org.jpedal.examples.text
Class ExtractTextAsWordlist

java.lang.Object
  extended by org.jpedal.examples.text.ExtractTextAsWordlist

public class ExtractTextAsWordlist
extends java.lang.Object

Sample code showing how jpedal library can be used with pdf files to extract text from a specified Rectangle as a set of words. This example is based on extractTextInRectangle.java These can then be entered into an index engine such as Lucene


Field Summary
static boolean isTest
          used in our regression tests to limit to first 10 pages
static boolean outputMessages
          flag to show if we print messages
 
Constructor Summary
ExtractTextAsWordlist()
           
ExtractTextAsWordlist(byte[] array)
          example method to open a file and extract the raw text
ExtractTextAsWordlist(java.lang.String file_name)
          example method to open a file and extract the raw text
 
Method Summary
 int getWordsExtractedCount()
          return words extracted.
static void main(java.lang.String[] args)
          main routine which checks for any files passed and runs the demo
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

outputMessages

public static boolean outputMessages
flag to show if we print messages


isTest

public static boolean isTest
used in our regression tests to limit to first 10 pages

Constructor Detail

ExtractTextAsWordlist

public ExtractTextAsWordlist()

ExtractTextAsWordlist

public ExtractTextAsWordlist(java.lang.String file_name)
example method to open a file and extract the raw text


ExtractTextAsWordlist

public ExtractTextAsWordlist(byte[] array)
example method to open a file and extract the raw text

Method Detail

main

public static void main(java.lang.String[] args)
main routine which checks for any files passed and runs the demo


getWordsExtractedCount

public int getWordsExtractedCount()
return words extracted. We use this in some tests.