Java program that gets all the words in a PDF document and echoes them to the console using Qoppa’s library jPDFText.

// Load the document
PDFText pdfText = new PDFText ("input.pdf", null);
// Get the words in the document
Vector wordList = pdfText.getWords();
// Echo the words 
for (int wordIx = 0; wordIx < wordList.size(); ++wordIx)
{ 
System.out.println (wordList.get(wordIx));
}

Java Program to Get Words in a PDF
Java Program to Get Words in a PDF
GetWordList.java_.txt
714.0 B
41 Downloads
Details