Java program that extracts the text for each page in a PDF document and writes it to a file using Qoppa’s library jPDFText.
// Load the document PDFText pdfText = new PDFText ("input.pdf", null); // Loop through the pages for (int pageIx = 0; pageIx < pdfText.getPageCount(); ++pageIx) { // Get the text for the page String pageText = pdfText.getText(pageIx); // Save the text in a file FileWriter output = new FileWriter ("output_" + pageIx + ".txt"); output.write(pageText); output.close(); } |