Here is a Java sample program that uses Qoppa’s jPDFText library to determine if a PDF file contains any text content. The method “findTextInPDF” will return true of text was found on any page in the PDF, false if no text was found on any page.
public static boolean findTextInPDF(String absoluteFilePath) throws PDFException, FileNotFoundException, IOException { boolean containsText = false; // create a PDFText object from the PDF loaded from the filepath provided InputStream inputStream = new FileInputStream(absoluteFilePath); PDFText pdfText = new PDFText (inputStream, null); // get the number of pages in this PDF int pageCount = pdfText.getPageCount(); // loop through all the pages for(int i = 0; i < pageCount; i++) { // get the text content from the current page String pageText = pdfText.getText(i); // if the text content is not empty if (pageText!=null && pageText.trim().length()>0) { // set the variable containsText to true containsText = true; break; } } // close the file input stream inputStream.close(); // return return containsText; } |