This java sample shows how to search text in a PDF and add text highlights or text markup annotations (underline, strikeout, squiggly) on top of the text using Qoppa’s PDF library jPDFProcess.
// Open the document PDFDocument inDoc = new PDFDocument ("c:/input.pdf", null); // Loop through the pages, searching for text for (int pageIx = 0; pageIx < inDoc.getPageCount(); ++pageIx) { // Search for the text in a page PDFPage page = inDoc.getPage (pageIx); Vector<TextPosition> searchResults = page.findText("the", false, false); System.out.println ("Page " + pageIx + " - Found " + searchResults.size() + " instances"); if (searchResults.size () > 0) { for (int count = 0; count < searchResults.size(); ++count) { // Get the position of the text TextPosition textPos = (TextPosition)searchResults.get (count); // Vector for annotation quadrilateral bounds Vector<Point2D[]> quadList = new Vector<Point2D[]>(); quadList.add(textPos.getQuadrilateral()); // Create markup annotation and add it to the page // subtype can be Highlight, Underline, StrikeOut, Squiggly TextMarkup markup = inDoc.getAnnotationFactory().createTextMarkup("Test Markup", quadList, "Highlight"); markup.setColor(Color.yellow); page.addAnnotation(markup); } } } inDoc.saveDocument("c:/output.pdf"); |
Unicode Support
It is possible to search CJK characters, for instance:
Vector<TextPosition> searchResults = page.findText("电压", false, false); |
Your Java sample program will have to be saved using UTF-8 encoding.
Markup Annotations Subtypes
There are 4 subtypes of text markup annotations:
- Text Markup Highlight
- Text Markup Underline
- Text Markup StrikeOut
- Text Markup Squiggly/li>
To change from text highlights to text underline markups in the sample code above, simply change the subtype in createTextMarkup call from “Highlight” to “Underline”.
// Create markup annotation and add it to the page TextMarkup markup = inDoc.getAnnotationFactory().createTextMarkup("Test Markup", quadList, "Underline"); |