This java sample shows how to search text in a PDF and add text highlights or text markup annotations (underline, strikeout, squiggly) on top of the text using Qoppa’s PDF library jPDFProcess.

// Open the document
PDFDocument inDoc = new PDFDocument ("c:/input.pdf", null);
 
// Loop through the pages, searching for text
for (int pageIx = 0; pageIx < inDoc.getPageCount(); ++pageIx)
{
 // Search for the text in a page
 PDFPage page = inDoc.getPage (pageIx);
 Vector<TextPosition> searchResults = page.findText("the", false, false);
 System.out.println ("Page " + pageIx + " - Found " + searchResults.size() + " instances");
 if (searchResults.size () > 0)
 {
   for (int count = 0; count < searchResults.size(); ++count)
   {
     // Get the position of the text
     TextPosition textPos = (TextPosition)searchResults.get (count);
 
     // Vector for annotation quadrilateral bounds
     Vector<Point2D[]> quadList = new Vector<Point2D[]>();
     quadList.add(textPos.getQuadrilateral());
 
     // Create markup annotation and add it to the page
     // subtype can be Highlight, Underline, StrikeOut, Squiggly
     TextMarkup markup = inDoc.getAnnotationFactory().createTextMarkup("Test Markup", quadList, "Highlight");
     markup.setColor(Color.yellow);
     page.addAnnotation(markup);
   }
 }
}
inDoc.saveDocument("c:/output.pdf");

Unicode Support

It is possible to search CJK characters, for instance:

Vector<TextPosition> searchResults = page.findText("电压", false, false);

Your Java sample program will have to be saved using UTF-8 encoding.

Markup Annotations Subtypes

There are 4 subtypes of text markup annotations:

  • Text Markup Highlight
  • Text Markup Underline
  • Text Markup StrikeOut
  • Text Markup Squiggly/li>

To change from text highlights to text underline markups in the sample code above, simply change the subtype in createTextMarkup call from “Highlight” to “Underline”.

// Create markup annotation and add it to the page
TextMarkup markup = inDoc.getAnnotationFactory().createTextMarkup("Test Markup", quadList, "Underline");