OCR Language Download Links Required Data File for All Languages Orientation and script detection Common Languages English – English French – Français German – Deutsch Spanish – Español Italian – Italiano Chinese (Simplified) – 中文简体中文 Chinese (Traditional) – 中文繁體 All Other Languages – This file contains all the languages available (large file) tessdata_fast.zip
Articles Tagged: PDF OCR
Creating Searchable PDF from Image Files
Q: Can we convert images files into searchable PDF documents, by performing OCR, using Qoppa’s Java PDF library? A: Yes, using jPDProcess, you can do that. 1. Convert Images to PDF Pages The first step is to create a PDF from the images: // create a new PDF document PDFDocument pdfDoc = new PDFDocument(); // […]
How to add OCR to jPDFProcess
jPDFProcess, Qoppa’s java PDF creation and manipulation library, has an OCR module. Please contact us regarding licensing this additional feature. How to Activate / Implement OCR To get started, you can download the latest jPDFProcess version from here: https://www.qoppa.com/pdfprocess/demo/download And the JNI native bridge files from here: https://www.qoppa.com/files/pdfprocess/ocr/libtessjni411.zip The JNI zip file contains the […]
Activate OCR in jPDFEditor
As of version 2013R2, jPDFEditor, Qoppa’s Java PDF editing component, has an optional OCR function available. OCR is also available in jPDFNotes and the steps for integration are the same as for jPDFEditor. Follow the instructions below to add an “OCR” button to the toolbar so your users can perform OCR on PDF documents open in Qoppa’s visual […]
Java PDF OCR library sdk
Qoppa offers a PDF OCR solution for Java which supports most languages, including English, German, French, and Spanish as well as Chinese, Japanese and Korean. It is available for Windows®, Mac OS X® and Linux®, in 32 and 64 bit. This is a clean, production-level Java integration of the well-known Tesseract engine with Qoppa’s own advanced […]