Skip to content
English
  • There are no suggestions because the search field is empty.

[INTERNAL] How Invisible Text Works in IronOCR’s SaveAsSearchablePdf()

Making PDFs Searchable with Invisible Text

When working with scanned documents or images, the text inside them is not selectable or searchable. IronOCR's SaveAsSearchablePdf() function solves this problem by adding invisible text over the original content, making the PDF searchable while keeping its appearance unchanged.

What is a Searchable PDF?

A searchable PDF is a document where you can select and search for text, even if the original file was just an image. This is useful for scanned documents, receipts, or old books where the text is stored as an image rather than actual characters.

How Does It Work?

IronOCR scans the document using Optical Character Recognition (OCR) and identifies text within the images. Then, it overlays the recognized text as invisible text on top of the original document. This ensures the document looks the same but allows users to search, copy, and highlight text.

When is Invisible Text Added?

  • If a document is entirely an image (e.g., a scanned receipt or a photo of a page), IronOCR adds invisible text to all recognized words.
  • If a document already has searchable text, IronOCR skips those sections and only applies OCR to unsearchable areas, such as images or handwritten notes.

Example

In the image below, the document has both searchable and unsearchable text. IronOCR only applies OCR to the unsearchable parts, such as text inside images. The red rectangles highlight five areas where invisible text has been added in the final searchable PDF.


image (5)-4

By using this feature, users can convert non-searchable PDFs into fully searchable documents without altering their original layout or appearance.