OCR PDF — Extract Text from Scanned PDFs Free

Run optical character recognition on any scanned or image-only PDF to extract a searchable text layer — free, with no file upload. Powered by Tesseract.js running entirely in your browser via WebAssembly. The result is a fully searchable, selectable PDF where you can copy text and search for content.

To make a scanned PDF searchable free, run OCR to add an invisible text layer behind the scanned image. On ihatepdf, upload the scanned PDF and Tesseract recognises the text in your browser, producing a PDF you can search, select and copy from while it still looks identical. Free, no sign-up, no watermark, and your document is never uploaded.

✓ 100% Free ✓ No Watermark ✓ No Sign-Up ✓ No Upload — Files Stay on Your Device ✓ Works Offline

How to use OCR PDF

Upload your scanned PDF: Go to ihatepdf.cv/ocr-pdf and upload your image-based or scanned PDF.
Select language: Choose the primary language of the document for best OCR accuracy.
Run OCR: Click "Make Searchable". Tesseract.js processes each page locally — time depends on number of pages.
Download: Download the PDF with an embedded text layer. Text is now selectable, copyable, and searchable.

What is OCR and why does your scanned PDF need it?

Optical Character Recognition (OCR) is the technology that converts page images into machine-readable text. When you scan a physical document, the scanner creates an image of each page — like a photograph. Without OCR, the PDF contains no text data: you cannot search it, copy from it, or use it in any text-processing workflow. Running OCR adds an invisible text layer that mirrors the visual text in the image, making the document fully searchable, selectable, and accessible to screen readers.

Tips for best OCR accuracy

Scan at 300 DPI minimum — higher resolution dramatically improves recognition accuracy. Ensure good lighting and straight scanning — skewed pages confuse the character recognition engine. Choose the correct document language in the settings. For documents with mixed content (typed text plus handwriting), typed areas will be recognized accurately; handwritten sections will have lower accuracy. After OCR, use Ctrl+F / Cmd+F in any PDF viewer to confirm the text is searchable.

Frequently asked questions

How accurate is the text recognition?

Clean, high-resolution scans of typed text typically achieve 95–99% accuracy with Tesseract. Handwritten text, low-resolution scans, and unusual fonts have lower accuracy.

Is my file uploaded to a server?

No. OCR runs locally in your browser using Tesseract.js via WebAssembly. Your file never leaves your device.

What languages does OCR support?

Tesseract supports 100+ languages including English, French, German, Spanish, Arabic, Chinese, Hindi, and many more.

Will OCR change the visual appearance of the PDF?

No. The original page appearance is preserved exactly. OCR adds an invisible text layer underneath the page image so the text can be searched and copied.

What DPI scan quality gives best OCR results?

300 DPI is the recommended minimum for reliable OCR. 200 DPI usually works for clean documents. Below 150 DPI, accuracy drops significantly.

Related PDF tools

More ai & smart tools — all free, no upload.

Chat With PDF Summarize PDF Compare Pdfs Repair PDF All tools →

Was this tool helpful? Rate it

Tap a star to rate.

← Back to all PDF tools