Why do PDF files get corrupted?
PDF corruption happens more often than most people expect. The most common causes are interrupted downloads (browser crash or network drop mid-transfer), storage device failure (bad sectors on a hard drive or failing USB drive), email attachment encoding errors, incomplete saves when an application crashed, and file system errors after an improper shutdown. The result is the same: a PDF that Adobe Reader, Chrome, or any viewer refuses to open.
What "corrupted PDF" actually means technically
A PDF file is a structured binary format with several critical components that must be intact for any viewer to open it. When any of these are damaged, the file becomes unreadable:
- Cross-reference table (xref) — a directory of every object in the file and its byte offset. If this is broken, the viewer can't locate any content
- Trailer dictionary — tells the viewer where the xref table starts and which object is the document root
- %%EOF marker — signals the end of the file. Missing in truncated files
- Stream lengths — each content stream declares how many bytes it contains. Corruption shifts these offsets, making streams unreadable
- Catalog and page tree — the root objects that list all pages. Without these, no pages can be rendered
How to repair a corrupted PDF free online
- Open ihatepdf.cv/repair-pdf in your browser
- Drag and drop the damaged PDF onto the upload area
- Click Scan & Repair PDF
- Watch the forensic analysis and repair log in real time
- If recovery succeeds, click Download Repaired PDF — no watermark
The entire process runs locally in your browser. Your file is never uploaded anywhere.
The 5 repair strategies — explained
The repair engine tries five increasingly aggressive approaches, stopping as soon as one succeeds:
- Strategy 1 — Direct structural repair: Loads the PDF through pdf-lib's error-tolerant parser and immediately re-saves it. Fixes many minor corruption issues where the structure is mostly intact
- Strategy 2 — Stream length normalization: Scans every content stream, measures the actual byte length, and corrects any declared lengths that don't match. Fixes corruption where file edits shifted stream data without updating length values
- Strategy 3 — Manual xref reconstruction: Scans the raw binary for every object definition, builds a brand-new cross-reference table from scratch, and appends it to the file. Fixes files where the xref was completely wiped but objects are still present
- Strategy 4 — Truncation recovery: Strips everything after the last valid endobj marker, then rebuilds the xref from the surviving object data. Fixes files that were cut off mid-write
- Strategy 5 — Page-by-page salvage: Attempts to copy each page individually into a new blank document, skipping pages that throw errors. Recovers as many pages as possible when the document structure is too damaged for a full repair
What the forensic analysis tells you
Before attempting repair, the tool scans the file and reports a Recovery Probability score (0–100%) based on how much structure is still intact. It also lists specific detected issues — whether the xref is missing, whether %%EOF is present, whether the catalog root object exists — so you understand exactly what damage you're dealing with before repair begins.
A score above 80% means the file is mostly intact and repair is highly likely to succeed. Below 25% means significant damage with uncertain recovery.
What to do if all 5 strategies fail
If the repair engine exhausts all strategies, the file either:
- Was never a valid PDF (wrong file type, renamed extension)
- Suffered catastrophic storage failure with no recoverable PDF objects
- Was corrupted during transfer with an encoding mismatch (binary sent as text)
- Is a zero-byte or near-zero-byte fragment with no content
In these cases, your best option is to contact whoever sent you the file and request a fresh copy, or check if a backup copy exists in your cloud storage (Google Drive, Dropbox, and OneDrive all keep version history).
Frequently asked questions
Will the repaired PDF look exactly like the original?
In most cases, yes — especially for Strategy 1 and 2 repairs where the content streams are intact. For salvage repairs (Strategy 5), pages that couldn't be extracted will be missing from the output. The repair log tells you exactly how many pages were recovered.
Can it repair a password-protected PDF?
The engine loads files with encryption ignored where possible. For heavily encrypted files, remove the password first if you have it, then attempt repair.
Is there a file size limit?
No server limit — the constraint is your device's available memory. Files up to 150MB work reliably on most desktop browsers.
Does the repaired file have a watermark?
No. ihatepdf never adds watermarks to any output file.