What hidden data can a PDF contain?
A PDF file is more than the text and images you see on screen. Embedded inside is a layer of metadata — invisible during normal reading but fully accessible to anyone who knows where to look. Before sharing a PDF externally, it is worth knowing exactly what data you are inadvertently sharing along with it.
Common hidden data found in PDFs:
- Author name — pulled from the OS user account at time of creation
- Company name — from the software license registration
- Creation and modification timestamps — exact time down to the second
- Software and version — which application created the PDF and its version number
- GPS coordinates — images in the PDF may contain EXIF location data
- Tracked changes and comments — deleted text and reviewer comments may still be present
- Previous authors — earlier author names from revision history
- Custom XMP properties — application-specific metadata fields
How to scan a PDF for hidden data free — step by step
- Open ihatepdf.cv/privacy-scanner — no sign-up required
- Upload the PDF you want to check before sharing
- Click Scan for Privacy Risks
- Review the detected metadata — each field is listed with its value and risk rating
- Select which fields to strip, then download a clean copy — no watermark
The scanner reads the PDF structure entirely in your browser. Your file is never uploaded.
Combined privacy workflow
For thorough privacy protection before sharing a PDF: first use Redact PDF to permanently remove visible sensitive text (names, account numbers, personal details), then run Privacy Scanner to identify and strip hidden metadata, then use Flatten PDF to remove all remaining interactive layers and embed metadata. This three-step approach ensures both visible and invisible personal data are fully removed.
Real privacy risks from PDF metadata
- Anonymous submissions exposed — author metadata in PDFs has revealed identities in whistleblower submissions and anonymous legal filings
- Internal software exposed — creator application metadata reveals your internal tooling to external recipients
- Location data from photos — GPS EXIF data from embedded photos can expose your home or office address
- Draft text in revision history — deleted or revised text can persist in internal object streams, visible to anyone who knows how to extract it
Frequently asked questions
Will stripping metadata change how the PDF looks?
No. Metadata is stored separately from visible content. Removing it has no effect on text, images, or layout.
Is my PDF uploaded to scan it?
No. The scanner reads and analyzes the PDF structure entirely in your browser.
Does the cleaned PDF have a watermark?
No. ihatepdf never adds watermarks to any output file.