Replaces names, companies, legal and financial IDs, addresses, emails and phones with structured tokens in .docx, .pdf and .xlsx, including scanned PDFs when local OCR is available. Runs locally. Russian and English, plus optional Spanish. No telemetry.
curl -fsSL anonymizer.site/install | sh Lawyers want AI feedback on contracts but can't paste raw client data into third-party tools. Manual redaction is slow and error-prone, especially for scanned documents. anonymizer automates the redaction step locally so the rest of the AI workflow stays unchanged.
Stable tokens that preserve grammatical position. Numbering is consistent within a session.
John Smith → [Person_1] Acme Corp. → [Company_1] +1 (415) 555-1234 → [PHONE_1] EIN 12-3456789 → [NUMBER_1] GB29 NWBK 6016... → [NUMBER_1] 4276 1300 ... → [NUMBER_1] 1 Main St, New York → [ADDRESS_1] 03/12/2024 → [DATE_1] Contract No. SVC-2025-0847 → [NUMBER_1] passport 45 11 123456 → [NUMBER_1] Drag a .docx, .pdf or .xlsx into the local web UI. Scanned PDFs use local Tesseract OCR when installed.
Natasha + spaCy run on your CPU. Regex catches structured PII. Never opens a socket.
Structure preserved, metadata cleared. Original file untouched.
No data leaves your laptop. Ever.
curl -fsSL anonymizer.site/install | sh iwr -useb anonymizer.site/install.ps1 | iex uv tool install docs-anonymizer See /docs/installation/manual for SHA256 and offline mirror options.
A build-time test asserts the redaction engine opens no socket. Any regression in the network policy fails CI before a release ships.
Full source ships as sdist alongside the wheel on PyPI.
Feedback is opt-in via an in-UI button. No passive analytics, ever.
Every release publishes the wheel SHA256 in /version.json. Verify what 'uv tool install' got you against the manifest before trusting it on a sensitive machine.
Strip the personal data locally first. Our step-by-step guide shows the safe workflow — and how anonymizer compares to other tools.
anonymize --no-update-check. Optional OCR setup during install may use your package manager to install Tesseract.curl -fsSL anonymizer.site/install | sh or install manually with uv tool install docs-anonymizer. On non-apt distros, install Tesseract OCR language packs manually if you need scanned PDFs.