<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>anonymizer changelog</title><description>Release history for anonymizer — offline PII redactor.</description><link>https://anonymizer.site/</link><item><title>0.4.0</title><link>https://anonymizer.site/changelog#v0-4-0</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-4-0</guid><description>This release sharpens detection quality for Spanish documents. Mexican
identifiers (RFC, CURP, CLABE bank accounts) are now recognized and masked.
Spanish dates written out in words (&quot;15 de julio de 2025&quot;) are masked whole
instead of leaving fragments, payment percentages are no longer mistaken for
dates, and date-named streets (&quot;Calle 5 de Mayo&quot;) stay part of the address.
Company names the model previously missed (like &quot;GreenLeaf Organics Corp.,
S.A. de C.V.&quot;) are now masked as one piece. Russian and English documents are
unaffected.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item><item><title>0.3.0</title><link>https://anonymizer.site/changelog#v0-3-0</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-3-0</guid><description>This release adds Spanish (es) as an optional detection language. Install the
Spanish model at setup time (or download it later from Settings → Languages),
restart, and Spanish documents will have names (including double surnames),
companies, and addresses detected — alongside the existing email, phone, IBAN
and date detection. Russian and English are unchanged and Spanish is fully
opt-in: nothing changes for you unless you install the Spanish model.</description><pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate></item><item><title>0.2.31</title><link>https://anonymizer.site/changelog#v0-2-31</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-31</guid><description>This release improves redaction accuracy: repeated dates, contract numbers with
lowercase letters, hyphenated Russian surnames, and bank-card/IBAN numbers
written with unusual spacing or dashes are now caught more reliably.</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate></item><item><title>0.2.28</title><link>https://anonymizer.site/changelog#v0-2-28</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-28</guid><description>- Scanned and hybrid PDFs can now be anonymized locally when Tesseract OCR with
  English and Russian language packs is available.
- The app warns about low OCR confidence and embedded images that need manual
  review.
- The installer and doctor command help check or set up OCR.
- Detection is stronger for legal identifiers, Hong Kong address fragments,
  company names, and international phone separators.</description><pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.21</title><link>https://anonymizer.site/changelog#v0-2-21</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-21</guid><description>Document comparison in the preview is easier to use: the source and anonymized
panes can be resized, collapsed, and scrolled horizontally for wide tables.
Documents also keep KPI, fee, target, and cadence fragments visible when those
values are not real dates.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.20</title><link>https://anonymizer.site/changelog#v0-2-20</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-20</guid><description>Russian report documents are handled more accurately when they contain date
ranges or amount rows that look similar to phone numbers. Values such as
24-28.02.2022, 13.10.2022, and 2022 - 360 000 remain visible when they are not
phone numbers, while real phone numbers continue to be masked.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.19</title><link>https://anonymizer.site/changelog#v0-2-19</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-19</guid><description>Internal quality checks for anonymization coverage are now stricter. The release
keeps critical entity categories at the required priority level and verifies
fixed regression cases against the exact intended synthetic entity.</description><pubDate>Sun, 24 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.18</title><link>https://anonymizer.site/changelog#v0-2-18</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-18</guid><description>- **Russian person names:** the detector no longer absorbs the K(F)X farm-organisation prefix into the entity surface. In documents like &quot;Глава К(Ф)Х Иванов И.И.&quot;, only the person&apos;s name is masked; К(Ф)Х remains visible in the text.
- **Document processing:** rare edge cases are handled more safely and precisely, reducing the risk of incorrect masking or unstable output when documents contain unusual structure.</description><pubDate>Fri, 22 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.17</title><link>https://anonymizer.site/changelog#v0-2-17</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-17</guid><description>Fixed a Windows-only crash when anonymizing documents containing company names — the legal-forms dictionary failed to load on Russian Windows (cp1251 default codepage) with `UnicodeDecodeError`. The file is now explicitly read as UTF-8 regardless of system locale.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.16</title><link>https://anonymizer.site/changelog#v0-2-16</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-16</guid><description>English company-name detection now correctly skips &quot;Google Looker Studio&quot; in marketing/adtech contexts. Previously the prefixed form leaked through and was masked as a counterparty name; now it stays visible like the bare &quot;Looker Studio&quot; form already did.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.15</title><link>https://anonymizer.site/changelog#v0-2-15</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-15</guid><description>Tables in documents (signature blocks, registers, schedules) now render in the preview as real tables instead of a flat list of paragraphs — easier to review the anonymization row by row. Russian civil-act registry series like &quot;II-МО&quot; / &quot;I-МК&quot; are now masked as NUMBER; previously they slipped through to the output DOCX while the trailing certificate number was masked.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.14</title><link>https://anonymizer.site/changelog#v0-2-14</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-14</guid><description>A batch of pilot-driven fixes across coverage, PDF handling, and the webapp UX:

- **DOCX tables are anonymized correctly.** A parser dedup bug previously dropped most cells of large tables (signature blocks, registers, schedules), letting PII through untouched.
- **PDFs with pre-marked redaction areas are handled safely.** Covered content is fully redacted and any embedded replacement text on those areas is dropped before anonymization.
- **Russian bank accounts, SNILS, and ИНН are masked in pilot documents**, while public regulators and sanctions bodies are left readable so sanctions clauses stay intelligible.
- **Upload progress is honest end-to-end.** Per-stage progress now starts immediately after dropping a file and animates visibly through all stages, including for small documents that finish quickly. The misleading &quot;failed&quot; banner on slow page loads is gone.
- **Webapp shuts itself down when you close the browser tab.**
- **New offline diagnostic command** reports installation health. Document carriers and diagnostic logs no longer retain raw entity text after anonymization.</description><pubDate>Wed, 20 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.9</title><link>https://anonymizer.site/changelog#v0-2-9</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-9</guid><description>Improved PII recognition in Russian legal documents: passport division codes (`NNN-NNN`) are now caught correctly, numeric phrases such as &quot;in three (3) years&quot; no longer get misclassified as dates, and addresses that start with a region or district block are picked up. Added an optional flag to mask country names — disabled by default.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.4</title><link>https://anonymizer.site/changelog#v0-2-4</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-4</guid><description>Internal: cleanup of the agent workflow infrastructure used by the dev team. No user-facing changes.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.3</title><link>https://anonymizer.site/changelog#v0-2-3</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-3</guid><description>### Fixed

- **Processing the first Russian document no longer crashes.** A transitive dependency could be resolved to a too-new version on user machines, which broke the Russian NER on first use. The version is now pinned at install time.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.2</title><link>https://anonymizer.site/changelog#v0-2-2</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-2</guid><description>### Fixed

- **Install on a fresh machine just works.** The one-line installer now ends with a working tool without manual cleanup steps.
- **`anonymize --version` works.** Prints the version and exits cleanly.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.1</title><link>https://anonymizer.site/changelog#v0-2-1</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-1</guid><description>### Fixed

- **Installer is stricter about the Python it picks.** Prevents a silent failure on machines that auto-selected an incompatible interpreter.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item><item><title>0.2.0</title><link>https://anonymizer.site/changelog#v0-2-0</link><guid isPermaLink="true">https://anonymizer.site/changelog#v0-2-0</guid><description>First public release on PyPI as `docs-anonymizer`. Local web app at `127.0.0.1` redacts `.docx`, `.xlsx`, and text-layer `.pdf` documents — names, companies, addresses, taxpayer / banking identifiers, phones, emails, dates — fully offline. Russian + English interface. Pilot-grade quality; promotion to `1.0.0` after pilot feedback.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item></channel></rss>