diff --git a/README.md b/README.md index e1b7876..f0901ff 100644 --- a/README.md +++ b/README.md @@ -245,9 +245,9 @@ Authoritative list is `nodejs-v2/src/engine/entity-types.ts` (`SUPPORTED_ENTITIE `EU_VAT`, `EU_PASSPORT` **Country-specific**: -`DE_TAX_ID`, `DE_SOCIAL_SECURITY`, `FR_NIR`, `FR_CNI`, `IT_FISCAL_CODE`, `IT_VAT`, `ES_DNI`, `ES_NIE`, `CY_TIC`, `CY_ID_CARD` +`DE_TAX_ID`, `DE_SOCIAL_SECURITY`, `FR_NIR`, `FR_CNI`, `IT_FISCAL_CODE`, `IT_VAT`, `ES_DNI`, `ES_NIE`, `CY_TIC`, `CY_ID_CARD`, `FI_HETU`, `FI_BUSINESS_ID` -33 types total (4 NER + 29 pattern-based). +35 types total (4 NER + 31 pattern-based). ## Logs diff --git a/nodejs-v2/README.md b/nodejs-v2/README.md index 2a5f2cd..386c2b5 100644 --- a/nodejs-v2/README.md +++ b/nodejs-v2/README.md @@ -1,6 +1,6 @@ # pii-shield -> Anonymize PII in legal documents locally. Node.js CLI — 33 entity types via GLiNER NER + EU/UK/US patterns. Reads `.pdf` / `.docx` / `.txt`. Pure offline, no Python. +> Anonymize PII in legal documents locally. Node.js CLI — 35 entity types via GLiNER NER + EU/UK/US/FI patterns. Reads `.pdf` / `.docx` / `.txt`. Pure offline, no Python. [![npm](https://img.shields.io/npm/v/pii-shield.svg?style=flat-square)](https://www.npmjs.com/package/pii-shield) [![License](https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square)](https://github.com/gregmos/PII-Shield/blob/main/LICENSE) [![Node](https://img.shields.io/badge/node-22%2B-339933.svg?style=flat-square&logo=nodedotjs&logoColor=white)](https://nodejs.org/) @@ -55,13 +55,13 @@ See `pii-shield --help ` or the [full CLI manual](https://github.com/gr ## What it detects -33 entity types — 4 NER classes (`PERSON`, `ORGANIZATION`, `LOCATION`, `NRP`) plus 29 pattern-based recognizers: +35 entity types — 4 NER classes (`PERSON`, `ORGANIZATION`, `LOCATION`, `NRP`) plus 31 pattern-based recognizers: - **Generic**: email, phone, URL, IP, ID doc, credit card, IBAN, crypto, medical licence - **US**: SSN, passport, driver licence - **UK**: NIN, NHS, passport, CRN, driving licence - **EU-wide**: VAT, passport -- **Country-specific**: DE (tax ID, social security), FR (NIR, CNI), IT (fiscal code, VAT), ES (DNI, NIE), CY (TIC, ID card) +- **Country-specific**: DE (tax ID, social security), FR (NIR, CNI), IT (fiscal code, VAT), ES (DNI, NIE), CY (TIC, ID card), FI (henkilötunnus, Y-tunnus) Authoritative list: [`src/engine/entity-types.ts`](https://github.com/gregmos/PII-Shield/blob/main/nodejs-v2/src/engine/entity-types.ts). diff --git a/nodejs-v2/cli/USAGE.md b/nodejs-v2/cli/USAGE.md index ec07a7b..2d736e5 100644 --- a/nodejs-v2/cli/USAGE.md +++ b/nodejs-v2/cli/USAGE.md @@ -959,7 +959,7 @@ tail -f ~/.pii_shield/audit/ner_init.log # NER bootstrap detail ## Detected entity types -33 types in total. The full authoritative list is `nodejs-v2/src/engine/entity-types.ts` (`SUPPORTED_ENTITIES`). +35 types in total. The full authoritative list is `nodejs-v2/src/engine/entity-types.ts` (`SUPPORTED_ENTITIES`). ### NER-based (GLiNER zero-shot) @@ -983,7 +983,7 @@ tail -f ~/.pii_shield/audit/ner_init.log # NER bootstrap detail ### Country-specific -`DE_TAX_ID`, `DE_SOCIAL_SECURITY`, `FR_NIR`, `FR_CNI`, `IT_FISCAL_CODE`, `IT_VAT`, `ES_DNI`, `ES_NIE`, `CY_TIC`, `CY_ID_CARD`. +`DE_TAX_ID`, `DE_SOCIAL_SECURITY`, `FR_NIR`, `FR_CNI`, `IT_FISCAL_CODE`, `IT_VAT`, `ES_DNI`, `ES_NIE`, `CY_TIC`, `CY_ID_CARD`, `FI_HETU`, `FI_BUSINESS_ID`. To list at runtime in JSON: `pii-shield scan small.txt --json | jq '.entities[].type' | sort -u` (assuming sample file has at least one of each). diff --git a/nodejs-v2/package.json b/nodejs-v2/package.json index 86e9503..77fcec3 100644 --- a/nodejs-v2/package.json +++ b/nodejs-v2/package.json @@ -2,7 +2,7 @@ "name": "pii-shield", "version": "2.2.0", "type": "module", - "description": "Anonymize PII in legal documents locally — Node.js CLI (GLiNER NER, EU/UK/US patterns, .docx/.pdf/.txt, HITL review). MCP plugin distributed separately as a .mcpb.", + "description": "Anonymize PII in legal documents locally — Node.js CLI (GLiNER NER, EU/UK/US/FI patterns, .docx/.pdf/.txt, HITL review). MCP plugin distributed separately as a .mcpb.", "keywords": [ "pii", "anonymization",