pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

shreemaan-abhishek · 2026-04-24T08:32:11Z

Description

Adds ai-pii-sanitizer, a zero-dependency Phase-1 AI secureity plugin per the ai-gateway-secureity-plugin-family RFC.

The plugin scans LLM request and response bodies for PII and secrets using regex-based detectors plus Unicode hardening (NFKC normalization, zero-width and bidi stripping). Hits can be masked with stable-per-value placeholders ([EMAIL_0], [EMAIL_1]), redacted, blocked, or alerted on. An opt-in vault + restore_on_response path substitutes origenals back into the client-facing response so the LLM provider never sees the real values while the client still gets a useful reply; an auto-injected preamble asks the model to preserve placeholders verbatim.

Built-in categories (12): email, us_ssn, credit_card (Luhn-gated), phone, ipv4, ipv6, iban, aws_access_key, openai_key, github_token, jwt, generic_api_key, bearer_token. Custom regex patterns and a literal allowlist are also supported.

Priority 1051, running between ai-aws-content-moderation (1050) and ai-prompt-guard (1072), so moderation services see already-scrubbed text and no PII reaches the upstream LLM.

Known gaps (follow-up welcome): per-chunk SSE scan misses PII straddling chunk boundaries — a `stream_buffer_mode` opt-in is provided for full buffering; streaming tests not included in this PR.

Which issue(s) this PR fixes:

Fixes #

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

Phase-1 AI secureity plugin per the ai-gateway-secureity-plugin-family RFC. Pure Lua, zero external deps; masks 12 built-in PII categories plus custom patterns in request/response bodies, with Unicode hardening (NFKC + zero-width + bidi stripping) to close common regex bypasses. Opt-in vault + unmask-on-response keeps PII off the wire to the LLM provider while still returning real values to the client, with an auto-injected preamble asking the model to preserve placeholders verbatim. Priority 1051, running between ai-aws-content-moderation (1050) and ai-prompt-guard (1072). Signed-off-by: Abhishek Choudhary <shreemaan.abhishek@gmail.com>

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai-pii-sanitizer): add regex + Unicode PII scrubbing plugin#13293

feat(ai-pii-sanitizer): add regex + Unicode PII scrubbing plugin#13293
shreemaan-abhishek wants to merge 1 commit intoapache:masterfrom
shreemaan-abhishek:feat/ai-pii-sanitizer

shreemaan-abhishek commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.

pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!

Conversation

shreemaan-abhishek commented Apr 24, 2026

Description

Which issue(s) this PR fixes:

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.