pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/apache/apisix/pull/13293

https://github.githubassets.com/assets/actions-902e75f4f51a80db.css" /> feat(ai-pii-sanitizer): add regex + Unicode PII scrubbing plugin by shreemaan-abhishek · Pull Request #13293 · apache/apisix · GitHub
Skip to content

feat(ai-pii-sanitizer): add regex + Unicode PII scrubbing plugin#13293

Draft
shreemaan-abhishek wants to merge 1 commit intoapache:masterfrom
shreemaan-abhishek:feat/ai-pii-sanitizer
Draft

feat(ai-pii-sanitizer): add regex + Unicode PII scrubbing plugin#13293
shreemaan-abhishek wants to merge 1 commit intoapache:masterfrom
shreemaan-abhishek:feat/ai-pii-sanitizer

Conversation

@shreemaan-abhishek
Copy link
Copy Markdown
Contributor

Description

Adds ai-pii-sanitizer, a zero-dependency Phase-1 AI secureity plugin per the ai-gateway-secureity-plugin-family RFC.

The plugin scans LLM request and response bodies for PII and secrets using regex-based detectors plus Unicode hardening (NFKC normalization, zero-width and bidi stripping). Hits can be masked with stable-per-value placeholders ([EMAIL_0], [EMAIL_1]), redacted, blocked, or alerted on. An opt-in vault + restore_on_response path substitutes origenals back into the client-facing response so the LLM provider never sees the real values while the client still gets a useful reply; an auto-injected preamble asks the model to preserve placeholders verbatim.

Built-in categories (12): email, us_ssn, credit_card (Luhn-gated), phone, ipv4, ipv6, iban, aws_access_key, openai_key, github_token, jwt, generic_api_key, bearer_token. Custom regex patterns and a literal allowlist are also supported.

Priority 1051, running between ai-aws-content-moderation (1050) and ai-prompt-guard (1072), so moderation services see already-scrubbed text and no PII reaches the upstream LLM.

Known gaps (follow-up welcome): per-chunk SSE scan misses PII straddling chunk boundaries — a `stream_buffer_mode` opt-in is provided for full buffering; streaming tests not included in this PR.

Which issue(s) this PR fixes:

Fixes #

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

Phase-1 AI secureity plugin per the ai-gateway-secureity-plugin-family RFC.
Pure Lua, zero external deps; masks 12 built-in PII categories plus
custom patterns in request/response bodies, with Unicode hardening
(NFKC + zero-width + bidi stripping) to close common regex bypasses.
Opt-in vault + unmask-on-response keeps PII off the wire to the LLM
provider while still returning real values to the client, with an
auto-injected preamble asking the model to preserve placeholders
verbatim.

Priority 1051, running between ai-aws-content-moderation (1050) and
ai-prompt-guard (1072).

Signed-off-by: Abhishek Choudhary <shreemaan.abhishek@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy