Overview
Benchmarks and a small human study show strong detection for several PII types, but key evaluations use an internal dataset and limited human tests, so broader generalization needs more validation.
Citations1
Evidence Strength0.60
Confidence0.85
Risk Signals11
Trust Signals
Findings with numeric evidence: 3/3
Findings with evidence refs: 3/3
Results with explicit delta: 3/4
Reproducibility
Status: No open assets linked
Open source: Partial
At A Glance
Cost impact: 60%
Production readiness: 70%
Novelty: 45%
Why It Matters For Business
Automated, policy-aware PII detection reduces legal risk and audit effort while preserving data utility for ML pipelines.
Who Should Care
Summary TLDR
This paper presents OneShield, an adaptive pipeline that detects personally identifiable information (PII) in text with context-aware scoring and applies regulation-aware masking strategies. Benchmarks on an in-house set (~1,500 points) and a Kaggle PII dataset show strong F1 scores (e.g., passport numbers 0.95–1.0). A 20-person study rated perceived protection 4.6/5. The system integrates a policy engine to map laws (GDPR/CCPA) into actionable masking rules and logs actions for audits.
Problem Statement
LLMs consume large public text corpora but legal rules (GDPR, CCPA, PIPEDA) vary by jurisdiction. Static redaction or pattern rules either miss sensitive cases or remove useful context. Enterprises need a scalable, updatable system that detects PII with context and applies jurisdiction-specific remediation without wrecking downstream model utility.
Main Contribution
Adaptive Risk Mitigation Framework: a policy-driven system that converts laws into executable masking rules.
Contextual PII Detection: multi-step detector that scores entity sensitivity using local semantics and metadata.
Key Findings
Passport number detection outperforms other tools on evaluated benchmarks.
Person name detection is near-perfect on evaluated data.
Results
| Metric | Value | Baseline | Delta | Split / Dataset | Evidence | Evidence Ref |
|---|---|---|---|---|---|---|
| PassportNumber F1 (Benchmark1) | 0.95 | Presidio 0.33; Comprehend 0.54 | OneShield +0.62 vs Presidio | Benchmark1 (in-house) | Table 2 passport row | Table 2 |
| Person name F1 (Benchmark1) | 1.00 | StarPII 0.99; Comprehend 0.88 | Comparable to best open-source NER | Benchmark1 (in-house) | Table 2 person row | Table 2 |
What To Try In 7 Days
Run OneShield or a contextual PII detector over a small training slice and compare F1 on known PII
Map your GDPR/CCPA must-rules into a policy table and test masking behaviors
Enable audit logging for remediation actions for one model pipeline and review outputs
Reproducibility
Risks & Boundaries
Limitations
Evaluation relies on an internal in-house dataset (~1,500 points) and one public Kaggle set; generalization to other domains is unproven
Human trust study is small (n=20) and subjective
When Not To Use
When legal audit requires fully auditable, certified third-party tools without internal customization
Where latency must be minimal and complex contextual scoring cannot be afforded
Failure Modes
False negatives on nested or obfuscated PII (e.g., email inside URLs) as noted for other tools
False positives on public organizations or names without contextual disambiguation

