ARTFEED — Contemporary Art Intelligence

On-Device PII Substitution with Locale-Conditioned Few-Shot Prompting

ai-technology · 2026-05-14

A recent research paper on arXiv introduces an innovative on-device technique that effectively replaces personally identifiable information (PII) with consistent fake data, matching the original data types. This approach aims to overcome the challenges presented by traditional redaction methods, which can impair retrieval and named entity recognition processes. The system employs a 1.5 billion parameter mixture-of-experts token classifier for identifying spans, alongside a 1-bit Bonsai language model for producing contextual replacements of sensitive information. The authors emphasize the importance of prompt selection, suggesting that varying demonstrations could enhance the model's performance compared to fixed examples.

Key facts

  • arXiv paper 2605.13538 proposes on-device PII substitution pipeline
  • Uses openai/privacy-filter 1.5B MoE token classifier for detection
  • Uses 1-bit Bonsai-1.7B SLM for contextual surrogate generation
  • Uses faker rule-based generator for patterned fields
  • Fixed three-shot demonstrations cause verbatim regurgitation of demonstration outputs
  • 1.58-bit Ternary-Bonsai-1.7B shows same byte-identical failures
  • Locale-conditioned rotating few-shot prompting fixes regurgitation
  • Prompting choice found more important than quantization

Entities

Institutions

  • arXiv
  • openai
  • Bonsai

Sources