Sending raw user data to a third-party model API is how regulated data leaks. The fix isn’t to stop using LLMs — it’s to put a PII-redaction proxy in front of them: a small service that strips personal data before the request ever leaves your network, forwards the sanitized prompt to the model, and re-inserts the original values in the response if you need them. This guide builds that proxy in Spring Boot, with working code and the trade-offs that matter. It’s a Java take on a problem most tutorials only solve in Python.
Why a redaction proxy, not just a prompt rule
“Don’t include PII in prompts” is a policy, not a control — it depends on every developer remembering, forever. A proxy makes redaction a chokepoint: every call to the model goes through one service that enforces the rule mechanically. That gives you three things a guideline can’t:
- Data minimization by default — the model provider only ever sees tokens like
[EMAIL_1], never[email protected]. This is the engineering backbone of a zero-data-retention posture. - One place to audit — every redaction is logged in a single service, so you can prove what left your network.
- Reversibility when you need it — keep a request-scoped map of token → original value, and you can re-hydrate the model’s answer so the user still sees real names and emails.
The architecture
Three components, one request flow:
- Your app calls the proxy’s
/v1/chatendpoint with a raw prompt. - A
PiiRedactorreplaces detected PII with placeholders and returns a reversible mapping. - An
LlmClientforwards the redacted prompt to the model provider. - The proxy re-inserts the original values into the model’s reply and returns it.
Only step 3 crosses your network boundary — and by then the data is already tokenized.
Step 1 — The redactor
Start with pattern-based detection for the high-confidence, structured PII — emails, US SSNs, credit-card numbers, phone numbers. Each match is swapped for a stable placeholder, and the reverse mapping is kept so we can undo it later.
@Component
public class PiiRedactor {
private static final Map<String, Pattern> PATTERNS = Map.of(
"EMAIL", Pattern.compile("[\\w.+-]+@[\\w-]+\\.[\\w.-]+"),
"SSN", Pattern.compile("\\b\\d{3}-\\d{2}-\\d{4}\\b"),
"CARD", Pattern.compile("\\b(?:\\d[ -]*?){13,16}\\b"),
"PHONE", Pattern.compile("\\b\\+?\\d[\\d ().-]{7,}\\d\\b")
);
public record Redaction(String text, Map<String, String> mapping) {}
public Redaction redact(String input) {
String text = input;
Map<String, String> mapping = new LinkedHashMap<>();
int counter = 0;
for (var entry : PATTERNS.entrySet()) {
Matcher m = entry.getValue().matcher(text);
StringBuilder sb = new StringBuilder();
while (m.find()) {
String token = "[" + entry.getKey() + "_" + (++counter) + "]";
mapping.put(token, m.group());
m.appendReplacement(sb, Matcher.quoteReplacement(token));
}
m.appendTail(sb);
text = sb.toString();
}
return new Redaction(text, mapping);
}
public String reinsert(String modelOutput, Map<String, String> mapping) {
String out = modelOutput;
for (var e : mapping.entrySet()) {
out = out.replace(e.getKey(), e.getValue());
}
return out;
}
}
The mapping is intentionally request-scoped — it lives only for the lifetime of one call and is never persisted. That’s the point: the sensitive values never touch disk or the model provider.
Step 2 — Forward the sanitized prompt to the model
With the prompt sanitized, forwarding it is ordinary API work. Using the official Anthropic Java SDK (add com.anthropic:anthropic-java to your build) and Claude as the example provider:
@Component
public class LlmClient {
// Reads ANTHROPIC_API_KEY from the environment
private final AnthropicClient client = AnthropicOkHttpClient.fromEnv();
public String complete(String redactedPrompt) {
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_8)
.maxTokens(2048L)
.addUserMessage(redactedPrompt)
.build();
Message response = client.messages().create(params);
return response.content().stream()
.flatMap(block -> block.text().stream())
.map(TextBlock::text)
.collect(Collectors.joining());
}
}
The model only ever receives [EMAIL_1], never the real address. The proxy is provider-agnostic by design — swap LlmClient for any other model API and the redaction layer is unchanged.
Step 3 — Wire it together in a controller
@RestController
@RequestMapping("/v1")
public class ChatController {
private final PiiRedactor redactor;
private final LlmClient llm;
public ChatController(PiiRedactor redactor, LlmClient llm) {
this.redactor = redactor;
this.llm = llm;
}
public record ChatRequest(String prompt) {}
public record ChatResponse(String answer) {}
@PostMapping("/chat")
public ChatResponse chat(@RequestBody ChatRequest req) {
var redaction = redactor.redact(req.prompt()); // 1. strip PII
String modelReply = llm.complete(redaction.text()); // 2. call model with tokens only
String answer = redactor.reinsert(modelReply, redaction.mapping()); // 3. re-hydrate
return new ChatResponse(answer);
}
}
That’s the whole proxy: redact, forward, re-hydrate. Everything sensitive stays on your side of the wire.
The trade-offs nobody mentions
| Concern | Reality |
|---|---|
| Regex misses unstructured PII | Names, addresses, and free-text identifiers won’t match a pattern. For those, add a Named Entity Recognition (NER) step (a local model or a library like Apache OpenNLP) before the regex pass — never send to a cloud NER service, or you’ve defeated the purpose. |
| Over-redaction breaks meaning | If you redact too aggressively, the model loses context it needs to answer. Tune per field; sometimes a typed placeholder ([PERSON_1]) preserves enough structure for the model to reason. |
| Reversibility is a risk too | The token→value map is itself sensitive. Keep it request-scoped and in memory; never log it or cache it. |
| Credit-card regex needs a Luhn check | The broad digit pattern over-matches. Validate candidates with the Luhn algorithm to cut false positives. |
Where this fits in a production LLM stack
A redaction proxy is one layer of a defense-in-depth posture. Pair it with input/output guardrails and prompt-injection defense, and treat the whole thing as part of the broader engineering playbook for taking an AI demo to production. The redaction proxy answers the data-governance question; guardrails answer the abuse question; observability answers the “what happened” question.
Frequently asked questions
How do you redact PII before sending it to an LLM?
Put a proxy service in front of the model. It detects PII with pattern matching (and optionally local NER), replaces each value with a placeholder token, forwards only the tokenized prompt to the model API, and re-inserts the original values into the response. The provider never sees the real data.
Is regex enough for PII redaction?
Regex handles structured PII well (emails, SSNs, card numbers, phones) but misses unstructured data like names and addresses. For comprehensive coverage, add a local Named Entity Recognition pass before the regex step — and never use a cloud NER service, which would expose the very data you’re protecting.
Can I do PII redaction in Java instead of Python?
Yes. A Spring Boot service makes an excellent redaction proxy: a @Component for detection, a provider client for forwarding, and a @RestController to tie them together. The official Anthropic Java SDK handles the model call, so the whole stack stays on the JVM.
