Memproof includes a ContentRedactor that applies regex-based patterns to strip sensitive data from memory content before it reaches the storage adapter. Redaction runs as part of the pipeline, after auth and before risk assessment, so downstream components never see raw PII or secrets.
How It Works
The ContentRedactor maintains an ordered list of RedactionPattern objects. When redact() is called, each pattern is applied in order. Matched substrings are replaced with a bracketed placeholder (e.g., [SSN_REDACTED]), and a RedactionResult is returned alongside the safe content.
Default Patterns
The following patterns are enabled out of the box:
| Pattern Name | Placeholder | Example Match |
|---|
| SSN | [SSN_REDACTED] | 123-45-6789 |
| Email | [EMAIL_REDACTED] | user@example.com |
| Credit Card | [CREDIT_CARD_REDACTED] | 4111-1111-1111-1111 |
| Phone | [PHONE_REDACTED] | (555) 123-4567 |
| Secret / Key | [SECRET_REDACTED] | sk-abcdef1234567890 |
| Bearer Token | [BEARER_REDACTED] | Bearer eyJhbGciOi... |
| API Key Header | [API_KEY_REDACTED] | x-api-key: abc123 |
Basic Usage
from memproof.redaction import ContentRedactor
redactor = ContentRedactor()
safe, result = redactor.redact("My SSN is 123-45-6789")
print(safe)
# "My SSN is [SSN_REDACTED]"
print(result.wasRedacted) # True
print(result.redactions) # [{"pattern": "SSN", "count": 1}]
import { ContentRedactor } from "@kyberon/memproof";
const redactor = new ContentRedactor();
const [safe, result] = redactor.redact("My SSN is 123-45-6789");
console.log(safe);
// "My SSN is [SSN_REDACTED]"
console.log(result.wasRedacted); // true
console.log(result.redactions); // [{ pattern: "SSN", count: 1 }]
Multiple Matches
The redactor handles content with multiple types of sensitive data in a single pass.
from memproof.redaction import ContentRedactor
redactor = ContentRedactor()
content = "Email me at alice@corp.com, my card is 4111-1111-1111-1111"
safe, result = redactor.redact(content)
print(safe)
# "Email me at [EMAIL_REDACTED], my card is [CREDIT_CARD_REDACTED]"
print(result.originalLength) # 57
print(result.redactedLength) # 55
print(result.redactions)
# [{"pattern": "Email", "count": 1}, {"pattern": "Credit Card", "count": 1}]
import { ContentRedactor } from "@kyberon/memproof";
const redactor = new ContentRedactor();
const content = "Email me at alice@corp.com, my card is 4111-1111-1111-1111";
const [safe, result] = redactor.redact(content);
console.log(safe);
// "Email me at [EMAIL_REDACTED], my card is [CREDIT_CARD_REDACTED]"
console.log(result.originalLength); // 57
console.log(result.redactedLength); // 55
Custom Patterns
Add your own RedactionPattern to handle domain-specific sensitive data.
from memproof.redaction import ContentRedactor, RedactionPattern
# Define a custom pattern for internal employee IDs
employee_id_pattern = RedactionPattern(
name="EmployeeID",
regex=r"EMP-\d{6}",
placeholder="[EMPLOYEE_ID_REDACTED]",
)
redactor = ContentRedactor(extra_patterns=[employee_id_pattern])
safe, result = redactor.redact("Assigned to EMP-482910")
print(safe)
# "Assigned to [EMPLOYEE_ID_REDACTED]"
import { ContentRedactor, RedactionPattern } from "@kyberon/memproof";
const employeeIdPattern: RedactionPattern = {
name: "EmployeeID",
regex: /EMP-\d{6}/g,
placeholder: "[EMPLOYEE_ID_REDACTED]",
};
const redactor = new ContentRedactor({ extraPatterns: [employeeIdPattern] });
const [safe, result] = redactor.redact("Assigned to EMP-482910");
console.log(safe);
// "Assigned to [EMPLOYEE_ID_REDACTED]"
RedactionResult Fields
The RedactionResult object returned from redact() contains metadata about what was redacted:
| Field | Type | Description |
|---|
originalLength | int | Character count of the original input |
redactedLength | int | Character count after redaction |
redactions | list | Array of {"pattern": str, "count": int} for each match |
wasRedacted | bool | True if at least one pattern matched |
The redactor does not store or log the original sensitive values. Once redacted, the raw content is discarded. Audit events record that redaction occurred and which patterns matched, but never the matched content itself.
Combine content redaction with the risk engine. The risk engine detects PII and secrets for scoring purposes; the redactor removes them before persistence. Together they provide defense in depth — even if a policy rule allows the operation, the stored content is already sanitized.