Skip to main content
Memproof includes a ContentRedactor that applies regex-based patterns to strip sensitive data from memory content before it reaches the storage adapter. Redaction runs as part of the pipeline, after auth and before risk assessment, so downstream components never see raw PII or secrets.

How It Works

The ContentRedactor maintains an ordered list of RedactionPattern objects. When redact() is called, each pattern is applied in order. Matched substrings are replaced with a bracketed placeholder (e.g., [SSN_REDACTED]), and a RedactionResult is returned alongside the safe content.

Default Patterns

The following patterns are enabled out of the box:
Pattern NamePlaceholderExample Match
SSN[SSN_REDACTED]123-45-6789
Email[EMAIL_REDACTED]user@example.com
Credit Card[CREDIT_CARD_REDACTED]4111-1111-1111-1111
Phone[PHONE_REDACTED](555) 123-4567
Secret / Key[SECRET_REDACTED]sk-abcdef1234567890
Bearer Token[BEARER_REDACTED]Bearer eyJhbGciOi...
API Key Header[API_KEY_REDACTED]x-api-key: abc123

Basic Usage

from memproof.redaction import ContentRedactor

redactor = ContentRedactor()

safe, result = redactor.redact("My SSN is 123-45-6789")
print(safe)
# "My SSN is [SSN_REDACTED]"

print(result.wasRedacted)     # True
print(result.redactions)      # [{"pattern": "SSN", "count": 1}]

Multiple Matches

The redactor handles content with multiple types of sensitive data in a single pass.
from memproof.redaction import ContentRedactor

redactor = ContentRedactor()

content = "Email me at alice@corp.com, my card is 4111-1111-1111-1111"
safe, result = redactor.redact(content)
print(safe)
# "Email me at [EMAIL_REDACTED], my card is [CREDIT_CARD_REDACTED]"

print(result.originalLength)   # 57
print(result.redactedLength)   # 55
print(result.redactions)
# [{"pattern": "Email", "count": 1}, {"pattern": "Credit Card", "count": 1}]

Custom Patterns

Add your own RedactionPattern to handle domain-specific sensitive data.
from memproof.redaction import ContentRedactor, RedactionPattern

# Define a custom pattern for internal employee IDs
employee_id_pattern = RedactionPattern(
    name="EmployeeID",
    regex=r"EMP-\d{6}",
    placeholder="[EMPLOYEE_ID_REDACTED]",
)

redactor = ContentRedactor(extra_patterns=[employee_id_pattern])

safe, result = redactor.redact("Assigned to EMP-482910")
print(safe)
# "Assigned to [EMPLOYEE_ID_REDACTED]"

RedactionResult Fields

The RedactionResult object returned from redact() contains metadata about what was redacted:
FieldTypeDescription
originalLengthintCharacter count of the original input
redactedLengthintCharacter count after redaction
redactionslistArray of {"pattern": str, "count": int} for each match
wasRedactedboolTrue if at least one pattern matched
The redactor does not store or log the original sensitive values. Once redacted, the raw content is discarded. Audit events record that redaction occurred and which patterns matched, but never the matched content itself.
Combine content redaction with the risk engine. The risk engine detects PII and secrets for scoring purposes; the redactor removes them before persistence. Together they provide defense in depth — even if a policy rule allows the operation, the stored content is already sanitized.