Skip to content
Product · AI Data Redaction

Strip every SSN, PHI, secret, and identifier before any prompt reaches the model.

PortEden's redaction engine inspects every field bound for Claude, ChatGPT, Copilot, or Gemini — masking the 18 HIPAA Safe Harbor identifiers, payment data, API keys, and your custom rules at the egress boundary. The AI gets useful structure. Your data never leaves your perimeter.

See pricing

Free tier · No credit card · Works with any AI client

Mapped to the frameworks your auditor reads
HIPAA §164.514(b)GDPR Art. 32PCI-DSS Req. 3.5SOC 2 CC6.7CCPA §1798.140ISO 27001 A.8.10GLBA Safeguards
The problem

Every prompt is an unmonitored data export.

The moment your team types into ChatGPT, Claude, Copilot, or Gemini, your data leaves your perimeter. Once it lands at OpenAI, Anthropic, Google, or Microsoft, you have no control over how long it lives, who reads it, or whether it shows up in someone else's training set.

Your team pastes customer emails into ChatGPT

One copy-paste sends names, addresses, account numbers, and full message bodies to OpenAI's servers in plain text. By the time procurement finds out, it's been logged, possibly used for abuse review, and you have no record of what went where.

Claude summarizes a contract and ingests every clause

Counterparty names, settlement figures, NDAs, and proprietary terms are now sitting in Anthropic's context window — protected by their privacy policy, not your control. If a client or auditor asks what data was disclosed, you can't answer.

A developer pastes a stack trace with API keys into Copilot

Error logs and config dumps are full of secrets — DB connection strings, JWTs, AWS access keys, OAuth tokens. Your secrets scanning catches Git pushes; it doesn't catch AI chat.

The firewall in action

Sensitive data, redacted before it reaches the model.

PortEden inspects every field. SSNs, PHI, payment data, secrets, and your custom identifiers are masked at the boundary — never sent to OpenAI, Anthropic, Google, or Microsoft.

Your data
PortEdenRedact
Your AI
Claude
ChatGPT
Copilot
Gemini
Grok
Safe
Sensitive
Redacted
Coverage

200+ patterns, four pillars of protection.

PortEden ships with deterministic regex, transformer NER, and Luhn-validated payment detectors. Add your own rules for matter numbers, claim IDs, internal SKUs — anything with a pattern.

PHI · 18 HIPAA Safe Harbor identifiers

  • Patient names
  • Geographic subdivisions smaller than a state
  • All elements of dates (DOB, admission, discharge)
  • Phone & fax numbers
  • Email addresses
  • Social Security Numbers
  • Medical Record Numbers (MRN)
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate / license numbers
  • Vehicle identifiers (VIN, plate)
  • Device identifiers and serial numbers
  • URLs
  • IP addresses
  • Biometric identifiers
  • Full-face photographs
  • Any other unique identifying number / characteristic
  • Diagnoses & procedure codes (ICD-10, CPT)

PCI · Payment Card Industry data

  • Primary Account Numbers (PAN, with Luhn check)
  • Card expiration dates
  • CVV / CVC / CID
  • Cardholder names tied to PAN
  • IBAN & SWIFT codes
  • ACH routing & account numbers
  • Stripe / Plaid / payment processor IDs
  • Tax ID / EIN / VAT numbers

PII · GDPR Article 4 personal data

  • Full names & aliases
  • Government IDs (passport, driver's license)
  • Date of birth & age combined with location
  • Home & work addresses
  • Geolocation coordinates
  • Online identifiers (cookie IDs, device fingerprints)
  • IP addresses & MAC addresses
  • Biometric & genetic data

Secrets · keys, tokens, credentials

  • API keys (OpenAI, Stripe, Twilio, SendGrid)
  • AWS access keys & session tokens
  • GCP service-account JSON
  • Azure connection strings
  • GitHub & GitLab PATs
  • JWTs & OAuth bearer tokens
  • Private keys (RSA, EC, SSH, PGP)
  • Database connection strings & passwords
How it works

Inspect. Detect. Redact. Re-hydrate.

1. Inspect

Every payload bound for an AI client is parsed in your perimeter. Email bodies, attachments, calendar events, chat messages, document text, and code snippets are tokenized and scanned by both regex and an NER model.

2. Detect

Layered detection: ~120 deterministic patterns (SSN, PCI, IBAN, JWT, AWS keys, GitHub PATs), a transformer NER for context-dependent entities (names, MRNs, locations, organizations), and your custom rules. Detections are confidence-scored.

3. Redact

Sensitive values are replaced with structure-preserving placeholders ([PERSON_1], [SSN_a4f2], [DATE_2024-03-15]) so the LLM still has enough signal to be useful. The original values go into a short-lived encrypted token vault.

4. Re-hydrate

When the model's response comes back referencing the placeholders, PortEden swaps them for real values in the user's browser. The model never sees the originals; the user never sees the placeholders. Every event is logged.

See it in action

Same email, two very different exposures.

Here's a real-looking message from your inbox. On the left, what you typed. On the right, exactly what reaches the AI. The model still has enough context to draft a useful reply — your patient's details never leave your network.

What you typed
Sensitive
From: you@clinic.com
To: Claude
Subject: Help drafting a patient reply

Hi team — quick update on Jane Doe (DOB 03/15/1985).

Her MRN is 0048-2231 and her insurance member ID is BC889922.

She called from (555) 123-4567 about her prescription.

Her card on file (4532-1234-5678-9012) was declined yesterday.

Can you draft a reply asking her to confirm the new copay amount?

What the AI sees
Safe
From: you@clinic.com
To: Claude
Subject: Help drafting a patient reply

Hi team — quick update on [PERSON_1] (DOB [DATE_1]).

Her MRN is [MRN_1] and her insurance member ID is [MEMBER_1].

She called from [PHONE_1] about her prescription.

Her card on file ([CARD_1]) was declined yesterday.

Can you draft a reply asking her to confirm the new copay amount?

What PortEden caught — automatically
Patient nameDate of birthMedical record #Member IDPhoneCredit card
The AI reply, rehydrated for you
Real names restored in your browser only

Of course — here's a friendly draft asking Jane Doe to confirm the updated copay for her prescription, with a link to update her card on file. Tone is reassuring and avoids referencing the decline directly. Want me to adjust the formality?

The model wrote [PERSON_1] in its response. PortEden swapped it back to Jane Doe in your browser before you saw the reply — so you get the useful answer, and her name never reaches OpenAI or Anthropic.

With and without redaction

The same workflow, two very different audit trails.

Drafting a customer reply with ChatGPT
Without
Customer name, account number, and full thread sent to OpenAI in plain text. Logged for abuse review under their retention policy.
With
Identifiers replaced with stable placeholders. The model drafts a coherent reply; the original values never leave your network.
Asking Claude to summarize a contract
Without
Full contract text — counterparty names, dollar figures, indemnity terms — flows into Anthropic's context window with no audit record.
With
Names and amounts redacted before upload. Claude still produces a useful summary; the audit log shows exactly what was masked.
Copilot helping debug a stack trace
Without
Stack trace pasted into chat carries DB passwords, JWTs, and AWS keys. Secrets are now in Microsoft's logs and possibly indexed.
With
Secrets pattern-matched and replaced with [SECRET_*] before the request leaves your perimeter. Copilot debugs the structure; keys stay home.
Calendar-aware AI assistant on Google Calendar or Outlook
Without
Meeting titles like "Acme acquisition — final number" and attendee lists with C-suite emails sync to the AI vendor for the length of their retention policy.
With
Titles, attendees, and locations sanitized at sync time. The assistant schedules and recaps without learning your deal pipeline.
Patient note dictation with an AI scribe
Without
Patient name, DOB, MRN, and clinical history sent to a third-party model. PHI disclosure exposure under HIPAA with no record of what went where.
With
All 18 Safe Harbor identifiers stripped, narrative preserved. Per-request log supports auditor evidence requests.
Auditor asks: "What client data has touched OpenAI in 2026?"
Without
No record. You're reconstructing it from screenshots, browser history, and developer Slack threads.
With
Per-user, per-integration, per-AI-client log of every redaction event with rule-level detail. Exportable to SIEM in one click.
Auditor-readable

Citations, not vague reassurances.

Each redaction control maps to a specific clause in the framework your auditor is reading. Evidence is exportable from the audit trail.

Framework
Citation
PortEden control
HIPAA
§164.514(b) Safe Harbor
All 18 identifiers de-identified at egress. Per-request audit log of every replacement.
HIPAA
§164.312(a)(1) Access Control
Per-user, per-integration redaction policies. PHI never reaches a workforce member or AI client without authorization.
GDPR
Art. 32 + Recital 28
Pseudonymization at the controller boundary. Tokens stored separately from data, with technical and organizational measures.
PCI-DSS v4.0
Req. 3.5.1 — PAN masking
PAN truncation/masking on display and transmission. CVV never stored. Luhn-validated detection.
SOC 2
CC6.7 — Confidential transmission
Sensitive data is identified and masked during transmission. Per-request audit log; signed CSV export.
CCPA / CPRA
§1798.140(ae) Personal info
Personal information stripped before transmission to service providers (AI vendors). Right-to-delete supported via vault TTL.
ISO 27001
A.8.10 / A.5.34
Information deletion and PII protection controls across the AI data path.
GLBA · Safeguards Rule
16 CFR §314.4(c)
Encryption and access controls on customer information transmitted to third-party AI services.
Where redaction runs

Every source your AI tries to read from.

Gmail
Gmail
Outlook
Outlook
Google Calendar
Google Calendar
Google Drive
Google Drive
SharePoint
SharePoint
Slack
Slack
Microsoft Teams
Microsoft Teams
Jira
Jira
Confluence
Confluence
Notion
Notion
Asana
Asana
Linear
Linear
Architecture

Egress redaction, not endpoint trust.

Browser extensions and per-app DLP miss as much as they catch — every new AI tool is another rollout. PortEden sits at the integration boundary instead, so the rules apply uniformly whether the prompt comes from Claude desktop, ChatGPT web, a Cursor agent, or an MCP server.

Integration-side, not endpoint-side

Redaction happens once at the data source — Gmail, Drive, Slack, etc. — not on every laptop. Add a new AI client and rules apply automatically.

Tokens stay in your tenant

Original values are stored encrypted in a short-lived per-tenant vault. Default 5-minute TTL, configurable. Vault is isolated; the model never reaches it.

Every event is an audit record

Rule fired, count, category, integration, user, AI client, timestamp, payload hash. Stream to Splunk, Datadog, Elastic, or download a signed CSV.

Available in: Pro, Business, Enterprise tiers · See pricing

Data redaction questions

What is AI data redaction and why do I need it?
AI data redaction is the process of detecting and masking sensitive fields — SSNs, payment data, PHI, secrets, internal identifiers — in any payload that's about to be sent to an LLM. Without it, every prompt your team writes effectively exports raw business data to OpenAI, Anthropic, Google, or Microsoft. PortEden's redaction engine runs in your perimeter, replaces sensitive values with reversible placeholders, and only the sanitized payload reaches the model. The AI gets the structure it needs to be useful; your data never leaves your control.
Which identifiers does PortEden's redaction engine cover by default?
Out of the box: all 18 HIPAA Safe Harbor identifiers (names, dates, phone numbers, fax numbers, email, SSN, MRN, account numbers, license numbers, vehicle IDs, device IDs, URLs, IPs, biometric IDs, photos, and any other unique identifying number/code), PCI-DSS payment data (PAN, expiry, CVV, IBAN, routing numbers), GDPR Article 4 personal data (names, government IDs, online identifiers, location, biometrics), and common secrets (API keys, JWTs, OAuth tokens, AWS access keys, GitHub PATs, private keys). You can extend the rule set with regex patterns, allow-lists, and custom NER labels for proprietary identifiers like matter numbers, claim IDs, or internal SKUs.
How does redaction work technically — regex, ML, or both?
Both, layered. Regex patterns catch deterministic formats (SSN, credit card with Luhn check, IBAN, JWT, AWS keys). A transformer-based NER model catches contextual entities (names, addresses, organizations, medical terms) where format alone isn't enough. A custom rule layer lets you add domain-specific patterns (matter numbers, claim IDs, MRNs in your system's format). Detections are merged, deduplicated, and confidence-scored before replacement. You can tune thresholds per integration or per AI client.
Will redaction break the AI's ability to be useful?
No — and that's the design point. PortEden uses structure-preserving placeholders: a name becomes [PERSON_1] (consistent across the prompt so the model can still reason about who's who), a date becomes [DATE_1985-03-15] preserving format, an account number becomes [ACCOUNT_8821] keeping last-four for reference. The model gets a coherent, redacted payload — it can still summarize, classify, draft, and reason. On the response path, PortEden re-hydrates placeholders client-side so the human user sees the real values without them ever leaving your perimeter.
What evidence does this produce for HIPAA, GDPR, PCI-DSS, and SOC 2 auditors?
PortEden's redaction engine produces a per-request log of every identifier masked at egress, with rule, category, integration, AI client, user, and a hash of the original payload. Auditors evaluating HIPAA §164.514(b) Safe Harbor de-identification, GDPR Article 32 pseudonymization (Recital 28), PCI-DSS Requirement 3.5 truncation/masking, and SOC 2 CC6.7 transmission of confidential information typically request exactly this kind of evidence. Logs export to SIEM or to a signed CSV. Compliance with these frameworks remains your responsibility — PortEden provides the technical control, you operate the program around it.
Can I see exactly what was redacted from each prompt?
Yes. Every redaction event is recorded in the audit trail with: the integration source (Gmail, Outlook, Drive, Calendar, Slack, etc.), the AI client that received the request (Claude, ChatGPT, Copilot, Gemini), the rules that fired, the count and category of replacements, the user, the timestamp, and a hash of the original payload. Logs export to SIEM (Splunk, Datadog, Elastic) or to a signed CSV for compliance review.
Does redaction work with Claude, ChatGPT, Copilot, and Gemini?
Yes. PortEden is model-agnostic — redaction happens at the integration layer (between your data sources and any AI client). Claude desktop, Claude API, ChatGPT, ChatGPT Enterprise, GitHub Copilot, Microsoft 365 Copilot, Gemini, Perplexity, Cursor, and any MCP-compatible client are supported with the same policy. Switch models without rewriting a single rule.
How fast is the redaction engine? Will it slow down my AI workflows?
Median redaction latency is under 40 ms for a typical email-sized payload (~4 KB), under 120 ms for a long document (~50 KB). For streaming workflows, the engine processes deltas in-flight so the user-perceived response latency increase is negligible. Heavy NER inference runs on dedicated workers and is cached per-tenant.
Can I customize redaction rules for my own internal identifiers?
Yes. Add custom regex rules with named placeholders, upload allow-lists for terms that should never be redacted (your product names, public officers, etc.), or train a lightweight custom NER model on a sample of your data. Rules can be scoped to specific integrations, specific users, specific AI clients, or specific times of day — useful for separating production from sandbox traffic.
What happens to data we've already sent to OpenAI or Anthropic before installing PortEden?
PortEden can't undo prior exfiltration, but it can help you contain blast radius. The product includes a one-time discovery scan that identifies which integrations have been connected to which AI clients, what scopes were granted, and how much data has flowed. From there, you set redaction policy going forward, and the audit trail starts the day you turn it on. We recommend rotating any secrets that have been pasted into AI tools as a precaution.
Is the redaction reversible? How do I see the original values?
Yes — for authorized users on the same prompt round-trip. PortEden keeps a short-lived, encrypted token vault (default 5-minute TTL, configurable) so when the AI's response comes back referencing [PERSON_1] or [ACCOUNT_8821], the placeholders are swapped back to real values in the user's browser before display. The model never sees the original; the user sees the original; the audit log records both.
What pricing tier includes data redaction?
Basic regex-based redaction (SSN, credit card, common PII) is included on the Pro tier. Full NER, HIPAA Safe Harbor, custom rules, SSO/SAML, SCIM, and SIEM export are on the Enterprise tier. See pricing for the full breakdown.

Stop your data from leaking into someone else's AI.

Set up redaction in under 5 minutes. Free tier covers solo users; Enterprise adds SSO/SAML, SCIM, and SIEM export.

Talk to sales