Compliance-First Protection

PII/PHI Protection Engine

7 configurable protection methods per data type. Auto-detection of sensitive fields and built-in HIPAA Safe Harbor with 18 PHI identifiers.

7

Protection Methods

18

HIPAA Identifiers

11+

PII Types Detected

11+

PHI Types Detected

7 Protection Methods

Each method has different characteristics. Choose based on whether you need to recover the data and your security requirements.

MASKMasking
One-way

Partial masking that hides most characters but preserves format. Useful for display and logs.

Original data CANNOT be recovered. Only use if you don't need the full value.

Good for

  • Display in UI
  • Customer support viewing
  • Logs
  • Reports

Not for

  • Analytics on full value
  • Future data recovery

Examples

john@email.com
****@email.com
555-123-4567
***-***-4567
1234-5678-9012-3456
****-****-****-3456
Security Level:
HASHSHA-256 Hash
One-way

One-way cryptographic hash. Same input always produces same output. Perfect for deduplication.

Original data CANNOT be recovered. Only use for comparison/deduplication.

Good for

  • Deduplication
  • Matching records
  • Audit trails
  • Anonymization

Not for

  • Reading original value
  • Customer communication

Examples

john@email.com
a8f5f167f44f4964e6c998...
Jane Doe
7c4a8d09ca3762af61e59...
Security Level:
REDACTRedaction
One-way

Complete removal and replacement with placeholder text. Maximum data minimization.

Data is PERMANENTLY DELETED. Use only when you truly don't need the value.

Good for

  • Compliance reports
  • Data minimization
  • Public datasets
  • GDPR right to erasure

Not for

  • Any future use of the data

Examples

123-45-6789
[SSN REDACTED]
John Smith
[NAME REDACTED]
123 Main St, NY
[ADDRESS REDACTED]
Security Level:
ENCRYPTAES-256 Encryption
Reversible

Reversible encryption. Data can be decrypted with the encryption key. Recommended for PHI.

Recommended for data you need to access later. Requires encryption key management.

Good for

  • Secure storage
  • PHI protection
  • Data that needs future access
  • HIPAA compliance

Examples

john@email.com
gAAAAABl7x2KsH8f...
Patient diagnosis
gAAAAABl7x2Kt9Kj...
Security Level:
TOKENIZETokenization
Reversible

Replace with a token reference. Original stored in secure vault. Best for PCI-DSS.

Best for references that pass through multiple systems. Vault required.

Good for

  • PCI-DSS compliance
  • Reference without exposure
  • API responses
  • Cross-system references

Examples

john@email.com
tok_a1b2c3d4e5f6
4532-1234-5678-9012
tok_cc_x7y8z9a0b1c2
ACCT-12345678
tok_bank_m3n4o5p6q7
Security Level:
GENERALIZEGeneralization
One-way

Reduce precision while maintaining analytical value. Required by HIPAA Safe Harbor for dates.

HIPAA Safe Harbor requires dates to be generalized to year only. ZIP codes to first 3 digits.

Good for

  • HIPAA Safe Harbor dates
  • Statistical analysis
  • Demographics
  • Geographic regions

Not for

  • Exact date requirements
  • Precise location needs

Examples

1985-03-15
1985
90210
902**
Age: 45
40-49
Security Level:
SKIPSkip (No Protection)
Reversible

No transformation applied. Data passes through unchanged. Use for non-sensitive fields.

Data is NOT protected. Only use for non-sensitive fields.

Good for

  • Non-sensitive data
  • Public information
  • IDs that must remain readable

Not for

  • PII
  • PHI
  • Any sensitive data

Examples

product_category
product_category
order_status
order_status
Security Level:

PII Auto-Detection

Nexion automatically detects these PII types and applies the configured protection method. Override defaults per field or per Data Pod.

EMAIL

Email addresses

HASH
PHONE_NUMBER

Phone numbers

MASK
PERSON_NAME

Full names

REDACT
ADDRESS

Physical addresses

REDACT
SSN

Social Security Numbers

REDACT
CREDIT_CARD

Credit card numbers

TOKENIZE
DATE_OF_BIRTH

Birth dates

MASK
IP_ADDRESS

IP addresses

MASK
PASSPORT_NUMBER

Passport numbers

REDACT
DRIVER_LICENSE

Driver license numbers

REDACT
BANK_ACCOUNT

Bank account numbers

TOKENIZE
HIPAA Compliance

HIPAA Safe Harbor

HIPAA Safe Harbor de-identification requires removing or generalizing 18 specific identifiers. Nexion automatically detects and protects all 18 when PHI protection is enabled.

Automatic Detection

AI-powered detection identifies PHI fields with configurable confidence threshold (default 80%).

Minimum Protection Enforced

PHI fields cannot be downgraded to less secure methods. ENCRYPT or TOKENIZE required.

BAA Support

Business Associate Agreement available for enterprise customers handling PHI.

18 HIPAA Safe Harbor Identifiers

1Names
2Geographic data
3Dates (except year)
4Phone numbers
5Fax numbers
6Email addresses
7SSN
8Medical record numbers
9Health plan IDs
10Account numbers
11Certificate numbers
12Vehicle IDs
13Device IDs
14URLs
15IP addresses
16Biometric IDs
17Photos
18Unique identifiers

PHI Auto-Detection

Healthcare-specific data types automatically detected and protected with ENCRYPT by default.

MEDICAL_RECORD

Medical record numbers

ENCRYPT
HEALTH_CONDITION

Diagnoses, conditions

ENCRYPT
MEDICATION

Prescriptions, medications

ENCRYPT
TREATMENT

Treatment information

ENCRYPT
DIAGNOSIS

Diagnosis codes (ICD-10)

ENCRYPT
LAB_RESULT

Laboratory results

ENCRYPT
PATIENT_ID

Patient identifiers

TOKENIZE
PROVIDER_ID

Healthcare provider IDs

TOKENIZE
HEALTH_PLAN_ID

Insurance plan IDs

TOKENIZE
DEVICE_ID

Medical device identifiers

HASH
BIOMETRIC_ID

Biometric identifiers

HASH
HIPAA Compliance by Design

Two-Layer Protection Architecture

Sensitive data is never persisted unprotected. Our architecture ensures HIPAA compliance through defense in depth: protection at the pipeline level AND the storage level.

1

Pipeline Protection

First line of defense

  • PII/PHI nodes auto-detect sensitive fields
  • HIPAA-appropriate method suggested per type
  • All processing in-memory (never persisted raw)
  • Override or skip fields as needed

Data Flow

Source → Extract → Transform → PII/PHI Node → Load

2

DataPod Safety Net

Second line of defense

  • Respects pipeline decisions (skip = skip)
  • Catches NEW sensitive fields not in pipeline
  • Applies DataPod-level compliance rules
  • Nothing stored unprotected without explicit skip

Safety Net

Pipeline Output → DataPodProtected Storage

Traditional ETL

  • • Raw PHI written to disk during processing
  • • Manual protection rules per pipeline
  • • No automatic detection
  • • Single point of failure for compliance

Nexion

  • • All processing in-memory until final protected write
  • • Zero-config auto-detection with HIPAA defaults
  • • Two independent protection layers
  • • Defense in depth for compliance

How Protection Works

1

Schema Discovery

Nexion scans your source tables and discovers the schema structure.

2

PII/PHI Detection

AI analyzes column names, data types, and sample values to detect sensitive fields.

3

Rule Application

Default protection rules are applied based on the Data Pod's compliance level.

4

Review & Override

Review detected fields and override protection methods as needed.

5

Data Flow Execution

During extraction, transformation applies the configured protection methods.

6

Audit Logging

Every protection action is logged with field, method, and timestamp.

Protect your sensitive data today

Start detecting and protecting PII/PHI automatically with Nexion.