Core Concept: Hard Execution Boundaries

What is a Data Pod?

A Data Pod is a logical storage unit that encapsulates data together with enforceable rules for quality, compliance, access control, and lineage. It acts as a hard execution boundary within the Nexion runtime.

Official Definition

"A Data Pod is a logical storage unit that encapsulates data together with enforceable rules for quality, compliance, access control, and lineage. It acts as a hard execution boundary within the Nexion runtime."

Unlike traditional datasets that are passive storage, Data Pods actively enforce rules at runtime. Data cannot exist in a Pod without passing through its configured validation and protection pipeline.

Dataset vs Data Pod

AspectTraditional DatasetNexion Data Pod
Primary FunctionStorageStorage + Enforcement
PoliciesDocumentedExecuted at Runtime
ComplianceOnly at rest (after storage)At execution + at rest
EnforcementUser-dependentRuntime-enforced
Data QualityOptional validationMandatory gates
PII/PHI ProtectionManual processAuto-enforced
LineageExternal toolBuilt-in tracking

Data Pods are the core differentiator of Nexion. They transform passive storage into active, governed data products.

Data Pod Architecture

PII Pod
Compliance:PII + GDPR
Protection:MASK, HASH
Destination:Configurable
Format:Delta Lake

Tables:

customers
orders
PHI Pod
Compliance:PHI + HIPAA
Protection:ENCRYPT
Destination:Configurable
Format:Delta Lake

Tables:

patients
diagnoses
Analytics Pod
Compliance:GENERAL
Protection:None
Destination:Configurable
Format:Delta Lake

Tables:

metrics
aggregates

Each pod connects to any Storage Destination (OneLake, ADLS, S3, GCS) with isolated encryption keys and access policies

Compliance Levels

Choose the right compliance level for your data. Each level includes all features from lower levels plus additional protections.

GENERAL

Standard data without special compliance requirements

  • Delta Lake storage with ACID
  • Basic encryption at rest
  • Standard audit logging
  • Access control policies

Use cases:

Analytics dataPublic datasetsNon-sensitive reports
PII

Personal Identifiable Information with GDPR/CCPA protection

  • All GENERAL features
  • Auto PII detection
  • 6 protection methods
  • GDPR compliance ready
  • Data retention policies
  • Subject access requests

Use cases:

Customer dataHR recordsMarketing databases
Most Popular
PHI

Protected Health Information with HIPAA Safe Harbor

  • All PII features
  • HIPAA Safe Harbor (18 identifiers)
  • PHI auto-detection
  • Enhanced encryption
  • Compliance audit trails
  • BAA support

Use cases:

Patient recordsMedical claimsHealthcare analytics
PCI

Payment Card Industry Data Security Standard

  • All PII features
  • Credit card tokenization
  • PCI-DSS controls
  • Cardholder data isolation
  • Enhanced monitoring
  • Quarterly scans support

Use cases:

Payment processingTransaction dataBilling systems

6 Protection Methods

Configure how each field type is protected. Different methods for different use cases.

MASK
One-way

Partial masking that preserves format

Example:

john@email.com → ****@email.com
Security Level:
HASH
One-way

One-way SHA-256 cryptographic hash

Example:

john@email.com → a8f5f167f44f...
Security Level:
REDACT
One-way

Complete removal with placeholder

Example:

123-45-6789 → [SSN REDACTED]
Security Level:
ENCRYPT
Reversible

Reversible AES-256 encryption

Example:

john@email.com → gAAAAABl7x2K...
Security Level:
TOKENIZE
Reversible

Replace with token, original in vault

Example:

john@email.com → tok_a1b2c3d4e5f6
Security Level:
SKIP
Reversible

No transformation (explicit opt-out)

Example:

john@email.com → john@email.com
Security Level:

Flexible Storage

Each Data Pod can use a different Storage Destination. Choose based on your cloud strategy, compliance requirements, or performance needs.

Microsoft OneLake

Microsoft OneLake

Fabric native, Unity Catalog compatible

Azure Data Lake Gen2

Azure Data Lake Gen2

Hierarchical namespace, Azure native

Amazon S3

Amazon S3

AWS native, most compatible

Google Cloud Storage

Google Cloud Storage

GCP native, BigQuery integration

Delta Lake

All storage uses Delta Lake format

ACID transactions, time travel, and schema evolution on any backend

Ready to isolate your sensitive data?

Create your first Data Pod and start protecting PII/PHI automatically.