A production-grade Streamlit application that intercepts raw data, audits it against PECA 2016 (Pakistan Electronic Crimes Act) and GDPR compliance frameworks, encrypts sensitive PII fields, and produces an immutable structured audit log.
Raw Data Source
│
▼
┌──────────────────────────────────────────────────────────┐
│ Stage 1 · INTERCEPT │
│ DataInterceptor — SHA-256 checksum, batch ID, metadata │
└──────────────────────────────────┬───────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Stage 2 · AUDIT (GDPR + PECA) │
│ ComplianceAuditor — PII detection, rule matching, │
│ violation scoring, CRITICAL/HIGH/MEDIUM/LOW risk rating │
└──────────────────────────────────┬───────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Stage 3 · ENCRYPT │
│ DataEncryptor — AES-256-GCM (or Fernet / RSA-OAEP+AES) │
│ All PII fields replaced with ENC:<base64(nonce+ct+tag)> │
└──────────────────────────────────┬───────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ Stage 4 · LOG │
│ ComplianceLogger — append-only JSON structured log, │
│ exportable as JSON Lines (SIEM) or CSV │
└──────────────────────────────────────────────────────────┘
| Rule ID | Article | Field | Risk |
|---|---|---|---|
| GDPR-ART25-001 | Art. 25 – Data Minimisation | national_id | HIGH |
| GDPR-ART32-001 | Art. 32 – Security of Processing | credit_card | CRITICAL |
| GDPR-ART35-001 | Art. 35 – DPIA Required | dob | MEDIUM |
| GDPR-ART5-001 | Art. 5 – Purpose Limitation | ip_address | MEDIUM |
| GDPR-ART5-002 | Art. 5 – Lawfulness | MEDIUM |
| Rule ID | Section | Field | Risk |
|---|---|---|---|
| PECA-SEC14-001 | Sec. 14 – Identity Information | national_id | HIGH |
| PECA-SEC18-001 | Sec. 18 – Data Protection | phone | MEDIUM |
| PECA-SEC34-001 | Sec. 34 – Dignity/Privacy | dob | LOW |
| PECA-SEC14-002 | Sec. 14 – Identity Information | credit_card | CRITICAL |
| Algorithm | Key Size | Mode | Notes |
|---|---|---|---|
| AES-256-GCM | 256-bit | Authenticated | Default — recommended |
| Fernet (AES-128-CBC) | 128-bit | HMAC-SHA256 | Simple symmetric |
| RSA-OAEP + AES | 2048-bit RSA + 256-bit AES | Hybrid | Key wrapping |
Encrypted values format: ENC:<base64(nonce + ciphertext + tag)>
pip install -r requirements.txt
streamlit run app.pydocker compose up --build
# Open http://localhost:8501docker build -t complianceshield .
docker run -p 8501:8501 complianceshieldReplace DataInterceptor.intercept() with a PySpark job:
spark = SparkSession.builder.appName("ComplianceShield").getOrCreate()
df = spark.read.json("s3a://raw-data/landing/")
df = df.rdd.mapPartitions(compliance_audit_udf).toDF()
df.write.format("delta").mode("append").save("s3a://processed/compliant/")from airflow import DAG
from airflow.operators.python import PythonOperator
with DAG("compliance_pipeline", schedule="@hourly") as dag:
ingest = PythonOperator(task_id="ingest", python_callable=intercept)
audit = PythonOperator(task_id="audit", python_callable=audit_records)
encrypt = PythonOperator(task_id="encrypt", python_callable=encrypt_fields)
log_task = PythonOperator(task_id="log", python_callable=write_audit_log)
ingest >> audit >> encrypt >> log_task- Store AES keys in Azure Key Vault or AWS KMS
- Implement 90-day automatic key rotation
- Use Hardware Security Modules (HSM) for RSA private keys
- Log all key access events to the compliance audit trail
compliance_pipeline/
├── app.py # Streamlit UI
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── README.md
└── pipeline/
├── __init__.py `
├── interceptor.py # Stage 1: Data interception + checksums
├── auditor.py # Stage 2: GDPR/PECA rule engine
├── encryptor.py # Stage 3: AES-256-GCM encryption
├── logger.py # Stage 4: Structured audit logging
└── spark_engine.py # Spark/Airflow execution simulation