📡 DSPM-Discovery-Simulator

Problem Statement

Organizations cannot validate if their Data Security Posture Management (DSPM) tools actually work without uploading real, sensitive production data, which creates a chicken-and-egg security risk. Security teams need a way to test classification accuracy, data discovery coverage, and alert tuning without exposing actual PII, PHI, or financial data.

Proposed Solution

A Python utility that generates "High-Fidelity Synthetic PII." It creates files that look like real customer data to a scanner but contain zero actual sensitive information. The tool generates realistic patterns for SSNs, credit cards, email addresses, phone numbers, and medical record numbers across multiple file formats.

Why It Matters

Validates tool effectiveness and classification accuracy in a controlled environment. Enables security teams to benchmark DSPM solutions (BigID, Microsoft Purview, Wiz) before purchasing or to tune existing deployments without compliance risk.

MVP Scope

Script to generate 1,000 rows of synthetic data in CSV, JSON, and Parquet formats.
Configurable "data profiles" (e.g., Healthcare, Financial Services, Retail).
A "Ground Truth" manifest file that documents what sensitive data patterns exist in each generated file.

Suggested Stack

Python
Faker Library
Pandas

Deliverables

generate_test_data.py
ground_truth_manifest.json
sample_datasets/ folder with pre-generated test files
README.md with usage examples

Expansion Path

Adding "obfuscated" data patterns to test advanced ML-based classification.
Support for unstructured data (PDFs, DOCX files with embedded PII).
API endpoint to generate data on-demand for CI/CD testing.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📡 DSPM-Discovery-Simulator

Problem Statement

Proposed Solution

Why It Matters

MVP Scope

Suggested Stack

Deliverables

Expansion Path

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

📡 DSPM-Discovery-Simulator

Problem Statement

Proposed Solution

Why It Matters

MVP Scope

Suggested Stack

Deliverables

Expansion Path

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages