Releases: didrod205/datalint
Releases · didrod205/datalint
v0.1.0
First public release of datalint — a local, deterministic CLI that profiles and lints CSV/TSV data quality.
Profiles
Per-column inferred type, empty rate, distinct count, min/max/mean and top values.
Catches
- 🧱 Ragged rows, duplicate/empty headers, empty columns/rows
- 🔢 Type drift (numbers mixed with text), numeric outliers (Tukey/IQR)
- 🈳 Missing values, leading/trailing whitespace
- 📅 Mixed date formats, inconsistent casing (US vs us)
- ♻️ Duplicate rows
- 📐 Optional schema: required, type, enum, range, pattern, unique, not-null
Features
- Hand-rolled RFC 4180 CSV/TSV parser (no data-lib dependency) + delimiter auto-detection.
- Quality score + A–F grade, per file and overall.
- JSON + Markdown export; colored console output.
--min-scoreCI gate (non-zero exit).- Config file: delimiter, header, thresholds, schema, rule severities.
- Programmatic API (
analyze,buildReport,parseCsv,toMarkdown). ESM + CJS + types. - No Python, no API key, no server, nothing uploaded.
Install
npx @didrod2539/datalint scan data.csv💖 Sponsor via Lemon Squeezy.