Skip to content

Releases: didrod205/datalint

v0.1.0

31 May 23:40

Choose a tag to compare

First public release of datalint — a local, deterministic CLI that profiles and lints CSV/TSV data quality.

Profiles

Per-column inferred type, empty rate, distinct count, min/max/mean and top values.

Catches

  • 🧱 Ragged rows, duplicate/empty headers, empty columns/rows
  • 🔢 Type drift (numbers mixed with text), numeric outliers (Tukey/IQR)
  • 🈳 Missing values, leading/trailing whitespace
  • 📅 Mixed date formats, inconsistent casing (US vs us)
  • ♻️ Duplicate rows
  • 📐 Optional schema: required, type, enum, range, pattern, unique, not-null

Features

  • Hand-rolled RFC 4180 CSV/TSV parser (no data-lib dependency) + delimiter auto-detection.
  • Quality score + A–F grade, per file and overall.
  • JSON + Markdown export; colored console output.
  • --min-score CI gate (non-zero exit).
  • Config file: delimiter, header, thresholds, schema, rule severities.
  • Programmatic API (analyze, buildReport, parseCsv, toMarkdown). ESM + CJS + types.
  • No Python, no API key, no server, nothing uploaded.

Install

npx @didrod2539/datalint scan data.csv

💖 Sponsor via Lemon Squeezy.