Integrations

Tier: Intermediate

qsv lives inside larger pipelines. This page maps the major integration surfaces — pick what's relevant to your stack.

CKAN / DataPusher+

CKAN is the de-facto open-source data portal. qsv is a first-class CKAN citizen via:

safenames — produces CKAN Datastore-safe column names
applydp — slim transform operations for the DataPusher+ Postgres COPY pipeline
qsvdp binary variant — the slim build DataPusher+ ships
to postgres — bulk load into the CKAN Datastore's underlying PostgreSQL
dynamicEnum with ckan:// URLs — validate column values against any CKAN-hosted reference CSV
sniff — health-check remote CKAN resources without downloading

See Recipe: CKAN Integration for the full pipeline.

DuckDB

qsv complements DuckDB:

qsv to parquet + DuckDB = an excellent CSV → analytics pipeline. qsv cleans and converts; DuckDB queries.
qsv sqlp uses Polars SQL; qsv scoresql --duckdb uses DuckDB's planner for query scoring.
qsv describegpt with QSV_DUCKDB_PATH set uses DuckDB for SQL-RAG (highly recommended).

Typical pipeline:

qsv stats --stats-jsonl raw.csv             # build stats cache
qsv schema --polars raw.csv                 # build Polars schema
qsv to parquet outdir/ raw.csv              # convert to Parquet

duckdb -c "SELECT borough, COUNT(*) FROM read_parquet('outdir/raw.parquet') GROUP BY borough"

For more on DuckDB integration, see docs/help/sqlp.md and docs/help/scoresql.md.

Python notebooks

qsv ships sample Jupyter / Colab notebooks in contrib/notebooks/:

qsv-colab-quickstart.ipynb — Google Colab walkthrough
Whirlwindtour.ipynb — the canonical whirlwind tour as a notebook
intro-to-count.ipynb — beginner-focused
qsv_benchmark.ipynb — reproducible benchmark setup

The pattern is to use qsv via shell subprocess from a notebook cell, or via subprocess.run. qsv handles the heavy data work; pandas / Polars / matplotlib handle plotting and modeling.

import subprocess, pandas as pd
result = subprocess.run(
    ['qsv', 'stats', '--everything', '--stats-jsonl', 'data.csv'],
    capture_output=True, text=True, check=True
)
stats_df = pd.read_csv('data.stats.csv')   # qsv wrote the sidecar

CI/CD — GitHub Actions

qsv works great as a data-quality gate. Drop a step into your workflow:

# .github/workflows/data-quality.yml
name: Data quality
on:
  pull_request:
    paths: ['data/**.csv']
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install qsv
        run: |
          curl -L -o qsv.zip \
            https://github.com/dathere/qsv/releases/latest/download/qsv-1.0.0-x86_64-unknown-linux-gnu.zip
          unzip qsv.zip
          sudo install qsv /usr/local/bin/
      - name: Validate
        run: |
          qsv validate data/customers.csv data/customers.schema.json
          if [ -f data/customers.csv.invalid.csv ]; then
            echo "::error::Validation failed; see data/customers.csv.validation-errors.tsv"
            exit 1
          fi
      - name: Diff against last release
        run: |
          qsv diff --select id last_release/data.csv data/data.csv > delta.csv
          qsv count delta.csv

See Recipe: JSON Schema Validation and Recipe: Diff & Audit for patterns you can drop into CI.

qsv-recipes — community Luau scripts

qsv-recipes is a curated repo of Luau scripts for qsv luau:

ISBN validation
Unemployment rate enrichment
Stemming / text normalization
Geographic enrichment
Time-series transforms
… and many more

Use it with qsv luau map -x -f .... See Scripting (Luau / Python) and Recipe: Date Enrichment.

qsv-lookup-tables — community reference CSVs

qsv-lookup-tables is the curated reference-data repo. Access from Luau / template / validate with the dathere:// URL scheme:

qsv_register_lookup("us_states", "dathere://us-states-example.csv")

See Lookup Tables.

qsv-cookiecutter

qsv-cookiecutter is a Cookiecutter project scaffold for qsv-based data pipelines:

pipx install cookiecutter
cookiecutter gh:dathere/qsv-cookiecutter

You get a templated project with directory layout, Makefile, baseline shell scripts, and an analytics/ folder pre-wired for qsv.

Claude / LLMs

Three integration shapes:

qsv MCP Server — qsv as an MCP server for any MCP-aware client. See MCP Server.
Claude Cowork Plugin — 15 skills + 3 agents on top of the MCP server, for Claude Desktop. See Claude Cowork Plugin.
qsv describegpt — qsv calls any OpenAI-compatible LLM directly (including Ollama / LM Studio / Jan local LLMs). See AI & Documentation.

Spreadsheets (Excel / LibreOffice)

qsv excel — Excel/ODS sheet → CSV.
qsv to xlsx — CSV(s) → Excel workbook (one sheet per CSV).
qsv to ods — CSV(s) → LibreOffice Calc.
MCP server auto-converts Excel inputs to CSV before running qsv commands (transparent to the LLM).

See Conversion & I/O.

Geospatial tooling

qsv geoconvert — CSV ↔ GeoJSON / SHP / KML / GPX.
qsv geocode — local Geonames + MaxMind GeoLite2 lookups (360k records/sec).
Pair with QGIS, GeoPandas, PostGIS, or kepler.gl for visualization.

See Geospatial and Recipe: Geographic Enrichment.

Shell pipelines

qsv reads/writes stdin/stdout. Pair freely with:

jq / jaq — JSON manipulation (qsv ships jaq as a built-in via --jaq)
xargs -P — parallelize over qsv-produced row lists
curl / wget — feed remote files (or use qsv fetch / sniff directly)
awk / sed — when qsv's regex isn't enough (rare)
psql / sqlite3 — for the database hop

Integrations

Integrations

CKAN / DataPusher+

DuckDB

Python notebooks

CI/CD — GitHub Actions

qsv-recipes — community Luau scripts

qsv-lookup-tables — community reference CSVs

qsv-cookiecutter

Claude / LLMs

Spreadsheets (Excel / LibreOffice)

Geospatial tooling

Shell pipelines

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Get Started

Command Reference

Cookbook

Tuning & Internals

Ecosystem

Reference

Legacy

Clone this wiki locally