SQL-first ClickHouse migrations for Python teams.
PyClickHouseMigrator is a small, predictable migration runner for ClickHouse. It keeps schema changes in plain .sql files, applies them in order, stores migration state inside ClickHouse, validates checksums, supports rollback SQL, and fits naturally into CI/CD.
No ORM. No schema diff engine. No framework. Just a clean migration flow for teams that want their ClickHouse DDL to live in Git.
- Plain SQL migrations — migration files are
.sql, not Python modules. - Explicit statement blocks — each
-- @stmtblock is executed as one ClickHouse query; the migrator does not split SQL by semicolons. - Rollback SQL —
-- migrator:downstores the rollback SQL used bymigrator rollback. - Checksum validation — detects edited or missing applied migration files.
- Dry-run mode — preview pending migrations without writing migration state.
- Baseline — adopt an existing database without executing historical migrations.
- Advisory locking — protects common CI/CD concurrency cases.
- Cluster-aware service tables — migration metadata and lock tables can be created with
ON CLUSTERand replicated engines. - Small dependency set — depends on
clickandclickhouse-driver.
pip install py-clickhouse-migratorWith uv:
uv add py-clickhouse-migratorThe CLI command is:
migrator --helpSet the ClickHouse connection URL. The database in the URL must already exist.
export CLICKHOUSE_MIGRATE_URL=clickhouse://default@localhost:9000/mydbCreate a migrations directory and a first migration:
migrator init
migrator new create_users_tableEdit the generated file in ./db/migrations:
-- migrator:up
-- @stmt
CREATE TABLE IF NOT EXISTS users
(
id UInt64,
name String,
created_at DateTime DEFAULT now()
)
ENGINE = MergeTree
ORDER BY id
-- migrator:down
-- @stmt
DROP TABLE IF EXISTS usersPreview what will run:
migrator up --dry-runApply pending migrations:
migrator upCheck status:
migrator showRollback the last applied migration:
migrator rollbackinit and new work offline. Commands that read or write migration state require a ClickHouse connection.
A migration is a .sql file with two required sections:
-- migrator:up
-- @stmt
-- SQL to apply the migration
-- migrator:down
-- @stmt
-- SQL to roll it backRules:
- each file must contain exactly one
-- migrator:upsection and exactly one-- migrator:downsection; -- migrator:upmust appear before-- migrator:down;- SQL must be placed inside
-- @stmtblocks; - the
upsection must contain at least one non-empty statement block; - the
downsection may be empty; - each
-- @stmtblock is sent to ClickHouse as one query; - the migrator does not split SQL by
;.
This is intentional. It avoids fragile semicolon splitting and makes multi-statement migrations explicit.
Good:
-- migrator:up
-- @stmt
CREATE TABLE IF NOT EXISTS events
(
id UInt64,
message String
)
ENGINE = MergeTree
ORDER BY id
-- @stmt
ALTER TABLE events ADD COLUMN IF NOT EXISTS created_at DateTime DEFAULT now()
-- migrator:down
-- @stmt
ALTER TABLE events DROP COLUMN IF EXISTS created_at
-- @stmt
DROP TABLE IF EXISTS eventsAvoid putting multiple ClickHouse queries into one block:
-- bad
-- @stmt
CREATE TABLE a (id UInt64) ENGINE = MergeTree ORDER BY id;
CREATE TABLE b (id UInt64) ENGINE = MergeTree ORDER BY id;Use separate blocks instead:
-- good
-- @stmt
CREATE TABLE a (id UInt64) ENGINE = MergeTree ORDER BY id
-- @stmt
CREATE TABLE b (id UInt64) ENGINE = MergeTree ORDER BY idSee Migration format for more examples.
Create the migrations directory. Default: ./db/migrations.
migrator init
migrator --path ./migrations initWorks offline.
Create a timestamped .sql migration file.
migrator new create_users_tableGenerated filename format:
YYYYMMDDHHmmss_create_users_table.sql
The name suffix is optional, but recommended. Only letters, digits, and underscores are allowed.
Works offline.
Apply pending migrations in filename order.
migrator up # apply all pending migrations
migrator up 3 # apply next 3 pending migrations
migrator up --dry-run| Option | Default | Description |
|---|---|---|
N |
all | Optional positional limit: number of pending migrations to apply. |
--lock / --no-lock |
--lock |
Enable or disable the migration lock. |
--lock-ttl |
600 |
Lock TTL in seconds. |
--lock-retry |
3 |
Lock acquire retry attempts. |
--dry-run |
off | Print pending migration SQL without executing it. |
--validate / --no-validate |
--validate |
Enable or disable preflight validation with EXPLAIN AST. |
--allow-dirty |
off | Skip checksum mismatch failures for this run. |
Example output:
20260421140000_create_users_table.sql applied [✔]
20260421143000_add_events_table.sql applied [✔]
Rollback applied migrations in reverse order.
migrator rollback # rollback last applied migration
migrator rollback 3 # rollback last 3 applied migrations
migrator rollback --dry-run| Option | Default | Description |
|---|---|---|
N |
1 |
Optional positional limit: number of migrations to rollback. |
--lock / --no-lock |
--lock |
Enable or disable the migration lock. |
--lock-ttl |
600 |
Lock TTL in seconds. |
--lock-retry |
3 |
Lock acquire retry attempts. |
--dry-run |
off | Print rollback SQL without executing it. |
--validate / --no-validate |
--validate |
Enable or disable preflight validation with EXPLAIN AST. |
Rollback uses the down SQL stored in db_migrations at the time the migration was applied, not the current file content.
Display applied migrations, pending migrations, HEAD, baseline markers, and integrity warnings.
migrator show
migrator show --all| Option | Default | Description |
|---|---|---|
--all |
off | Show all applied migrations. By default, only the latest 5 applied migrations are shown. |
Example:
Applied:
[X] 20260421143000_add_events_table.sql (HEAD)
[X] 20260421140000_create_users_table.sql
Pending:
[ ] 20260421150000_add_status_column.sql
Applied: 2 | Pending: 1
Possible applied migration markers:
| Marker | Meaning |
|---|---|
HEAD |
Latest applied migration row. |
baseline |
Migration was recorded by baseline, not executed. |
modified |
Applied migration file exists but its checksum no longer matches. |
missing |
Applied migration file is missing locally. |
Adopt an existing ClickHouse database without replaying historical DDL.
migrator baselineUse this when your database schema already exists and you have .sql migration files that represent that existing schema. Point the migrator at that directory with --path or CLICKHOUSE_MIGRATE_DIR, or put the files in the default ./db/migrations directory. Then run migrator baseline once and use migrator up for future migrations.
Baseline behavior:
- creates the
db_migrationsservice table if it does not exist; - requires
db_migrationsto have no rows; - marks all current
.sqlfiles in the migrations directory as already applied (baselinerows); - does not execute SQL;
- does not validate that the ClickHouse schema matches the files;
- baseline rows are not selected by
rollback; - baseline rows are excluded from checksum validation.
See Baseline existing databases.
Update stored checksums to match current migration files.
migrator repairUse this only after intentionally editing already-applied migration file(s) and confirming that the database state is still consistent with those edits. migrator show and migrator up report that applied migration SQL changed; repair updates the stored checksum to accept the current file content in future checks. It does not execute SQL, does not modify your application schema, and skips missing files.
Show active migration lock information.
migrator lock-infoExample:
Locked by: runner-01:1234:abc123ef
Locked at: 2026-04-21 14:30:00
Expires at: 2026-04-21 14:40:00
Manually release a stuck lock.
migrator force-unlockUse this when a deployment process crashed and the lock did not get released. Locks also expire automatically after their TTL.
Global options can be provided through CLI flags or environment variables.
| CLI option | Environment variable | Default | Description |
|---|---|---|---|
--url |
CLICKHOUSE_MIGRATE_URL |
— | ClickHouse connection URL. Required for DB commands. |
--path |
CLICKHOUSE_MIGRATE_DIR |
./db/migrations |
Migrations directory. |
--cluster |
CLICKHOUSE_MIGRATE_CLUSTER |
— | ClickHouse cluster name for service table DDL. |
--connect-retries |
CLICKHOUSE_MIGRATE_CONNECT_RETRIES |
0 |
Connection retry attempts. |
--connect-retries-interval |
CLICKHOUSE_MIGRATE_CONNECT_RETRIES_INTERVAL |
1 |
Seconds between connection retries. |
--send-receive-timeout |
CLICKHOUSE_MIGRATE_SEND_RECEIVE_TIMEOUT |
600 |
ClickHouse client send/receive timeout in seconds. |
-v, --verbose |
— | off | Enable DEBUG logging. |
-q, --quiet |
— | off | Suppress INFO/WARNING logs; command output such as dry-run SQL is still printed. |
Connection URL example:
clickhouse://user:password@host:9000/databaseFor TLS/secure connections, use connection parameters supported by clickhouse-driver URLs, for example:
clickhouse://user:password@host:9440/database?secure=Truedocker pull maksimburtsev/py-clickhouse-migrator:latestRun migrations:
docker run --rm \
-v "$PWD/db/migrations:/migrations" \
-e CLICKHOUSE_MIGRATE_URL=clickhouse://default@clickhouse:9000/mydb \
maksimburtsev/py-clickhouse-migrator:latest \
upInside the Docker image, the default migrations directory is /migrations.
See Docker usage.
In CI/CD, the usual deployment step is intentionally simple:
migrator upGitHub Actions example:
- name: Run ClickHouse migrations
run: migrator up
env:
CLICKHOUSE_MIGRATE_URL: ${{ secrets.CLICKHOUSE_MIGRATE_URL }}Recommended deployment pattern:
- build and test application code;
- run
migrator uponce per deployment; - deploy application code that depends on the new schema.
The CLI exits with 0 on success and 1 for handled migration errors such as invalid migrations, connection failures, checksum mismatches, missing databases, and lock errors.
See CI/CD usage.
When a migration is applied, PyClickHouseMigrator stores a SHA-256 checksum in db_migrations.
The checksum is computed from the up and down statement blocks extracted by the migrator, not from the migration file as a whole. This makes the checksum tied to the SQL blocks the tool will send to ClickHouse.
On migrator up, applied migration checksums are compared with the current local files. If an applied migration was modified or deleted, the command fails unless --allow-dirty is used.
Useful commands:
migrator show # display modified/missing markers
migrator up # fail fast on checksum mismatch
migrator up --allow-dirty
migrator repair # update stored checksums after intentional editsBy default, up, rollback, and their dry-run variants validate statements with EXPLAIN AST before execution.
migrator up --no-validate
migrator rollback --no-validateValidation is best-effort. It catches many syntax and parse problems early, but it is not a guarantee that execution will succeed and it is not a production-safety analyzer.
up, rollback, and baseline use an advisory migration lock by default.
The lock is stored in a ClickHouse service table named _migrations_lock. It has a TTL, a lock owner identity, retry behavior, and manual recovery commands.
migrator up --lock-ttl 900
migrator up --lock-retry 10
migrator up --no-lock
migrator lock-info
migrator force-unlockThe lock is meant to protect common deployment races, for example two CI jobs starting at the same time. It is still best practice to run migrations from a single deployment job or Kubernetes Job.
Set a cluster name when you want the migrator's own service tables to be created across a ClickHouse cluster:
export CLICKHOUSE_MIGRATE_CLUSTER=my_cluster
migrator upWhen cluster mode is enabled:
db_migrationsis created withON CLUSTERand a replicated engine;_migrations_lockis created withON CLUSTERand a replicated replacing engine;- service table writes use cluster consistency settings;
- your migration SQL is executed exactly as written.
PyClickHouseMigrator does not inject ON CLUSTER into user migrations. If your ClickHouse DDL must run on the whole cluster, write ON CLUSTER in the migration yourself.
See Cluster mode.
from py_clickhouse_migrator import Migrator
migrator = Migrator(
database_url="clickhouse://default@localhost:9000/mydb",
migrations_dir="./db/migrations",
)
migrator.up()See Python API.
- ClickHouse DDL is not transactional. A multi-statement migration can partially apply if a later statement fails.
- The advisory lock is best-effort. It reduces common concurrency problems, but it is not a substitute for a single well-defined migration job.
- The target database must exist before the migrator runs.
- Baseline does not compare migration files with the existing database schema.
- Preflight validation is best-effort and can be disabled with
--no-validate. - Each
-- @stmtblock must contain one ClickHouse query.
See Known limitations.
