Feature/schema statistics and samples by Nechja · Pull Request #4 · Nechja/Schemalyzer

Nechja · 2025-10-04T20:00:42Z

This pull request introduces new schema statistics and data sampling features to the export and comparison commands, enhancing the ability to analyze and understand database schemas. The main changes include support for collecting table and column statistics (such as row counts and sample values), updates to the schema data model, and implementations for MySQL, PostgreSQL, and Oracle backends.

New statistics and sampling features:

Added command-line flags to enable collection of schema statistics, row counts, and column sample values, including a configurable sample size, in both the export and compare commands (cmd/schemalyzer/commands/export.go, cmd/schemalyzer/commands/compare.go). [1] [2]
Implemented the collectStatistics function to aggregate statistics (table count, view count, total columns, index count, row counts, and column samples) when exporting or comparing schemas (cmd/schemalyzer/commands/common.go). [1] [2] [3]

Schema model enhancements:

Extended the Schema, Table, and Column structs to include fields for overall statistics (Stats), per-table row counts (RowCount), and per-column sample values (Samples). Added a new SchemaStats struct to represent aggregated statistics (pkg/models/schema.go). [1] [2] [3] [4]

Backend support for statistics:

Defined a new StatisticsReader interface for retrieving row counts and column samples, and implemented it for MySQL, PostgreSQL, and Oracle readers, including safe identifier quoting and value conversion for each backend (internal/database/interfaces.go, internal/database/mysql/reader.go, internal/database/postgres/reader.go, internal/database/oracle/reader.go). [1] [2] [3] [4]

- Added --with-stats flag to include schema statistics (table count, column count, etc.) - Added --with-row-count flag to include row counts for each table - Added --with-samples flag to include sample values for each column - Added --sample-size flag to control number of samples (default: 3) - Implemented StatisticsReader interface for PostgreSQL - Updated models to include optional statistics fields - Statistics collection continues even if some queries fail This helps users understand data volume and content patterns without manual queries.

…Oracle readers

Nechja added 2 commits October 4, 2025 11:07

feat: add GetTableRowCount and GetColumnSamples methods to MySQL and …

9984615

…Oracle readers

Nechja merged commit c1242ae into main Oct 4, 2025
7 checks passed

Nechja deleted the feature/schema-statistics-and-samples branch December 10, 2025 05:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/schema statistics and samples#4

Feature/schema statistics and samples#4
Nechja merged 2 commits intomainfrom
feature/schema-statistics-and-samples

Nechja commented Oct 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nechja commented Oct 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant