Skip to content

xzlknr/Aniwa

 
 

Repository files navigation

Aniwa

Aniwa Logo

See your data clearly.

Aniwa is an open-source universal dataset profiling and intelligence tool for developers, analysts, data engineers, researchers, and modern data teams.

Aniwa helps users quickly understand datasets through:

  • schema profiling
  • data quality analysis
  • statistical summaries
  • intelligent insights
  • rich terminal reports
  • shareable reports
  • configurable profiling workflows

Whether you're working with CSV files, Excel spreadsheets, JSON datasets, or Parquet files, Aniwa provides a fast and elegant way to inspect, understand, and trust your data.

Full documentation available here: https://reginalderzoah.github.io/Aniwa/


Current Version

v0.1.1

Why Aniwa?

Modern data workflows constantly involve:

unknown datasets

Before trusting a dataset, teams need to answer questions like:

  • What columns exist?
  • What data types are present?
  • Are there missing values?
  • Are there duplicates?
  • Are there suspicious patterns?
  • Which columns may contain IDs or sensitive information?
  • Is the dataset healthy?

Aniwa makes answering those questions:

fast, intelligent, and developer-friendly

Features

Universal Dataset Support

Aniwa currently supports:

  • CSV
  • Excel (.xlsx/.xls)
  • JSON
  • Parquet

Future releases are planned to support:

  • PostgreSQL
  • MySQL
  • DuckDB
  • BigQuery
  • Snowflake

Core Profiling

Aniwa currently provides:

Dataset Summary

  • row counts
  • column counts
  • dataset size analysis

Schema Profiling

  • type inference
  • schema overview
  • mixed type detection

Data Quality Analysis

  • null analysis
  • duplicate detection
  • uniqueness analysis
  • sparse column detection

Statistical Profiling

  • minimum values
  • maximum values
  • mean
  • median
  • standard deviation

Intelligent Insights

  • possible ID detection
  • high-cardinality warnings
  • sparse column warnings
  • suspicious quality patterns

Reporting

Aniwa currently supports:

  • Rich terminal reports
  • JSON reports
  • HTML reports

Upcoming releases are planned to include:

  • Markdown reports
  • Excel reports
  • PDF reports
  • charts
  • report templates

Quick Installation

Install Aniwa from PyPI:

pip install aniwa

Verify installation:

aniwa --help

Upgrade Aniwa:

pip install --upgrade aniwa

Quick Start

Profile a dataset:

aniwa profile customers.csv

Generate a JSON report:

aniwa profile customers.csv --report json --output profile.json

Generate an HTML report:

aniwa profile customers.csv --report html --output profile.html

Run lightweight profiling:

aniwa profile customers.csv --mode fast

Run full profiling:

aniwa profile customers.csv --mode deep

Configuration Files

Aniwa supports configuration-driven workflows.

Supported config formats:

  • YAML
  • TOML
  • JSON

Aniwa automatically searches for:

aniwa.yaml
aniwa.yml
aniwa.toml
aniwa.json

Example:

mode: deep

report:
  format: html
  output_dir: reports/

sections:
  include:
    - summary
    - schema
    - statistics
    - insights

Use a custom config file:

aniwa profile customers.csv --config config.yaml

Report Sections

Aniwa supports configurable report sections.

Current sections include:

  • summary
  • schema
  • quality
  • statistics
  • insights

Example:

aniwa profile customers.csv --include summary,statistics

Exclude sections:

aniwa profile customers.csv --exclude statistics

Example Console Output

┌──────────────────────────────┐
│      Aniwa Dataset Profile   │
├──────────────────────────────┤
│ Rows: 5                      │
│ Columns: 5                   │
│ Duplicate Rows: 1            │
└──────────────────────────────┘

Documentation

Aniwa now includes a full documentation system.

Full documentation available here: https://reginalderzoah.github.io/Aniwa/

Documentation includes:

  • getting started guides
  • architecture documentation
  • developer guides
  • release notes
  • roadmap
  • philosophy

View documentation locally with MkDocs:

mkdocs serve

Build documentation:

mkdocs build

Documentation structure:

docs/
├── index.md
├── roadmap.md
├── philosophy.md
├── getting-started/
├── developer-guide/
└── release-notes/

Installation for Development

Clone the repository:

git clone https://github.com/ReginaldErzoah/Aniwa.git
cd Aniwa

Create a virtual environment:

python -m venv .venv

Activate the environment:

Windows

source .venv/Scripts/activate

macOS/Linux

source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Install Aniwa locally:

pip install -e .

Architecture

Aniwa currently follows a modular layered architecture:

CLI
→ Configuration
→ Readers
→ Profiling Engine
→ Models
→ Reports

This architecture prioritizes:

  • modularity
  • maintainability
  • scalability
  • contributor friendliness

Project Structure

Aniwa/
│
├── aniwa/
│   ├── cli.py
│   ├── config/
│   ├── core/
│   ├── io/
│   ├── models/
│   ├── reports/
│   ├── templates/
│   └── utils/
│
├── docs/
├── tests/
├── examples/
│
├── README.md
├── CONTRIBUTING.md
├── SPRINT.md
├── mkdocs.yml
├── pyproject.toml
└── requirements.txt

Roadmap

v0.1.x - Foundation

  • universal dataset profiling
  • reporting systems
  • configuration workflows
  • modular architecture
  • developer-first UX

v0.2.x - Better User Experience

Planned features:

  • Better HTML with charts.
  • Better report templates.
  • Report modes and presets.
  • Better output management.
  • Metadata
  • Incldue & exclude sections.
  • Config file supports (yml, json & toml).
  • Improved documentation.

v0.3.x - Intelligence

Planned features:

  • correlation analysis
  • anomaly detection
  • semantic profiling
  • improved insights

v0.4.x - Universal Connectivity

Planned features:

  • PostgreSQL support
  • MySQL support
  • DuckDB support
  • BigQuery support
  • profiling history
  • snapshot management

v0.5.x - Extensibility

Planned features:

  • plugin system
  • custom profiling modules
  • community extensions

v0.6.x - AI Intelligence

Planned features:

  • dataset summarization
  • semantic understanding
  • AI-assisted recommendations
  • anomaly explanations

Philosophy

Aniwa is built around several core principles:

  • universal
  • developer-first
  • fast
  • modular
  • intelligent
  • beautiful
  • automation-friendly

The long-term goal is to build:

universal data intelligence infrastructure

For deeper architectural and ecosystem thinking, see:

docs/philosophy.md
docs/roadmap.md

Contributing

Contributions are welcome.

See:

  • CONTRIBUTING.md
  • SPRINT.md
  • docs/developer-guide/

for:

  • local development
  • testing workflows
  • architecture guidance
  • release workflows
  • contributor standards

Support

If you find this project useful:

  • star the repository
  • Share repo with friends
  • Contribute to making this project better solving open issues
  • Recommendations can also be sent to maintainer

License

Aniwa is released under the MIT License.

See LICENSE for details.

About

Open-source universal dataset profiling and intelligence tool designed for developers, analysts, data engineers, researchers, and modern data teams

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • HTML 90.5%
  • Python 8.3%
  • JavaScript 1.2%