Open
Conversation
Removed mention of '100% free-tier AI' from the project description.
…d methodology docs
|
i have uploaded my work in my repo mentors please check my work there |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces the completed Phase 1 Minimum Viable Product (MVP) for the AStats project, designed for the UW-Madison GSoC 2026 proposal. It establishes a fully functional, end-to-end agentic AI framework capable of exploring datasets, generating statistical visualizations, executing Python code, and compiling professional analytical reports — validated against real-world scientific datasets.
🚀 What Was Accomplished (Phase 1)
We built the entire modular framework entirely from scratch, ensuring best practices in software architecture and AI integration:
1. Core Agentic Architecture
Plan-Execute-Reflect Loop: Implemented a robust
BaseAgentclass that autonomously guides the LLM through a structured reasoning cycle with iterative self-correction on execution errors.Specialist Agents: Created domain-specific agents (
EDAAgent,HypothesisAgent,RegressionAgent,TimeSeriesAgent) for targeted statistical methodologies.Workflow Orchestrator: Developed an intelligent router that analyzes user queries to automatically select the best agent, with configurable autonomy levels (
full-auto,semi-auto,step-by-step).2. Multi-Provider LLM Abstraction
Built a flexible BaseLLMProvider abstraction layer to prevent vendor lock-in.
Integrated 4 providers:
Google Gemini 2.5 Flash — Primary free-tier default
Groq (Llama 3 70B / Mixtral) — Blazing-fast open-weight inference
Anthropic Claude 3.5 — Sonnet, Opus, and Haiku support
OpenAI / Codex — GPT-4o and legacy Codex support
All providers include streaming, structured function calling, and automatic retry with exponential backoff.
3. Data Engine & Tool Registry
Intelligent Tooling: Provided agents with sandboxed execution environments via the
ToolRegistry(create_plot,describe_data,run_code,run_statistical_test,fit_model).Auto-Profiler: Built a high-speed profiling engine to ingest datasets (CSV, JSON, Parquet, Excel, Stata, SPSS, Feather, etc.) and generate LLM-readable statistical summaries.
Smart Visualization Engine: Auto-selects distributions, correlation heatmaps, box plots, and categorical charts based on data profile.
4. Real-World Dataset Validation
Validated the full pipeline end-to-end on standard scientific datasets:
Each dataset has a fully documented example script in
examples/.5. Statistical Methodology Documentation
6. CLI, UX & Reporting
Interactive CLI: Built a
Click-based CLI with subcommands (explore,analyze, profile,config).HTML/Markdown Reports: Implemented automated, beautifully formatted HTML report generation using Jinja2, with Base64-embedded Seaborn/Matplotlib plots for zero-dependency sharing.
🔮 What's Next (Phase 2 & Beyond)
Now that the core framework is mathematically and architecturally sound, the next steps for the project entail:
Local Open-Weight Integration: Wiring up local inference servers (e.g., Ollama or vLLM) so the analysis can be run entirely air-gapped without API keys.
Fine-Tuning Pipelines: Researching and building dataset generation pipelines to fine-tune a smaller LLM specifically on data practitioner reasoning.
Advanced Evaluators: Implementing "critic" agents to double-check statistical code and assumptions before execution.
Workflow Templates: Developing reusable, recipe-based workflow templates for common statistical analyses.
🧪 Testing Notes
Tested end-to-end on
sample_sales.csv,iris.csv,diabetes.csv, and titanic.csv.All 3 example scripts (eda_iris_example.py, regression_diabetes_example.py, hypothesis_titanic_example.py) execute successfully and produce correct statistical outputs.
Verified environment variable injection (
.env) for seamless API key configuration across all 4 providers.All generated plots are verified to embed natively via Base64 in output HTML reports.