Skip to content

AprilYoungs/data_analyze_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Analyze Agent

This project automates the generation of data analysis reports using a multi-agent system built with crewai. It takes a dataset as input and produces a Jupyter Notebook with a complete analysis, including data cleaning, exploratory data analysis (EDA), insights, and predictive modeling suggestions.

About the Project

The core of this project is a two-agent system that collaborates to perform a comprehensive data analysis:

  1. Data Analyst Expert: This agent creates a detailed, step-by-step plan for analyzing the data.
  2. Expert Python Developer: This agent takes the plan and writes the corresponding Python code, assembling both the plan and the code into a final Jupyter Notebook.

This approach allows for a clear separation of concerns, where one agent focuses on the analytical strategy and the other on the implementation, resulting in a well-structured and easy-to-understand report.

How to Use

  1. Install dependencies:

    python -m venv venv
    source .venv/bin/activate
    pip install -r requirements.txt
  2. Set up your environment:

    • Create a .env file in the root of the project.
    • Add your OPENROUTER_API_KEY to the .env file:
      OPENROUTER_API_KEY="your_api_key"
      
  3. Configure your data:

    • Open main.py and update the following variables:
      • company_name: The name of the company for the analysis.
      • dataset_description: A brief description of the dataset.
      • dataset_path: The path to your dataset.
  4. Run the agent:

    python main.py
  5. View the output:

    • analysis_plan.md: The analysis plan generated by the Data Analyst Expert.
    • analysis_report.ipynb: The final Jupyter Notebook with the complete analysis.

How it Works

The project uses the crewai library to create and manage the two agents.

  • analyze_planer_agent: This agent is responsible for creating the data analysis plan. It takes the dataset and a description as input and outputs a Markdown file with a step-by-step guide for the analysis.

  • analyze_coder_agent: This agent takes the analysis plan from the first agent and generates the Python code to execute it. It then assembles the plan (as Markdown cells) and the code (as code cells) into a Jupyter Notebook.

The two agents work in a sequential process, with the output of the first agent being the input for the second. This ensures that the final report is based on a well-defined and structured plan.

Vestiaire Collective Dataset

Second Hand Luxury Fashion Data

https://www.kaggle.com/datasets/justinpakzad/vestiaire-fashion-dataset

About Dataset

Context

This dataset contains product listings from Vestiaire, an online marketplace for buying and selling pre-owned luxury fashion items. It was scraped using Python and the Hrequests Library. The CSV file contains approximately 900k rows and 36 columns.

Inspiration

Trend Analysis: Investigate current trends in second-hand luxury fashion, such as brands, product types, and item pricing, to gain a deeper understanding of the current market trends. Geographical Analysis: Analyze which countries are the most active in terms of both buyers and sellers on Vestiaire Collective. Look for trends in user demographics, such as regions with a high concentration of second-hand luxury fashion. Item Price Prediction: Utilize machine learning algorithms to predict the price of listed items based on various available features.

About

data_analyze_agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors