This project performs a comprehensive Exploratory Data Analysis (EDA) on a car details dataset to uncover patterns, trends, and actionable insights related to automotive specifications and customer preferences.
The analysis focuses on understanding:
- Market dominance of car brands
- Vehicle type distribution
- Fuel efficiency comparisons
- Production trends over time
- Engine type and drivetrain preferences
- Ground clearance comparison across brands
🔗 Dataset Source: https://carapi.app/features/vehicle-csv-download
- 🔍 Identify the most common car brands and vehicle types
- ⛽ Compare fuel efficiency across segments (SUVs, Sedans, etc.)
- 🏭 Analyze production trends by brand over the years
- 🌱 Explore preferred fuel/engine types (Gas, Hybrid, etc.)
- 🚗 Examine preferred drivetrain modes (FWD, AWD, RWD)
- 🏔️ Compare highest ground clearance provided by brands
- Python 3.x
- 📊 Pandas – Data cleaning & manipulation
- 🧮 NumPy – Numerical computations
- 🎨 Matplotlib – Core visualizations
- 📈 Seaborn – Statistical & advanced plots
- 🤖 Scikit-learn – Basic preprocessing & analysis
- Handling missing values
- Data type corrections
- Feature selection & filtering
- Outlier inspection
- Structured formatting for analysis
This ensures accurate, reliable, and interpretable visualizations.
The project includes:
- 📌 Bar charts for brand and vehicle distribution
- 📌 Pie charts for categorical comparisons
- 📌 Scatter plots for fuel efficiency relationships
- 📌 Violin plots for distribution analysis
- 📌 Trend analysis charts for production growth
Each visualization is designed to communicate clear and meaningful insights.
- Dominant brands and market concentration patterns
- Fuel efficiency variations across vehicle segments
- Increasing trends in hybrid and alternative engine adoption
- Popular drivetrain preferences (e.g., AWD in SUVs)
- Brands offering superior ground clearance
These insights simulate real-world automotive market research scenarios.
Car-EDA/
│
├── data/ # Dataset files
├── notebooks/ # Jupyter notebooks (EDA analysis)
├── visuals/ # Generated charts and plots
├── requirements.txt # Dependencies
└── README.md
git clone https://github.com/Jeet-Lohar-itzJeeSKUULL/Data_Analysis_EDA_Process.git
cd car-eda-projectpython -m venv venv
source venv/bin/activate # Mac/Linux
venv\Scripts\activate # Windowspip install -r requirements.txtjupyter notebookOpen the EDA notebook and execute cells step-by-step.
- Strong data cleaning and preprocessing skills
- Practical use of Python data analysis libraries
- Ability to derive insights from real-world datasets
- Effective data storytelling through visualizations
- Structured exploratory workflow
- Analytical thinking for business-driven conclusions
- Implement predictive modeling (price prediction, demand trends)
- Emission and sustainability analysis
- Integration of additional automotive datasets
- Interactive dashboards using Plotly or Power BI
- Deployment as a web-based analytics dashboard
Contributions, suggestions, and improvements are welcome!
Feel free to:
- Fork the repository
- Raise issues
- Submit pull requests
Jeet Lohar Data Analyst | Python Developer | Django Enthusiast
🔗 LinkedIn: https://www.linkedin.com/in/jeet-lohar/
This is not just a basic dataset exploration. It reflects:
- Real-world market analysis simulation
- Structured analytical thinking
- Clean visualization techniques
- Industry-relevant insights
- Business-focused interpretation of data
It showcases the ability to convert raw automotive data into meaningful, decision-support insights.
This project is licensed under the MIT License.