Skip to content

yunusdonmez-dev/Data-preprocessing

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

125 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Team2 Data Processing Project

Introduction

The aim of the model is to preprocess and clean the dataset by removing dirty and outlier data, and preparing it for machine learning analysis. The dataset contains information about car sales, including attributes such as location, make, model, price, mileage, gearbox, fuel type, power, and many others.

Dataset Description

This project performs analysis on a dataset consisting of 62,766 rows and 49 columns pertaining to car sales, utilizing various data science techniques. The columns include attributes related to the car, such as make, model, and body type, as well as attributes related to the sale, such as price and seller information. The data also includes technical specifications of the car, such as engine power, fuel consumption, and CO2 emissions.

Results and Insights

The dataset was cleaned and preprocessed before being used for machine learning analysis. The cleaning process involved removing duplicate and irrelevant data, filling in missing values, and converting some categorical data into numerical data. By using linear regression and independent variables, automobile price value estimation was made. The price value was estimated using "mileage" and power data.

Future Work

There are several directions for future work on this project, such as:

1- Adding more data sources to enrich the dataset. 2- Trying other machine learning algorithms and hyperparameter tuning techniques. 3- Applying the models to real-world scenarios, such as car valuation or demand forecasting.

About

Data Processing Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 100.0%