The aim of the model is to preprocess and clean the dataset by removing dirty and outlier data, and preparing it for machine learning analysis. The dataset contains information about car sales, including attributes such as location, make, model, price, mileage, gearbox, fuel type, power, and many others.
This project performs analysis on a dataset consisting of 62,766 rows and 49 columns pertaining to car sales, utilizing various data science techniques. The columns include attributes related to the car, such as make, model, and body type, as well as attributes related to the sale, such as price and seller information. The data also includes technical specifications of the car, such as engine power, fuel consumption, and CO2 emissions.
The dataset was cleaned and preprocessed before being used for machine learning analysis. The cleaning process involved removing duplicate and irrelevant data, filling in missing values, and converting some categorical data into numerical data. By using linear regression and independent variables, automobile price value estimation was made. The price value was estimated using "mileage" and power data.
There are several directions for future work on this project, such as:
1- Adding more data sources to enrich the dataset. 2- Trying other machine learning algorithms and hyperparameter tuning techniques. 3- Applying the models to real-world scenarios, such as car valuation or demand forecasting.