GitHub - Almas-ansari/Ev-range-prediction: This project investigates machine-learning approaches to predict the remaining driving range of electric vehicles using onboard telemetry and contextual features. The work, conducted as part of a research internship at the Department of Management Studies, IIT Roorkee

Electric Vehicle (EV) Range Prediction using Classical and Neural Regression Models

Abstract

This project investigates machine-learning approaches to predict the remaining driving range of electric vehicles using onboard telemetry and contextual features. The work, conducted as part of a research internship at the Department of Management Studies, IIT Roorkee, implements a full pipeline — data collection and cleaning, feature encoding, baseline linear models, tree-based ensemble models, and a feed-forward neural network — and provides reproducible notebooks for model training and evaluation. The codebase includes notebooks for exploratory analysis, preprocessing, model experiments, and visualization.

Problem statement

Predict the remaining driving range (continuous variable, in km) of an electric vehicle given a snapshot of vehicle state and environmental context (battery level, recent driving behavior, ambient conditions, etc.) Accurate short-term range prediction helps reduce driver range anxiety, enables smarter charging strategies, and improves energy management for EV fleets.

Data

Files present: data.csv (raw), data_cleaned.csv (processed), data_enc_dummies.csv and data_enc_label.csv (encoded feature variants). Collection method: a scraping script data_scrapper.py was used and the dataset was programmatically collected/assembled as part of the internship pipeline after required permission and license check for data. Typical features (as used across notebooks): telemetry-like features (battery percentage/state-of-charge, recent distance/speed statistics), environmental/context features (temperature, maybe terrain), and engineered variables produced in the feature extraction notebooks. Exact column names and counts are available in data_cleaned.csv and the feature notebooks. Data snapshots/artifacts: cleaned dataset and two encoded variants (dummy one-hot and label-encoded) are included to support different model types (tree vs linear/NN).

Preprocessing & feature engineering

Implemented steps: Data cleaning (data cleaning.ipynb): missing value handling, basic sanity checks, duplicate removal, and timestamp parsing where applicable. Feature extraction & encoding (Features extraction.ipynb, Feature_encoding.ipynb): derived features from telemetry (e.g., rolling averages, recent speed or distance windows), categorical encoding (one-hot and label encodings saved as separate CSVs), and normalization/standardization where needed for linear/NN models. Exploration (basic_data_exploration.ipynb, Advanced data exploration.ipynb): univariate and bivariate analyses, target distribution checks, and correlation inspection to select candidate features. These steps are reproducible in the notebooks and produce the data_cleaned.csv and encoded variants used by the model notebooks.

Models implemented Each model has an associated notebook that trains and evaluates it on the processed dataset:

Linear Regression (linear_regression.ipynb) Purpose: simple, interpretable baseline to set a performance floor.

Preprocessing: feature scaling / encoding as required.

Tree-based methods (Trees.ipynb)

Implemented algorithms: Decision Tree and ensemble methods (Random Forest, Gradient Boosting-style models).

Strength: handle nonlinear interactions and categorical features without heavy scaling; useful reference for feature importance.

Feed-forward Neural Network (MLP) (Feed forward neural netwrok.ipynb)

A simple multi-layer perceptron to model complex non-linear relations; notebook contains architecture, training loop, and loss curves.

Implemented using the deep-learning framework present in the notebook (see the notebook header for exact framework and hyperparameters).

Notes: Notebooks include hyperparameter choices and training code — run them top-to-bottom to reproduce training and evaluation.

Method

Data split: Notebooks perform a train/test separation (train/test split) to evaluate generalization; cross-validation or hyperparameter tuning is performed where indicated in the corresponding notebooks.

Evaluation metrics: Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) are used as primary metrics for regression performance. Notebooks compute and report these metrics per model.

Reproducibility: The repo contains cleaned datasets and model scripts/notebooks; saving trained artifacts and metrics to a results/ folder is recommended for future runs.

Suggested next steps

Add temporal models (LSTM/Transformer) for sequences of telemetry to capture state transitions.

Implement uncertainty quantification (e.g., quantile regression or Monte Carlo Dropout) to produce confidence bounds — valuable in driver-facing applications.

Evaluate model calibration and produce a lightweight on-device inference pipeline (pruning/distillation) for embedded deployment.

Create an evaluation suite that stresses the model on edge-case scenarios (cold start, steep slopes, heavy loads).

Reproducibility & artifacts

Run the notebooks in order: data cleaning → feature encoding → model training notebooks.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Advanced data exploration.ipynb		Advanced data exploration.ipynb
EV range prediction.pptx		EV range prediction.pptx
Feature_encoding.ipynb		Feature_encoding.ipynb
Features extraction.ipynb		Features extraction.ipynb
Feed forward neural netwrok.ipynb		Feed forward neural netwrok.ipynb
LICENSE		LICENSE
New Text Document.txt		New Text Document.txt
README.md		README.md
Trees.ipynb		Trees.ipynb
Untitled.ipynb		Untitled.ipynb
basic data exploration.ipynb		basic data exploration.ipynb
basic_data_Exploration.ipynb		basic_data_Exploration.ipynb
craweler.JPG		craweler.JPG
data cleaning.ipynb		data cleaning.ipynb
data scrapper.py		data scrapper.py
data.csv		data.csv
data_cleaned.csv		data_cleaned.csv
data_enc_dummies.csv		data_enc_dummies.csv
data_enc_label.csv		data_enc_label.csv
data_input_snapp.JPG		data_input_snapp.JPG
linear_regression.ipynb		linear_regression.ipynb
lr plot.JPG		lr plot.JPG
relational_data_Exploration.ipynb		relational_data_Exploration.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages