This project involves building a machine learning model to predict students' final performance based on various academic, demographic, and socio-economic factors. The goal is to identify key factors influencing student success and develop a reliable predictive model using real-world data.
- Data cleaning and preprocessing
- Exploratory Data Analysis (EDA) with visualizations
- Feature engineering and encoding categorical variables
- Model training using Random Forest Regressor
- Performance evaluation using R² score and Mean Squared Error
- Visualization of actual vs predicted results
- Python
- Pandas
- Scikit-learn
- Seaborn
- Matplotlib
- Jupyter Notebook
The dataset contains various features such as student demographics, study habits, and other factors, along with their final grades.
(Include dataset source or upload dataset file if permitted)
-
Clone the repository:
git clone https://github.com/yourusername/student-performance-predictor.git
-
Navigate to the project directory:
cd student-performance-predictor -
(Optional but recommended) Create and activate a virtual environment:
-
On Linux/Mac:
python3 -m venv venv source venv/bin/activate -
On Windows:
python -m venv venv venv\Scripts\activate
-
-
Install the required libraries:
pip install -r requirements.txt
-
Launch the Jupyter Notebook:
jupyter notebook Student_Performance_Predictor.ipynb
-
Follow the instructions inside the notebook to input student data and get predictions.
student-performance-predictor/
│
├── LICENSE # Project license (MIT)
├── README.md # Project overview and instructions
├── requirements.txt # Python dependencies
├── student_score_predictor.py # Main Python script for prediction
├── student_scores.csv # Dataset file with student data
├── predicted_vs_actual.png # Visualization: predicted vs actual results
├── regression_plot.png # Visualization: regression plot
└── Student_Performance_Predictor.ipynb # Jupyter notebook with EDA, modeling, and evaluation