Skip to content

Add Section 10: Deep Learning – LSTM and GRU Pedestrian Count Forecasting#1869

Open
Thivainv wants to merge 23 commits into
masterfrom
Thivain_t1_26
Open

Add Section 10: Deep Learning – LSTM and GRU Pedestrian Count Forecasting#1869
Thivainv wants to merge 23 commits into
masterfrom
Thivain_t1_26

Conversation

@Thivainv
Copy link
Copy Markdown
Collaborator

Summary
This pull request adds the deep learning section (Section 10) to the urban pedestrian
climate impact prediction notebook. This section builds on the data cleaning, EDA,
time series analysis, and feature engineering completed in earlier sections by
implementing and comparing three recurrent neural network architectures for hourly
pedestrian count forecasting.

What Was Added

  • Random seed configuration for full reproducibility across all runs
  • Input sequence creation (24-hour sliding window, 22 features)
  • Three deep learning model architectures:
    • Baseline LSTM (24,385 parameters)
    • Stacked LSTM (35,777 parameters)
    • GRU (19,009 parameters)
  • Model training with EarlyStopping and ReduceLROnPlateau callbacks
  • Model saving and loading for guaranteed reproducible evaluation
  • Evaluation using MAE, RMSE, and R² on the held-out test set
  • Training loss curve visualisations for all three models
  • Actual vs predicted plots across the first 200 test hours
  • Side-by-side model comparison table
  • Full markdown documentation with inline citations throughout
  • 11 new references (fn-53 to fn-63) added to the references section

@Thivainv Thivainv requested review from Litxinh123 and NguyenMav May 13, 2026 02:23
Copy link
Copy Markdown
Collaborator

@Litxinh123 Litxinh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thivain, looks good overall!

A few comments from my side:

  • The scenario, introduction, and learning outcomes are clearly written and align well with the pedestrian climate impact prediction topic.
  • The data preparation is strong, especially the hourly aggregation, climate-pedestrian merging, and time-series based feature engineering.
  • The chronological train/validation/test split is appropriate for this forecasting task and helps avoid future data leakage.
  • Model comparison is also well explained, with Baseline LSTM, Stacked LSTM, and GRU evaluated clearly using MAE, RMSE, and R².
  • The optimisation approach looks reasonable, especially with dropout, EarlyStopping, and ReduceLROnPlateau supporting more stable model training.

Overall, the notebook is well structured and the GRU result is clearly justified as the best model. I don’t see any major issue from my side. Thanks!

Copy link
Copy Markdown
Collaborator

@NguyenMav NguyenMav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Don, looks good on my end, didn't see anything glaringly wrong. Based on the documentation (https://github.com/Chameleon-company/MOP-Code/blob/master/datascience/documentation/Peer%20review%20work%20practices/Peer%20review%20work%20practices.pdf):

  • Functionality: The code works end-to-end, no errors popped up for me.
  • Reusability: The dataset seems to be updated in real-time, and the notebook have no errors despite these new additions.
  • Readability: Comments in the code blocks were provided, and explanations of the outputs were included.
  • Maintainability: The code is fairly easy to follow along, and changes can be made if the client (City of Melbourne) wants to do anything with the usecase.
  • Others: Australian English used, templates followed, V2.1 API followed, usecase naming followed, step-by-step tutorial with the headings and sub-headers added.

Thivainv added 21 commits May 17, 2026 19:49
…026/T1/UC00213_Urban_Pedestrian_Climate_Impact_Prediction directory
…026/T1/UC00213_Urban_Pedestrian_Climate_Impact_Prediction/test.txt
…026/T1/UC00213_Urban_Pedestrian_Climate_Impact_Prediction directory
…026/T1/UC00213_Urban_Pedestrian_Climate_Impact_Prediction/test.txt
Copy link
Copy Markdown
Collaborator

@Litxinh123 Litxinh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thivainv, looks good overall!

  • The preprocessing workflow is well structured, especially the hourly aggregation, climate-pedestrian merging, and handling of datetime-based features.
  • The feature engineering part is also strong, including lag features, rolling averages, and cyclical time encoding for sequential modelling.
  • The visualisations are clear and help explain both pedestrian flow and climate behaviour effectively.
  • The chronological train/validation/test split is appropriate for this forecasting task and helps avoid data leakage.
  • The comparison between Baseline LSTM, Stacked LSTM, and GRU is presented clearly with suitable evaluation metrics.
  • The tuning techniques such as dropout, EarlyStopping, and ReduceLROnPlateau are also implemented properly to improve training stability and model performance.
  • All required files have been included for a more complete handover package

Overall, the notebook is well organised and the deep learning workflow makes sense for this prediction task. Happy to approve!

Copy link
Copy Markdown
Contributor

@molliefernandez-mentor molliefernandez-mentor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Thivainv Please make sure you delete the files in your Playground folder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants