Skip to content
/ PPTSA Public

PPTSA (Precipitation Probabilistic Time Series Analysis) examines the statistical structure and predictability of precipitation, emphasizing the impulse-like nature of rainfall and its implications for forecasting. The study uses ~140 years of daily precipitation data from Central Park, New York City.

License

Notifications You must be signed in to change notification settings

omidemam/PPTSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Precipitation Probabilistic Time Series Analysis
Frequency analysis, statistical learning, and uncertainty-aware forecasting of long-term precipitation records.

Made with Jupyter Python MATLAB License

🌧️ Overview

This repository presents a probabilistic time-series analysis of precipitation, focusing on the statistical structure, impulse-like behavior, and predictability of rainfall.

Using 140 years of daily precipitation data from Central Park, New York City, the project combines:

  • Frequency-domain analysis
  • Classical stochastic models
  • Point-process formulations
  • Deep learning approaches
  • Probabilistic uncertainty quantification

The goal is to understand precipitation dynamics, assess forecasting limits, and explicitly characterize uncertainty, particularly in the presence of intermittent and impulsive rainfall events.


🧾 Results

📌 Frequency-Domain Reconstruction (Daily PRCP)

Daily precipitation signal reconstruction and error

Frequency-domain reconstruction of daily precipitation (PRCP) using the dominant Fourier components (top frequencies). Left: original PRCP time series overlaid with the reconstructed signal. Right: reconstruction error (PRCP − reconstructed signal), highlighting how impulse-like rainfall events dominate the mismatch.

📌 Marked Point Process (MPP): State Estimation (two years)

MPP state estimation results

MPP state estimation results for two years: (a) the measured precipitation data, (b) the continuous decoded hidden state, and (c) the normalized decoded hidden state.

📌 LSTM + Gaussian Process (GP): Prediction + Uncertainty (1912)

LSTM and GP residual uncertainty (1912)

Observed precipitation values, LSTM predictions, LSTM uncertainty band, Gaussian processes of residuals, and their mean and confidence interval for the year 1912.


🔬 Methods & Modeling Framework

The repository implements and compares the following approaches:

  • Frequency Domain Analysis
    Identifies periodicity (or lack thereof) and highlights challenges introduced by impulse-like precipitation patterns.

  • ARIMAX / SARIMAX Models
    Captures trends and conditional dependencies using classical time-series models with exogenous variables.

  • Marked Point Process (MPP)
    Models precipitation as an impulse-driven stochastic process, offering physical and statistical intuition beyond Gaussian assumptions.

  • LSTM Neural Networks
    Learns temporal dependencies among multiple meteorological variables using tuned deep recurrent architectures.

  • Gaussian Process (GP) Residual Modeling
    Quantifies predictive uncertainty by modeling LSTM residuals probabilistically.


📊 Key Findings (High Level)

  • Precipitation exhibits no strong periodicity, complicating frequency-based forecasting.
  • ARIMAX/SARIMAX effectively captures long-term trends but struggles with impulses.
  • Marked Point Processes provide interpretability for impulse-like rainfall events.
  • LSTM models capture trends but show limited accuracy for exact daily values.
  • Gaussian Processes accurately model residual uncertainty, improving probabilistic predictions.

📁 Repository Structure

PPTSA/
├─ Data/
│  └─ Raw and processed meteorological datasets
│
├─ Figures/
│  └─ Generated plots, diagnostics, and visual summaries
│
├─ Models/
│  ├─ Frequency analysis
│  ├─ ARIMAX / SARIMAX models
│  ├─ Marked Point Process formulations
│  ├─ LSTM implementations
│  └─ Gaussian Process residual models
│
├─ Report/
│  └─ PDF/
│     └─ Paper.pdf
│
├─ README.md
└─ LICENSE


Dataset:

The dataset is publicly available on Kaggle: Central Park Weather Data 1869-2022.

🤝 Contributions

  • Omid Emamjomehzadeh
    Conceptualization; finding the dataset and learning how to use it; data preprocessing; methodology development; implementing the LSTM model; hyper-parameter tuning for the LSTM; applying a Gaussian Process to model the residuals of LSTM predictions; analysis and interpretation of results; presentation preparation; writing the original report draft; and editing.

  • Ahmadreza Ahmadjou
    Implementing the SARIMAX forecasting model (data processing, training, and parameter tuning via trial-and-error); Marked Point Process (MPP) approach, including conceptualization, implementation, and results analysis/interpretation; report writing (writing and editing); and presentation preparation.

  • Ruixuan Zhang
    Implementing frequency-domain analysis (DFT and DSTFT), results analysis and visualization, report writing, and presentation preparation.

📬 Contact

For questions, feedback, or collaboration opportunities, please email me at:
omid.emamjomehzadeh@nyu.edu


📚 Citation

If you use this repository in your research or projects, please cite it as follows.

@misc{emamjomehzadeh2024pptsa,
  author       = {Emamjomehzadeh, Omid and Zhang, Ruixuan and Ahmadjou, Ahmadreza},
  title        = {Daily Rainfall Time Series Analysis of Central Park, New York},
  year         = {2024},
  howpublished = {\url{https://github.com/omidemam/PPTSA}},
}





About

PPTSA (Precipitation Probabilistic Time Series Analysis) examines the statistical structure and predictability of precipitation, emphasizing the impulse-like nature of rainfall and its implications for forecasting. The study uses ~140 years of daily precipitation data from Central Park, New York City.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published