GitHub - VahidMonfared/Air-Pollution-Data-Science-EDA: Air Pollution Data Science EDA

Dimensionality Reduction – Air Pollution Dataset

Context: Air pollution is something we all experience—whether we’re indoors or outdoors, in big cities or rural areas. It’s often invisible but has a real impact on our health and environment. Over the past few decades, we've seen a steady increase in the permissible levels of pollutants in the air. This has contributed not only to worsening air quality but also to changes in weather patterns and the more frequent occurrence of smog. The harmful effects on public health are significant and can even disrupt local economies. Urban areas, in particular, are struggling with the health consequences of rising pollution levels.

In this case study, we'll dive into and visualize an air pollution dataset. It includes detailed records of air pollutants and meteorological data, helping us better understand the composition and trends of air quality in a city over time.

Objective: The main goal here is to apply dimensionality reduction techniques, such as PCA (Principal Component Analysis) and t-SNE (t-distributed Stochastic Neighbor Embedding), to simplify the dataset by reducing the number of features—while still preserving the key patterns and insights.

Dataset Overview:

The dataset covers 13 months of air quality and weather data from a city.

It includes average daily readings of several pollutants:

    Nitric Oxide (NO), Nitrogen Dioxide (NO₂), Oxides of Nitrogen

    Ozone, PM10, PM2.5 (Particulate Matter), Sulphur Dioxide (SO₂), Benzene, Toluene, and p-Xylene

    Units: Micrograms per cubic meter of air (µg/m³)

Meteorological features include:

    Wind Speed, Vertical Wind Speed, Barometric Pressure, Temperature, Wind Direction, and Weather condition

The dataset also provides previous day readings (prefix PD_) for five pollutants:

    PD_PM10, PD_NO2, PD_SO2, PD_CO, and PD_PM2.5

Each record is timestamped by a Date field, indicating when the data was captured.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
AirPollution.csv		AirPollution.csv
AirPollution_PCA_and_TSNE_Data_Science_for_ML.ipynb		AirPollution_PCA_and_TSNE_Data_Science_for_ML.ipynb
Instruction_ imensionality Reduction_Air Pollution Dataset.txt		Instruction_ imensionality Reduction_Air Pollution Dataset.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages