Macro-Sentiment-Engine

Geopolitical, Global Macro & Financial News Sentiment Analysis

Project overview

This project implements a macro-focused sentiment analysis system designed to ingest global financial and geopolitical news, classify dominant market narratives, and quantify sentiment trends over time.

The system is intended to support macro analysis, market monitoring, and research workflows, enabling users to observe how narratives such as inflation, monetary policy, geopolitics, and risk sentiment evolve and potentially become reflected in market expectations.

The project was built as part of a Data Bootcamp, while deliberately using modern, industry-relevant tools and patterns so that it can function as a portfolio-grade project rather than a purely academic exercise.

Objectives

Ingest financial and geopolitical news from multiple sources
Clean and normalise unstructured text data
Apply NLP techniques to score sentiment
Classify news into interpretable macro narratives
Aggregate sentiment by source and narrative over time
Produce structured outputs suitable for dashboards and decision support
Maintain a modular architecture that supports future expansion

High-level architecture

News Feeds (RSS / APIs) ↓ Ingestion Layer ↓ Raw Articles (Database) ↓ Cleaning & Normalisation ↓ Sentiment Scoring (NLP) ↓ Narrative Classification ↓ Daily Aggregations ↓ Dashboards / Visualisations

Data sources

Public sources (MVP)

RSS feeds from major financial and global news providers
(e.g. Bloomberg Markets, BBC World; availability may vary depending on network restrictions)

These sources are used for the MVP to avoid licensing and credential dependencies while validating the full analytical pipeline.

Subscription-based / premium sources (optional extension)

The system is architected to support authenticated or subscription-based news feeds for users who have valid access rights. This includes, but is not limited to:

Enterprise financial news APIs
Authenticated RSS feeds
Licensed data platforms providing structured news content

Integration of these feeds is not enabled by default in the MVP, but the ingestion layer is deliberately modular so that premium sources can be added without refactoring the core pipeline.

This design allows the project to scale from an educational MVP to an institutional-style research tool.

Core functionality

1. News ingestion

Modular ingestion pipeline
Support for multiple feed endpoints
Deduplication by article link
Preservation of source and publication timestamp

2. Data cleaning

HTML stripping and text normalisation
Consistent datetime handling
Separation of raw and cleaned data layers

3. Sentiment analysis

Baseline NLP sentiment scoring using a transparent, interpretable model
Compound sentiment scores per article
Model version recorded alongside outputs

4. Narrative classification

Articles are classified into high-level macro narratives using a rule-based taxonomy prioritising interpretability:

Inflation
Monetary policy / interest rates
Growth & recession
Geopolitics
Energy & commodities
Risk sentiment

This approach enables clear reasoning about why an article is associated with a narrative, which is essential in financial analysis contexts.

5. Aggregation & analytics

Daily sentiment by source
Daily sentiment by narrative
Article counts and averages suitable for dashboards and reporting

Technologies used

Python 3.12
SQLAlchemy (database modelling and access)
SQLite (MVP persistence layer)
pandas / numpy (data processing)
feedparser / BeautifulSoup (ingestion and cleaning)
VADER Sentiment (baseline NLP sentiment analysis)
GitHub Codespaces (cloud development environment)
Streamlit (interactive demo application)
Power BI (external dashboarding and visual analytics)

Repository structure

macro-sentiment-engine/ ├── app/ # Streamlit demo app ├── db/ │ ├── models.py # Database schema │ └── init_db.py # Database initialisation ├── pipeline/ │ ├── ingest_rss.py # News ingestion │ ├── clean.py # Text cleaning │ ├── score_sentiment.py # NLP sentiment scoring │ ├── narratives.py # Narrative taxonomy │ ├── tag_narratives.py # Narrative tagging │ ├── aggregate.py # Aggregation by source │ └── run_pipeline.py # End-to-end pipeline runner ├── data_out/ │ ├── sentiment.db │ ├── daily_sentiment_by_source.csv │ └── daily_sentiment_by_narrative.csv ├── requirements.txt └── README.md

Running the project

Setup

python -m pip install -r requirements.txt

Initialise the database

python -m db.init_db

Run the full pipline

python pipeline/run_pipeline.py

Outputs

SQLite database containing raw articles, cleaned text, sentiment scores, and narrative tags
CSV files for dashboarding:
- Daily sentiment by source
- Daily sentiment by narrative
Interactive Streamlit dashboard (optional deployment)

Limitations

RSS feed availability may vary depending on network restrictions
Rule-based narrative tagging may not capture nuanced or implicit themes
Baseline sentiment models may struggle with sarcasm or complex financial language
The system does not attempt to infer causality between sentiment and market prices

Future extensions

Integration of licensed, subscription-based news feeds via authenticated APIs
Replacement of baseline sentiment models with finance-specific transformer models
Event-based sentiment and “surprise” scoring
Integration of market price data for sentiment–price divergence analysis
Source credibility weighting and time-decay modelling
Deployment as a hosted API or research tool

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Macro-Sentiment-Engine

Project overview

Objectives

High-level architecture

Data sources

Public sources (MVP)

Subscription-based / premium sources (optional extension)

Core functionality

1. News ingestion

2. Data cleaning

3. Sentiment analysis

4. Narrative classification

5. Aggregation & analytics

Technologies used

Repository structure

Running the project

Setup

Initialise the database

Run the full pipline

Outputs

Limitations

Future extensions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Macro-Sentiment-Engine

Project overview

Objectives

High-level architecture

Data sources

Public sources (MVP)

Subscription-based / premium sources (optional extension)

Core functionality

1. News ingestion

2. Data cleaning

3. Sentiment analysis

4. Narrative classification

5. Aggregation & analytics

Technologies used

Repository structure

Running the project

Setup

Initialise the database

Run the full pipline

Outputs

Limitations

Future extensions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages