Skip to content

jmankani-coder/macro-sentiment-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Macro-Sentiment-Engine

Geopolitical, Global Macro & Financial News Sentiment Analysis

Project overview

This project implements a macro-focused sentiment analysis system designed to ingest global financial and geopolitical news, classify dominant market narratives, and quantify sentiment trends over time.

The system is intended to support macro analysis, market monitoring, and research workflows, enabling users to observe how narratives such as inflation, monetary policy, geopolitics, and risk sentiment evolve and potentially become reflected in market expectations.

The project was built as part of a Data Bootcamp, while deliberately using modern, industry-relevant tools and patterns so that it can function as a portfolio-grade project rather than a purely academic exercise.

Objectives

  • Ingest financial and geopolitical news from multiple sources
  • Clean and normalise unstructured text data
  • Apply NLP techniques to score sentiment
  • Classify news into interpretable macro narratives
  • Aggregate sentiment by source and narrative over time
  • Produce structured outputs suitable for dashboards and decision support
  • Maintain a modular architecture that supports future expansion

High-level architecture

News Feeds (RSS / APIs) ↓ Ingestion Layer ↓ Raw Articles (Database) ↓ Cleaning & Normalisation ↓ Sentiment Scoring (NLP) ↓ Narrative Classification ↓ Daily Aggregations ↓ Dashboards / Visualisations

Data sources

Public sources (MVP)

  • RSS feeds from major financial and global news providers
    (e.g. Bloomberg Markets, BBC World; availability may vary depending on network restrictions)

These sources are used for the MVP to avoid licensing and credential dependencies while validating the full analytical pipeline.

Subscription-based / premium sources (optional extension)

The system is architected to support authenticated or subscription-based news feeds for users who have valid access rights. This includes, but is not limited to:

  • Enterprise financial news APIs
  • Authenticated RSS feeds
  • Licensed data platforms providing structured news content

Integration of these feeds is not enabled by default in the MVP, but the ingestion layer is deliberately modular so that premium sources can be added without refactoring the core pipeline.

This design allows the project to scale from an educational MVP to an institutional-style research tool.

Core functionality

1. News ingestion

  • Modular ingestion pipeline
  • Support for multiple feed endpoints
  • Deduplication by article link
  • Preservation of source and publication timestamp

2. Data cleaning

  • HTML stripping and text normalisation
  • Consistent datetime handling
  • Separation of raw and cleaned data layers

3. Sentiment analysis

  • Baseline NLP sentiment scoring using a transparent, interpretable model
  • Compound sentiment scores per article
  • Model version recorded alongside outputs

4. Narrative classification

Articles are classified into high-level macro narratives using a rule-based taxonomy prioritising interpretability:

  • Inflation
  • Monetary policy / interest rates
  • Growth & recession
  • Geopolitics
  • Energy & commodities
  • Risk sentiment

This approach enables clear reasoning about why an article is associated with a narrative, which is essential in financial analysis contexts.

5. Aggregation & analytics

  • Daily sentiment by source
  • Daily sentiment by narrative
  • Article counts and averages suitable for dashboards and reporting

Technologies used

  • Python 3.12
  • SQLAlchemy (database modelling and access)
  • SQLite (MVP persistence layer)
  • pandas / numpy (data processing)
  • feedparser / BeautifulSoup (ingestion and cleaning)
  • VADER Sentiment (baseline NLP sentiment analysis)
  • GitHub Codespaces (cloud development environment)
  • Streamlit (interactive demo application)
  • Power BI (external dashboarding and visual analytics)

Repository structure

macro-sentiment-engine/ ├── app/ # Streamlit demo app ├── db/ │ ├── models.py # Database schema │ └── init_db.py # Database initialisation ├── pipeline/ │ ├── ingest_rss.py # News ingestion │ ├── clean.py # Text cleaning │ ├── score_sentiment.py # NLP sentiment scoring │ ├── narratives.py # Narrative taxonomy │ ├── tag_narratives.py # Narrative tagging │ ├── aggregate.py # Aggregation by source │ └── run_pipeline.py # End-to-end pipeline runner ├── data_out/ │ ├── sentiment.db │ ├── daily_sentiment_by_source.csv │ └── daily_sentiment_by_narrative.csv ├── requirements.txt └── README.md

Running the project

Setup

python -m pip install -r requirements.txt

Initialise the database

python -m db.init_db

Run the full pipline

python pipeline/run_pipeline.py

Outputs

  • SQLite database containing raw articles, cleaned text, sentiment scores, and narrative tags
  • CSV files for dashboarding:
    • Daily sentiment by source
    • Daily sentiment by narrative
  • Interactive Streamlit dashboard (optional deployment)

Limitations

  • RSS feed availability may vary depending on network restrictions
  • Rule-based narrative tagging may not capture nuanced or implicit themes
  • Baseline sentiment models may struggle with sarcasm or complex financial language
  • The system does not attempt to infer causality between sentiment and market prices

Future extensions

  • Integration of licensed, subscription-based news feeds via authenticated APIs
  • Replacement of baseline sentiment models with finance-specific transformer models
  • Event-based sentiment and “surprise” scoring
  • Integration of market price data for sentiment–price divergence analysis
  • Source credibility weighting and time-decay modelling
  • Deployment as a hosted API or research tool

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors