Skip to content

AryanSharma1017/ASX-Stock-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Alpha Engine

A Multimodal Hybrid Deep Learning Framework for ASX Stock Prediction

Python PyTorch FastAPI Next.js FinBERT Vercel Render Research License


🧠 Research-Grade Financial AI System

Multimodal Deep Learning + Financial NLP + Production Deployment

🌐 Live Application


📊 Predicting ASX Stock Direction Using:

📈 Technical Indicators • 💬 Reddit Sentiment • 📰 Financial News • 🌍 Macroeconomics • 🧠 Sequential Deep Learning


📌 Overview

Alpha Engine is a research-grade multimodal financial forecasting framework designed to predict directional movement of Australian Securities Exchange (ASX) stocks using deep learning and heterogeneous financial modalities.

The framework integrates:

  • 📈 Historical stock market data
  • 📊 Technical indicators
  • 💬 Reddit financial sentiment
  • 📰 Financial news sentiment
  • 🌍 Macroeconomic market-regime indicators
  • 🧠 Sequential deep learning architectures

The project investigates whether multimodal temporal fusion architectures can outperform traditional standalone machine learning approaches for ASX stock forecasting.



🌐 Live Demo

🚀 Deployed AI Application

🔗 Frontend (Vercel)

https://asx-stock-prediction-dpm6ecmax-aryansharma1017s-projects.vercel.app


✨ Features Available in Deployment

✅ Real-time ASX prediction interface
✅ Dynamic model switching
✅ Probability confidence visualization
✅ Deep learning inference API
✅ Multi-model forecasting engine
✅ Modern financial dashboard UI
✅ Production-ready FastAPI backend
✅ Cloud-hosted multimodal AI system


🧠 Currently Deployed Models

Model Configuration Test AUC
🥇 Best Model LSTM + Transformer + News FinBERT 0.6449
🥈 Second Model LSTM + Transformer + Macro + Reddit + News 0.6237

🎯 Research Objectives

RQ Research Question
RQ1 How does integrating sentiment data affect ASX stock prediction performance?
RQ2 Do macroeconomic and geopolitical variables improve forecasting capability?
RQ3 Do multimodal sequential architectures outperform traditional ML models?
RQ4 How effectively can deep sequential models learn temporal financial dependencies?

🏗️ System Architecture

                    ┌────────────────────┐
                    │  Yahoo Finance API │
                    └─────────┬──────────┘
                              │
                    Historical OHLCV Data
                              │
                              ▼
                    ┌────────────────────┐
                    │ Technical Features │
                    └─────────┬──────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼

┌────────────────┐  ┌────────────────┐  ┌──────────────────┐
│ Reddit Dataset │  │ News Headlines │  │ Macro Indicators │
└───────┬────────┘  └────────┬───────┘  └────────┬─────────┘
        │                    │                   │
        ▼                    ▼                   ▼

┌────────────────┐  ┌────────────────┐  ┌──────────────────┐
│ FinBERT NLP    │  │ FinBERT /      │  │ Market Regime    │
│ Sentiment      │  │ VADER Pipeline │  │ Features         │
└───────┬────────┘  └────────┬───────┘  └────────┬─────────┘
        │                    │                   │
        └────────────┬───────┴────────────┬──────┘
                     ▼                    ▼

              ┌─────────────────────────────┐
              │ Multimodal Feature Fusion   │
              └─────────────┬──────────────┘
                            ▼

          ┌──────────────────────────────────┐
          │ Sequential Deep Learning Models  │
          │ LSTM / Transformer / Hybrid      │
          └──────────────────────────────────┘

☁️ Production Deployment Architecture

                    ┌──────────────────────────┐
                    │      Next.js Frontend    │
                    │        (Vercel)          │
                    └────────────┬─────────────┘
                                 │
                       HTTPS REST API Calls
                                 │
                                 ▼

                    ┌──────────────────────────┐
                    │      FastAPI Backend     │
                    │         (Render)         │
                    └────────────┬─────────────┘
                                 │
                ┌────────────────┼────────────────┐
                │                │                │
                ▼                ▼                ▼

      ┌────────────────┐ ┌────────────────┐ ┌────────────────┐
      │ PyTorch Models │ │ Scalers (.pkl) │ │ Metadata JSON  │
      └────────────────┘ └────────────────┘ └────────────────┘
                                 │
                                 ▼

                  ┌─────────────────────────┐
                  │ Multimodal Inference AI │
                  └─────────────────────────┘
                                 │
                                 ▼

                   ASX Directional Prediction

⚡ Deployment Stack

Layer Technology
Frontend Next.js 16 + TailwindCSS
Backend API FastAPI
Deep Learning PyTorch
NLP HuggingFace Transformers
Hosting Vercel + Render
Model Serving Artifact-based inference
Version Control GitHub

📂 Dataset Overview

📈 Financial Market Dataset

Attribute Details
Primary Ticker CBA.AX
Market Australian Securities Exchange
Source Yahoo Finance (yfinance)
Period 2015–2024
Frequency Daily
Problem Type Binary Classification

🔌 API Documentation

📡 Prediction Endpoint

POST /predict

Example Request

{
  "model": "best",
  "ticker": "CBA.AX"
}

Example Response

{
  "ticker": "CBA.AX",
  "prediction": "UP",
  "confidence": 0.7421,
  "model": "LSTM + Transformer",
  "target": "Target_t7"
}

📂 Available Endpoints

Endpoint Description
/predict Run stock direction inference
/models Return available deployed models
/health API health check

🧮 Engineered Financial Features

📊 Technical Indicators & Feature Engineering

Momentum Features

  • Daily Returns
  • Log Returns
  • Lag Returns (t-1, t-2, t-3, t-5)

Trend Features

  • Rolling Moving Averages
  • MACD
  • MACD Signal
  • Ichimoku Indicators

Volatility Features

  • Bollinger Bands
  • Rolling Volatility
  • Rolling Standard Deviation

Volume Features

  • Volume Change
  • Volume Moving Average

Oscillators

  • RSI (Relative Strength Index)


💬 Sentiment Analysis Pipelines

🧠 Reddit Financial Sentiment

Component Details
Source r/AusFinance, r/ASX_Bets
NLP Model FinBERT
Purpose Retail investor sentiment modelling
Aggregation Daily grouped aggregation

Generated Features

[
    "sentiment_mean",
    "sentiment_std",
    "positive_ratio",
    "negative_ratio",
    "post_volume"
]

📰 Financial News Sentiment

Component Details
Source The Guardian Open Platform API
Initial Model VADER
Final Model FinBERT
Observation FinBERT > VADER for financial forecasting

Final News Features

[
    "news_sentiment_mean",
    "news_sentiment_std",
    "news_positive_ratio",
    "news_negative_ratio",
    "news_headline_volume"
]

🌍 Macroeconomic Features

📉 Market Regime Indicators
Variable Description
^VIX Global volatility / fear index
^GSPC S&P500 spillover effects
AUDUSD=X Currency sentiment
CL=F Oil futures
GC=F Gold futures

Engineered Macro Features

[
    "vix_return",
    "sp500_return",
    "audusd_return",
    "oil_return",
    "gold_return"
]

🎯 Prediction Targets

Target Description
Target_t1 Next-day movement
Target_t7 7-day future movement

🔥 Major Research Discovery

Target_t7 consistently outperformed Target_t1 across nearly all architectures and feature groups.

This suggests:

  • medium-horizon forecasting contains stronger learnable temporal structure,
  • while next-day prediction behaves closer to Efficient Market Hypothesis assumptions.

🤖 Implemented Models

📌 Traditional Machine Learning

Model Purpose
Logistic Regression Linear interpretable baseline
XGBoost Nonlinear ensemble baseline

Observation

AUC ≈ 0.50–0.53

Traditional tabular methods struggled to capture temporal dependencies and multimodal interactions.


🧠 Sequential Deep Learning Models

Architecture Description Performance
Improved LSTM Multi-layer temporal sequence model Strong
LSTM + CNN Local temporal motif extraction Moderate
LSTM + Transformer Sequential memory + temporal attention Best Overall
LSTM + Informer Efficient long-sequence attention Weak Generalization

🏆 Current Best Model

🥇 LSTM + Transformer Hybrid

Configuration Value
Target Target_t7
Features Stock + News FinBERT
Sequence Length 60
Hidden Size 96
Optimizer AdamW
Epochs 150

📊 Best Results

Metric Score
Validation AUC 0.7075
Test AUC 0.6449

📉 Evaluation Metrics

evaluation_metrics = [
    "ROC-AUC",
    "F1-Score",
    "Accuracy",
    "Confusion Matrix",
    "Threshold Analysis",
    "Probability Distribution Analysis"
]

⚠️ Important Methodological Finding

Models demonstrated stronger ranking capability than calibrated binary directional classification.

Therefore:

  • ROC-AUC became the primary evaluation metric,
  • instead of raw accuracy alone.

🛡️ Leakage Prevention Strategy

🔒 Research-Grade Temporal Integrity

Strict Time-Series Principles

✅ Chronological splitting only
✅ No random shuffling
✅ Sentiment shifted by +1 day
✅ Scaling fitted only on training data
✅ Sequence generation after preprocessing
✅ No future information leakage


Sentiment Alignment Rule

sentiment(t-1) ---> predicts ---> stock movement(t)

This preserves realistic financial forecasting conditions.


🧪 Experimental Ablation Studies

Evaluated Feature Groups

[
    "Stock Only",
    "Stock + Reddit FinBERT",
    "Stock + News FinBERT",
    "Stock + Macro",
    "Stock + Reddit + News",
    "Stock + Macro + News",
    "Stock + Macro + Reddit + News"
]

📁 Project Structure

ASX-Stock-Prediction/
│
├── Backend/
│   ├── app/
│   │   ├── main.py
│   │   ├── inference.py
│   │   ├── model.py
│   │   └── config.py
│   │
│   ├── artifacts/
│   │   ├── best_lstm_transformer_news_finbert/
│   │   └── second_lstm_transformer_finbert_macro/
│   │
│   ├── requirements.txt
│   └── runtime.txt
│
├── frontend/
│   ├── app/
│   ├── public/
│   ├── package.json
│   └── .env.local
│
├── notebooks/
│   └── alphaengine.ipynb
│
├── data/
│   └── multimodal datasets
│
├── README.md
└── LICENSE

⚙️ Tech Stack

Category Technologies
Deep Learning PyTorch
NLP HuggingFace Transformers
Financial NLP FinBERT
ML Scikit-learn, XGBoost
Data Processing Pandas, NumPy
Visualization Matplotlib, Seaborn
Financial Data yfinance

📌 Core Research Contributions

✅ ASX-specific multimodal forecasting framework
✅ Temporal multimodal fusion methodology
✅ Medium-horizon forecasting superiority discovery
✅ FinBERT > VADER financial NLP finding
✅ Sequential models > tabular models finding
✅ Transformer-enhanced temporal forecasting improvements
✅ Research-grade leakage-safe methodology


🔮 Future Work

  • Explainable AI (SHAP/LIME)
  • Cross-stock generalization
  • Portfolio optimization
  • Event-aware transformers
  • Attention visualization
  • Regime-switching architectures
  • Reinforcement learning integration

⚙️ Local Development Setup

1️⃣ Clone Repository

git clone https://github.com/yourusername/ASX-Stock-Prediction.git
cd ASX-Stock-Prediction

🧠 Backend Setup

Create Environment

cd Backend

python -m venv venv
source venv/bin/activate

Install Dependencies

pip install -r requirements.txt

Run FastAPI Server

uvicorn app.main:app --reload

Backend runs on:

http://127.0.0.1:8000

🎨 Frontend Setup

cd frontend

npm install
npm run dev

Frontend runs on:

http://localhost:3000

🔐 Environment Variables

Frontend .env.local

NEXT_PUBLIC_API_URL=https://your-render-backend-url.onrender.com

Backend Environment Variables

MODEL_DIR=artifacts/
DEVICE=cpu

🧠 Model Artifact System

Each deployed model contains:

model.pth
scaler.pkl
feature_columns.json
metadata.json

Artifact Responsibilities

File Purpose
model.pth PyTorch trained weights
scaler.pkl StandardScaler object
feature_columns.json Ordered feature schema
metadata.json Model configuration

This architecture enables:

✅ Multi-model deployment
✅ Dynamic model switching
✅ Production-safe inference
✅ Modular AI deployment
✅ Scalable experimentation


🚀 Deployment Workflow

Research Notebook
        ↓
Model Training
        ↓
Best Model Selection
        ↓
Artifact Serialization
        ↓
FastAPI Inference Engine
        ↓
Render Cloud Deployment
        ↓
Next.js Financial Dashboard
        ↓
Vercel Production Hosting

☁️ Deployment Platforms

Service Purpose
Vercel Frontend Hosting
Render Backend API Hosting
GitHub Version Control
HuggingFace NLP Models
yfinance Financial Data

📚 Academic & Engineering Significance

Alpha Engine bridges the gap between:

Academic Research Production AI Systems
Financial forecasting research Cloud AI deployment
Multimodal deep learning Real-time inference
Financial NLP experimentation Production APIs
Sequential temporal modelling Full-stack AI engineering

This project evolved from:

🎓 Research Thesis
➡️ into
🚀 Deployable Financial AI Platform


🛣️ Future Deployment Roadmap

Alpha Engine is actively evolving from a research-grade forecasting framework into a scalable real-world financial AI platform.

The following production upgrades and research extensions are planned for future releases.


🚀 Upcoming Platform Features

📈 Multi-Stock Prediction Engine

Current deployment focuses on:

CBA.AX

Future versions will support:

  • Commonwealth Bank (CBA.AX)
  • BHP Group (BHP.AX)
  • CSL Limited (CSL.AX)
  • NAB (NAB.AX)
  • ANZ (ANZ.AX)
  • Westpac (WBC.AX)
  • ASX200 constituent forecasting

Goal:

  • generalized cross-stock inference,
  • portfolio-scale forecasting,
  • sector-aware modelling.

🧠 Advanced Transformer Architectures

Future research models include:

  • Temporal Fusion Transformer (TFT)
  • PatchTST
  • TimeGPT-style architectures
  • Cross-attention multimodal transformers
  • Event-aware transformers
  • Hierarchical attention systems

These architectures aim to improve:

  • long-term temporal understanding,
  • event sensitivity,
  • market regime adaptation.

🌍 Real-Time Financial Intelligence Pipeline

Planned upgrades:

✅ Live market data streaming
✅ Real-time news ingestion
✅ Streaming Reddit sentiment analysis
✅ Intraday prediction support
✅ Continuous model updates
✅ Automated retraining pipelines

Future architecture:

Live APIs
    ↓
Streaming NLP Pipelines
    ↓
Real-Time Feature Fusion
    ↓
Continuous Inference Engine

📊 Explainable AI Integration

Future deployment versions will include:

  • SHAP explainability
  • Attention heatmaps
  • Feature importance dashboards
  • Temporal contribution visualization
  • Prediction reasoning interfaces

Goal:

  • improve interpretability,
  • enhance financial trustworthiness,
  • support human-AI collaboration.

📱 Future Frontend Enhancements

Upcoming UI improvements:

  • Interactive financial charts
  • Historical prediction explorer
  • Confidence trend visualization
  • Portfolio dashboard
  • Mobile-responsive analytics
  • AI-generated market summaries

🧪 Research Expansion

Future research directions include:

  • Cross-market forecasting
  • Crypto-financial multimodal fusion
  • Reinforcement learning trading agents
  • Regime-switching neural systems
  • Event-driven forecasting
  • Financial graph neural networks
  • LLM-enhanced market reasoning

🎯 Long-Term Vision

Alpha Engine aims to evolve into:

A fully scalable multimodal financial intelligence platform capable of combining quantitative finance, financial NLP, deep temporal learning, and cloud AI deployment into a unified real-time forecasting ecosystem.


🔄 Current Development Status

Phase Status
Research Framework ✅ Completed
Deep Learning Experiments ✅ Completed
Multimodal Fusion System ✅ Completed
Production API Deployment ✅ Completed
Cloud Frontend Deployment ✅ Completed
Real-Time Streaming AI 🚧 In Progress
Explainable AI Dashboard 🔜 Planned
Portfolio Intelligence System 🔜 Planned
Large-Scale Multi-Stock Engine 🔜 Planned

👨‍💻 Author

Aryan Sharma

🎓 Software Engineering — Artificial Intelligence
📍 Melbourne, Australia
🧠 AI Researcher • Full-Stack Developer • Financial AI Enthusiast


🚀 Alpha Engine

A Research-to-Production Financial AI System


⭐ Star the Repository

If you found Alpha Engine interesting or useful, consider starring the project and following its development.


Built with ❤️ using Deep Learning, Financial NLP, and Full-Stack AI Engineering

About

Multimodal Hybrid Deep Learning Framework for ASX Stock Prediction using FinBERT, LSTM-Transformer architectures, and production AI deployment.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages