Project Overview

This work implements an end-to-end automated trading prototype that integrates natural language sentiment extraction with time-series price modeling for crypto/stock assets. The pipeline begins with large-scale tweet collection (~1.5M items) using a scraping framework (twint), followed by preprocessing, tokenization, and sentiment classification with a fine-tuned BERT transformer. The sentiment features are temporally aligned with OHLCV and technical indicators to produce a fused feature space for downstream modeling. The system architecture includes modular scripts for data ingestion, feature engineering, model training, and a Flask-based bot capable of live inference and simulated trade execution.

Modeling Approach

The modeling stack explores multiple paradigms: (1) classical ML baselines (RandomForestRegressor, Logistic Regression, SVM) to benchmark signal strength (2) A deep sequence model (LSTM/GRU) trained on concatenated sentiment-price sequences to capture temporal dependencies (3) hybrid pipelines where transformer-derived sentiment embeddings are aggregated into fixed-length windows and merged with technical features before being passed to the LSTM. The project also includes experimentation with feed-forward neural nets and gradient boosting to compare sample efficiency. Models are evaluated on predictive accuracy (classification F1 / regression RMSE), as well as strategy-level backtest metrics (cumulative returns, Sharpe ratio, max drawdown).

Metrics & System Capabilities

Empirical evaluation shows that incorporating sentiment features reduces error metrics and improves directional accuracy compared to price-only models. The LSTM with BERT-based sentiment inputs consistently outperformed baselines, highlighting the predictive value of social signals during high-volatility intervals. The repository provides not only model notebooks but also a structured codebase (structured_code/) for reproducibility, checkpointed models (Models-Trade Bot/), and a Flask API for integrating predictions into a live trading loop. Limitations include noise sensitivity of sentiment features and challenges with overfitting on small windows. Future extensions could involve domain-specific transformers (FinBERT), probabilistic forecasting (Bayesian LSTM, quantile regression), and reinforcement learning agents for direct policy optimization.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Models-Trade Bot		Models-Trade Bot
all_code		all_code
bot-main		bot-main
structured_code		structured_code
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Modeling Approach

Metrics & System Capabilities

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Modeling Approach

Metrics & System Capabilities

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages