Skip to content

Almas-ansari/Sentiment-based-automatic-crypto-investor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Overview

This work implements an end-to-end automated trading prototype that integrates natural language sentiment extraction with time-series price modeling for crypto/stock assets. The pipeline begins with large-scale tweet collection (~1.5M items) using a scraping framework (twint), followed by preprocessing, tokenization, and sentiment classification with a fine-tuned BERT transformer. The sentiment features are temporally aligned with OHLCV and technical indicators to produce a fused feature space for downstream modeling. The system architecture includes modular scripts for data ingestion, feature engineering, model training, and a Flask-based bot capable of live inference and simulated trade execution.

Modeling Approach

The modeling stack explores multiple paradigms: (1) classical ML baselines (RandomForestRegressor, Logistic Regression, SVM) to benchmark signal strength (2) A deep sequence model (LSTM/GRU) trained on concatenated sentiment-price sequences to capture temporal dependencies (3) hybrid pipelines where transformer-derived sentiment embeddings are aggregated into fixed-length windows and merged with technical features before being passed to the LSTM. The project also includes experimentation with feed-forward neural nets and gradient boosting to compare sample efficiency. Models are evaluated on predictive accuracy (classification F1 / regression RMSE), as well as strategy-level backtest metrics (cumulative returns, Sharpe ratio, max drawdown).

Metrics & System Capabilities

Empirical evaluation shows that incorporating sentiment features reduces error metrics and improves directional accuracy compared to price-only models. The LSTM with BERT-based sentiment inputs consistently outperformed baselines, highlighting the predictive value of social signals during high-volatility intervals. The repository provides not only model notebooks but also a structured codebase (structured_code/) for reproducibility, checkpointed models (Models-Trade Bot/), and a Flask API for integrating predictions into a live trading loop. Limitations include noise sensitivity of sentiment features and challenges with overfitting on small windows. Future extensions could involve domain-specific transformers (FinBERT), probabilistic forecasting (Bayesian LSTM, quantile regression), and reinforcement learning agents for direct policy optimization.

About

End-to-end automated trading framework fusing transformer-based sentiment analysis (BERT) with sequence models (LSTM/GRU) on fused social+price signals, benchmarked against classical ML baselines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors