Project by: Jana Nikolovska
Supervised by: Giacomo Frisoni, MSc
Prof. Gianluca Moro, PhD
ALMA MATER STUDITORIUM - University of Bologna
November 2025
Summary:
In this project, I explore the use of Large Language Models (LLMs) for time series forecasting, focusing on the task of stock market prediction. The work was proposed and mentored by Prof. Gianluca Moro and Giacomo Frisoni at the University of Bologna.
As a starting point, I used a provided notebook by my mentors. The notebook introduces the dataset (historical S&P 500 data via yfinance), a basic linear regression model, which I chose not to include in the final results, that served as an early benchmark and highlighted the underlying complexity of the forecasting problem and finally, the "Trading Protocol" — a framework designed to evaluate forecasting performance through simulated trading strategies —that I extended upon to incorporate multiple strategy types and Monte Carlo-based robustness testing.
On the chosen dataset, a traditional statistical model ARIMA and LLM-based forecasting approaches have been tested and evaluated. For the implementation for LLM-Time, I followed the methodology described in this paper and its official implementation. The code from the paper has been adapted and extended with additional functionality tailored to the specific requirements of my experiments. I also included a comparison with TimesFM following the example code from this blog and code repository.
The problem was defined comparable to that of the referenced paper, particularly in terms of training and testing proportions. The task was defined such that, given 150 days of historical values, the objective was to predict the subsequent 29 days values following an autoregressive approach without direct access to ground truth during prediction.
Unlike the provided baseline notebook, which operated directly on price levels, this project instead utilized log returns as the primary time-series input. This choice was motivated by findings from the exploratory data analysis, which indicated that the raw price series exhibited non-stationary behavior — making a dificult target for predicting. Transforming the data into returns helped achieve approximate stationarity, giving a chance to the models to capture relative changes more effectively.
As mentioned earlier, both statistical and trading-based evaluation methods were used, with greater emphasis placed on the trading metrics (Trading Protocol), while the statistical ones were included primarily for completeness and exploratory insight.
DATA_ANALYSIS.ipynb Plotly visualizations and analysis of time series data
PROJECT.ipynb Inference and evaluation pipeline for ARIMA and LLMTime. Combined results including TimesFM
PROJECT_timesfm.ipynb Inference and evaluation pipeline for TimesFM (needed to be separated due to different preconditions and environments)
PRESENTATION.pdf short PowerPoint Presentation
The steps for replicability refer to the PROJECT.ipynb and DATA_ANALYSIS.ipynb notebook, whereas the PROJECT_timesfm.ipynb can be run as it is on Google Colab. They will guide you to setting up an OpenAI API needed for accessing the GPT models used in the LLMTime implementation.
Disclaimer: it needs to be run on Linux or WSL
Run the following command to install all dependencies in a conda environment named llm_stock_prediciton. Change the cuda version for torch if you don't have cuda 11.8.
source install.shAfter installation, activate the environment with
conda activate llm_stock_predicitonIf you prefer not using conda, you can also install the dependencies listed in install.sh manually.
In the root of your project directory, create a folder named secrets:
mkdir secretsInside the secrets folder, create a file called openai_key.txt:
touch secrets/openai_key.txtPaste your OpenAI API key into that file. It should look like this:
sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
jupyter notebook
Open and execute either DATA_ANALYSIS.ipynb or PROJECT.ipynb