Crypto_Price_pipeline is a tool that collects and organizes cryptocurrency price data. It gathers information from CoinGecko and processes it into easy-to-understand tables and charts. This helps users follow market trends without technical steps.
This guide will help you download and run Crypto_Price_pipeline on a Windows computer. No programming skill or technical knowledge needed.
Crypto_Price_pipeline automatically collects crypto prices from CoinGecko, a popular data provider. It then sorts the data into three levels of detail:
- Bronze: Raw data straight from the source
- Silver: Cleaned and organized data
- Gold: Final, user-friendly tables and reports
The pipeline runs tasks in a set order, building up data accuracy as it moves through each stage. It also creates interactive charts so you can watch price trends live.
The software uses powerful tools like PySpark and Delta Lake, but you do not need to know them to use the app.
Before starting, make sure your computer meets these needs:
- Windows 10 or later (64-bit)
- At least 8 GB of RAM (16 GB recommended for smooth use)
- 5 GB of free disk space
- Stable internet connection to fetch data
- Administrator rights to install new software
You do not need to install Python or Spark yourself. The app package includes everything needed.
This project focuses on:
- Real-time cryptocurrency price data
- Organizing data in layers for clean analysis
- Using Apache Spark for data processing
- Collecting data from CoinGecko API
- Running data jobs automatically with Databricks setup
- Creating reports using SQL and Python code behind the scenes
- Storing data efficiently with Delta Lake
Step 1: Visit the download page using the button below:
Step 2: On the GitHub page, look for the green “Code” button. Click it and then choose “Download ZIP.”
Step 3: Once the ZIP file finishes downloading, find it in your Downloads folder.
Step 4: Right-click the ZIP file and select “Extract All...” Choose a folder you will remember.
Step 5: Open the extracted folder.
Crypto_Price_pipeline includes an easy-to-use launcher. Follow these steps to start it:
Step 1: Inside the extracted folder, find the file named run_pipeline.bat. This is a batch file that starts the program.
Step 2: Double-click run_pipeline.bat. A command window will open, and the program will begin working.
You will see messages showing different steps. Do not close the window until the process finishes.
Step 3: When the program completes, it will create reports you can open.
After running the app:
- Look inside the
outputfolder in the extracted directory. - You will find files and folders with data organized by Bronze, Silver, and Gold levels.
- Open the
dashboard.htmlfile with your internet browser to see an interactive view of the latest crypto prices.
This dashboard updates every time you run the pipeline.
To keep your data accurate, run the run_pipeline.bat file regularly. The program will fetch new market information and refresh reports.
If an update is available on GitHub:
- Visit the download page (use the button above).
- Download the new ZIP file.
- Extract it similarly and replace old files with new ones.
- Run the batch file again.
- If the command window closes immediately, try right-clicking
run_pipeline.batand select “Run as administrator.” - Make sure you have a stable internet connection for data downloads.
- If you see errors about missing files, re-extract the ZIP file to ensure all files are present.
- Restart your computer if the app does not start after installation.
- Check the
logsfolder for error details if the program runs but data does not update.
Crypto_Price_pipeline uses Medallion Architecture. This means it builds data in steps:
- Bronze: Collects raw crypto prices from CoinGecko API using PySpark
- Silver: Cleans and formats data for easier use
- Gold: Creates final tables and visual reports
These steps run in order using Databricks Jobs, a scheduling and running platform for big data tasks.
All the data is stored in Delta Lake, a format that keeps track of changes and lets you view history.
Q: Do I need Databricks or Spark installed?
No. The package runs these tools behind the scenes, so you do not install them separately.
Q: Can I use this software offline?
No. It needs an internet connection to get data from CoinGecko.
Q: What cryptocurrencies does it track?
It tracks all currencies available via CoinGecko’s free API.
Q: Can I change the data update frequency?
Currently, you run the program manually. Scheduling automatic runs is possible but requires technical setup.
- Crypto_Price_pipeline GitHub Repository
- CoinGecko API Documentation
- Apache Spark Overview
- Delta Lake Guide
If you encounter problems or have suggestions, create an issue on the GitHub repository. Include clear details about your system and what you tried.
If you want to run pipeline tasks one by one:
- Open a command prompt inside the extracted folder.
- Use Python scripts located under
/scriptsto run specific parts. - Refer to the code comments for how each script works.
This tool only reads public market data. It does not store personal information or require login credentials.
Run the software in a safe environment and be cautious when opening files generated by the app.