What does this program do?
This pipeline is designed to process transaction data stored in CSV format, filtering and aggregating the data based on specified criteria, and writing the results to JSON files.
To execute my work just run the beam_pipeline.py script or use the command line in terminal:
"python beam_pipeline.py" OR "python3 beam_pipeline.py"
No need to run any 'pip' commands ive included the pip install for the requirements.txt inside the script.
We require pandas and apache-beam[GCP]
Thanks for reading!