Currently in this repo, there are some duplicated files in this repo, including:
- template files
nds/*.template -> shared/*.template
nds/spark-submit-template -> shared/spark-submit-template
- benchmark listener and reporter
nds/PysparkBenchReport.py -> utils/python_benchmark_reporter/PysparkBenchReport.py
nds/python_listener/PythonListener.py -> utils/python_benchmark_reporter/PythonListener.py
- check functions
nds/check.py -> utils/check.py
- Other utils
nds/jvm_listener -> utils/jvm_listener
nds/properties -> utils/properties
We will need to update the nds scripts to import the dependencies from utils and shared dir and remove the legacy duplicate files from nds/. All affected scripts (including nds_power.py, nds_maintenance.py, nds_gen_data.py, nds_gen_query_stream.py, etc.) need to be updated.
To ensure a smooth transition from legacy version to new version of nds scripts in internal jobs, we will need to keep both versions of power run and other nds scripts for a period of time. For example:
- Copy current
nds/nds_power.py to nds/nds_power_v1.py with legacy dependencies in nds folder for compatibility
- Update
nds/nds_power.py with new dependencies in utils folder
Once all internal jobs switch to the new version, then we can remove all duplicate files and v1 scripts from nds/
Currently in this repo, there are some duplicated files in this repo, including:
nds/*.template->shared/*.templatends/spark-submit-template->shared/spark-submit-templatends/PysparkBenchReport.py->utils/python_benchmark_reporter/PysparkBenchReport.pynds/python_listener/PythonListener.py->utils/python_benchmark_reporter/PythonListener.pynds/check.py->utils/check.pynds/jvm_listener->utils/jvm_listenernds/properties->utils/propertiesWe will need to update the nds scripts to import the dependencies from
utilsandshareddir and remove the legacy duplicate files fromnds/. All affected scripts (includingnds_power.py,nds_maintenance.py,nds_gen_data.py,nds_gen_query_stream.py, etc.) need to be updated.To ensure a smooth transition from legacy version to new version of nds scripts in internal jobs, we will need to keep both versions of power run and other nds scripts for a period of time. For example:
nds/nds_power.pytonds/nds_power_v1.pywith legacy dependencies inndsfolder for compatibilitynds/nds_power.pywith new dependencies inutilsfolderOnce all internal jobs switch to the new version, then we can remove all duplicate files and
v1scripts fromnds/