Two different lines of an Upload metadata.tsv should never share the same data_path, because that would imply that the exact same data was to end up in two different datasets. This problem is not detected. Failure to detect the problem led to a situation when a multi-line metadata.tsv file with all data_path entries set to '.' caused all the files to be scanned by the plugin validators many times, once for each line of metadata.
I know of no way to describe a relationship between lines in the schema language which defines the table schemata, but it would be easy to insert an ad-hoc test of the 'data_path' field around here somewhere:
Two different lines of an Upload metadata.tsv should never share the same data_path, because that would imply that the exact same data was to end up in two different datasets. This problem is not detected. Failure to detect the problem led to a situation when a multi-line metadata.tsv file with all data_path entries set to '.' caused all the files to be scanned by the plugin validators many times, once for each line of metadata.
I know of no way to describe a relationship between lines in the schema language which defines the table schemata, but it would be easy to insert an ad-hoc test of the 'data_path' field around here somewhere:
ingest-validation-tools/src/ingest_validation_tools/table_validator.py
Line 71 in b359ac2