Skip to content

Data export fetching ambiguity due to project name change #4497

@lcjohnso

Description

@lcjohnso

Expectation: when a project team requests a project-level data export (e.g., the subjects export) and that export previously existed, the new export CSV will overwrite the previous results, the updated_at timestamp will update, and the metadata.state will change to "ready" when available.

Problem: if the project has changed names, the query to the media table will return more than one result due to changes in the filename variable in the content_disposition column that is based on the project name (use of this: the filename replaces the hash filename when downloaded). Unfortunately, in some cases the first result is the older / defunct file, meaning teams do not have access to the desired, most recent file.

Example: for DELVE Dwarf Galaxy Question (project_id = 22365) has two project_subjects_export rows in the media table due to project name change -- OLD filename = "delve-dwarfquest-for-the-public-subjects.csv" vs. CURRENT filename = "delve-dwarf-galaxy-quest-milky-way-neighbors-subjects.csv".
Panoptes DB Query = select * from media where linked_id=22365 and linked_type='Project' and type = 'project_subjects_export'

Possible Solution: prevent the creation of multiple rows in the media table for a single type of data export by improving the identification and overwriting of existing rows when new versions of

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions