Skip to content

Code to populate re3data into OpenSearch #589

@sfisher

Description

@sfisher

Unfortunately re3data doesn't provide any robust way to search repos from the api. Their OpenSearch endpoint seems broken and only returns the same list no matter what is searched for.

We will need to populate the data into OpenSearch. See the information about the ZOD and OpenSearch schema (in other subtasks).

I believe the code in the current dmproad map will largely work fine and can just be translated into the language of choice (Typescript?). See https://github.com/CDLUC3/dmptool/blob/fa9dc0100d14e72af2e5b2898616eaba5bb1a9d0/app/services/external_apis/re3data_service.rb#L7

I believe this may be able to be done quickly by an LLM.

The main difference will be that rather than having a JSON field for "info" we can move the items under the info key to the top level as part of the OpenSearch document.

I think where this will run is another factor. Is it appropriate for a Lambda (short run time to populate?). Or does it belong under AWS batch or some other way for periodic runs. I'm guessing DMP will come up with a common way to do these types of tasks that formerly used to run as cron job or services in the old system.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions