Skip to content

nickyjgarland/dri_metadata_relation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DRI Metadata Compilation - Relation matching

This github repository contains R code to assist in the compilation of the DRI Batch Metadata Template (https://doi.org/10.7486/DRI.qn603p95v-11) used for batch ingest of multiple digital objects within the Digital Repository of Ireland (DRI). More specifically, the code searches for related digital objects from within a single collection based on the ‘Filename’ field and populates the ‘dc:Relation’ field (and creates new fields where there is more than one relation) within the template.

The code requires that there is a unique identifier within the filenames for the code to find the relations. In the sample data the filename follows this format: OBJ_XX_description, where XX is the unique object number. All filenames with the same object numbers are associated with a single group and therefore related to one another. For digital objects to be linked in the DRIs online catalogue (found under ‘Related materials’), the relation needs to be recorded in the batch metadata template.

The code can be broken down into 8 key steps as follows:

  1. Read the completed batch metadata template (i.e. the .ods file with the ‘Filename’ field populated – the filename must contain unique identifiers)
  2. Creates a new dataframe based on the unique identifiers from the ‘Filename’ column and groups all filenames under the same unique identifier together in a field called ‘relation’. This step also creates a new field called ‘Fnm2’ which contains the unique identifier used for each grouping.
  3. Creates a new dataframe of the completed batch metadata template with a new field called ‘Fnm2’ populated with the unique identifier.
  4. Joins the dataframe from step 2 with the dataframe from step 3 based on the field ‘Fnm2’.
  5. Adds a new field called dc:Relation to the dataframe from Step 4 and populates it with the groupings. It searches this field and removes reference to the filename of each row.
  6. Splits dc:Relation into multiple dc:Relation columns so each column only contains one entity, as per the requirements of the DRI.
  7. Tidies the dataframe by removing unnecessary columns so that it only contains those (or multiple of those) found in the batch metadata template.
  8. Exports the tidied dataframe to .ods format.

Please note: this code collates related fields by using a partial unifying string from the filename column - i.e. '10'. As such, line 18 in the attached code will need to be adjusted to suit your dataset.

This repository contains the following objects:

  • The R code - 'relation_matching.R' to undertake this automated process
  • A test dataset - 'data/DRI_metadata_test.odb' - created based on metadata from 'The Irish Stone Axe Project Digital Collection' (see details below)
  • The R Project file utilised in RStudio
  • A ReadMe and Licence file
  • 'sessionInfo.text' - A SessionInfo file to capture the R session information while running the code
  • The output file – ‘data/ dri_metadata.ods’

The DRI's Batch Metadata template can be downloaded from the repository from this DOI: https://doi.org/10.7486/DRI.qn603p95v-11


This code was initially designed to compile metadata for the 'The Irish Stone Axe Project Digital Collection', compiled and deposited by The Discovery Programme and University College Dublin. The metadata has been provided under the CC0 licence of this material in the DRI.

The original collection is a digital image catalogue associated with prehistoric stone axe heads with an Irish provenance. For each stone axe head found within the catalogue at least one artefact photographs, one line drawings and two images of petrographic thin sections have been deposited in the DRI. This means there was a minimum of 4 related digital objects per stone axe head that needed inputting into the metadata batch template.

The Discovery Programme, University College Dublin. The Irish Stone Axe Project Digital Collection. Collection [Type]. Digital Repository of Ireland (2024) [Publisher]. The Discovery Programme [Depositor]. https://doi.org/10.7486/DRI.8623xr528 (Accessed: 2026/03/15)

Authors

Nicky Garland: Methodology, Software (Lead), Writing – Original Draft Preparation, Writing – review & editing (joint)

Lesley Davidson: Conceptualization, Data curation, Software (Supporting), Writing – Review & Editing (joint), Validation

These roles have been assigned along the structure of the Contributor Role Taxonomy (CRediT).

Maintainers

The following people are responsible for maintaining this repository.

Licence

The code in this repository, as well as the materials that are provided, are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Contributing

Pull requests are welcome to this repository. For major changes, please open an issue first to discuss what you would like to change.

About

R code that assists in Relation matching within a Metadata template to deposit data with the Digital Repository of Ireland

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages