In LanguageDCAT-AP, we use dct:format and dcat:mediaType with maximum cardinality ..n , while DCAT-AP sets the maximum cardinality for both to ..1.
In order to appropriately describe datasets such as annotated corpora, consisting of the source files and their annotations (cf. DankMemes Dataset with source images in JPG format and their annotations in CSV), or datasets composed of data files in various formats (e.g. Audio Skills including video files in MP3 and WAV), we need to allow for multiple values of formats added on the same Distribution.
Note that we have considered using different Distributions for the different formats but this is not deemed appropriate as (a) the semantics of dcat:Distribution are meant for multiple representations/serializations of the dataset and not for the different filetypes included in the dataset, and (b) we prefer to use multiple Distributions for the representation of the same Dataset with different licences. (e.g., offered for free for research purposes vs. on a fee for commercial applications).
We have also considered using a different property, but we would rather use the two properties above for interoperability purposes when exchanging datasets across data spaces and other catalogues.
In LanguageDCAT-AP, we use dct:format and dcat:mediaType with maximum cardinality
..n, while DCAT-AP sets the maximum cardinality for both to..1.In order to appropriately describe datasets such as annotated corpora, consisting of the source files and their annotations (cf. DankMemes Dataset with source images in JPG format and their annotations in CSV), or datasets composed of data files in various formats (e.g. Audio Skills including video files in MP3 and WAV), we need to allow for multiple values of formats added on the same Distribution.
Note that we have considered using different Distributions for the different formats but this is not deemed appropriate as (a) the semantics of dcat:Distribution are meant for multiple representations/serializations of the dataset and not for the different filetypes included in the dataset, and (b) we prefer to use multiple Distributions for the representation of the same Dataset with different licences. (e.g., offered for free for research purposes vs. on a fee for commercial applications).
We have also considered using a different property, but we would rather use the two properties above for interoperability purposes when exchanging datasets across data spaces and other catalogues.