Metadata and subject-specific profiles in DORIS
SND aims to ensure that data shared via our services are easy to find and described in a way that aligns as much as possible with the FAIR principles.
For data to be findable, they must be described in a standardized way that both humans and machines can understand. This is achieved using metadata – data about data – and through the use of so-called metadata standards. Metadata standards are sets of rules for how to structure and relate metadata elements within a shared domain, such as a specific research field. Using an established standard for metadata when describing research data, descriptions become readable and interpretable by both humans and machines, which is key to meeting the FAIR principles. Machine readability also allows metadata to be integrated into various systems, for example search engines or systems that automatically transfer information.
As far as possible, SND uses internationally established standards also used by other research infrastructures. For example, the metadata structure in DORIS largely builds on the DDI Lifecycle 3.3 metadata standard and on DataCite’s metadata recommendations.
Metadata profiles in DORIS
To make it easier for researchers from different disciplines to describe and share data, SND has developed subject-specific metadata profiles. These profiles align with the top levels of the Fields of Research and Development classification (FORD) from the OECD Frascati Manual and Sweden’s national research subject classification system from Statistics Sweden (SCB).
Current metadata profiles in DORIS:
- Agricultural Sciences (top level at OECD FORD and SCB)
- Medical and Health Sciences (top level at OECD FORD and SCB)
- Natural Sciences (top level at OECD FORD and SCB)
- Social Sciences (top level at OECD FORD and SCB)
- Engineering and Technology (top level at OECD FORD and SCB)
- Veterinary Science (top level at OECD FORD and SCB)
- History and Archaeology
- Earth and Related Environmental Sciences
- Language Resources
A profile for Humanities and the Arts is currently in development.
SND also provides a general profile for data that do not fit into any of the other categories.
For further reading, SND has published the documentation for the metadata profiles on Zenodo.
SND’s metadata profiles build on domain-specific metadata standards and requirements from international research infrastructures. For example, the Social Sciences profile meets the metadata requirements from CESSDA, the Language Resources profile is interoperable with the metadata schema used by CLARIN, and the Earth and Related Environmental Sciences profile fulfils requirements from both ISO 19115 and INSPIRE.
Subject headings and keywords in DORIS
To facilitate machine-readability and -interpretability, domain-specific, standardized lists of words and phrases – so-called controlled vocabularies – are often used for, for example, subject headings, keywords, and menu options. DORIS uses controlled lists from standards like DDI, Dublin Core, CESSDA, and DataCite, and supplements them with others, for example, GeoNames for geographic information, ISO 639 for language codes, and Sweden's national research subject classification system for research subjects.
DORIS supports the following lists of keywords and subject headings:
- AAT, Art & Architecture Thesaurus
- AGROVOC, Vocabulary for Agricultural Sciences
- ALLFO, Allmän finländsk ontologi
- ELSST, The European Language Social Science Thesaurus
- EnvThes, Environmental Thesaurus
- FISH, Thesaurus of Monument Types
- GCMD, Global Change Master Directory, vocabulary for Earth Science
- GEMET, GEneral Multilingual Environmental Thesaurus
- ICD-10, International Classification of Diseases
- MeSH, Medical Subject Headings
- NASA STI Thesaurus.
CVs for specific metadata elements are presented in the documentation for SND’s metadata profiles. SND
For some metadata elements, where there are no machine-readable controlled vocabularies, SND has developed lists that are based on other, established lists of keywords. For example, DORIS uses vocabularies from the Swedish National Heritage Board (Riksantikvarieämbetet) for types of remains and investigations; for historical time periods, terms developed in the ARIADNE collaboration and published in PeriodO are used.