Review metadata
This page covers the review of data descriptions created using SND’s documentation system, DORIS. In this context, “metadata” refers to the information entered in the fields of the DORIS web form, not information included in documentation files or data files.
- Under General guidance on reviewing metadata, you will find advice on reviewing metadata descriptions as a whole – such as which fields must be completed and how detailed the metadata should be.
- Under Reviewing metadata in DORIS, a selection of sections in the data description form is listed with discussion of selected fields to support a review of their content.
Below is a brief overview of what you need to check to ensure that metadata meet the minimum requirements for inclusion in the SND catalogue. These requirements are described in the SND document Krav och rekommendationer för data och metadata i SND:s forskningsdatakatalog (Requirements and Recommendations for Data and Metadata in SND’s Research Data Catalogue) on Zenodo.
General guidance on reviewing metadata
When reviewing metadata for a data description, the starting point is that the researcher who created the description is the person best suited to describe the data. The reviewer’s main task is therefore to ensure that all required fields – and as many relevant optional fields as possible – are completed, and that the content appears reasonable. You are not expected to verify the accuracy of the content, but rather to check that the information provided in each field makes sense in light of the accompanying documentation.
Some minor edits to the metadata may be easiest for you as the reviewer to make directly, such as correcting typos, adding additional keywords, or entering ORCID or ROR IDs. Other fields, such as data collection method, are best completed by the researcher. If you make changes as a reviewer, it is good practice to inform the researcher of what was done and to ask them to approve the catalogue entry preview before publication.
Which metadata fields must be completed?
DORIS has a minimum set of metadata elements that must be provided for a data description to be published. These requirements ensure that research data can be found, accessed, and reused, and that the metadata can be disseminated through other systems. Meeting these minimum requirements is an important step towards fulfilling the FAIR principles.
Fields marked with an orange-red symbol (·) in DORIS are mandatory and must be completed before a data description can be published. The required fields vary depending on the subject profile selected in the form. Some fields are mandatory for all profiles. Non-mandatory fields should still be completed as fully as possible.
Mandatory metadata fields must be provided in both Swedish and English. In fields where terms are selected from a controlled vocabulary, translation is handled automatically. However, in free-text fields, you must manually add a translation in the form. The translations do not need to be exact, but the same level of information should be available in both languages. For longer text entries, it may be appropriate to refer to the catalogue entry in the other language, but an introduction should be present in both languages.

Researchers may submit a data description even if some mandatory fields are missing. In such cases, the reviewer is expected to help provide the missing information. It is technically possible to publish a data description without completing all mandatory fields, but this is not recommended. Such publication should be reserved for exceptional cases; for example, when the PID requirement can be waived (see below under What applies to data descriptions that only contain metadata?). A note explaining the missing information should be added in the Notes section in DORIS.
How much information is needed and at what level?
Good metadata strike a balance between general information and detailed content. The Description field is especially important. It should explain the context and nature of the data in a way that is understandable even to researchers outside the field. It should also provide enough detail to allow potential users to quickly assess whether the dataset is relevant and worth exploring further. Other metadata fields, such as those under Collection and method, are typically aimed at a more specialized audience.
Some metadata fields contain links to external sources such as publications, websites, or related resources. These links should, whenever possible, be persistent identifiers such as a DOI, handle, or URN. If no persistent identifier is available, a standard URL can often be used, but since URLs do not guarantee long-term accessibility, they may need to be verified periodically.
Make sure the metadata are reasonable and useful, that no fields contain meaningless values, and that there are no spelling errors or typos.
What applies to data descriptions that only contain metadata?
DORIS can be used to share metadata for datasets that have already been published elsewhere. This is done by ticking the box I only want to share metadata. It can be more challenging to review these types of data descriptions, as you may not have access to the dataset (it may be openly accessible through another portal, or it may require you to request to access the data).
Note: This checkbox should not be used for datasets that cannot be shared openly – such access restrictions should instead be indicated using the Data access level field.
Data descriptions that only contain metadata are not assigned a DOI in DORIS, as the description does not include a dataset. Instead, an existing persistent identifier (PID) for the dataset must be entered in DORIS, in line with SND’s Policy for Persistent Identifiers (PID) on Research Data. Exceptions to the PID requirement may be made when the data description refers and links to a resource collection (such as a database, catalogue, or portal) that itself contains multiple datasets.
Reviewing metadata in DORIS
Below is a list of the sections included in the DORIS data description form. Each section contains several fields and some of the fields are described in more detail to highlight aspects that may be relevant to review. Keep in mind that the number of visible fields depends on the research area profile selected.
- Research area profile
- Files and access
- Citation and description
- Administrative information
- Collection and method
- Topics and keywords
- Geographic coverage
- Publications and relations
- Language resources
Research area profile
Whether a suitable research area profile has been selected often only becomes clear during review. It is possible to change the profile in the form even after the description has been submitted, but subject-specific fields will be cleared if the profile is changed. It is not essential to use the “correct” profile, but selecting an appropriate profile generally improves the quality of the metadata by aligning them with disciplinary practices.
Files and access
This section lists the data and documentation files to be shared with the dataset. If SND CARE is used, the files are uploaded here; if local storage is used, local routines apply for specifying data files. For datasets with restricted access, it is especially important that accompanying documentation files are indicated correctly, as these should be openly accessible in the catalogue entry even if the data files cannot be shared openly.
Licence and copyright
Ensure that the licence and copyright information do not conflict with the selected data access level, your institution’s local policies, or other applicable frameworks. As an example, datasets with restricted access should not have a Creative Commons licence applied.
Data contain personal data / Data contain other protected information
It is crucial that the information provided here is correct. A detailed review and discussion with the researcher are often needed to verify this. Datasets that contain personal data may only be shared via DORIS if the researcher’s organization has an agreement with SND. This may involve local storage or a data processing agreement with SND.
Also indicate what type of personal data are included. This is important for both prospective secondary users and internal processes for assessing requests for access and disclosure of the data.
If it is not possible to fully determine whether personal data are included – for example, due to the size or complexity of the material – we recommend selecting Data contain personal data and using the Type of personal data field to describe any potential risks related to the personal data.
Data access level
Check that the selected data access level aligns with the content of the data – particularly regarding personal or protected information. If the data contain information that cannot be shared openly (e.g., sensitive personal data), the access level should be set to Access to data is restricted, which means a request to access the data is required before access is granted.
Read more about data that contain personal data on GDPR and personal information.
Remember that even if data cannot be made directly downloadable, documentation files should be. This is important so that a prospective data requester can obtain as much information as possible to decide whether the material may be useful to them.
Citation and description
This section includes information that gives visitors an overview of the dataset: title, creator(s), and a description of the data. Much of the information in this section forms the basis for how the dataset will be cited. At the top of this section, a suggested citation for the dataset is provided, which will be visible in the catalogue entry. This makes it easier for users to cite the dataset.
Title
Titles should ideally be provided in both Swedish and English, but this field is exempt from the general bilingual requirement as a suitable Swedish title may not always be available for the purpose of data publication.
The Alternative title field can be used to enhance discoverability, for example by providing a title in a third language.
Avoid using the same title for the data description as any publication associated with the research results. In such cases, use a title such as “Data for: [Title of the article]”.
Creator/Principal investigator
This may be one or more individuals or organizations. It is relatively uncommon for both individuals and organizations to be listed as creators of a dataset. An organization may be listed as the creator when the dataset stems from a large research project or collaboration. For example, the SOM Institute lists the organization as the creator of its surveys.
The names entered here will be included in the data citation. Individuals or organizations that contributed but should not be included in the citation should instead be listed in the Contributors field (see next section, Administrative information). Whenever possible, include ORCID IDs for individuals and ROR IDs for organizations to improve interoperability and findability.
If a researcher’s institution does not have an official Swedish name, use the English name in both language fields rather than leaving the Swedish field blank.
Description
This is a central field in any data description. Description should provide users with a clear understanding of what kind of data are involved and the context in which they were collected or generated.
A good description balances general information – accessible to non-specialists – with sufficient detail to enable researchers with domain knowledge to assess whether the dataset may be relevant to them for further research. Larger and more complex datasets from completed projects may benefit more from broader descriptions aimed at a wider audience, than what smaller, article-specific datasets do. For highly specialized material, it may be acceptable to provide a more targeted, specialized description.
Description is a mandatory field in both Swedish and English. However, the text does not have to be identical; a shorter summary in Swedish with a reference to the English catalogue page for more information is acceptable.
If you draft a simple description yourself, make sure the researcher reviews and approves it. Consider including a reference in the text to the English page for further information. Always ask the researcher to verify what you have written, especially if the research field is unfamiliar to you.
Administrative information
It is essential that you provide the correct Research principal (Forskningshuvudman), as this determines which organization is responsible for reviewing the dataset, which storage system shall connect to DORIS, and who is responsible for decisions on data access and disclosure.
Note: You cannot change the research principal afterwards. If the wrong research principal organization has been selected, the researcher must create a new data description. To simplify this, the previous description can be copied using Other actions > Duplicate data description. You may also contact the SND office at snd@snd.se for possible solutions.
Research principal
Research principal refers to the organization within which the research was conducted and which bears ultimate responsibility for it. If there are multiple research principals, enter the organization responsible for making the data accessible. You can, for example, ask the researcher which organization is responsible for archiving the data or which organization submitted the ethics application.
Other research principals
This field can be used to list additional research principals, e.g., in collaborative projects. In collaborations, such as those between a university and a regional authority, it may be unclear which organization should be listed as research principal. The researcher should usually know this, but you should be alert to cases where the creator and principal investigator come from different organizations, or where the metadata indicate a collaborative project. If in doubt, ask the researcher for clarification.
Collection and method
In the Time period(s) investigated field, the researcher can specify either a specific date (year, month) or a more general period (e.g., “Bronze Age”). Note that this field does not automatically refer to the data collection period.
If the person submitting the data description has entered a broad date range (from–to) but only studied selected times within that range, it may be more appropriate to list the individual time periods instead of an interval. This improves catalogue search results. For example, if the researcher has chosen the range AA–ZZ but the data only cover periods AA, EE, and ZZ, it is better to list AA, EE, and ZZ separately as individual time periods.
Topics and keywords
At least one Research area from Statistics Sweden’s standard classification of fields of research must be provided. Depending on the research field and data type, additional classifications from CESSDA or INSPIRE may be relevant. CESSDA is primarily used for the social sciences, while INSPIRE is recommended for spatial data.
Keywords
Keywords improve the findability of data. A data description must include at least one keyword, but the more keywords provided, the more findable the dataset becomes. Whenever possible, keywords should be selected from one of the controlled vocabularies. If a suitable term is not available, free-text keywords can be entered. The advanced keyword search allows you to explore terms within specific subject areas and vocabularies. You can also search using the English version of the keyword lists, as some terms do not have Swedish equivalents.
As a reviewer, you are encouraged to help the researcher by suggesting or adding more relevant keywords.
Geographic coverage
The section Geographic coverage includes several fields where you can specify the geographical area covered by the data. It is also possible to mark the area on a map, which enables geospatial search in the catalogue and adds a map to the catalogue entry. You may also add a free-text description of the geographical coverage or upload GIS data to avoid duplicated work.
Publications and relations
Many researchers make their data accessible in connection with the publication of an article. In such cases, it is essential that the article is linked to the data description. A copy of the article should be saved even if it is published Open Access, to ensure the information remains accessible if the original publication becomes unavailable. If a publication describes how the data were created, it is particularly important to link it to the data description.
Publication details can be imported automatically from SwePub. Always double-check that the imported and automatically generated information is accurate. For larger studies, there may be publication lists that are maintained on project websites. In such cases, you can use the Link to publication list field – but be aware that links may need updating over time.
Additional publications can be added to the catalogue entry later.
Language resources
The Language resources section only appears if the researcher has selected the Language resources research area profile.