Rich Metadata

Overview

Teaching: 10 min
Exercises: 15 min

Questions

1 What is the difference between Metadata and Rich metadata?

2 How to create a Rich Metadata file?

3 Where to put a Rich Metadata file?

Objectives

The participant will understand the difference between Metadata and Rich Metadata.

The participant will be able to pull Rich Metadata files and harness them.

FAIR principles used for Rich Metadata:

Findable

FM-F2 (Machine-readability of Metadata) → doi.org/10.25504/FAIRsharing.ztr3n9

Interoperable

FM-I1 (Use a Knowledge Representation Language) → doi.org/10.25504/FAIRsharing.jLpL6i
FM-I2 (Use FAIR Vocabularies) → doi.org/10.25504/FAIRsharing.0A9kNV

1. What is the difference between Metadata and Rich metadata?

Metadata is the “data about the data”. It is a detailed description of the Digital Object referring to, for example, the documentation of dataset properties.

For example:

In Data for: Women Insurgents, Rebel Organization Structure and Sustaining the Rebellion: The Case of Kurdistan Workers’ Party → LINK TO EXAMPLE
We can see how a data source is described for future usage.

Metadata Atribute	Example
Descriptive Metadata	DOI
Structural Metadata	Data Size
Administrative Metadata	Distributor
Statistical Metadata	Date of Data Collection

women unsurgents

Metadata by itself is plain text; to be meaningful for further reuse, it needs to be in a machine-readable format. When data is submitted to a trusted data repository, the machine-readable metadata is generated by the data repository. If the information is not in a repository, a text file with machine-readable metadata can be added as part of the documentation.

Rich Metadata is

Standardised

Structured

Machine- and human-readable

A subset of documentation

Metadata alone is plain text

There are many examples and resources where you can learn more about metadata. The idea is straightforward: we want to document the scientific data we generate for future use.

More resources

Metadata definition (The Turing Way) → LINK
What is Metadata, and how do I document my data? (CESSDA Training) → LINK
Rich Metadata and indexability by Search Engines (Google) LINK

S. Bonaretti (2019) Video: A walk-through video to describe Metadata in ZENODO.

2. How to create a Rich Metadata file?

There are several ways, but the main takeaway is that you don’t need to do it manually. The idea is to fill a form using a tool for Rich Metadata generation (table below) and export it (or copy-paste it) on a machine-readable format such as JSON-LD for interoperability reasons

Platform for Rich Metadata generation	Source	Online	Note
Dataverse Export Button	LINK	✅	Specific for datasets, it’s the easiest way to get a rich metadata file → RECOMMENDED
JSON-LD Generator Form	LINK	✅	Specific to the NanoSafety community, but it can be adapted
Steal Our JSON-LD	LINK	✅	General use. Ideal for blog posts, tables, videos and research project websites
JSON-LD Schema Generator For SEO	LINK	✅	Tailored for SEO, but quite comprehensive
FAIR Metadata Wizard	LINK	✅	A bit slow, but it is tailored to a generic scientific project → RECOMMENDED

Export JSON-LD Metadata

Image: With Dataverse Export Button, you can quickly get your Rich Metadata file

3. Where to put a Rich Metadata file?

A straightforward rule: Everywhere your data is stored.

In the project’s root folder

In the data repository

In the Github repository

On the project’s webpage

Example:

More and more services are using common schemas such as DataCite’s Metadata Schema or Dublin Core to foster greater use and discovery. A schema provides an overall structure for the metadata and describes core metadata properties. While DataCite’s Metadata Schema is more general, there are discipline specific schemas such as Data Documentation Initiative (DDI) and Darwin Core.

Thanks to schemas, the process of adding metadata has been standardised to some extent but there is still room for error. For instance, DataCite reports that links between papers and data are still very low. Publishers and authors are missing this opportunity.

Discussion

Real case:
The project Practicing Legitimation: How Chinese State Capital is Transnationalized in Europe doi:10.34894/BFZF0Y** is a clear example of a study where the interviews can’t be made public. Nevertheless, the metadata is made available so search engines and visible can index the study.

Discuss with your team the value (or not) of publishing and describing the study even though the actual data is not disclosed

Discuss with your team if there is something in the study that would be worth making publicly available

Key Points

Rich Metadata = Metadata + using FAIR Vocabularies (e.g. Dublin Core) + in an Interoperable format (e.g. JSON-LD)

There are tools for creating Rich Metadata files. Researchers do not have to do it manually. For example: FAIR Metadata Wizard

previous episode

Circular Research Data Coursebook. 2nd Edition

next episode

Rich Metadata

Overview

FAIR principles used for Rich Metadata:

1. What is the difference between Metadata and Rich metadata?

Rich Metadata is

More resources

2. How to create a Rich Metadata file?

3. Where to put a Rich Metadata file?

Discussion

Key Points

previous episode

next episode