FAIR Research Data Coursebook: Glossary

Key Points

0. Introduction
  • FAIR means human and machine-friendly data sources which aim for transparency in science and future reuse.

  • DOI (Digital Object Identifier) is a type of PID (Persistent Identifier)

1. Set up your own terms
  • The Data Terms of Use statement is the legal basis of the referred data source.

  • A License is the bare minimum requirement for Data Terms of Use.

  • If a standard License does not fit your project, then you can use Terms of Use layouts e.g. Sample Data Usage Agreement.

2. Speak the same language
  • ‘Codebook’, ‘data glossary’, or ‘data dictionary’ are some other ways to name Data Descriptions.

  • Ontologies (in information science) are like public online vocabularies of community curated terms and their definitions.

  • By reusing Ontology terms or community accepted vocabularies, we aim to create a culture of recycling terminology by default.

3. Securely share
  • Data Access Protocols are a set of formatting and processing rules for data communication. For example, imagine you enter a security room. You must follow certain steps or possess keys to access the room.

  • When you expose your data using FAIR protocols, you must register your service in a registry for FAIR APIs such as SMART API.

4. Publish and preserve
  • Data repositories can make research data more discoverable by machines (e.g., Google search engine).

  • Always aim for a repository that fits your community (e.g., DataverseNL). Else, deposit your dataset on generic repositories (e.g., Zenodo).

  • If the data is about human subjects or includes demographics, you can always choose to make it private or deposit an aggregated subset.

5. Make machines work for you
  • Rich Metadata = Metadata + using FAIR Vocabularies (e.g., Dublin Core) + in an Interoperable format (e.g., JSON-LD)

  • There are tools for creating Rich Metadata files. Researchers do not have to do it manually. For example: FAIR Metadata Wizard

6. Responsibly reuse
  • Digital Objects (e.g., Datasets) “live” on the web, imagine they are like fishes in the sea 🐠🐠. The way to get picked up by a “fisherman” 🎣 i.e., a search engine (e.g., Google) is by describing these Digital Objects with Rich Metadata.

Glossary

     
RDM Research Data Management  
FAIR Findable, Accesible, Interoperable and Reusable  
DMP Data Management Plan  
CC Creative Commons  
ODC Open Data Commons  
RDF Resourse Description Framework  
API Application Programming Interface  
HTTP   Hypertext Transfer Protocol
DOI Digital Object Identifier  
PID Persistent Identifier