Record-level Metadata

Overview

Teaching: 15 min
Exercises: 20 min
Questions
Objectives
  • Evaluate and rank the quality of existing metadata records.

  • Describe the types of and importance of record level metadata.

  • Compose an appropriate set of descriptive keywords for a given text.

Creating Record Level Metadata

Learning objectives:

Metadata quality: Good - Better - Best

Exercise 1 - rank these Zenodo entries in terms metadata quality (7 minutes)

This is a continuation of the Exercise 3 in the Documentation section. Rank these from from 1 (most helpful/informative) to 3 (least helpful/informative):

Discuss the results. Specifically, answer and discuss the following questions:

The metadata in your life

You’re used to metadata within your research. You’ve got metadata about specific data points, observations, samples, etc. But there are many more parts of metadata.

The information that you were looking at in the Zenodo records is metadata. Metadata about the dataset (record) on that page. Let’s take a look at the pieces of these pages.

Point out where these pieces are:

This information is important because:

Good metadata are important for reproducible research, because they describe the data, and thus provide the context for interpreting the data, analysis, and results.

Let’s think about the workflow of discovery, the user…

  1. Searches for something
  2. Reviews the results - is this the kind I was looking for, and if so, is it worth studying further?
  3. Might add some filters to reduce and refine the results
  4. Selects a record to review and goes to that record’s page
  5. Reviews the new information on this page, including the fuller description, keywords, and other readme/documentation files.
  6. Downloads and digs in to the data files

This person would continue to move through these steps so long as the information continues to look sufficiently interesting.

Metadata also aid discovery. Metadata should be clearly defined and tightly integrated with the data and project (Hart et al. 2016).

Keywords, best friend/worst enemy (7 minutes)

There are many places where you might need to add in tags and keywords about items. You do this for organizing your pictures, maybe your electronic notes, or your issue tracker tickets.

The keywords you add need to be items:

The same keyword may count as too general, too specific, or just right depending on the platform you are using.

Exercise 2: What makes a good keyword a good keyword?

For example, assume you have a series of Jupyter Notebooks you were going to deposit in an archive in support of a manuscript you are publishing on the publication genetics of BRCA1 alleles. Which of the following keywords would be useful or not useful?

What kind of context could change your answers?

Exercise 3: Picking keywords for the gapminder data

Imagine that you are finishing up a project on the gapminder dataset that we’ve been using over the course of the workshop. You are preparing to deposit the dataset and the Jupyter Notebooks into an archive such as Zenodo. The submission interface allows you to provide a set of keywords to descrive your deposit, and you want to maximize the impact of your deposit by allowing those for whom it would be useful to find it.

You will work with a partner (or a small group).

The entire room now decides on a single set of 5 keywords. Again, this may be some form of union or new creation.

Key Points

  • TODO