Documentation

Overview

Teaching: 10 min
Exercises: 20 min
Questions
  • Why should I invest time in good documentation?

  • How does my target audience influence my documentation strategy?

  • What are some published examples of good documentation?

Objectives
  • Describe how documentation is useful to yourself and to others

  • Evaluate and rank the quality of comments in published notebooks

Overview

Documenting your process, especially as it concerns your data, is a key element of making your research more reproducible. Data manipulation is as integral to your analysis as statistical modelling and inference. If you do not thoroughly record all the steps you used to process data, it will likely be impossible for you, or anyone else, to repeat the analysis in the future (Wilson et al. 2016). Using the Jupyter Notebook for scripting your data processing is powerful because it saves the code, which documents what action was taken, and the code can be interspersed with documenting the motivations behind each step, i.e., the why. There is also project-level documentation that isn’t needed to understand a particular series of data processing steps, but to understand the organization of the project as a whole. Finally, documentation can be used to aid discoverability.

In this lesson, we will discuss the types and styles for documentation, their utility, and how you might tailor them for different audiences.

Learning objectives

Wondering how others have been using the Notebook? Luckily, the IPython community curates a Gallery of IPython Notebooks that have been used for scientific research and educational tutorials. Browse the topics that pique your curiosity.

Here are some cool features that have been added to or done with notebooks:

Exercise 1a

Evaluate and rank the breadth and quality of documentation in two notebooks. What is one good thing and one bad thing about each. Add your comments to the etherpad.

Exercise 1b

Here is a piece of a notebook. Modify the existing markdown documentation to improve either the text itself or the formating. Past your modification in the etherpad.

Optional Jupyter Notebook Demonstration

It is possible that the instructors and or helpers for this lesson will want to provide a brief demonstration for how they use the jupyter notebook in their own research.

FIXME: This portion of the lesson will detail the best practices for doing that (i.e. time, providing the links, how to prepare, etc).

Documentation for different target audiences

Exercise 2

Think Pair Share. Should you leave commented-out code in your workflow when you publish it. Although it is useful to you while you’re working, is it useful to someone who is interested in your process? Is it necessary to understand and recreate your research?

README file

It is important to write a brief overview of your project. A README file is a short file (think 1-pager) in the project’s home directory, and typically is the main entry point for readers to the project, including in particular the code. It should thus answer questions others will commonly have when they come upon the project, including the following:

A README should be written in text, with markup that is easy to read (such as Markdown, Reitz 2016).

Based on the above, items to include in a README file include the following:

Exercise 3

Compare and contrast different research product archives for the quality and value of their documentation, and their corresponding utility for reuse.

Wrap-up

At this point in the workshop, you have now learned a lot about using the Jupyter Notebook and how to document your process. Documentation of your process is very important for communicating your work to others but also as a tool for communicating with yourself.

Exercise 4

  1. Take a moment to reflect on what you what learned in this workshop and what changes you want to make in your scienctific process.
  2. In your Jupyter notebook, write down what was the most useful thing that you learned.
  3. In your Jupyter notebook, write one change you will make to your current workflow to make it more reproducible.
  4. Set an alert to check in with yourself to see if you have implemented the change you wanted to make

Key Points

  • Your code tells what you did. Your documentation tells why you did it and why it is important.

  • Documentation is the key to communicating your workflow and findings with your future self, collaborators, peers, and the general public.