Module 3: Containers and Reproducible Builds

Module Overview

"Works on my machine" is a common state of code developed by people lacking in software engineering background. It must be reproducible for code (and science) to work.

We've already learned about pipenv as a Python packaging tool, which goes a long way towards giving reproducible builds - but for even greater reproducibility (and deployability), containers are the tool of choice. A container is a minimal virtual operating system, complete with all the software needed to run the desired application. Because they pack everything together, they are identical to run regardless of host.

Docker is a common standard and tool for containers, and we will use it to build and run Linux containers with Python code.

Learning Objectives

1. Launch Docker containers and access/execute programs on them

  • Understanding Docker container basics
  • Running pre-built Docker containers
  • Executing commands within containers
  • Managing container lifecycle
  • Accessing container resources
  • Interacting with container processes

2. Create/customize a Dockerfile to build a basic custom container

  • Writing Dockerfile instructions
  • Setting up container environments
  • Installing dependencies
  • Configuring container settings
  • Building custom images
  • Managing container configurations

Guided Project

In this guided project, we'll learn how to create Docker containers for reproducible Python environments:

Guided Project File:

guided-project.md

Module Assignment

Please read the assignment.md file in the GitHub repository for detailed instructions

Assignment File:

assignment.md

Solution Video

Additional Resources