Beyond Jupyter

Beyond Jupyter is a collection of self-study materials on software design, with a specific focus on machine learning applications, which demonstrates how sound software design can accelerate both development and experimentation.

The software being developed in machine learning contexts often remains at fairly low levels of abstraction and fails to satisfy well-established standards in software design and software engineering. One could argue that development environments such as Jupyter even actively encourage unstructured design; and we thus deem it necessary to abandon the respective software development patterns and to metaphorically go “beyond Jupyter”.

The goal of the course material is for practitioners to

understand how a principled software design approach supports every aspect of a machine learning project, accelerating both development & experimentation.

It is a common misconception that good design slows down development, while, in fact, the opposite is true. We showcase the limitations of (unstructured) procedural code and explain how principled design approaches can drastically increase development speed while simultaneously improving the quality of the code along multiple dimensions. We advocate object-oriented design principles, which naturally encourage modularity and map well to real-world concepts in the application domain, be they concrete or abstract. Our overarching goal is to foster

  • maintainability
  • efficiency
  • generality, and
  • reproducibility.

Content

Find our content on GitHub, which covers the following modules:

  1. Object-Oriented Programming: Essentials

    This module explains the core principles of object-oriented programming (OOP), which lay the foundation for subsequent modules.

  2. Guiding Principles

    This module puts forth our set of guiding principles for software development
    in machine learning applications. These principles can critically inform design decisions during development.

  3. Spotify Song Popularity Prediction: A Refactoring Journey

    This module addresses the full journey from a notebook implemented in Jupyter to a highly structured solution that is vastly more flexible, easy to maintain and that strongly facilitates experimentation as well as deployment for production use. We transform the implementation step by step, clearly explaining the benefits achieved and naming the relevant principles being implemented along the way.

  4. Anti-Patterns

    While the rest of the course material focuses on demonstrating positive design patterns, this module collects a number of common anti-patterns.