Research Software Engineering with Python


In this course, you will move beyond programming, to learn how to construct reliable, readable, efficient research software in a collaborative environment. The emphasis is on practical techniques, tips, and technologies to effectively build and maintain complex code. This is a intensive, practical course.


  • Prior knowledge of at least one programming language, including variables, control flow, and functions.
  • You are required to bring your own laptop to the course as the classrooms we are using do not have desktop computers.
  • We have provided setup instructions for installing the software needed for the course on your computer.
  • Eligibility: This course is for Turing PhD students. Turing researchers might join too if capacity allows.



Introduction to Python

  • Why use scripting languages?
  • Python. IPython and the IPython notebook.
  • Data structures: list, dictionaries, and sets.
  • List comprehensions
  • Functions in Python
  • Modules in Python
  • An introduciton to classes

Research Data in Python

  • Working with files on the disk
  • Interacting with the internet
  • JSON and YAML
  • Plotting with Matplotlib
  • Animations with Matplotlib

Version Control

  • Why use version control
  • Solo use of version control
  • Publishing your code to GitHub
  • Collaborating with others through Git
  • Branching
  • Rebasing and Merging
  • Debugging with GitBisect
  • Forks, Pull Requests and the GitHub Flow

Testing your code

  • Why test?
  • Unit testing and regression testing
  • Negative testing
  • Mocking
  • Debugging
  • Continuous Integration

Software Projects

  • Turning your code into a package
  • Releasing code
  • Choosing an open-source license
  • Software project management
  • Organising issues and tasks

Construction and Design

  • Coding conventions
  • Comments
  • Refactoring
  • Documentation
  • Object Orientation
  • Design Patterns

Advanced Programming Techniques

  • Functional programming
  • Metaprogramming
  • Duck typing and exceptions
  • Operator overloading
  • Iterators and Generators

Programming for Speed

  • Optimisation
  • Profiling
  • Scaling laws
  • NumPy
  • Cython

Scientific file formats 1

  • Serialisation and Deserialisation
  • Domain specific languages
  • Templating languages: Mako

Scientific file formats 2

  • Binary file formats: XDR and HDF5
  • Parsers and grammars: Python Lex and Yacc
  • Ontologies
  • Semantic file formats


Examples and exercises for this course will be provided in Python. Python will be introduced during this course, but we will assume you can already program. That means that you may find supplementary python content useful.


None: you are not graded. You will be provided with 2 exercises for self-assessment.


You can find the course notes as HTML via the navigation bar to the left.

The notes are also available in a printable pdf format.

If you encounter any problem or bug in these materials, please remember to add an issue to the course repo, explaining the problem and, potentially, its solution. In this way we can improve the instructions for future users.