XClose
Menu

Research Software Engineering with Python

Introduction

In this course, you will move beyond programming, to learn how to construct reliable, readable, efficient research software in a collaborative environment. The emphasis is on practical techniques, tips, and technologies to effectively build and maintain complex code. This is a intensive, practical course.

Pre-requisites

  • Prior knowledge of at least one programming language, including variables, control flow, and functions.
  • You are required to bring your own laptop to the course as the classrooms we are using do not have desktop computers.
  • We have provided setup instructions for installing the software needed for the course on your computer.
  • Eligibility: This course is for Turing PhD students. Turing researchers might join too if capacity allows.

Registration

See email and Turing Complete. First class is open, then registration via Eventbrite.

This course may not be audited.

Instructors

Schedule

Class Date When Where
Introduction to Python 12 Oct 10am-1pm Enigma
Research Data in Python 26 Oct 10am-1pm Enigma
Research Data in Python and Version Control 2 Nov 9:30am-4:30pm Enigma
Testing your code 9 Nov 10am-1pm Enigma
Software Projects 15 Nov 10am-1pm Enigma
Construction and Design 11 Jan 10am-1pm Enigma
Advanced Programming Techniques 18 Jan 10am-1pm Enigma
Programming for Speed 25 Jan 10am-1pm Enigma
Scientific file formats 1 and 2 22 Feb 10am-2pm Enigma

Bonus: HPC Carpentry Workshop, december 6-7.

Synopsis

Introduction to Python

  • Why use scripting languages?
  • Python. IPython and the IPython notebook.
  • Data structures: list, dictionaries, and sets.
  • List comprehensions
  • Functions in Python
  • Modules in Python
  • An introduciton to classes

Research Data in Python

  • Working with files on the disk
  • Interacting with the internet
  • JSON and YAML
  • Plotting with Matplotlib
  • Animations with Matplotlib

Version Control

  • Why use version control
  • Solo use of version control
  • Publishing your code to GitHub
  • Collaborating with others through Git
  • Branching
  • Rebasing and Merging
  • Debugging with GitBisect
  • Forks, Pull Requests and the GitHub Flow

Testing your code

  • Why test?
  • Unit testing and regression testing
  • Negative testing
  • Mocking
  • Debugging
  • Continuous Integration

Software Projects

  • Turning your code into a package
  • Releasing code
  • Choosing an open-source license
  • Software project management
  • Organising issues and tasks

Construction and Design

  • Coding conventions
  • Comments
  • Refactoring
  • Documentation
  • Object Orientation
  • Design Patterns

Advanced Programming Techniques

  • Functional programming
  • Metaprogramming
  • Duck typing and exceptions
  • Operator overloading
  • Iterators and Generators

Programming for Speed

  • Optimisation
  • Profiling
  • Scaling laws
  • NumPy
  • Cython

Scientific file formats 1

  • Serialisation and Deserialisation
  • Domain specific languages
  • Templating languages: Mako

Scientific file formats 2

  • Binary file formats: XDR and HDF5
  • Parsers and grammars: Python Lex and Yacc
  • Ontologies
  • Semantic file formats

Exercises

Examples and exercises for this course will be provided in Python. Python will be introduced during this course, but we will assume you can already program. That means that you may find supplementary python content useful.

Evaluation

None: you are not graded. You will be provided with 2 exercises for self-assessment.

Versions

You can find the course notes as HTML via the navigation bar to the left.

The notes are also available in a printable pdf format.

If you encounter any problem or bug in these materials, please remember to add an issue to the course repo, explaining the problem and, potentially, its solution. In this way we can improve the instructions for future users.