Software Reliability (440)
Open to MEng 4 students and MSc in Computing (Advanced
Computing, Software Engineering and Secure Software
Systems specialisms) students, this course provides an overview of
exciting recent research into techniques and tools that aim to help
developers improve the reliability of their software.
The importance of software reliability
Society is becoming ever more reliant on software and
software-controlled systems. Some of this software is
safety-critical, e.g., the software used to control cars,
aeroplanes and other high-speed transport. Defects in safety-critical
software can lead to serious injury or death. A much larger volume of
software is business-critical, e.g., software that runs in mobile
phones, powers web servers and manages data centers. Defects in this
type of software can lead to significant financial losses.
Underpinning all of these areas is
systems software: the
low-level operating systems, compilers, device drivers and networking
software on which complex systems are built. This foundational role
means that the reliability of systems software is of primary
importance.
Traditional methods for improving software reliability
Three of the main techniques used in industrial and open source
projects to improve software reliability are:
- Manual testing: Manually crafting a suite of tests to
exercise a software system to a reasonably high degree, e.g., covering
a high percentage of program statements.
- Code reviews: Requiring that source code additions or
modifications (patches) are reviewed by experienced developers before
being committed to the code base.
- Coding standards: Requiring that all developers adhere to a
set of rules when writing or maintaining code. Coding standards can
improve source code readability, making it easier to spot defects, and
may ban the use of programming idioms that are arguably dangerous.
Rigorous manual testing, code reviews and adherence to standards
are essential to the success of large software projects, but they all
suffer from two common problems:
- They depend fundamentally on human reasoning and judgement.
Humans are clever, but software can be devilishly complex. It is
easy for subtle defects to creep into a project despite
adherence to coding standards, and to evade manual testing and code
review.
- They do not provide guarantees. A test suite can
demonstrate that certain executions of a software system do not
exhibit defects, but provides no further guarantee. For safety- (and
often business-) critical systems this may not be enough: it is highly
desirable to have a guarantee of defect freedom; ideally an
absolute guarantee, but at least a guarantee that system executions
have been systematically checked up to some well-defined bound.
This course
The focus of this course is on automatic techniques for improving
software reliability which go
beyond manual testing. The
course will cover:
- Verification condition generation
- Procedure summaries
- Bounded model checking
- Dynamic symbolic execution
- Constraint solving
- Invariant generation
- Systematic testing for concurrent programs
- The lockset algorithm
- Undefined behaviour, compiler bugs and unstable code
- Introduction to security and stack canaries
- Safe C compilers
- Control-flow, data-flow and write integrity
The constraint solving topic is at the heart of many of the
techniques we will study, as a common approach is to encode program
behaviours as logical formulae to be discharged to a solver. Dramatic
advances in the power of constraint solvers in recent years has made
it possible to apply relevant techniques to larger and larger software
systems.