Archie Maskill
A software development blog

Building Correct Software

An Introduction to Contracts
There are two measures of the quality of software: how correct it is, and how robust it is.

This article is about using Contracts to improve the correctness of software, with the aim of getting it right the first time round.

An implementation’s correctness can only be judged according to its specification. If an implementation could somehow include its specification (or a subset of it), this would be a great step towards it meeting its specification and assessing its correctness.

It’s extremely common for developers to write code that does what they say, and not what they mean. Computers, while incredibly faithful, are utterly stupid, and a great deal of precision is required when instructing them to do something non-trivial. In the business world, Contracts are a formal agreement between partners, clarifying the terms of a relationship. Their purpose is to specify obligations and benefits with clarity, and remove ambiguity. This idea can be used to directly address the correctness of software, by expressing its specification in the form of Contracts which declare the obligations and benefits of routines, and the code that calls these routines.

Contracts are implemented by way of assertions; statements which are believed by the developer to be true. When an assertion is evaluated by the program and found to be true, all is well. But when the assertion is false, the software is deemed to be behaving contrarily to the developer’s expectations.

Assertions are nothing new to most developers. Despite this, they’re rarely used in software and, when they are, tend only to be pulled out of the toolbox after a bug has been found. Once a developer is thinking in terms of Contracts, it becomes second nature for her to make use of assertions before bugs are found, and even before an implementation has been written.

The specification of a routine can be expressed with two sets of assertions:

  • Preconditions are checked upon entry to a routine.
  • Postconditions are checked upon exit from a routine.

If an assertion fails, then an appropriate exception is thrown. If a precondition was violated, this was due to a bug in the Client (the part of the code that calls the routine). A postcondition violation indicates a bug in the Supplier (the module that supplied the routine being called).

Let’s look at how we would expect a stack to behave (the specification):

Routine Name Description Preconditions Postconditions
Count Returns the number of items on the stack
IsEmpty Returns a boolean indicating if the stack is empty The result should be true if Count = 0, false otherwise
IsFull Returns a boolean indicating if the stack is full The result should be true if Count = _capacity, false otherwise
Item Returns the top-most item without popping the stack Can only be executed if the stack isn’t empty
Put Pushes an item onto the stack Can only be executed if the stack isn’t full Count must be incremented; the stack must not be empty; the top-most item must be the item just pushed
Remove Pops an item from the stack Can only be executed if the stack isn’t empty Count must be decremented; the stack must not be full; the top-most item is not the item just Removed

_capacity is the number of items a stack may hold. This is a hidden feature (using the convention of a leading underscore to imply this), and is unavailable to the Client

In Python, the implementation of Put would look something like the following…

def Put(item):
    assert not IsFull, "The stack must not be full"
    oldCount = Count   # required for postcondition

    # ...
    # implementation
    # ...

    # postcondition
    assert Count == oldCount + 1, "Count must be incremented"
    assert not IsEmpty, "The stack must not be empty"
    assert item == Item, "The top-most item must be the item just pushed"

Python is good for this first example because it’s quite a nice pseudo-code-looking language. That said, it’s also the last example you’ll be seeing in Python, because it (along with most other programming languages) does not offer support for Contracts. Simply placing assert statements at the beginning and end of a routine simulates Contract-usage only at its most basic level. By examining a language that directly supports Contracts, we will see the full richness that Contracts offer.

Languages that were designed with Contracts in mind have Contract-specific syntax baked into the language. This allows the formal separation of contracts from implementation, opening up assistance from the compiler and automated documentation tools, and offering further advantages that will be addressed. The .Net 4 framework offers a rich Contracts library, called Code Contracts. However, the syntax is sufficiently verbose that it’s not so good for didactic examples. Instead, I shall focus on Eiffel, touted as the poster child for Contract-driven programming. Here is the same example, written using Eiffel notation:

Put(item: G) is
        not IsFull
        -- ...
        -- implementation
        -- ...
        CountIncremented: Count = old Count + 1
        NotEmpty: not IsEmpty
        ItemJustPushed: Item = item

A few things to notice about Contract-supportive languages:

  • Preconditions and postconditions are formally separated from the implementation using the require and ensure clauses.
  • A syntax is provided for accessing the values of expressions from before the routine was executed (old Count, in the example above). This removes the need to store this value separately at the beginning of the routine.
  • The syntax allows for Assertion labels, such as CountIncremented, which are used to diagnose which assertion failed
  • Preconditions and appropriate postconditions should be available to Client code. In a language that supports Contracts, this is more easily supported within automated document generators and development environments.

So, when the routine Put is called, the precondition is checked. If this fails, an exception is thrown. When this happens, we can be sure that the Client did not fulfill its obligations to the Supplier. If it succeeds, then execution continues as normal. When it’s time to exit the routine, the postcondition is checked. If this fails then an exception is raised, signifying that the Supplier has failed to fulfill its obligation to the Client.

Once decent Contracts in place, it can be easier to see how we go about writing an implementation. It clarifies both the obligations required of the Client code before it may expect the benefits of the routine, and the checklist of tasks that must be achieved within the routine. Assertions help us get our software right in the first place by aiding in analysis, design, implementation and documentation. It is software with reliability built-in, as opposed to reliability achieved through iterations of post-implementation debugging.

When developers are writing Client code, they shouldn’t have to inspect the Supplier’s implementation. In fact, the source may not even be available for inspection. This makes Contracts a very useful device, serving as abstract descriptions for routines.

The whole point of preconditions is that they are available to Client code, giving it a fair chance to ensure the conditions are correct for calling the routine. It would therefore be unfair for preconditions to depend on attributes hidden from the Client. This is why private attributes are not allowed in preconditions, and why a compiler should flag such a violation.

There is no such rule for postconditions. Any postcondition that refers to features that are kept secret from the Client are simply not mentioned in the documentation generated for Client authors.

Preconditions and Defensive Programming
Preconditions allow the Supplier’s implementation to be simpler. The implementation can be written, safe in the knowledge that the preconditions have been satisfied, without having to provide checks for calling errors that may have been passed on from the Client code. Authors often try to achieve robustness by writing routines that accept a wide range of input states, by second-guessing malformed Client requests. Ironically, this can often lead to code that is less robust. It complicates the routine, bringing all the traditional problems associated with such complexity: difficult and costly to maintain, less reliable, brittle, and buggy. It also blurs the distinction between the Client’s and Supplier’s obligations. Is the Client asking something unreasonable of the Supplier? Or, if implemented ‘robustly’, is it a failure of the Supplier’s robustness?

For example, consider a square root function, SQRT. Ignoring complex numbers, it does not make sense to calculate the square root of a negative number. Such a request should be considered an error in the Client code.

SQRT(x: INT) is
        if (x = 0
        -- Remainder of implementation

SQRT(x: INT) is
        x >= 0
        -- Remainder of implementation

The first example of the SQRT function is an example of defensive programming. The idea here is that in order to build reliable software, every component should be designed to protect itself and the Client as much as possible. The second example uses Contracts. Which method to choose is a matter of personal choice, but a stronger case can be made for Contracts.

Any correctness condition required to successfully call a routine should be met by either the Client or the Supplier, but not both (to avoid redundancy). But which one?

  • We could assign this responsibility to the Supplier – rather than use a precondition, if…else statements are used to check for unsuitable conditions. The Supplier then handles these somehow. This is the Tolerant approach.
  • We could assign this responsibility to the Client – checks are performed to ensure the correct state before calling the Supplier’s routine. This is the Demanding approach.

The Tolerant approach, at first glance, looks quite appealing. It’s common for a Supplier to have multiple Clients, so effort could be saved by placing the ensuring of correctness conditions within the Supplier. But what happens if the Supplier is asked to, for example, pop an empty stack? Within the Supplier, it’s difficult to know how to react appropriately to such a request. Should it be ignored? Is it an error? Should an exception be raised? Should there be a correction, followed by a re-attempt? Should a user-visible message be printed? The Supplier can’t make this sort of decision, because it lacks the context that the Client has. It’s the Client’s role to know why there is an attempt to pop an empty stack, and what this means.

With a Demanding approach, the Supplier does not try to do everything. It aims to perform a well-defined job, and to be strict about cases it cannot handle. A Supplier that computes, checks for abnormal cases, takes corrective actions, notifies Clients and produces results will be complex and could easily fail some or all of these goals. The preconditions are available to Client authors, so it’s obvious what conditions are required to call the routine correctly.

The Tolerant approach is akin to “killing with kindness”, whereas the Demanding approach is like being “cruel to be kind”.
That said, the Tolerant approach still has its uses; for example, dealing with data coming from the outside world: user input, sensor data, …

Class Invariants
Up until this point, everything that has been said about Contracts has been applicable to any programming language paradigm. However, Contracts start to really shine when used in Object-Oriented languages.

To recap, preconditions and postconditions are Contracts that are checked upon entering and exiting a routine, respectively. Together, they characterise routines, providing an abstract description of what they do.

An invariant is a Contract that characterises the properties of an entire class. When any routine of the class is called remotely (i.e. by Client code, outside of the class), the invariant is checked both on entering and exiting the routine.

Returning to our stack example:

  1. Put‘s precondition ensures that it may only be called if the stack is not full (Count != _capacity)
  2. Remove‘s precondition ensures that it may only be called if the stack is not empty (Count > 0)

While other Contracts also hint at the relationship between Count and _capacity, we only get a clear view of the larger picture once we’ve collected together the fragments to form the invariant:

0 <= Count; Count <=_capacity

As the invariant, this condition must always be met, regardless of which routine is about to be executed, or has finished executing. If it is not true, then an exception is thrown. If the class implementation is correct, the invariant will be satisfied at the following three times:

  1. just after the class has been instantiated (and the constructor has been called)
  2. just before a remote call to a routine of the class
  3. just after a remote call to a routine of the class

Invariants are not checked at other times, as it may be necessary for a class’ routines to violate the invariant if they are to get any useful work done.

Logically speaking, when remote calling a routine, the following assertions are checked;

  1. Entering the routine: Invariant AND Precondition
  2. Exiting the routine: Invariant AND Postcondition.

So, technically, we don’t need the invariant – we could simply add it to all preconditions and postconditions. This would be a bad idea, for the following reasons:

  1. it introduces unnecessary redundancy
  2. the invariant states something meaningful about the class as a whole. When merged with preconditions and postconditions, this deeper meaning is lost
  3. someone needs to remember to add this invariant to the preconditions and postcondition of every new routine

Plain old Assert
Most languages support the plain old Assert instruction. It expresses a certainty that a property will be satisfied at that point in a computation. As well as providing reassurance, the hypotheses upon which you have been relying are made explicit for future readers of your software. While such Assert statements aren’t really Contracts, this article concerns not just Contracts, but correctness, so it will not be too much of a deviation to give a brief summary.

An invaluable application of the Assert instruction is just before a call to a routine with a precondition. If you are convinced that the call satisfies the precondition, but this is not immediately obvious from the context, the Assert instruction makes your conviction clear. Assertions also remind the reader that the lack of routine call protection is deliberate, and not an oversight.

x = a*a + b*b
Assert(X >= 0)  // because x is the sum of two squares
y = sqrt(x)

(This may be a poor example if you’re a mathematician for whom it _is_ immediately obvious that the sum of two squares is non-negative!)

In software construction, calls and operations usually rely on assumptions that are largely implicit. An author will work hard to convince herself that a particular property always holds at a particular point. Such careful consideration is necessary when writing good quality software. After a while, as memory fades or a project’s resources are re-assigned, all that survives is the text, with no trace of the rationale. If someone (perhaps even the original author) needs to understand the text, they will no longer have access to the resulting assumptions of the mental effort that went into writing the text, and will have to recreate the rationale that the previous author had. If an assumption is non-trivial, document it with an assert statement.

In summary:

  1. improve the correctness of software, by embedding specification within implementation.
  2. force obligations upon Client (preconditions), which are beneficial to the Supplier.
  3. force obligations upon the Supplier (postconditions), which are beneficial to the Client.
  4. aid in analysis, design, implementation and documentation.
  5. provide an abstract description of whole classes (invariants) and routines, whether or not the source code is available for inspection.

Additionally, assertions provide reassurance to the author, whilst preserving the efforts of reasoning about the code by expressing certainties and hypotheses to future readers.


2 Responses to “Building Correct Software”

  1. I think your SQRT routines have been truncated somehow: there aren’t two SQRT routines, and the fragment that is there is incomplete

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: