## Verifying loops – Part 2

Last time I showed how it was possible to analyse a loop-free and recursion-free program or function to
determine its semantics (i.e. its weakest precondition and its postcondition), but that this didn’t work
when we have loops. To make it possible to analyze loops thoroughly, eCv
(along with other formal systems) requires you to provide *loop invariants*.

What exactly is a loop invariant? Well, it’s a Boolean expression that depends on all the variables modified by the loop, and is true when the loop is first entered, during each iteration of the loop, and when the loop terminates. Typically, it comprises two parts.

The most important part of the loop invariant is a generalization of the state that the loop is intended to
achieve. It needs to be written so that it is easy to establish by suitable initialization of variables, yet
it becomes exactly the desired state when the loop terminates. I’ll refer to the desired state when the
loop terminates as the* loop postcondition*.

Let’s take a simple example. Suppose we have an array a of integers, and we want to set every element in that array to k. Here’s a function to do that:

voidsetArray(int*arraya, size_t size,intk)pre(a.lwb == 0)pre(a.lim == size)post(foralljin0..(size - 1) :- a[j] == k) { size_t i;for(i = 0; i != size; ++i) { a[i] = k; } }

The first precondition says that a is a regular pointer to the start of an array. The second one says that the number of elements is given by size. The postcondition says that when the function returns, for all indices j in the range 0 to (size – 1), a[j] is equal to k.

In this case, the loop comprises the entire body of the function, so the loop postcondition is the same as the
function postcondition. In fact, loop postconditions are very frequently **forall **expressions,
especially for loops that iterate over an array.

We need to generalize the **forall **expression in the postcondition so that it is true at the
start of every iteration of the loop. What is the state when we are about to commence the ith iteration? Well,
at that point we’ve already set all elements from zero to i-1 inclusive to k, but not
the elements from i onwards. We can express this with a slight modification
to the **forall **expression:

foralljin0..(i - 1) :- a[j] == k)

Do you see what I’ve done? I’ve changed the upper bound of the
**forall **from (size – 1) to (i – 1).
This gives us exactly what we need for the loop invariant:

- The loop initialization sets i to zero, so the initial bounds of the
**forall**are 0 .. -1 (that is, from zero up to minus one). This is an empty range (because -1 is less than 0), and a**forall**over an empty range is**true**. So the loop invariant is established at the start of the first iteration. - During each iteration, we set the ith element to k
**and**we increment i. So the invariant is preserved. For example, after the first iteration, the**forall**has range 0..0 so it just tells us that a[0] == k, which is exactly right. After the second iteration, the range is 0..1 so it says that a[0] and a[1] have value k, which is again exactly right. - The loop terminates when i == size, because I wrote the while-condition as i != size. If we replace i by size in the invariant, we get exactly the desired postcondition.

eCv expects the loop invariant to be written between the loop header and the body, like this:

voidsetArray(int*arraya, size_t size,intk)pre(a.lwb == 0)pre(a.lim == size)post(foralljin0..(i - 1) :- a[j] == k) { size_t i;for(i = 0; i != size; ++i)keep(foralljin0..(i - 1) :- a[j] ==k) { a[i] = k; } }

eCv uses the keyword **keep **to introduce the invariant, because we’re keeping the invariant true.

Is this enough to allow eCv to verify the loop? Unfortunately not. Using the above, eCv reports that it is unable to prove that a[j] is in-bounds in the loop invariant, or that a[i] is in-bounds in the loop body. So we need another component of the loop invariant, to ensure that these accesses are always in bounds. To make sure that a[i] is in bounds, we need to constrain i to be in the range 0..(size - 1) at that point. But we can’t constrain i to this range in the invariant, because when the loop terminates, i ends up with the value size, which is just outside this range (remember, the invariant has to be true not only at the start of every iteration, but also when the loop terminates). So instead we constrain i to the range 0..size. The body isn’t executed when i == size, so that constraint is sufficient to guarantee that a[i] is in bounds in the body. The program then looks like this:

voidsetArray(int*arraya, size_t size,intk)pre(a.lwb == 0)pre(a.lim == size)post(foralljin0..(i - 1) :- a[j] == k) { size_t i;for(i = 0; i != size; ++i)keep(iin0..size)keep(foralljin0..(i - 1) :- a[j] ==k) { a[i] = k; } }

I’ve put this new **keep **clause before the original one.
The reason is that eCv reported that it couldn’t prove that a[j] was in bounds
in the original **keep** clause. Putting the new **keep **clause first
allows eCv to assume that i is in 0..size in the second **keep **clause.
That is enough for it to prove that a[j] is in bounds, because j takes values from
0 to i – 1.

Have we finished? Well, eCv is able to prove that the above function meets its specification,
*if* the loop terminates. To prove the loop terminates, we’ll need to provide a *loop
variant* too. I’ll introduce loop variants next time.