David

Aliasing and how to control it

Today I’ll start by writing a simple function that determines the maximum and minimum of two integers. We want to return two values, and C doesn’t make that easy unless we declare a struct to hold them. So I’ll pass two pointers to where I want the results stored instead. Here goes:

#include "ecv.h"
void minMax(int a, int b, out int *min, out int *max)
post(*min <= a; *min <= b; *min == a || *min == b)
post(*max >= a; *max >= b; *max == a || *max == b)
{ *min = a < b ? a : b;
  *max = a > b ? a : b;
}

I’ve highlighted the eCv annotations like this. Notice how, in the postcondition, I’ve defined the minimum as a value that is less than or equal to both inputs and equal to one of them. Similarly for the maximum. Also notice that the two parameters used to pass the results back are flagged with the out keyword. This tells eCv (and users of this function) that the initial values of *min and *max are not used by the function, and the obligation to initialize them rests with the function, instead of with the caller of the function.

Does this function work? Always? Let’s see what eCv makes of it:

c:\escher\ecvtest\ecvtest20.c (9,25): Warning! Unable to prove:
     Postcondition satisfied when function returns
     (defined at c:\escher\ecvtest\ecvtest20.c (5,27))
     (see c:\escher\ecvTest\ecvtest20_unproven.htm#2),
     did not prove: *min <= b.

c:\escher\ecvtest\ecvtest20.c (9,25): Warning! Unable to prove:
     Postcondition satisfied when function returns
     (defined at c:\escher\ecvtest\ecvtest20.c (5,16))
     (see c:\escher\ecvTest\ecvtest20_unproven.htm#1),
     did not prove: *min <= a.

c:\escher\ecvtest\ecvtest20.c (9,25): Information! Confirmed:
     Postcondition satisfied when function returns
     (defined at c:\escher\ecvtest\ecvtest20.c (5,43))
     (see C:\escher\ecvTest\ecvtest20_proof.htm#3),
     proved: (a == *min) || (b == *min).

c:\escher\ecvtest\ecvtest20.c (9,25): Information! Confirmed:
     Postcondition satisfied when function returns
     (defined at c:\escher\ecvtest\ecvtest20.c (6,16))
     (see c:\escher\ecvTest\ecvtest20_proof.htm#4),
     proved: a <= *max.

c:\escher\ecvtest\ecvtest20.c (9,25): Information! Confirmed:
     Postcondition satisfied when function returns
    (defined at c:\escher\ecvtest\ecvtest20.c (6,27))
     (see c:\Escher\ecvTest\ecvtest20_proof.htm#5),
proved: b <= *max.

c:\escher\ecvtest\ecvtest20.c (9,25): Information! Confirmed:
     Postcondition satisfied when function returns
     (defined at c:\escher\ecvtest\ecvtest20.c (6,43))
     (see c:\Escher\ecvTest\ecvtest20_proof.htm#6),
     proved: (a == *max) || (b == *max).

So eCv proves that the maximum is returned correctly, but not the minimum. What’s going on here? If we follow the link to the “unproven” file, we find a clue:

         Could not prove: !(min == max)

Sure enough, it’s clear that if we make the following call:

int temp;
minMax(42, 24, &temp, &temp);

then minMax fails to meet its specification, because after storing the minimum in temp, it overwrites that value with the maximum.

The fix is to add a precondition to make that sort of call illegal:

void minMax(int a, int b, out int *min, out int *max)
pre(min != max)
post(*min <= a; *min <= b; *min == a || *min == b)
post(*max >= a; *max >= b; *max == a || *max == b)
{ *min = a < b ? a : b;
  *max = a > b ? a : b;
}

eCv is then able to verify the function.

This is a simple example of classic pointer aliasing - the situation in which two pointers refer to the same memory location. Any assignments we make through one of the pointers affects the value we read through the other. Another form of aliasing is pointer-variable aliasing. In this variant, a function is manipulating the value referred to by a pointer, and also a variable – typically a static variable declared outside the function. If the pointer points to the variable, then assignments through the pointer affect the value read from the variable, and vice versa.

eCv adopts a strict aliasing model that disallows casts between different pointer types. This means that in an eCv-compatible C program, pointer-pointer aliasing can only occur if the pointers have the same type, or if one of the types pointed to contains a field or element of the other. Similarly, aliasing between a pointer and a variable can only happen if the type pointed to is either the same as the type of the variable, or the same as the type a field or element contained in the variable. This serves to reduce the potential for aliasing.

Nevertheless, there are still many situations in which aliasing is possible and will break the program if it occurs. For example, consider the following function for copying elements from one array to another:

#include "ecv.h"
void arrayCopy(const int* array src, int* array dst, size_t num)
pre(src.lim >= num; dst.lim >= num)
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ size_t j;
  int i;
  for (j = 0; j < num; ++j)
  keep(j <= num)
  keep(forall i in 0..(j - 1) :- dst[i] == src[i])
  decrease(size - i)
  { dst[j] = src[j];
  }
}

The preconditions ensure that we don’t get out-of-bounds array accesses, and the postcondition describes what we want to achieve. For an explanation of the loop invariants (keep clauses) and loop variants (decrease clauses), see Verifying loops in C and C++, Verifying loops – part 2 and Verifying loops: proving termination.

Does this function meet its specification? Not if the elements of dst we are writing overlap with the elements of src that we are reading! So we need an anti-aliasing precondition again. Here’s one possibility:

void arrayCopy(const int* array src, int* array dst, size_t num)
pre(src.lim >= num; dst.lim >= num)
pre(forall i in src.indices; j in dst.indices; &src[i] != &dst[j])
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ ...
}

Stating that arrays don’t overlap in this way is rather cumbersome. But it can get a lot worse! Suppose we have, for example, an int* and a pointer to an array of structs, where each struct has several int fields, some int[] fields, and some struct fields that may themselves contains ints. If we want to write a precondition stating that the int* doesn’t alias any of the ints contained in the array of structs, we would need one precondition term for each possible way in which they could be aliased. If we add another int* parameter that is also forbidden to alias the other pointers, we would need to do much the same all over again.

To make it easier to express absence of aliasing, eCv provides the disjoint-expression. Here’s our example modified to use it:

void arrayCopy(const int* array src, int* array dst, size_t num)
pre(src.lim >= num; dst.lim >= num)
pre(disjoint(src.all, dst.all))
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ ...
}

The disjoint-expression takes two or more operands and yields true if and only if no pair of operands refer to overlapping storage. Internally, eCv determines all the ways in which pairs of operands could be aliased, and expands the expression into a conjunction of equivalent predicates for the prover to use.

Although eCv can reason about C programs even in the presence of aliasing, it’s best to write programs that avoid the possibility of aliasing as far as possible. To help you, eCv extends the strict aliasing model when you declare a new type using a constrained typedef. eCv doesn’t allow you to cast a pointer to a constrained type to or from a pointer to any other type. For example, suppose I declare the following:

typedef int invariant(value >= 0) count_t;
typedef int invariant(value >= -273) temperature_t;

void foo(count_t * a, temperature_t *b) { ... }

Within foo, eCv knows that *a and *b cannot be aliases for the same variable, because the pointer types are not compatible. If you should want to declare a new sort of integral type without actually constraining it, you can use invariant(true) in its typedef.

ARTICLES   List of Articles     TOPTOP