Predicate Logic
for
Specifying Programs
and
Proving their Correctness
Program Specification
At one level of abstraction, a program is a relation between input and output values.
It is common to express this relation as a pair of assertions about the state of a program:
The input assertion specifies what is must be true at the start in order for the program to be applicable.
The output assertion specifies what will be true at the end.
Such assertions are predicates on program state.

Pre- and Post- Conditions
The input assertion is also called a
pre-condition and the output-assertion a post-condition.
We normally work under the assumption that the input assertion is true.
 
If the input assertion is not true, then Òall bets are offÓ;

Uses of Specification
Specification can be used before coding, to prove that an already-constructed program is correct.
Specification can be used before coding, as a specification to which code can be written.

Specification Example
A program that finds the index of a maximal value in an array a of n > 0 elements.
An array max-finding program:
Input assertion: n > 0.
Output assertion:
(
"i) (i > 0 ô i < n) ¨ a[i] < a[m]

(ÒIndex m is index of a maximal value in a.Ó)
Is this adequate as a specification?

The Need for ÒAnchorsÓ
Input assertion: n > 0 and a == a0
Output assertion: Index m is index of a maximal value in a and a == a0
Without the anchor, the assertion could be satisfied without the program actually computing the maximum; it could simply set all elements of a to 0 and m to 0, and the assertion would be satisfied; a == a0 prevents the array from being modified overall.

Specification Examples
A greatest-common-divisor program:
Input assertion: x == x0, y == y0, x0 > 0, y0 > 0
Output assertion: x is the greatest common divisor of x0 and y0 [notated x == gcd(x0 , y0)].
Implicit here is that all quantities are integers.
Values x0 and y0 are not program variables, but serve to anchor the original values for later reference.

Specification Examples
An array-sorting program:
Input assertion: a == a0
Output assertion:
The elements of a are those of a0  as a ÒbagÓ (or Òmulti-setÓ) i.e. the same number of each element.
The elements of a are in increasing order
Implicit here is that the notion that
order makes sense for the elements.

Examples
A searching program:
Input assertion: a == a0
Output assertion: Index m is such that
a == a0, and
a[m] == v if v occurs in array a, and
m == -1   if v does not occur in array a

Example with more-formal Logic
A searching program:
Input assertion:
a == a0
Output assertion:
a == a0 and
(($i) a[i] == v) ¨ a[m] == v, and
(("i) a[i] != v) ¨ m == -1

Forms of Correctness
The main forms of correctness are:
Partial Correctness (PC):
If the input assertion is satisfied when the program starts and the program eventually terminates, then the output assertion will be satisfied upon termination.
Termination:
If the input assertion is satisfied, then the program will eventually terminate
Total Correctness (TC):
defined to be PC + termination

Forms of Correctness
The forms PC and termination are separated because it often is easier to prove them separately.
(Principle of ÒSeparation of concernsÓ)

Example
Construct, and prove correct, a program that squares a number using only addition.
This technique goes back to BabbageÕs
ÒDifference EngineÓ, 1832.

Slide 14
Squaring Program Idea
Look at squares:
1, 2, 4, 9, 16, 25, É
Look at first differences:
1, 3, 5, 7, 9, ...
Note that first differences are:
odd
differ by 2 each time

Synthesis of Squaring Program
Assert n > 0 is the number to be squared.
int sum = 0; // accumulated sum
int i = 0; // step number
int fd = 1; // first difference
int sd = 2; // second difference
while( i < n )
  {
  sum = sum + fd;
  fd = fd + sd;
  i = i+1;
  }
Assert sum == n2
What/where to add intermediate assertions?

Synthesis of Squaring Program
Assert n > 0 is the number to be squared.
int sum = 0; // accumulated sum
int i = 0; // step number
int fd = 1; // first difference
int sd = 2; // second difference
while( i < n )
  {//  sum == i2
ô i < n ô fd == 2*i+1 ô sd == 2
  sum = sum + fd;
  fd = fd + sd;
  i = i+1;
  }
Assert sum == n2
The arrow-ed assertion is called a loop invariant.

Prove the Squaring Program
Using assertion-based reasoning
(Òloop invariantÓ)
for PC.
Use Òenergy functionÓ or ÒvariantÓ
for termination.

Using Primed Variables
A useful technique to avoid confusion:
Each variable has a primed and un-primed version.
Unprimed: indicates the value of variable before the step or iteration.
Primed: indicates the value after

Using Primed Variables
Instead of assignment statements:
  sum = sum + fd;
  fd = fd + sd;
  i = i+1;
use mathematical equations:
      sumÕ == sum + fd
      fdÕ == fd + sd
      sdÕ == sd (no change)
iÕ == i+1
It is easier to reason about equations:
sum == i2    ô i < n  ô fd == 2*i+1   ô sd == 2
ô i < n
implies
sumÕ == iÕ2 ô iÕ < n ô fdÕ == 2*iÕ+1 ô sdÕ == 2

Transition Induction
To prove that an assertion (e.g. a loop invariant) is true whenever the program reaches the point to which the assertion is attached:
Basis: Show that the assertion is true the first time
Induction: Show that if the assertion is true the current time (unprimed variables), then it is true the next time (primed variables), provided there is a next time.

Loop Invariant Positioning
The loop invariant always goes right before the test for continuing to the next iteration.
In particular, it is ÒtestedÓ:
Before the first iteration, if any.
Before each additional iteration.
Before exit.

Energy Function Principle
for Proving Termination
An energy function (also called a ÒvariantÓ) is a function that one fabricates:
f:States
¨ Natural Numbers
(therefore always non-negative)
If the same point in the program is visited with state s1 after state s2, then necessarily:
f(s2) < f(s1)

Exercise: Prove this ÒcubingÓ program
Example
Construct, and prove correct, a searching program.
The need to prove that a program is correct can have beneficial effects on the structure of the program:
You need to be able to explain (in logic) what all the pieces do.

Search Program Specification
Input assertion:
a == a0
Output assertion:
a == a0, and
(($i) a[i] == v) ¨ a[r] == v, and
(("i) a[i] != v) ¨ r == -1
where r represents the returned value
For simplicity, we assume that the quantifiers range over the valid indices of the array only.

Note
(($i) a[i] == v)
(("i) a[i] != v)
The above two conditions are complementary; one or the other always holds, but not both.
This is a variant of DeMorganÕs Law.

A Search Program
Adding Input and Output Assertions (shown in red)
Globalizing an Assertion
Adding Intermediate Assertions (1)
Adding Intermediate Assertions (2)
Adding Intermediate Assertions (3)
Adding Intermediate Assertions (4)
Proving the Assertions (1)
Proving the Assertions (2)
Proving the Assertions (3)
Proving the Assertions (4)
Proving the Assertions (5)
In more detail (pay careful attention to < vs. <)
Proving the Assertions (6)
Proving the Assertions (8)
Thought Exercise
What if the search program were instead a binary search of an ordered array?
What would the loop invariant be?
How would you prove that it holds?

Structural Induction
Complementary approach to correctness.
Induction on data values or structure.
Resembles classical mathematical induction
Proves PC + termination at the same time.
Useful for functional programs.
due to McCarthy transformation, this means all programs
Set-up can be trickier.

Structural Induction for Lists
Let P be a property of (open) lists
To show
(
"L) P(L)
it is sufficient to show:
P([]) // Basis
("L) ("A)  P(L) => P([A | L]) // Induction step

Structural Induction Example
append([], M) => M;
append([A | L], M) => [A | append(L, M)];
Show
(
"L) ("M)
 
length(append(L, M)) = length(L) + length(M)
Here P(L) is the property:

 (
"M) length(append(L, M)) = length(L) + length(M)

Structural Induction Proof: Basis
To show P([]):
(
"M) length(append([], M)) = length([]) + length(M)
We know from the definition of append that
append([], M) == M.
We also know that
length([]) == 0.
So the thing to be shown reduces to:
(
"M) length(M) == 0 + length(M)
which is true since the equation reduces to an identity.

Structural Induction Proof: Induction Step
To show ("L) ("A)  P(L) => P([A | L])
Assume P(L):
(
"M) length(append(L, M)) = length(L) + length(M)
To show P([A | L]):
(
"M) length(append([A | L], M)) = length([A | L]) + length(M)
From the definition of append, we know that the Òto showÓ part is the same as:
 (
"M) length([A | append(L, M) ]) = length([A | L]) + length(M)
Since length([A | X]) == 1 + length(X), the Òto showÓ part is equivalent to:
 (
"M) 1 + length(append(L, M)) == 1 + length(L) + length(M)
By the induction hypothesis, we can substitute for length(append(L, M)) an expression that renders this as an identity.

Fine Points
To be perfectly rigorous, weÕd need to axiomatize information such as:

length([A | X]) == 1 + length(X)
WeÕd want to formalize the definition of length, addition, etc.
There are software systems such as ACL2 that automate many proofs of this nature.

Undecidability
There is no tool that can determine whether an arbitrary predicate logic formula is valid.
If there were, the halting problem would be solvable.
This is because the computation by a Turing machine can be captured as an appropriate logical formula, one which is valid iff the Turing machine fails to halt.