Computational  Complexity
Solving Computational Problems
First, get it right.
Then, make it faster.
Avoid optimizing prematurely.

Topics
Algorithm analysis
Fast algorithm synthesis
Empirical measurement of complexity

Complexity in the
algorithm-analysis context
means the cost of a program's execution
(in running time, memory, ...)
rather than
the cost of creating the program
 (in # of statements, development time, ...)
In this context, less-complex programs may require more development time.

Functions associated with a program
Consider a program with one natural number as input (for simplicity).
Two functions that can be associated with the program are:
f(n), the function computed by the program
T(n), the running time of the program

Running time T(n)
Possible size measures n
for T(n)
Total number of bits used to encode the input.
Number of data values in the input (e.g. size of an array)

The second is viewed as an approximation to the first.

Primitive Operations
These are operations which we don't further decompose in our analysis.
They are considered the fundamental building blocks of the algorithm, e.g.
+   *   -   /   if( )
Typically, the time they take is assumed to be constant.
Caution: This assumption is not always valid, and may need to be revisited.

Is multiplication really primitive?
In doing arithmetic on arbitrarily-large numbers, the size of the numerals representing those numbers may have a definite effect on the time required.
 2 x 2
vs.
 26378491562329846 x 786431258901237

Size of Numerals
For a single number (n) input, size of the corresponding numeral is typically on the order of log(n)
e.g. decimal numeral encoding:
size(n) = #digits of n
           = 
log10n
 x = smallest integer > x
 (called the ceiling of x)

Does Numeral Radix Matter?
Not much, as long as its > 1.
r-ary numeral encoding:
size(n) = #digits of n
           = 
logrn
logrn vs. logsn ?
All logs differ by only a constant multiple:
logrn = logrs logsn
logrs is a constant, not a function of n.

log2 is the norm
log10(n) = log10(2) log2(n) = 0.301 log2(n)
log2(n) = 3.32 log10(n)

Asymptotic Analysis
Typically in measuring complexity, we look at the growth-rate of the time function as size increases without bound, rather than the value itself.
There is therefore a tendency to pay less attention to constant factors in the run-time function.
This is called asymptotic analysis.
It is only one part of the story; sometimes constants are important.

Step Counting
Using exact running time to measure an algorithm requires calibration based on the type of machine, clock rate, etc.
Instead, we usually just count steps taken in the algorithm.
Often we will assume primitives take one step each.
This is usually enough to give us an accurate view of the growth rate of running time.

Straight-Line Code
Loop Code
Non-Numeric Loop Control
Recursive Code
Recurrence Formulas
represent time for recursive expressions
Solving Recurrence Formulas
One Method: Repeated Substitution
Solving Recurrence Formulas
Another Example
Try These
Gauss 3rd Grade Technique
Approximate Solutions
Rather than solve a recurrence exactly, it is often simpler, yet serves the same purpose, to get an approximate solution.
More specifically, wed like a tight upper bound on the solution, that is, an approximation that differs from actuality by at most a constant multiple.

O Notation
O is letter Oh (for order)
Used to express upper bounds on running time (number of steps)
T O(f) means that

(
$c)("n) T(n) < c f(n)
Think of O(f) as the set of functions that grow no faster than a constant times the value of f.
This is big-Oh; little-oh has a different meaning.

O Notation
Multiplicative constants are irrelevant in O comparisons:
If g is a function, then any function n Þ d*g(n) (using anonymous function notation, a la rex), where d is a constant,
is
O(g).
Examples:
n Þ 2.5*n2    O(n Þ n2 )
n Þ 100000000*n2    O(n Þ n2 )

Notation Abuse
It is customary and usual to drop the n Þ and just use the body expression of the anonymous function.
Examples:
O(n2) instead of O(n Þ n2 )
O(n) instead of O(n Þ n )
O(1) instead of O(n Þ 1)

Notation Abuse
Examples:
2.5*n2 O(n2 )
100000000*n O(n)
10000000000 O(1)

Words for Functions
A function f is called
constant if f O(1)
logarithmic if f O(log(n))
linear if f O(n)
quadratic if f O(n2)
cubic if f O(n3)
polynomial if f O(nk), for some constant k
exponential if f O(2p(n)), for some polynomial p(n)
factorial if f O(n!)

Why Asymptotic Complexity Matters
Running Time as a function of Complexity
Allowable Problem Size
as a Function of Available Time
O Notation
Example algorithms
O(n2):
O(n3):
O(n):
O(log(n)):
O(1):
O(2n):
O(n!):

Rules for O
Additive Rule
f+g O(max(f, g))
Here f+g means n
Þ f(n) + g(n)
max(f, g) means n
Þ max(f(n), g(n))
Corollary: If g O(f), then f+g O(f)
Example: For polynomials, the highest order term dominates, e.g.
n5+1000000n3+10000n2
O(n5)

Multiplicative Rule
If f O(g), then h*f O(h*g)
where h*f means n
Þ h(n)*f(n)
For example,
n*log(n)
O(n2)
since
log(n)
O(n)

Transitivity Rule
If f O(g) and g O(h),
 
then f
O(h).

Derivative Rule
If f O(g),

where denotes the derivative,

then f
O(g).
Example: log(n) O(n).
This follows from the derivative rule because 1/n
O(1).

Limit Rule
If  lim   f(n)/g(n) = k
     n

then
If k > 0, f O(g), and g O(f).
If k = 0, f O(g), but not conversely.

Tight Bounds
A bound f O(g) is tight if g O(f) also.

Example: log(n) O(n1/2).
This holds provided 1/n O(n-1/2), according to the derivative rule.
Apply the limit rule:
    lim ((1/n) / n-1/2)
 = lim (1/ n1/2)
 = 0
This bound is not tight, since the limit is 0.
Therefore n-1/2 O(1/n).

Black-box vs. White-box Complexity
Black-box: We have a copy of the program, with no code. We can run it for different sizes of input.
White-box (aka Clear box): We have the code. We can analyze it.
These terms are from the area of software testing.

Black-Box Complexity
Run the program for different sizes of data set; try to get a fix on the growth rate.
What sizes?
An approach is to repeatedly double the input size, until testing becomes infeasible (e.g. it takes too long, or the program breaks).

Doubling Input Size
Black-Box Complexity
Run on sizes 32, 64, 128, 512,
For each n, get time T(n).
How can we estimate the order of run-time (e.g. O(n2), O(n3), etc.)?

Black-Box Complexity
Suppose we are trying to bolster a hypothesis
  T(n)
O(f(n))
From the definition, we know this means there is a constant c such that
for all n, T(n) < c f(n)
If our hypothesis is correct, we therefore expect for all n,
  T(n) / f(n) < c
We can simply examine this ratio.

Black-Box Complexity
If we see
  T(n) / f(n) < c
then the hypothesis T(n) O(f(n)) is supported.
If T(n) / f(n) is approximately a constant as n increases, then the bound appears to be tight.
If T(n) / f(n) decreases as n increase, the bound is loose.
If T(n) / f(n) increases, we dont have a bound.

Use Empirical Analysis to Predict
If T(n) O(f(n)) is supported, we can predict an upper bound for any data set size n using f and knowing the implied constant.
The prediction might or might not be accurate, since we recall that size n abstracts away variations among data sets of that size.

Sorting Programs:

The fruit flies of
algorithm analysis
Examples: Sorting Programs
See turing:/cs/cs60/examples/sorting
Use unix command:
run <type>
which will try random dataset sizes in the ranges 64, 128, 256, , 65536
type can be one of:
quicksort, heapsort, minsort, radixsort, bucketsort

Result Tabulation from turing runs
(times in ms.)
Example: minsort
Example: minsort
White-Box Complexity
Here we examine the code (loop structure, etc.)

Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Typical Loops
Analysis of Actual Programs
minsort:
sorting by selecting minima
minsort
minsort
minsort
minsort
minsort
minsort
insertion sort in rex
recurrence for isort
recurrence for insert
returning to
recurrence for isort
Is O(n2) the
best we can do
for sorting?
Algorithm Speedup Techniques
Divide and Conquer
The digital principle: Use data values to do selection or direct accessing
Try trees instead of linear arrangement
Dynamic programming (later)

Technique #1:
Divide-and-Conquer
Rearrange the array to be sorted:
Low elements on the bottom
High elements on the top
Use some element as the pivot value
Sort the low and high portions recursively
Called quicksort
Invented by C.A.R. (Tony) Hoare

Tony Hoare
Quicksort Illustrated
Quicksort Illustrated
Quicksort Analysis
Overall steps = cn x number of levels.
O(log(n)) levels in optimistic case.
O(n) levels in pessimistic case.
Overall O(n2).
The average case can be shown to be O(n log(n)) based on a probabilistic argument, assuming the data are initially randomly distributed.

Another Divide-and-Conquer Sort
Mergesort
Sort by successively merging longer and longer sorted sequences.
Useful with linked lists, or large files.
More difficult to program for arrays.
Different versions exists:
Top-Down: Split unordered sequence, vs.
Bottom-Up: Start with sequences of length 1 and create increasingly longer ones.

Bottom-Up Mergesort
Start with sequences of length 1; these are sorted by default.
Merge pairs of sorted sequences to form single sorted sequences,
until there is only one sequence left.

Bottom-Up Mergesort Example
Bottom-Up Mergesort Analysis
Merging two sequences of length n/2 each can be done in O(n) if linked lists are used.
In merge sort of n elements, we merge:
n/2 pairs of sequences of length 1
n/4 pairs of sequences of length 2
n/8 pairs of sequences of length 4

1 pair of sequences of length n/2

Bottom-Up Mergesort Analysis
At each level O(n) steps are used.
There are log(n) levels.
Therefore mergesort is O(n log(n)) worst case.

Bottom-Up Mergesort Analysis
Let T(j) = steps at levels < j
Then
T(1) = c n, c some constant
T(j+1) = T(j) + c n
T(log(n)) is the time to sort n elements
= c n log(n)

Bottom-Up Mergesort in rex (1)
Bottom-Up Mergesort in rex (2)
Top-Down Mergesort Example
Alternate ways to split
Top-Down Mergesort Analysis
Tmerge(n) = cn Time to merge two lists of length n/2
Tsplit(n) = dn Time to split a list of length n
Tsort(1) = 1;
Tsort(n) = Tsplit(n) + 2 Tsort(n/2) + Tmerge(n)
 
= 2 Tsort(n/2) + en
where e = c+d

Top-Down Mergesort Analysis
Tsort(1) = 1;
Tsort(n) = 2 Tsort(n/2) + en
Substituting
T (n) = 2 T(n/2) + en
     = 2 (T(n/4) + en/2) + en
     = 2 (2(T(n/8) +en/4) + en/2) + en
   =
        = 2log(n)*1 + log(n)*en
        = n + en log(n)
  O(nlog(n))

Technique #2:
Using data values to do selection
Under this category, we have
bucket sort
radix sort
We make non-general assumptions about the data:
The size of keys to be sorted is limited to integers with a fixed upper bound.

Bucket Sort
Related to hashing
Both use indexing, which is O(1), to find bucket
Suppose the set of keys to be sorted is known to be limited to a relatively small integer range, say {0, 1, , R-1}.
Create an array of size R of lists, each entry corresponding to a key value.
Go through the data once, putting each element in the corresponding list.
Concatenate the resulting lists.

Analysis of Bucket Sort
Assume that the number of data elements is comparable to, or larger than, the number of buckets.
Go through the data once, putting each element in the corresponding list.
This is O(n).
Concatenate the resulting lists.
This is also O(n).
Therefore we have O(n) overall.
Remember that bucket sorting makes special assumptions about the data.

Radix Sort
Like bucket sort, but with smaller arrays.
Trades array size for multiple passes.
Represent the range as radix b integers.
Make P = logb(R) passes.
Sort on the least significant digit first, progressing toward the most.
Re-collect the data after each pass and redistributed.

Analysis of Radix Sort
Assuming a bounded range, the number of passes P is a fixed constant.
Each pass uses O(n).
Therefore we have O(n) overall.

Demo of Radix Sort
Technique #3:
Use a Tree instead of Linear List
For the same amount of data, a sufficiently balanced tree uses only O(log n) to traverse a chain rather than O(n) (as in nave bubble, selection, or insertion sort).
Heapsort is a form of sorting that uses trees.

Heapsort
Loosely based on the Peter Principle (Dr. Laurence J. Peter)
books by Laurence Peter

Laurence J. Peter
The Peter Principle
In a hierarchy, everyone rises to his/her level of incompetence.
[A person having reached this level is said to have the final placement syndrome.]

A Hierarchy
The Peter Principle
applied to Sorting
A heap is a binary tree structure in which, for each sub-tree, the root is > all elements in that sub-tree.
It suffices for each root to be > its children.
[This is one definition of heap used in CS. The other definition is the area in memory from which storage is allocated dynamically, e.g. when new is called.]

A Heap
A Heap?
The Two Phases in Heapsort
Phase I: Form the data into a heap
Phase II: Transform the heap into a linear sequence

Phase I: Forming a Heap
This is the actual Peter Principle.
Start with the data distributed randomly in the tree.
Form sub-heaps beginning at the leaves and working toward the root.
Successively combine two sub-heaps with a common root into a single heap.

Combination as a Tournament
Combination as a Tournament
Analysis of forming a heap from two sub-heaps
3-way rounds are played from the parent downward to the leaves
Only one path toward the leaves is followed, since other sub-trees along the path dont change
A 3-way round uses O(1) steps (2 comparisons, possible exchange)
The time to form a heap at level k from the leaves is therefore O(k).
In the worst case, this is O(log(n)).

Iterating sub-heap formation from leaves to root
The leaves are already heaps by themselves.
We need to play the tournament (O(k) rounds at level k) n/2 times, corresponding to the non-leaves.
Coarsely, Phase I is O(n log(n)) overall.
Is this bound tight?

Illustrating Phase I
Illustrating Phase I
After Level 1 Heap Formation
Level 2 Heap Formation
After Level 2 Heap Formation
Level 3 Heap Formation
After Level 3 Heap Formation
Heapsort Phase II
Now that we have a heap, what do we do?

Heapsort Phase II
One antidote for final-placement syndrome is that people retire.
In our heap, we retire the maximum, which is guaranteed to be at the root.
This leaves a hole that needs to be filled.

Retiring the maximum
Filing the hole
The way this happens is different from in a corporation.
We pick the rightmost leaf, and tentatively place it in the hole at the root.
Then we play the tournament from the root so that the new value reaches its proper level.

Filling the hole
Filling the hole
Result of the tournament
The cost of keeping
the heap happy
The tournament path could be as long as the path from the root to a leaf.
Along the path, a number of O(1) rounds were played.
The cost for one post-retirement adjustment is therefore O(log(n)).
Retiring all n elements in sequence gives us the sorted order.
Overall then, we have O(n log(n)) for Phase II.

The Rest of Phase II illustrated
Phase II, step 2, continued
Phase II, step 3
Phase II, step 3, continued
Phase II, step 4
Phase II, step 5
Phase II, step 6
Phase II, step 7
Phase II, step 8
Phase II, step9
Phase II, step 10
Phase II, step 11
Phase II, step 12
Phase II, step 13
Phase II, step 14
Phase II, step 15
Conclusion of Phase II
Where to put the retirees?
We can put the retirees back in the tree, as long as we know not to play any more tournaments with them.
There is an easy way to keep track of this, as we shall see.

Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Phase II with Retiree Placement
Node Numbering
Note that the sorted sequence is readable from the nodes in breadth-first order.
Further, we can maintain the entire tree as an array (with no explicit links), as shown next.

Array Node Numbering
Permits Read-out of Sorted Data
Use as a Tree:
Finding your parents/children
Heapsort Code (1)
Heapsort Code (2)
Heapsort Summary
O(n log(n)) steps
In-place array implementation

Priority Queue
A priority queue is a data repository that has two methods:
insert
removeMax (or removeMin, depending on the version)
A heap is a natural implementation of a priority queue:
Both insertions and deletions are O(log n)

Priority Queue Applications
A priority queue (with min removal) is often found in simulation applications.
Events are time-stamped and put in a priority queue, which orders them by smallest time first.
On a typical simulation cycle:
The event with the next timestamp is removed.
The event may cause the insertion of new events with later timestamps.

Lower-Bound for Sorting
Sorting based on pairwise comparisons (which excludes radix and bucket sorts) requires a minimum of c n log2(n) steps.
Sorting methods that achieve this lower bound are called optimal.
Heapsort and mergesort are the two optimal sorts weve studied.

Derivation of the
Sorting Lower-Bound
A sorting algorithm effectively establishes which of n! permutations the data are originally in and through a series of comparisons and exchanges permutes the data to a fixed order.
This can be viewed as a tree with n! nodes, constructed with binary internal nodes for comparisons.

Derivation of the
Sorting Lower-Bound
The worst-case run time of the sorting algorithm is the height of a tree having has n! nodes.
The minimum height of such a tree is log2(n!).
By Stirlings formula, n! ~ (2 p n)1/2 (n/e) n,
so
log2(n!) is O(n log(n)).

Sorting Summary
Bucket sort / Radix sort
fastest asymptotic performance
special assumptions about data
Heap sort
optimal performance
sorts arrays in-place
Merge sort
optimal performance
sorts lists
more difficult for arrays