Computational Complexity

Solving Computational Problems


	First, get it right.
	Then, make it faster.

	Avoid optimizing prematurely.

Topics


	Algorithm analysis
	Fast algorithm synthesis
	Empirical measurement of complexity

ŇComplexityÓ in the
algorithm-analysis context


	means the cost of a program's execution
		(in running time, memory, ...)
	rather than
	the cost of creating the program
		(in # of statements, development time, ...)

	In this context, less-complex programs may require more development time.

Functions associated with a program


	Consider a program with one natural number as input (for simplicity).
	Two functions that can be associated with the program are:
		f(n), the function computed by the program
		T(n), the running time of the program

Running time T(n)

Possible size measures n
for T(n)


	Total number of bits used to encode the input.
	Number of data values in the input (e.g. size of an array) The second is viewed as an approximation to the first.

Primitive Operations


	These are operations which we don't further decompose in our analysis.
	They are considered the fundamental building blocks of the algorithm, e.g. + * - / if( )
	Typically, the time they take is assumed to be constant.
		Caution: This assumption is not always valid, and may need to be revisited.

Is multiplication really ŇprimitiveÓ?


	In doing arithmetic on arbitrarily-large numbers, the size of the numerals representing those numbers may have a definite effect on the time required.
		2 x 2
	vs.
		26378491562329846 x 786431258901237

Size of Numerals


	For a single number (n) input, size of the corresponding numeral is typically on the order of log(n)
	e.g. decimal numeral encoding:
		size(n) = #digits of n = Žlog₁₀nť
	Žxť = smallest integer > x (called the Ňceiling ofÓ x)

Does Numeral Radix Matter?


	Not much, as long as itŐs > 1.
	r-ary numeral encoding:
		size(n) = #digits of n = Žlog_rnť
		log_rn vs. log_sn ?
	All logs differ by only a constant multiple:
		log_rn = log_rs log_sn
		log_rs is a constant, not a function of n.

log₂ is the norm


	log₁₀(n) = log₁₀(2) log₂(n) = 0.301 log₂(n)
	log₂(n) = 3.32 log₁₀(n)

Asymptotic Analysis


	Typically in measuring complexity, we look at the growth-rate of the time function as size increases without bound, rather than the value itself.
	There is therefore a tendency to pay less attention to constant factors in the run-time function.
	This is called asymptotic analysis.
	It is only one part of the story; sometimes constants are important.

Step Counting


	Using exact running time to measure an algorithm requires calibration based on the type of machine, clock rate, etc.
	Instead, we usually just count steps taken in the algorithm.
	Often we will assume primitives take one step each.
	This is usually enough to give us an accurate view of the growth rate of running time.

Straight-Line Code

Loop Code

Non-Numeric Loop Control

Recursive Code

Recurrence Formulas
represent time for recursive expressions

Solving Recurrence Formulas
One Method: Repeated Substitution

Solving Recurrence Formulas
Another Example

Try These

GaussŐ 3rd Grade Technique

Approximate Solutions


	Rather than solve a recurrence exactly, it is often simpler, yet serves the same purpose, to get an approximate solution.

	More specifically, weŐd like a Ňtight upper boundÓ on the solution, that is, an approximation that differs from actuality by at most a constant multiple.

ŇOÓ Notation


	ŇOÓ is letter ŇOhÓ (for ŇorderÓ)
	Used to express upper bounds on running time (number of steps)
	T ë O(f) means that ($c)("n) T(n) < c f(n)

	Think of O(f) as the set of functions that grow no faster than a constant times the value of f.
	This is Ňbig-OhÓ; Ňlittle-ohÓ has a different meaning.

ŇOÓ Notation


	Multiplicative constants are irrelevant in ŇOÓ comparisons:
	If g is a function, then any function n Þ d*g(n) (using anonymous function notation, a la rex), where d is a constant, is ë O(g).
	Examples:
		n Þ 2.5*n² ë O(n Þ n² )
		n Þ 100000000*n² ë O(n Þ n² )

Notation Abuse


	It is customary and usual to drop the Ňn ÞÓ and just use the body expression of the anonymous function.

	Examples:
		O(n²) instead of O(n Þ n² )
		O(n) instead of O(n Þ n )
		O(1) instead of O(n Þ 1)

Notation Abuse


	Examples:
		2.5*n² ë O(n² )
		100000000*n ë O(n)
		10000000000 ë O(1)

Words for Functions


	A function f is called
		constant if f ë O(1)
		logarithmic if f ë O(log(n))
		linear if f ë O(n)
		quadratic if f ë O(n²)
		cubic if f ë O(n³)
		polynomial if f ë O(n^k), for some constant k
		exponential if f ë O(2^p(n)), for some polynomial p(n)
		factorial if f ë O(n!)

Why Asymptotic Complexity Matters
Running Time as a function of Complexity

Allowable Problem Size
as a Function of Available Time

ŇOÓ Notation


	Example algorithms
		O(n²):
		O(n³):
		O(n):
		O(log(n)):
		O(1):
		O(2ⁿ):
		O(n!):

Rules for ŇOÓ
Additive Rule


	f+g ë O(max(f, g)) Here f+g means n Þ f(n) + g(n) max(f, g) means n Þ max(f(n), g(n))
	Corollary: If g ë O(f), then f+g ë O(f)
	Example: For polynomials, the highest order term dominates, e.g. n⁵+1000000n³+10000n² ë O(n⁵)

Multiplicative Rule


	If f ë O(g), then hf ë O(hg) where hf means n Þ h(n)f(n)
	For example, n*log(n) ë O(n²) since log(n) ë O(n)

Transitivity Rule


	If f ë O(g) and g ë O(h), then f ë O(h).

Derivative Rule


	If fŐ ë O(gŐ), where Ő denotes the derivative, then f ë O(g).
	Example: log(n) ë O(n). This follows from the derivative rule because 1/n ë O(1).

Limit Rule


	If lim f(n)/g(n) = k n¨ ´ then
		If k > 0, f ë O(g), and g ë O(f).
		If k = 0, f ë O(g), but not conversely.

Tight Bounds


	A bound f ë O(g) is tight if g ë O(f) also.

Example: log(n) ë O(n^1/2).


	This holds provided 1/n ë O(n^-1/2), according to the derivative rule.
	Apply the limit rule: lim ((1/n) / n^-1/2) = lim (1/ n^1/2) = 0
	This bound is not tight, since the limit is 0.
	Therefore n^-1/2 ě O(1/n).

Black-box vs. White-box Complexity


	Black-box: We have a copy of the program, with no code. We can run it for different sizes of input.
	White-box (aka ŇClear boxÓ): We have the code. We can analyze it.

	These terms are from the area of software testing.

Black-Box Complexity


	Run the program for different sizes of data set; try to get a fix on the growth rate.
	What sizes?
		An approach is to repeatedly double the input size, until testing becomes infeasible (e.g. it takes too long, or the program breaks).

Doubling Input Size

Black-Box Complexity


	Run on sizes 32, 64, 128, 512, É
	For each n, get time T(n).
	How can we estimate the order of run-time (e.g. O(n²), O(n³), etc.)?

Black-Box Complexity


	Suppose we are trying to bolster a hypothesis T(n) ë O(f(n))
	From the definition, we know this means there is a constant c such that for all n, T(n) < c f(n)
	If our hypothesis is correct, we therefore expect for all n, T(n) / f(n) < c
	We can simply examine this ratio.

Black-Box Complexity


	If we see T(n) / f(n) < c
	then the hypothesis T(n) ë O(f(n)) is supported.
	If T(n) / f(n) is approximately a constant as n increases, then the bound appears to be tight.
	If T(n) / f(n) decreases as n increase, the bound is loose.
	If T(n) / f(n) increases, we donŐt have a bound.

Use Empirical Analysis to Predict


	If T(n) ë O(f(n)) is supported, we can predict an upper bound for any data set size n using f and knowing the implied constant.

	The prediction might or might not be accurate, since we recall that size n abstracts away variations among data sets of that size.

Sorting Programs:

The Ňfruit fliesÓ of
algorithm analysis

Examples: Sorting Programs


	See turing:/cs/cs60/examples/sorting

	Use unix command: run <type> which will try random dataset sizes in the ranges 64, 128, 256, É, 65536
	type can be one of:
		quicksort, heapsort, minsort, radixsort, bucketsort

Result Tabulation from turing runs
(times in ms.)

Example: minsort

White-Box Complexity


	Here we examine the code (loop structure, etc.)

Analysis of Typical Loops

Analysis of Actual Programs

minsort:
sorting by selecting minima

minsort

insertion sort in rex

recurrence for isort

recurrence for insert

returning to
recurrence for isort

Is O(n²) the
best we can do
for sorting?

Algorithm Speedup Techniques


	Divide and Conquer

	The ŇdigitalÓ principle: Use data values to do selection or direct accessing
	Try trees instead of linear arrangement
	ŇDynamicÓ programming (later)

Technique #1:
Divide-and-Conquer


	Rearrange the array to be sorted:
		Low elements on the bottom
		High elements on the top
		Use some element as the ŇpivotÓ value
	Sort the low and high portions recursively
	Called ŇquicksortÓ
	Invented by C.A.R. (Tony) Hoare

Tony Hoare

Quicksort Illustrated

Quicksort Analysis


	Overall steps = cn x number of levels.
	O(log(n)) levels in optimistic case.
	O(n) levels in pessimistic case.
	Overall O(n²).
	The average case can be shown to be O(n log(n)) based on a probabilistic argument, assuming the data are initially randomly distributed.

Another Divide-and-Conquer Sort
Mergesort


	Sort by successively merging longer and longer sorted sequences.
	Useful with linked lists, or large files.
	More difficult to program for arrays.
	Different versions exists:
		Top-Down: Split unordered sequence, vs.
		Bottom-Up: Start with sequences of length 1 and create increasingly longer ones.

Bottom-Up Mergesort


	Start with sequences of length 1; these are sorted by default.
	Merge pairs of sorted sequences to form single sorted sequences,
	until there is only one sequence left.

Bottom-Up Mergesort Example

Bottom-Up Mergesort Analysis


	Merging two sequences of length n/2 each can be done in O(n) if linked lists are used.
	In merge sort of n elements, we merge: n/2 pairs of sequences of length 1 n/4 pairs of sequences of length 2 n/8 pairs of sequences of length 4 É 1 pair of sequences of length n/2

Bottom-Up Mergesort Analysis


	At each ŇlevelÓ O(n) steps are used.
	There are log(n) levels.
	Therefore mergesort is O(n log(n)) worst case.

Bottom-Up Mergesort Analysis


	Let T(j) = steps at levels < j
	Then
		T(1) = c n, c some constant
		T(j+1) = T(j) + c n
	T(log(n)) is the time to sort n elements = c n log(n)

Bottom-Up Mergesort in rex (1)

Bottom-Up Mergesort in rex (2)

Top-Down Mergesort Example

Alternate ways to split

Top-Down Mergesort Analysis


	T_merge(n) = cn Time to merge two lists of length n/2
	T_split(n) = dn Time to split a list of length n
	T_sort(1) = 1;
	T_sort(n) = T_split(n) + 2 T_sort(n/2) + T_merge(n) = 2 T_sort(n/2) + en where e = c+d

Top-Down Mergesort Analysis


	T_sort(1) = 1;
	T_sort(n) = 2 T_sort(n/2) + en
	Substituting T(n) = 2 T(n/2) + en = 2 (T(n/4) + en/2) + en = 2 (2(T(n/8) +en/4) + en/2) + en = É = 2^log(n)1 + log(n)en = n + en log(n) ë O(nlog(n))

Technique #2:
Using data values to do selection


	Under this category, we have
		bucket sort
		radix sort
	We make non-general assumptions about the data:
		The size of keys to be sorted is limited to integers with a fixed upper bound.

Bucket Sort


	Related to hashing
		Both use indexing, which is O(1), to find ŇbucketÓ
	Suppose the set of keys to be sorted is known to be limited to a relatively small integer range, say {0, 1, É, R-1}.
	Create an array of size R of lists, each entry corresponding to a key value.
	Go through the data once, putting each element in the corresponding list.
	Concatenate the resulting lists.

Analysis of Bucket Sort


	Assume that the number of data elements is comparable to, or larger than, the number of buckets.
	Go through the data once, putting each element in the corresponding list. This is O(n).
	Concatenate the resulting lists. This is also O(n).
	Therefore we have O(n) overall.
	Remember that bucket sorting makes special assumptions about the data.

Radix Sort


	Like bucket sort, but with smaller arrays.
	Trades array size for multiple passes.
	Represent the range as radix b integers.
	Make P = log_b(R) passes.
	Sort on the least significant digit first, progressing toward the most.
	Re-collect the data after each pass and redistributed.

Analysis of Radix Sort


	Assuming a bounded range, the number of passes P is a fixed constant.
	Each pass uses O(n).
	Therefore we have O(n) overall.

Demo of Radix Sort

Technique #3:
Use a Tree instead of Linear List


	For the same amount of data, a sufficiently balanced tree uses only O(log n) to traverse a chain rather than O(n) (as in na•ve bubble, selection, or insertion sort).
	Heapsort is a form of sorting that uses trees.

Heapsort


	Loosely based on the ŇPeter PrincipleÓ (Dr. Laurence J. Peter)
	books by Laurence Peter

Laurence J. Peter

The Peter Principle


	ŇIn a hierarchy, everyone rises to his/her level of incompetenceÓ.
	[A person having reached this level is said to have the Ňfinal placement syndromeÓ.]

A Hierarchy

The Peter Principle
applied to Sorting


	A heap is a binary tree structure in which, for each sub-tree, the root is > all elements in that sub-tree.
	It suffices for each root to be > its children.
	[This is one definition of ŇheapÓ used in CS. The other definition is the area in memory from which storage is allocated dynamically, e.g. when new is called.]

A Heap

A Heap?

The Two Phases in Heapsort


	Phase I: Form the data into a heap
	Phase II: Transform the heap into a linear sequence

Phase I: Forming a Heap


	This is the actual Peter Principle.
	Start with the data distributed randomly in the tree.
	Form Ňsub-heapsÓ beginning at the leaves and working toward the root.
	Successively combine two sub-heaps with a common root into a single heap.

Combination as a Tournament

Analysis of forming a heap from two sub-heaps


	3-way rounds are played from the parent downward to the leaves
	Only one path toward the leaves is followed, since other sub-trees along the path donŐt change
	A 3-way round uses O(1) steps (2 comparisons, possible exchange)
	The time to form a heap at level k from the leaves is therefore O(k).
	In the worst case, this is O(log(n)).

Iterating sub-heap formation from leaves to root


	The leaves are already heaps by themselves.
	We need to play the tournament (O(k) rounds at level k) n/2 times, corresponding to the non-leaves.
	Coarsely, Phase I is O(n log(n)) overall.
	Is this bound tight?

Illustrating Phase I

After Level 1 Heap Formation

Level 2 Heap Formation

After Level 2 Heap Formation

Level 3 Heap Formation

After Level 3 Heap Formation

Heapsort Phase II


	Now that we have a heap, what do we do?

Heapsort Phase II


	One antidote for final-placement syndrome is that people retire.
	In our heap, we ŇretireÓ the maximum, which is guaranteed to be at the root.
	This leaves a hole that needs to be filled.

Retiring the maximum

Filing the hole


	The way this happens is different from in a corporation.
	We pick the rightmost leaf, and tentatively place it in the hole at the root.
	Then we play the tournament from the root so that the new value reaches its proper level.

Filling the hole

Result of the tournament

The cost of keeping
the heap happy


	The tournament path could be as long as the path from the root to a leaf.
	Along the path, a number of O(1) rounds were played.
	The cost for one post-retirement adjustment is therefore O(log(n)).
	Retiring all n elements in sequence gives us the sorted order.
	Overall then, we have O(n log(n)) for Phase II.

The Rest of Phase II illustrated

Phase II, step 2, continued

Phase II, step 3

Phase II, step 3, continued

Phase II, step 4

Phase II, step 5

Phase II, step 6

Phase II, step 7

Phase II, step 8

Phase II, step9

Phase II, step 10

Phase II, step 11

Phase II, step 12

Phase II, step 13

Phase II, step 14

Phase II, step 15

Conclusion of Phase II

Where to put the retirees?


	We can put the retirees back in the tree, as long as we know not to play any more tournaments with them.
	There is an easy way to keep track of this, as we shall see.

Phase II with Retiree Placement

Node Numbering


	Note that the sorted sequence is readable from the nodes in breadth-first order.
	Further, we can maintain the entire tree as an array (with no explicit links), as shown next.

Array Node Numbering
Permits Read-out of Sorted Data

Use as a Tree:
Finding your parents/children

Heapsort Code (1)

Heapsort Code (2)

Heapsort Summary


	O(n log(n)) steps
	In-place array implementation

Priority Queue


	A priority queue is a data repository that has two methods:
		insert
		removeMax (or removeMin, depending on the version)
	A heap is a natural implementation of a priority queue:
		Both insertions and deletions are O(log n)

Priority Queue Applications


	A priority queue (with min removal) is often found in simulation applications.
	Events are time-stamped and put in a priority queue, which orders them by smallest time first.
	On a typical simulation cycle:
		The event with the next timestamp is removed.
		The event may cause the insertion of new events with later timestamps.

Lower-Bound for Sorting


	Sorting based on pairwise comparisons (which excludes radix and bucket sorts) requires a minimum of c n log₂(n) steps.
	Sorting methods that achieve this lower bound are called ŇoptimalÓ.
	Heapsort and mergesort are the two optimal sorts weŐve studied.

Derivation of the
Sorting Lower-Bound


	A sorting algorithm effectively establishes which of n! permutations the data are originally in and through a series of comparisons and exchanges permutes the data to a fixed order.
	This can be viewed as a tree with n! nodes, constructed with binary internal nodes for comparisons.

Derivation of the
Sorting Lower-Bound


	The worst-case run time of the sorting algorithm is the height of a tree having has n! nodes.
	The minimum height of such a tree is log₂(n!).
	By StirlingŐs formula, n! ~ (2 p n)^1/2 (n/e)ⁿ, so log₂(n!) is O(n log(n)).

Sorting Summary


	Bucket sort / Radix sort
		fastest asymptotic performance
		special assumptions about data
	Heap sort
		optimal performance
		sorts arrays in-place
	Merge sort
		optimal performance
		sorts lists
		more difficult for arrays