| Computational Complexity |
| Solving Computational Problems |
| First, get it right. | |
| Then, make it faster. | |
| Avoid optimizing prematurely. |
| Topics |
| Algorithm analysis | |
| Fast algorithm synthesis | |
| Empirical measurement of complexity |
| Complexity in the algorithm-analysis context |
| means the cost of a program's execution | ||
| (in running time, memory, ...) | ||
| rather than | ||
| the cost of creating the program | ||
| (in # of statements, development time, ...) | ||
| In this context, less-complex programs may require more development time. | ||
| Functions associated with a program |
| Consider a program with one natural number as input (for simplicity). | ||
| Two functions that can be associated with the program are: | ||
| f(n), the function computed by the program | ||
| T(n), the running time of the program | ||
| Running time T(n) |
| Possible size measures n for T(n) |
| Total number of bits used to encode the input. | |
| Number of data values in the input
(e.g. size of an array) The second is viewed as an approximation to the first. |
| Primitive Operations |
| These are operations which we don't further decompose in our analysis. | ||
| They are considered the fundamental
building blocks of the algorithm, e.g. + * - / if( ) |
||
| Typically, the time they take is assumed to be constant. | ||
| Caution: This assumption is not always valid, and may need to be revisited. | ||
| Is multiplication really primitive? |
| In doing arithmetic on arbitrarily-large numbers, the size of the numerals representing those numbers may have a definite effect on the time required. | ||
| 2 x 2 | ||
| vs. | ||
| 26378491562329846 x 786431258901237 | ||
| Size of Numerals |
| For a single number (n) input, size of the corresponding numeral is typically on the order of log(n) | ||
| e.g. decimal numeral encoding: | ||
| size(n) = #digits of n = log10n |
||
| x = smallest integer > x (called the ceiling of x) |
||
| Does Numeral Radix Matter? |
| Not much, as long as its > 1. | ||
| r-ary numeral encoding: | ||
| size(n) = #digits of n = logrn |
||
| logrn vs. logsn ? | ||
| All logs differ by only a constant multiple: | ||
| logrn = logrs logsn | ||
| logrs is a constant, not a function of n. | ||
| log2 is the norm |
| log10(n) = log10(2) log2(n) = 0.301 log2(n) | |
| log2(n) = 3.32 log10(n) | |
| Asymptotic Analysis |
| Typically in measuring complexity, we look at the growth-rate of the time function as size increases without bound, rather than the value itself. | |
| There is therefore a tendency to pay less attention to constant factors in the run-time function. | |
| This is called asymptotic analysis. | |
| It is only one part of the story; sometimes constants are important. |
| Step Counting |
| Using exact running time to measure an algorithm requires calibration based on the type of machine, clock rate, etc. | |
| Instead, we usually just count steps taken in the algorithm. | |
| Often we will assume primitives take one step each. | |
| This is usually enough to give us an accurate view of the growth rate of running time. |
| Straight-Line Code |
| Loop Code |
| Non-Numeric Loop Control |
| Recursive Code |
| Recurrence Formulas represent time for recursive expressions |
| Solving Recurrence Formulas One Method: Repeated Substitution |
| Solving Recurrence
Formulas Another Example |
| Try These |
| Gauss 3rd Grade Technique |
| Approximate Solutions |
| Rather than solve a recurrence exactly, it is often simpler, yet serves the same purpose, to get an approximate solution. | |
| More specifically, wed like a tight upper bound on the solution, that is, an approximation that differs from actuality by at most a constant multiple. |
| O Notation |
| O is letter Oh (for order) | |
| Used to express upper bounds on running time (number of steps) | |
| T O(f) means that ($c)("n) T(n) < c f(n) |
|
| Think of O(f) as the set of functions that grow no faster than a constant times the value of f. | |
| This is big-Oh; little-oh has a different meaning. | |
| O Notation |
| Multiplicative constants are irrelevant in O comparisons: | ||
| If g is a function, then any function
n Þ d*g(n) (using anonymous function notation,
a la rex), where d is a constant, is O(g). |
||
| Examples: | ||
| n Þ 2.5*n2 O(n Þ n2 ) | ||
| n Þ 100000000*n2 O(n Þ n2 ) | ||
| Notation Abuse |
| It is customary and usual to drop the n Þ and just use the body expression of the anonymous function. | ||
| Examples: | ||
| O(n2) instead of O(n Þ n2 ) | ||
| O(n) instead of O(n Þ n ) | ||
| O(1) instead of O(n Þ 1) | ||
| Notation Abuse |
| Examples: | ||
| 2.5*n2 O(n2 ) | ||
| 100000000*n O(n) | ||
| 10000000000 O(1) | ||
| Words for Functions |
| A function f is called | ||
| constant if f O(1) | ||
| logarithmic if f O(log(n)) | ||
| linear if f O(n) | ||
| quadratic if f O(n2) | ||
| cubic if f O(n3) | ||
| polynomial if f O(nk), for some constant k | ||
| exponential if f O(2p(n)), for some polynomial p(n) | ||
| factorial if f O(n!) | ||
| Why Asymptotic Complexity
Matters Running Time as a function of Complexity |
| Allowable Problem Size
as a Function of Available Time |
| O Notation |
| Example algorithms | ||
| O(n2): | ||
| O(n3): | ||
| O(n): | ||
| O(log(n)): | ||
| O(1): | ||
| O(2n): | ||
| O(n!): | ||
| Rules for O Additive Rule |
| f+g O(max(f, g)) Here f+g means n Þ f(n) + g(n) max(f, g) means n Þ max(f(n), g(n)) |
|
| Corollary: If g O(f), then f+g O(f) | |
| Example: For polynomials, the highest
order term dominates, e.g. n5+1000000n3+10000n2 O(n5) |
| Multiplicative Rule |
| If f O(g), then h*f O(h*g) where h*f means n Þ h(n)*f(n) |
|
| For example, n*log(n) O(n2) since log(n) O(n) |
| Transitivity Rule |
| If f O(g) and g O(h), then f O(h). |
| Derivative Rule |
| If f
O(g), where denotes the derivative, then f O(g). |
|
| Example: log(n) O(n). This follows from the derivative rule because 1/n O(1). |
| Limit Rule |
| If lim
f(n)/g(n) = k n then |
||
| If k > 0, f O(g), and g O(f). | ||
| If k = 0, f O(g), but not conversely. | ||
| Tight Bounds |
| A bound f O(g) is tight if g O(f) also. | |
| Example: log(n) O(n1/2). |
| This holds provided 1/n O(n-1/2), according to the derivative rule. | |
| Apply the limit rule: lim ((1/n) / n-1/2) = lim (1/ n1/2) = 0 |
|
| This bound is not tight, since the limit is 0. | |
| Therefore n-1/2 O(1/n). |
| Black-box vs. White-box Complexity |
| Black-box: We have a copy of the program, with no code. We can run it for different sizes of input. | |
| White-box (aka Clear box): We have the code. We can analyze it. | |
| These terms are from the area of software testing. |
| Black-Box Complexity |
| Run the program for different sizes of data set; try to get a fix on the growth rate. | ||
| What sizes? | ||
| An approach is to repeatedly double the input size, until testing becomes infeasible (e.g. it takes too long, or the program breaks). | ||
| Doubling Input Size |
| Black-Box Complexity |
| Run on sizes 32, 64, 128, 512, | |
| For each n, get time T(n). | |
| How can we estimate the order of run-time (e.g. O(n2), O(n3), etc.)? |
| Black-Box Complexity |
| Suppose we are trying to bolster a
hypothesis T(n) O(f(n)) |
|
| From the definition, we know this means
there is a constant c such that for all n, T(n) < c f(n) |
|
| If our hypothesis is correct, we
therefore expect for all n, T(n) / f(n) < c |
|
| We can simply examine this ratio. |
| Black-Box Complexity |
| If we see T(n) / f(n) < c |
|
| then the hypothesis T(n) O(f(n)) is supported. | |
| If T(n) / f(n) is approximately a constant as n increases, then the bound appears to be tight. | |
| If T(n) / f(n) decreases as n increase, the bound is loose. | |
| If T(n) / f(n) increases, we dont have a bound. | |
| Use Empirical Analysis to Predict |
| If T(n) O(f(n)) is supported, we can predict an upper bound for any data set size n using f and knowing the implied constant. | |
| The prediction might or might not be accurate, since we recall that size n abstracts away variations among data sets of that size. |
| Sorting Programs: The fruit flies of algorithm analysis |
| Examples: Sorting Programs |
| See turing:/cs/cs60/examples/sorting | ||
| Use unix command: run <type> which will try random dataset sizes in the ranges 64, 128, 256, , 65536 |
||
| type can be one of: | ||
| quicksort, heapsort, minsort, radixsort, bucketsort | ||
| Result Tabulation from
turing runs (times in ms.) |
| Example: minsort |
| Example: minsort |
| White-Box Complexity |
| Here we examine the code (loop structure, etc.) |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Typical Loops |
| Analysis of Actual Programs |
| minsort: sorting by selecting minima |
| minsort |
| minsort |
| minsort |
| minsort |
| minsort |
| minsort |
| insertion sort in rex |
| recurrence for isort |
| recurrence for insert |
| returning to recurrence for isort |
| Is O(n2) the
best we can do for sorting? |
| Algorithm Speedup Techniques |
| Divide and Conquer | |
| The digital principle: Use data values to do selection or direct accessing | |
| Try trees instead of linear arrangement | |
| Dynamic programming (later) |
| Technique #1: Divide-and-Conquer |
| Rearrange the array to be sorted: | ||
| Low elements on the bottom | ||
| High elements on the top | ||
| Use some element as the pivot value | ||
| Sort the low and high portions recursively | ||
| Called quicksort | ||
| Invented by C.A.R. (Tony) Hoare | ||
| Tony Hoare |
| Quicksort Illustrated |
| Quicksort Illustrated |
| Quicksort Analysis |
| Overall steps = cn x number of levels. | |
| O(log(n)) levels in optimistic case. | |
| O(n) levels in pessimistic case. | |
| Overall O(n2). | |
| The average case can be shown to be O(n log(n)) based on a probabilistic argument, assuming the data are initially randomly distributed. |
| Another Divide-and-Conquer
Sort Mergesort |
| Sort by successively merging longer and longer sorted sequences. | ||
| Useful with linked lists, or large files. | ||
| More difficult to program for arrays. | ||
| Different versions exists: | ||
| Top-Down: Split unordered sequence, vs. | ||
| Bottom-Up: Start with sequences of length 1 and create increasingly longer ones. | ||
| Bottom-Up Mergesort |
| Start with sequences of length 1; these are sorted by default. | |
| Merge pairs of sorted sequences to form single sorted sequences, | |
| until there is only one sequence left. | |
| Bottom-Up Mergesort Example |
| Bottom-Up Mergesort Analysis |
| Merging two sequences of length n/2 each can be done in O(n) if linked lists are used. | |
| In merge sort of n elements, we
merge: n/2 pairs of sequences of length 1 n/4 pairs of sequences of length 2 n/8 pairs of sequences of length 4 1 pair of sequences of length n/2 |
| Bottom-Up Mergesort Analysis |
| At each level O(n) steps are used. | |
| There are log(n) levels. | |
| Therefore mergesort is O(n log(n)) worst case. |
| Bottom-Up Mergesort Analysis |
| Let T(j) = steps at levels < j | ||
| Then | ||
| T(1) = c n, c some constant | ||
| T(j+1) = T(j) + c n | ||
| T(log(n)) is the time to sort n
elements = c n log(n) |
||
| Bottom-Up Mergesort in rex (1) |
| Bottom-Up Mergesort in rex (2) |
| Top-Down Mergesort Example |
| Alternate ways to split |
| Top-Down Mergesort Analysis |
| Tmerge(n) = cn Time to merge two lists of length n/2 | |
| Tsplit(n) = dn Time to split a list of length n | |
| Tsort(1) = 1; | |
| Tsort(n) = Tsplit(n)
+ 2 Tsort(n/2) + Tmerge(n) = 2 Tsort(n/2) + en where e = c+d |
|
| Top-Down Mergesort Analysis |
| Tsort(1) = 1; | |
| Tsort(n) = 2 Tsort(n/2) + en | |
| Substituting T (n) = 2 T(n/2) + en = 2 (T(n/4) + en/2) + en = 2 (2(T(n/8) +en/4) + en/2) + en = = 2log(n)*1 + log(n)*en = n + en log(n) O(nlog(n)) |
| Technique #2: Using data values to do selection |
| Under this category, we have | ||
| bucket sort | ||
| radix sort | ||
| We make non-general assumptions about the data: | ||
| The size of keys to be sorted is limited to integers with a fixed upper bound. | ||
| Bucket Sort |
| Related to hashing | ||
| Both use indexing, which is O(1), to find bucket | ||
| Suppose the set of keys to be sorted is known to be limited to a relatively small integer range, say {0, 1, , R-1}. | ||
| Create an array of size R of lists, each entry corresponding to a key value. | ||
| Go through the data once, putting each element in the corresponding list. | ||
| Concatenate the resulting lists. | ||
| Analysis of Bucket Sort |
| Assume that the number of data elements is comparable to, or larger than, the number of buckets. | |
| Go through the data once, putting each
element in the corresponding list. This is O(n). |
|
| Concatenate the resulting lists. This is also O(n). |
|
| Therefore we have O(n) overall. | |
| Remember that bucket sorting makes special assumptions about the data. |
| Radix Sort |
| Like bucket sort, but with smaller arrays. | |
| Trades array size for multiple passes. | |
| Represent the range as radix b integers. | |
| Make P = logb(R) passes. | |
| Sort on the least significant digit first, progressing toward the most. | |
| Re-collect the data after each pass and redistributed. |
| Analysis of Radix Sort |
| Assuming a bounded range, the number of passes P is a fixed constant. | |
| Each pass uses O(n). | |
| Therefore we have O(n) overall. |
| Demo of Radix Sort |
| Technique #3: Use a Tree instead of Linear List |
| For the same amount of data, a sufficiently balanced tree uses only O(log n) to traverse a chain rather than O(n) (as in nave bubble, selection, or insertion sort). | |
| Heapsort is a form of sorting that uses trees. |
| Heapsort |
| Loosely based on the Peter Principle (Dr. Laurence J. Peter) | |
| books by Laurence Peter |
| Laurence J. Peter |
| The Peter Principle |
| In a hierarchy, everyone rises to his/her level of incompetence. | |
| [A person having reached this level is said to have the final placement syndrome.] |
| A Hierarchy |
| The Peter Principle applied to Sorting |
| A heap is a binary tree structure in which, for each sub-tree, the root is > all elements in that sub-tree. | |
| It suffices for each root to be > its children. | |
| [This is one definition of heap used in CS. The other definition is the area in memory from which storage is allocated dynamically, e.g. when new is called.] |
| A Heap |
| A Heap? |
| The Two Phases in Heapsort |
| Phase I: Form the data into a heap | |
| Phase II: Transform the heap into a linear sequence |
| Phase I: Forming a Heap |
| This is the actual Peter Principle. | |
| Start with the data distributed randomly in the tree. | |
| Form sub-heaps beginning at the leaves and working toward the root. | |
| Successively combine two sub-heaps with a common root into a single heap. |
| Combination as a Tournament |
| Combination as a Tournament |
| Analysis of forming a heap from two sub-heaps |
| 3-way rounds are played from the parent downward to the leaves | |
| Only one path toward the leaves is followed, since other sub-trees along the path dont change | |
| A 3-way round uses O(1) steps (2 comparisons, possible exchange) | |
| The time to form a heap at level k from the leaves is therefore O(k). | |
| In the worst case, this is O(log(n)). |
| Iterating sub-heap formation from leaves to root |
| The leaves are already heaps by themselves. | |
| We need to play the tournament (O(k) rounds at level k) n/2 times, corresponding to the non-leaves. | |
| Coarsely, Phase I is O(n log(n)) overall. | |
| Is this bound tight? |
| Illustrating Phase I |
| Illustrating Phase I |
| After Level 1 Heap Formation |
| Level 2 Heap Formation |
| After Level 2 Heap Formation |
| Level 3 Heap Formation |
| After Level 3 Heap Formation |
| Heapsort Phase II |
| Now that we have a heap, what do we do? |
| Heapsort Phase II |
| One antidote for final-placement syndrome is that people retire. | |
| In our heap, we retire the maximum, which is guaranteed to be at the root. | |
| This leaves a hole that needs to be filled. |
| Retiring the maximum |
| Filing the hole |
| The way this happens is different from in a corporation. | |
| We pick the rightmost leaf, and tentatively place it in the hole at the root. | |
| Then we play the tournament from the root so that the new value reaches its proper level. |
| Filling the hole |
| Filling the hole |
| Result of the tournament |
| The cost of keeping the heap happy |
| The tournament path could be as long as the path from the root to a leaf. | |
| Along the path, a number of O(1) rounds were played. | |
| The cost for one post-retirement adjustment is therefore O(log(n)). | |
| Retiring all n elements in sequence gives us the sorted order. | |
| Overall then, we have O(n log(n)) for Phase II. |
| The Rest of Phase II illustrated |
| Phase II, step 2, continued |
| Phase II, step 3 |
| Phase II, step 3, continued |
| Phase II, step 4 |
| Phase II, step 5 |
| Phase II, step 6 |
| Phase II, step 7 |
| Phase II, step 8 |
| Phase II, step9 |
| Phase II, step 10 |
| Phase II, step 11 |
| Phase II, step 12 |
| Phase II, step 13 |
| Phase II, step 14 |
| Phase II, step 15 |
| Conclusion of Phase II |
| Where to put the retirees? |
| We can put the retirees back in the tree, as long as we know not to play any more tournaments with them. | |
| There is an easy way to keep track of this, as we shall see. |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Phase II with Retiree Placement |
| Node Numbering |
| Note that the sorted sequence is readable from the nodes in breadth-first order. | |
| Further, we can maintain the entire tree as an array (with no explicit links), as shown next. |
| Array Node Numbering Permits Read-out of Sorted Data |
| Use as a Tree: Finding your parents/children |
| Heapsort Code (1) |
| Heapsort Code (2) |
| Heapsort Summary |
| O(n log(n)) steps | |
| In-place array implementation |
| Priority Queue |
| A priority queue is a data repository that has two methods: | ||
| insert | ||
| removeMax (or removeMin, depending on the version) | ||
| A heap is a natural implementation of a priority queue: | ||
| Both insertions and deletions are O(log n) | ||
| Priority Queue Applications |
| A priority queue (with min removal) is often found in simulation applications. | ||
| Events are time-stamped and put in a priority queue, which orders them by smallest time first. | ||
| On a typical simulation cycle: | ||
| The event with the next timestamp is removed. | ||
| The event may cause the insertion of new events with later timestamps. | ||
| Lower-Bound for Sorting |
| Sorting based on pairwise comparisons (which excludes radix and bucket sorts) requires a minimum of c n log2(n) steps. | |
| Sorting methods that achieve this lower bound are called optimal. | |
| Heapsort and mergesort are the two optimal sorts weve studied. |
| Derivation of the Sorting Lower-Bound |
| A sorting algorithm effectively establishes which of n! permutations the data are originally in and through a series of comparisons and exchanges permutes the data to a fixed order. | |
| This can be viewed as a tree with n! nodes, constructed with binary internal nodes for comparisons. |
| Derivation of the Sorting Lower-Bound |
| The worst-case run time of the sorting algorithm is the height of a tree having has n! nodes. | |
| The minimum height of such a tree is log2(n!). | |
| By Stirlings formula, n! ~ (2 p n)1/2
(n/e) n, so log2(n!) is O(n log(n)). |
| Sorting Summary |
| Bucket sort / Radix sort | ||
| fastest asymptotic performance | ||
| special assumptions about data | ||
| Heap sort | ||
| optimal performance | ||
| sorts arrays in-place | ||
| Merge sort | ||
| optimal performance | ||
| sorts lists | ||
| more difficult for arrays | ||