CS 147 Homework Assignment 1

This homework assignment is due at 12 AM on Thursday, February 9, 2012 (i.e., the Wednesday/Thursday boundary). Please give your solutions to me, slide them under my door, or e-mail them.

Most of the problems in this assignment are taken from the textbook. I expect that it will take you about 2 hours to complete the assignment.

If you use a Microsoft product to do your graphing, be sure to turn off the stupid gray background, and ensure that color isn't essential for interpreting the graphs (since I might decide to print things on a B&W printer).

Problems from the textbook: 12.7 through 12.15.

Additional problems:

12.16
What is the skewness (see box 12.1 on page 197) of the data from problem 12.15? What is the kurtosis? The kurtosis is calculated by using an exponent of 4 instead of 3 in the skewness equation, and then subtracting 3 from the result, as follows:
kurtosis = [ 1/(n*s^4) * sum[(xi-xbar)^4] ] - 3
A skewness significantly far from zero indicates a skewed distribution. A kurtosis with an absolute value greater than about 0.5 indicates a distribution narrower (negative) or wider (positive) than a normal distribution.
NOTE: Most authorities suggest dividing by (n-1) instead of n when calculating both skewness and kurtosis. For this problem, we'll stick with the biased estimate (using n) since that's what the book uses.
12.17
Do the skewness and kurtosis you calculated in 12.16 support the conclusions you reached in 12.15?
12.18
(Extra credit.) Plot a kernel density estimate of the data from problem 12.15, using either a Gaussian or a triangular kernel. If you used an existing software to create the plot, identify the package (I'm looking for good packages).
12.19
(Extra credit.) Derive the formula for the number of samples needed to demonstrate that one system is better than another to a given level of confidence.

Notes:

To save you pointless typing, here are links to the data given in some of the problems:

  • 12.10
  • 12.11
  • 12.15
  • For problem 12.7, you may find the tables in Appendix A useful, or you might choose to use a program with statistical functions (some options include Excel, Matlab, R, and Minitab). In problem 12.7b, interpret "less than" as "less than or equal to".


    © 2012, Geoff Kuenning

    This page is maintained by Geoff Kuenning.