If you have not done much experimental scientific work, start by reading the references on scientific data analysis in the basic laboratory procedure pages. These will explain simple statistical procedures, including simple least-squares regression. These techniques may prove sufficient for your purposes. Furthermore, you will have an easier time understanding the statistics texts if you have mastered this basic background.
There are many elementary introductions to statistics. For computer vision, choose a text which claims to be written for students in science and engineering. Such texts have a more practical (sometimes computational) approach than texts for mathematics and statistics majors. They have a better choice of topics than texts for students in the social sciences. A particularly readable book is
Two introductory books which require somewhat more mathematical maturity and/or more effort on your part:
The book by Papoulis has been particularly popular in computer vision. Therefore, you may find its choice of topics is particularly appropriate when trying to understand computer vision publications.
Researchers in "robust statistics" have extended standard statistical techniques to work (a) in the presence of outliers (occasional extremely wrong values) and (b) when the real data may be generated by a mechanism which is similar to, but does not belong to, the class of theoretical models used to analyze the data. The robust methods are better able to cope with the types of data found in real scientific data sets.
Good introductions to robust statistics can be found in
Robust methods for regression are described in
A similar method apparently worked out independently, with somewhat less analysis, by Fischler and Bolles
Grouping and cluster analysis algorithms are still a black art. For a nice survey of the state of play, including problems with existing techniques, from the point of view of robust statistics, see
Do not, however, expect such techniques to work miracles. In particular, if your data is low-dimensional and you can't see the cluster divisions in suitable scatterplots, you should not expect the algorithm to magically find the distinctions. Consider whether you really have a reliable cluster structure, with clear separation between the clusters!
Multivariate analysis techniques analyze how scalar output values depend on many input variables. For example, principal component analysis attempts to locate the input variables, or combinations of input variables, which have the greatest influence on the output values. A very nice book on multivariate analysis is
Classification algorithms are given a set of model distributions, and a test value, and asked to determine which model distribution the test value is most likely to belong to. Two good references are
It is frequently necessary to generate random numbers. Or, more precisely, pseudo-random numbers. A fun, readable book on how to do this, and how not to do this, is
Most statistical techniques work only if values come from a linear space. In computer vision, we must occasionally analyze values from a circular or spherical space, such as 2D or 3D orientations. The only references I have seen on the subject, which fortunately seem to be fairly readable, are