Digitization and Interpolation

Camera images used in computer vision are digitized for manipulation by a computer. That is, the continuous 2D image is represented by the values at a finite set of sample locations, typically those on a rectangular grid. In the following discussion, we will assume that sample locations are the 2D locations with integer coordinates. The value at each sample location is mapped into some finite range [0,M] by replacing any larger intensity values by M. The values in this range are then represented to some finite precision.

Many low-level vision operations can be viewed as operations on digitized images which emulate operations on the corresponding continuous images. Two such operations are image rotation and Gaussian convolution. Techniques for doing this emulation are typically based on work in digital signal processing.

Sampling Theorem, Aliasing, etc

Interpolation

In implementing low-level vision operations, it is frequently necessary to interpolate the value at a non-integer location in the image. When high precision is required, consult references on digital signal processing. In computer vision, however, the precision of intensity values is typically low (usually 8-bit) and speed is essential.

The most commonly-used method of interpolation is bilinear (also called twisted-plane, area-weighting, or four-point). Suppose that our digitized image is D(x,y) and our continous image is I(x,y). Suppose that we need the value at location (i+p,j+q), where i and j are integers, p and q are in [0, 1.0). We can approximate I(i+p,j+q) using the values at the four nearest integer locations using the formula

I(x,y) =(approx) (1-p)(1-q) D(i,j) + p(1-q) D(i+1,j) + q(1-p) D(i,j+1) + pq D(i+1,j+1)

The bilinear formula works well for many applications. However, since it is based on a linear approximation to the image intensities, it tends to flatten peaks and troughs in the signal. This distortion of the image values can be noticable, particularly if the operation is applied repeatedly (e.g. rotating an image several times).

Fitting a second-order surface to the data at integer locations gives better performance on peaks, troughs, and similar features. Bracewell (p. 249) gives a second-order formula for interpolating values from 6 nearby points, which is relatively simple (and thus fast) but may be significantly more accurate than bilinear interpolation.

Again, suppose we need the value at location (i+p,j+q). This algorithm first determines which of the four nearest integer locations is closest. Suppose that the nearest integer is (i,j). Then the interpolated value is:

I(x,y) =(approx) 0.5q(q-1) D(i,j-1) + 0.5p(p-1) D(i-1,j) + (1+pq - p^2 - q^2) D(i,j) + 0.5p(p-2q+1) D(i+1,j) + 0.5q(q-2p+1) D(i,j+1) + pq D(i+1,j+1)

Similar formulas are used for the other three cases.

References

Bracewell, Ronald N. (1995) Two-Dimensional Imaging, Prentice-Hall, Englewood Cliffs NJ (ISBN: 0-13-062621-X).

Handbook Main Page

Ownership, Maintenance, and Disclaimers

Last modified