In image processing, there is a natural division between small data structures (numbers, linked lists, etc) and bulk data structures (images, windows, digitized curves). A typical image (250K bytes to 1M bytes) is so much larger than a typical small data element (< 100 bytes), that there are drastic differences in design priorities. In Envision, bulk data is handled by the image coprocessor, whereas small data is handled by the main Scheme interpreter. This allows us to handle both types of data appropriately.
Bulk data requires different storage allocation strategies than small data. Bulk data must be stored compactly, with type markers collected at the start of the data sheet. Copying of bulk data should be avoided whenever possible. The user must explicitly allocate and deallocate bulk data, because, in the normal course of events, they do not lose all pointers to a data sheet (making it accessible to the garbage collector) before its space must be freed. Each sheet of bulk data should be allocated directly from the operating system, not carved out from a big sheet of space allocated by Scheme, to avoid fragmentation.
In Envision, bulk data is kept entirely separate from small data. Small data is handled by the usual Scheme interpreter. Bulk data is handled entirely by the coprocessor and never passed to Scheme. At the user level, it appears that bulk data objects exist in Scheme and are manipulated by Scheme operations. However, in fact, the Scheme interpreter only has a unique identifier for each bulk object. Scheme can only operate on bulk objects indirectly, by sending requests to the coprocessor.
In Envision, the coprocessor is connected to the main Scheme interpreter via a simple, slow socket connection. This arms-length connection offers two advantages: reliability and ability to run on different processors.
Many previous computer vision algorithms use a foreign function interface in which high-level (e.g. lisp) and low-level code (e.g. C) share memory. Although this ought to work in principle, these interfaces seem to be fragile in practice. Having two computer programs share memory is like connecting the bloodsystems of two creatures: a dangerous procedure that should be reserved for times when it is truly necessary.
Finally, it is not uncommon to run graphics and image processing operations on a separate processor. Some computers have specialized graphics or image processing boards which can run common operations much faster than the general-purpose processor. These boards typically have their own, dedicated, memory. Finally, because many image operations are inherently parallel, it would make sense to divide an operation among several processors. The socket model allows individual sites to set up such arrangements easily.
In Envision, the user writes bulk data operations in scheme, but these operations are compiled into low-level code: the coprocessor's stack assembler and/or C. This compilation process protects the user from many of the trivial housekeeping operations required to make computer vision code run properly. The user describes the algorithm in a simple, high-level way. The low-level details are filled in by the compiler. The compiler also ensures that output C code is correct, standard, portable, ANSI C.
The compilation process also insulates the algorithm designer from the specifics of the hardware, the coprocessor, and the C compiler. Each new hardware configuration and each system upgrade requires some one-time work from the people maintaining the compiler. User-level code, however, should not need to be modified. This should make it much easier to transfer code from site to site, or between machines with very different hardware, e.g. a slow linux notebook PC and a lab workstation with attached image processing boards.
Bulk data processing typically dominates the running time of the entire computer vision system. Therefore, operations processing bulk data must be compiled into very efficient code. When specialized image processing or graphical display boards are available, bulk data algorithms must be compiled into whatever language these boards require. Finally, special handling may be required if a bulk data operation is to be divided and run in parallel on several processors.
Fortunately, bulk data algorithms do not require the sophisticated programming language constructs which make it difficult to compile a high-level language. Bulk data operations can, and should, be written in a way which makes them easy to compile into efficient code. The Envision primitives have been designed to make user-level code simple but, at the same time, make its structure correspond in a known way to the structure of efficient low-level code. Opportunities for parallelism are made manifest, so that the compiler does not have to guess at the intentions of the algorithm designer.
In Envision, bulk data operations are written in a special subset of the high-level language, restricted to allow easy compilation. They are processed by a special compiler, which generates code for the coprocessor's stack machine language (for debugging) and will eventually also generate C code (for faster performance of finished algorithms). We hope that experts in compiler design will eventually develop better compilers, including compilers which can exploit specialized hardware available at particular sites.
The distinction between bulk data and small data operations closely reflects an established distinction in computer vision. Most computer vision researchers make a strong distinction between low-level and high-level algorithms, with edge finders forming the transition between the two stages of processing. The transition between the two classes of vision algorithms occurs at approximately the point where the Envision programmer would switch between coprocessor operations running on bulk data and true Scheme functions running on small data.
Ownership, Maintenance and Disclaimers
Manual Top Page