My goal in writing this software

In writing this software, my goal was to provide programmers with a Java API to modularly distribute parallel computations across a network of processors that are not guaranteed to cooperate with a scheduling server. That is, the nodes at which a portion of a large computation is processed are assumed to not be part of a dedicated network, and may drop communication with the server at any time. The server manages the network, redistributing tasks which are not completed due to a slave leaving. Additionally, I wanted an API that would allow clients to utilize the system easily, without forcing programmers to design their applications around the system. I envisioned a system that an application could be adapted to in under two hours.

As far as I know, existing software packages for parallelizing computations place some or all of the following restrictions upon software written to utilize them:

The parallelization libraries must be ported to each of the machine architectures and operating systems to be used in donating processor cycles to the network.

The parallel application itself must be ported to each of the machine architectures and operating systems to be used in donating processor cycles to the network.

The person running the parallel application must have some way of starting a server on the machines he intends to have donate processing cycles to the network, and the server will require access to the network, which may be considered a security risk by the administrators of the donor machine.

The parallelization system designates a single machine to act as a scheduling server that instructs the donor machines as to what computations to do. The problem with this approach is that it is designed to work only on small scale: there is no guarantee that the scheduling server will be able to keep up with an infinite number of processor donation machines.

The parallelization system is designed to be run on a small, fast, local network, and the machines intended to donate cycles to the network are specified and fixed for the duration of the application's run time. The problem with this approach is that the parallel network is usually limited to machines that are running some form of UNIX and that an application writer has physical access to. In most cases, this restricts the possibilities for processor donor machines to a small LAN.

Gestalt attempts to solve these problems. It is written entirely in Java, so there are no porting issues. Also, since there is an applet version of the slave program, all that is required for a machine to donate cycles to the network is for it to have a Java 1.1 enabled web browser. This also allows third parties to donate cycles without having to worry that the slave program will be able to maliciously affect the machine it is running on, as per Java Applet security restrictions. Furthermore, the Gestalt system is designed to be distributed among multiple servers, so it is scalable. The Gestalt server manages processor nodes dropping out of the network, and redistributes the calculations being done on those processors that are no longer available to those that are.

In comparison to other, natively compiled parallelization libraries, Gestalt has some obvious shortcomings. Firstly, because it only runs Java code, programs that utilize it will suffer the slowdown associated with any interpreted language. Secondly, there is no interface for communication between nodes: code submitted to the server to be run by a slave will not be able to pass data to anything until it is returned to the client that submitted it. In spite of these shortcomings, it is my hope that the system will be able to compare with other, natively compiled libraries through the utilization of a large number of donor machines, and the advantage of scalability.


Last Modified: 7/24/97 by jack@cs.hmc.edu