CS 152: Neural Networks - Final Project

CS 152: NEURAL NETWORKS

Evolving a Sigma-Pi Network as a Network Simulator

Justin Basilico

[ Main | Problem statement | Approach | Results | References | Code directory | Presentation ]

Sigma-pi networks seem to have properties that make them more powerful than a standard neural network models. This power is derived from the addition of the pi units, which take the product of their input, rather than just the sum. One way to look at these products is that it can allow for the input to the network to control the weight values in the network. In this sense, the network could be "programmed" by its input to act in a variety of different ways.

The purpose of this project is to make use of this "programmability" of sigma-pi networks to create a network that can act as a variety of other types of networks based on its input. In particular, the goal is a sigma-pi network that will act as a network simulator. It will take as input the input values for the network that it is simulating along with the values of the weights in the network it is simulating. The target output for this sigma-pi simulation network is to produce the output that the given input network (as specified by its weights) would produce on the given input value for that network. If such a network could be created, it could be a first step in looking at networks (such as sigma-pi networks) that can operate on other networks as input, similar to Turing Machines that use other Turing Machine codes as input. One specific operation that would be an interesting extension would be to create a network that could apply a learning rule to another network in order to improve it, such as backpropagation. Although such an exploration would be beyond the timeframe allotted for this project, being able to simulate one network using another network would be a first step towards this goal.

In order to create these sigma-pi simulator networks, an evolutionary approach is used. Evolutionary techniques are particularly useful for sigma-pi networks because unlike other standard models, such as backpropagation networks, the connectivity between layers is important. In particular, while for a summation unit, a weight can be zero meaning that there is no real ``influence'' from that unit, for a product (pi) unit if any one of its weights is zero (or close to it) then that whole unit will always have an activation of zero (or close to it). Thus for product units, there must be a distinction between when there is a connection between two nodes and when there is not. When there is a connection, then the weight can be applied. When there is not, then the activation from that unit is not passed on and not taken into account in the product. This particular problem is also seems to easily applicable for a genetic algorithm since a simulator sigma-pi network can be built that assumes a weight of 1.0 whenever there is a connection between two units. The networks will be tested against a set of randomly created networks of the specific architecture that the sigma-pi network is simulating when given random input values. Thus, the fitness of a network chromosome will is calculated as how well it simulates these random networks in terms of the average mean squared error. The goal is to evolve a chromosome that perfectly simulates these networks.

his evolutionary technique will be applied to try and evolve sigma-pi simulator networks of varying size. First they will be tested on simple networks with no hidden layers and if simulators for those networks can be evolved, a simulator for a larger network that contains a hidden layer will be attempted.

[ Main | Problem statement | Approach | Results | References | Code directory | Presentation ]

This file is located at
http://www.cs.hmc.edu/~jbasilic/cs152/project/statement.html