Problem Statement

Background:

Speaker Recognition: Speaker Recognition is the process of determining which of a set of N speakers spoke the phrase that is input to the network. There are two types, text-dependant and text-independant.

In Text-Dependant speaker recognition, a specific phrase is used by all speakers, and the system is trained on that basis.
In Text-Independant speaker recognition, the system is trained to recognize the speaker independant of what phrase they utter.

ART Network: ART stands for Adaptive Resonance Theory. There are many types of ART Networks, but they share the same basic principle. Basically, ART networks are an attempt to create an unsupervised network that can change in response to significant new inputs while remaining insensitive to noise. A more thourough introduction can be found at http://www.wi.leidenuniv.nl/art/art-intro.html, which is an abstract of a paper examining the first ART network.
ARTMAP Network: An ARTMAP network is a varient of the ART network that can undergo supervised training by maping the prototypes onto specific outputs

Problem Statement:

The purpose of this project is to evaluate the performance of ARTMAP networks at text-dependant speaker recognition.
It will explore several aspects:

Ability to correctly identify the speaker of a new utterance of the same phrase
Ability to identify speakers for whom it has not been trained as "unknown"
The effects of different numbers of training samples and feature selections on performance