Proposal:

    My proposed project is a text-dependant speaker recognition system using an ART2 network for the actual pattern recognition (ART2 is a variant of ART designed to take real-valued vector input).  I will train the system on a population of 8 people, 4 male and 4 female.  Each speaker will utter the same short phrase three times to provide training data and once to provide test data.  This will be augmented with noisy versions of the same utterance to give the network robustness.  The speech will then be processed with STRUCT, a freely available toolkit that will do various forms of feature extraction.  Then the training data will be fed into an ART network and the parameters adjusted to maximize the number of correctly recognized speakers and eliminate incorrect classification.  Once the network has been trained, it will be given versions of the utterances with additional noise and a new version of the utterance to test its stability.  It will also be given utterances by speakers it has not been trained for to test its discrimination and plasticity.
 

return to main page