Speaker Recognition by Dmitriy Yakovlev

Abstract

This is a tool for doing neural network-based speaker recognition. A speaker recognition implementation typically trains on a set of voice samples from the population it is meant to classify. Then, when further samples are given to it from that population, it should be able to correctly identify the speaker of the sample. This is typically done by a combination of feature extraction and patter recognition, but the neural network-based approach comes at it from a different direction. It relies on the ability to extract a characteristic of a sound file that will be similar to characteristics from other sound files created by the same speaker.

The tool was created in MATLAB, and primarily uses Linear Predictive Coding to generate characteristic numbers. Results are positive, although significantly worse than the claimed results from previous network-based attempts.

The directory listing below contains all the MATLAB files you would need to run the implementation. The .pdf files are presentations I gave about the problem and implementation, and discuss the problem in more depth than this abstract.

Presentations

Approaches to Voice Recognition - First presentation on the subject. Introduces techniques and neural network-based approaches.
Speaker Recognition - The outcome of my research. Discussion of the results my implementation achieved.

Files

getdata.m - MATLAB script for reading .wav files, used by
go.m - MATLAB script that creates a neural network to do speaker recognition based on non-windowed LPC coefficients.
go2.m - MATLAB script that creates a neural network to do speaker recognition based on windowed LPC coefficients.
data - zipped folder that contains wave file sample speech. Read comments in go.m and getdata.m for format suggestions.
getsound - shellscript that captures sound samples. You need the sound-recorder package installed on your system, and a microphone that provides input to /dev/dsp1. (or edit the script to provide a different input interface)

Speaker Recognition

Dmitriy Yakovlev - Harvey Mudd College Fall '08 - CS152 Neural Networks - Professor Robert Keller