Project Description
Optical
Character
Recognition is the process of converting
a digital image of text into a corresponding digital representation of that text.
Applications include:
- Conversion of written paper to digital documents
- Pen based character input on tablet computers
- Automated sorting of mail based on zipcode
- Check processing in ATMs
- Book digitizing
We endeavoured to create a neural network based OCR system that runs on Android phones to
solve simple mathematical expressions involving digits 0 - 9, addition and subtraction.
The program accepts an input image, normalizes and extracts characters from the image,
attempts to identify each character, and tokenize then solve the equation. Using a network with
two hidden layers trained with scaled conjugate gradient backpropogation we achieved over
96.5%
successful classification rate.
Method
We broke this problem into three parts: image processing, character recognition with a neural network,
and equation solving.
- Image processing
-
In order to have good inputs into our neural network - characters from images must first
be isolated and extracted from the image and normalized for color, size, and position.
With our focus on the character recognition aspect of the project we used a relatively
naive form of image processing.
- Character recognition
-
We used a 3 layer feed forward network trained with conjugate scaled gradient
backpropagation. Training data was a combination of 163 pixels we gathered by
soliciting equations written on a whiteboard and 1593 preprocessed characters from the
UCI Machine Learning Repository.
Inputs were a 16 by 16 boolean array and outputs were encoded as 1 by 12 array one hot encoding
of the digits 0 through 9 and the symbols + and -.
Training was done using the Matlab neural network toolkit. We experimented with
different sizes of networks, training functions, input sizes, and transfer functions.
Our best results were had using a 65 x 58 network with 12 outputs neurons. Similar to some
of our sources - we found scaled conjugate gradient training much more effective and faster then
Levenberg-Marquardt backpropagation training. Similarly we achieved better results
using the logsig transfer function as opposed to the tansig transfer function.
Conjugate scaled gradient training
Levenberg-Marquardt 30x30 network training
Results
70% of the training set is used for training, 30% for validation and testing. Training runs for a maximum of 1000 epochs.
Final success
rate is the percentage of correct classifications on the entire data set averaged over 10 runs.
Training set |
Network size |
Training method |
Success % |
Ours |
15 x 15 |
SCG |
91.2 |
Ours |
25 x 25 |
SCG |
93.5 |
Ours |
65 x 58 |
SCG |
95.3 |
Ours |
100 x 85 |
SCG |
94.1 |
Ours |
15 x 15 |
LM |
89.63 |
Ours + UCI |
30 x 30 |
LM |
60.9 |
Ours + UCI |
15 x 15 |
SCG |
90.9 |
Ours + UCI |
25 x 25 |
SCG |
93.7 |
Ours + UCI |
65 x 58 |
SCG |
96.6 |
Ours + UCI |
100 x 85 |
SCG |
97.35 |
Ours + UCI |
50 x 45 x 25 |
SCG |
96.05 |
Ours + UCI |
70 x 25 x 25 |
SCG |
95.73 |
We successfully ported our image processing code along with our neural network onto the Android platform.
The application accepts a picture and attempts to solve the given equation. The app is low on features but
serves as a proof of concept.
==>
Sources
Matan, O.; Baird, H.S.; Bromley, J.; Burges, C.J.C.; Denker, J.S.; Jackel, L.D.; Le Cun, Y.; Pednault, E.P.D.; Satterfield, W.D.; Stenard, C.E.; Thompson, T.J.; , "Reading handwritten digits: a ZIP code recognition system," Computer , vol.25, no.7, pp.59-63, July 1992
Shoeb Shatil, Adnan Md. Research Report on Bangla Optical Character Recognition Using Kohonen Network. Tech. Center for Research on Bangla Language Processing, n.d. Web. http://www.panl10n.net/english/final%20reports/pdf%20files/Bangladesh/BAN26.pdf.
Sloss, Steven. "Handwritten Equation Intelligent Character Recognition with Neural Networks." Harvey Mudd College, 2007. http://www.math.hmc.edu/~dyong/math164/2007/sloss/finalreport.pdf.
Xiaojun Zhai; Bensaali, F.; Sotudeh, R.; , "OCR-based neural network for ANPR," Imaging Systems and Techniques (IST), 2012 IEEE International Conference on , vol., no., pp.393-397, 16-17 July 2012