SCHOOL OF INFORMATION TECHNOLOGY AND
Fall Semester 2018-2019
TOPIC:-OPTICAL CHARACTER RECOGNITION USING
ARTIFICIAL NEURAL NETWORK.
COURSE NAME:SOFT COMPUTING(ITE 1015)
NISHANT MITTAL(16IT0173) SUBMITTED TO:
PROF. RATHI R.
Optical character recognition is a difficult task that requires heavy image processing
followed by algorithms used to convert that data into a recognized character. While
programs exist that already can perform character recognition, they need intensive
processing that is not always important because they recognize a wide range of characters
spanning numerous fonts. In applications where a particular character set in a particular
font is defined, the processing requirements can be easily lessened by developing a OCR
system tailored to those specifications. A way to do this is with artificial neural networks.
This method has numerous tunable parameters, and in many cases the optimal settings may
need to be determined through trial and error. The primary aim of this attempt is to explore
the utility of Artificial Neural Networks based approach to the recognition of characters. A
unique multilayer perception of neural network is built for classification using Back
Propagation learning algorithm.
In the proposed system, each typed English letter is represented by binary numbers that are
used as input to a simple feature extraction system whose output, in addition to the input,
are fed to an ANN. Afterwards, the Feed Forward Algorithm gives insight into the enter
workings of a neural network followed by the Back Propagation Algorithm which
compromises Training, Calculating Error, and Modifying Weights.
Optical Character Recognition, typically referred to as OCR, is the process of converting the
image obtained by scanning a text or a document into machine-editable format. Computer
system equipped with such an OCR system can increase the speed of input operation and
lessen some possible human errors. Recognition of printed characters is itself a challenging
problem since there is a dissimilarity of the same character due to change of fonts or
introduction of different types of noises. Dissimilarity in font and sizes makes recognition
task difficult if pre-processing, feature extraction and recognition are not done properly.
There might be noise pixels that are introduced due to scanning of the image. Besides, same
font and size may also have bold face character and normal one. Thus, width of the stroke is
also a factor that affects recognition. Thus, a good character recognition approach must
remove the noise after reading binary image data, smooth the image for better recognition,
extract features properly, train the system and classify patterns A lot of people today are
trying to write their own OCR (Optical Character Recognition) System or to improve the
quality of an existing one. We will demonstrate how the use of artificial neural network
simplifies development of an optical character recognition application, while getting highest
quality of recognition and good performance. OCR system is a difficult task and requires a
lot of effort. Such systems usually are very complicated and can hide a lot of logic. The use
of artificial neural network in OCR applications and improve quality of recognition while
achieving good performance. There are two basic methods used for OCR: Matrix matching
and feature extraction. Of the two ways, matrix matching is simpler and more common.
Matrix Matching compares what the OCR scanner sees as a character with a library of
character matrices or templates. When an image matches any of these prescribed matrices
of dots within a given level of similarity, the computer marks that image as the
corresponding ASCII character. Feature Extraction is OCR without strictly matching to
prescribed templates. Also otherwise known as Intelligent Character Recognition (ICR), or
Topological Feature Analysis, this method varies by how much “computer intelligence” is
applied by the manufacturer. The computer checks for general features such as open areas,
closed shapes, diagonal lines, line intersections, etc. This method is far more versatile than
matrix matching. Matrix matching works best when the OCR faces a limited repertoire of
type styles, with little or no variation within each style. Where the characters are less
predictable, feature, or topographical analysis is superior.
STRUCTURE OF OCR SYSTEM
OCR is the short form for Optical Character Recognition. This technology allows a machine
to automatically recognize characters through an optical mechanism. Human beings
recognize many objects in this manner our eyes being the “optical mechanism.” But while
the brain “sees” the input, the ability to understand these signals varies in each person
according to many factors. By reviewing these variables, we can comprehend the challenges
encountered by the technologist developing an OCR system. The ultimate objective of any
OCR system is to simulate the human reading abilities so the computer can read,
understand, edit and do similar activities it does with the text.
Block diagram of the typical OCR system.
Each stage has its own limitations and effects on the overall system’s performance. Thus, to
tackle the problems, either by solving each particular problem. OCR system by integrating all
stages to one main stage, and this is what our research proposes. The study presents new
structure of OCR system which relies on the powerful proprieties. The algorithm is designed
and tested in the related sections.
ARTIFICIAL NEURAL NETWORK
An Artificial Neuron is essentially an engineering approach of biological neuron i.e. Neural
Networks basically aim at imitating the structure and functioning of the human brain, to
create intelligent behavior.
A Neural Network is an immensely parallel distributed processor made up of simple
processing units that have natural propensity for storing experiential knowledge and making
it available for use. It is similar to brain in two aspects. First, Knowledge is acquired by the
Network from its environment through a learning process. Second, Interneuron connection
strength is used to store obtained knowledge.
In Neural Network, each node performs some simple calculation and each connection
conveys a signal from one node to another marked by a number called ?connection
Linear Combination Uk, Uk = ?wkj *xj
Induced Local Field Vk, Vk = Uk + bk,
Activation function defines the value of output Yk, Yk = ? (Vk)
The Activation function used here are of different types: Threshold Activation Function,
Piecewise Linear Activation Function, Sigmoid Activation Function, Signum Activation
Learning is formally defined as a process by which free parameters of a Neural Networks are
adapted through a process of simulation by the environment in which the network is
Once the system begins to learn consisting of some initial weight values, as the learning
process increase weight values keeps on changing and give the final output at end. Learning
can be divided into: First, Supervised Learning i.e. learning with Teacher, Second,
Unsupervised Learning i.e. learning without Teacher.
Typical pattern recognition systems are designed using two passes. The first pass is a feature
extractor that finds features within the data which are specific to the problem being solved.
The second pass is the classifier, which is more general purpose and could be trained using a
neural network and sample data sets.
As Optical Character Recognition is defined as a Multiclass Problem, amongst various
classification methods, we have used, Multilayer Feed Forward Architecture, which consists
of an Input Layer, an Output Layer and one or more Hidden Layer. As the number of Hidden
Layer increases the complexity of network also increases.
Back-Propagation Neural Network (BPNN) algorithm is the most popular and the oldest
supervised learning multilayer feed-forward neural network algorithm proposed by
Rumelhart, Hinton and Williams.
Input vectors and the corresponding target vectors are used to train a network until it can
approximate a function, combine input vectors with specific output vectors, or classify input
vectors in an appropriate way as defined by you. Networks with biases, a sigmoid layer, and
a linear output layer are capable of approximating any function with a finite number of
Steps of Back Propagation Algorithm are as follows.
A. Weight Initialization
Set all weights and Node threshold to some small random values.
B. Calculation of Activation
1) Input Unit: The Activation Level of the input unit is determined by the instances
presented to the Network.
2) Hidden unit and Output unit: The Activation Level Oj of Hidden unit and Output Unit are
Oj = F ? wji*Oi – ?j
Where wji – weight from input Oi to unit j
?j – Node threshold at unit j
F – Activation Function
C. Weight Training
1) Weight Change: Start at output unit and work backward to hidden layer, recursively
adjust the weight by-
wji (t+1) = wji (t) +? wji
2) Weight Change Computation: The weight change is computed by-
Where ? = learning rate,
?j = error gradient
The error gradient is given as follows at Output Unit
?j = Oj (1 – Oj)( Tj – Oj)
And for Hidden Unit
?j = Oj (1 – Oj)? ?kwkj
Where Tj = Target Value,
Oj = Actual Output Value,
?k = Error Gradient at unit k to which a connection point at unit j.
D. Repeat Iterations until convergence.
The complete Process is broken-down into Pre-processing, Feature Extraction and then it is
passed through the Artificial Neural Network for training and acknowledgement.
In this stage the obtained image is passed through various phases, because the image
cannot be directly passed to the recognition system.
Overview of Character Recognition System
The various phases include Conversion to Gray Image, Segmentation, Complement image,
Normalization and Noise Reduction.
1) Conversion to Gray Image: Tranforms the truecolor image RGB to the Grayscale intensity
image by using a function called rgb2gray ().
2) Segmentation: Transforms any image into a series of Black Text written on a white
background. Thus, it induces uniformity to all the input images. This also lessens
computational power as it to deal with two colours i.e. Black and White.
3) Normalization: The process of normalization is used to resize the obtained images into
same size so that further processing is applied.
4) Noise Reduction: The process is used to lessen the noise if present before the image is
subjected to ANN.
5) Complement Image: The process of transforming the White pixels of an image to Black
pixels and Vice- Versa.
B. Feature Extraction
It is the process that is used to serve various ideas, the process is used to extract properties
that are used to recognize the character differently, and at the same time it is used to
extract properties that are obtained to differentiate between similar characters.
Stages of Pre-processing
The Output of stage 1 Pre-processing showing different plots.
In this study we use single-level two-dimensional wavelet decomposition with respect to
particular wavelet called Symlet ?sym4′. The method calculates approximation coefficients
and detailed coefficients of images. It extracts 6 geometrical features viz. mean, median,
standard deviation, minimum, maximum, and variance for these approximation and detailed
1) Low Pass Approximation: – Is a one or two dimensional wavelet analysis function.
2) Horizontal Detail Image: – For an Input Image the Horizontal Points of image are displayed
in this portion.
Feature Extraction Process
3) Vertical Detail Image: – Corresponding to an Input Image the Vertical Points of image are
displayed in this portion.
4) Diagonal Detail Image: – Corresponding to an Input Image the Diagonal Points of image
are displayed in this portion.
The output of stage 2 Feature Extraction showing different plots of Low-pass
Approximation, Horizontal Detail Image, Vertical Detail Image, and Diagonal Detail Image
at Level 1 and at level 2.
TRAINING AND CLASSIFICATION
Before the recognition is done, the Artificial Neural Network must be trained so that the
network gets a potential of mapping different inputs to their corresponding output, so that
the system classifies various characters.
Input and Output Parameters of ANN during the phase of Training and Classification
Artificial neural networks are usually used to perform character recognition due to their
high noise tolerance. The systems have the ability to yield remarkable results. The feature
extraction step of optical character recognition is the most crucial step. A poorly chosen set
of features will give poor classification rates by any neural network. At the current stage of
development, the software does performs good either in terms of speed or accuracy but not
better. It is not likely to replace existing OCR methods, especially for English text. A simple
approach for recognition of Optical characters by artificial neural networks has been
described. Despite the computational complexity involved, artificial neural networks has
several advantages in back-propagation network and classification with respect to
emulating adaptive human intelligence to a small extent.