Pattern Recognition

Example: Neural Network Number Recognition

Description

Neural networks have been successfully used for pattern recognition in a number of fields such as hand writing and speech recognition. As a starting point for further explorations this example trains two neural networks on numbers rendered in different fonts and tests the ability of the networks to generalize from the training data. The weights are visualized to illuminate the inner workings of the neural networks.

Output Encoding

The encoding of the output is kept simple so there is a direct relationship between neurons and the numbers they encode. Each output neuron encodes a number i.e. one bit set to one and the rest to zero. Other encodings such as ASCII which are used in the Associative Memory Example are more condensed, and could be used for more complex scenarios.

BitEncode[chars_List, c_String] := Module[{p = First[First[Position[chars, c]]] - 1},  ... th[chars] - First[First[Position[bits, Max[bits]]]] + 1}, chars〚p〛]

Character Bitmaps

The charbitmaps function takes a list of characters as input. Each character is drawn on a square bitmap and the bitmaps are shown as an array. To convert from the Mathematica graphics primitives to a raster bitmap, the graphics are exported and imported in a native bitmap format from which the bits are extracted, normalized and converted to gray scale.

RowBox[{charbitmaps[chars_List, size_Integer, textstyle_List], :=, , RowBox[{Module, [ ...  1.2, size}], ,, RowBox[{1.2, size}]}], }}]}]}], ]}], ]}], ;, , 1 - b}]}], , ]}]}]

To generate a suitable sample of training data the charvariations function contains a list of a variety of fonts. The fonts found on different systems varies, so this list can be edited to include other available fonts. The available fonts on Windows systems can be seen in the Control Panel applet Fonts and in the Accessories/System Tools/Character Map application if it has been installed. The Mathematica 1 font is not included in the training data so it can be used to test the generalization power of the network. The charvariations function returns the generated bitmaps and their matching classification in the form used by the Train function.

charvariations[chars_List, size_Integer] := Module[{inputs, outputs, data = http://www.geocities.com/freegoldbar/{}, styles ... ad[({Flatten[#1], #2}) &, {inputs, outputs}]] ) &, styles] ; data]

Bitmap Generation

The pattern recognition is chosen to only include decimal numbers to illustrate the principles, extending the sample is straightforward. The bitmap size used as input is set to 20×20 pixels.

chars = CharacterRange["0", "9"] ; num = Length[chars] ; pixels = 20 ;

The training data is generated and the different fonts are displayed slightly enlarged. Preprocessing steps can assist the network by reducing the complexity of the recognition task. For example the images in the bitmaps could be resized to a common size, the linewidths could be adjusted to a common width, the images could be moved to the center of the bitmap and images could be rotated (relevant for italic fonts or hand written characters). For simplicity no preprocessing is performed in this example.

data = http://www.geocities.com/freegoldbar/charvariations[chars, pixels] ;

[Graphics:HTMLFiles/PatternRecognition_6.gif][Graphics:HTMLFiles/PatternRecognition_7.gif][Graphics:HTMLFiles/PatternRecognition_8.gif]
[Graphics:HTMLFiles/PatternRecognition_9.gif][Graphics:HTMLFiles/PatternRecognition_10.gif][Graphics:HTMLFiles/PatternRecognition_11.gif]

Neural Network With No Hidden Layer

The neural network is created with Sigmoid activation functions and the standard incremental backpropagation training algorithm. The input layer contains a neuron for each bit in the square bitmap. The output layer contains one bit for each number. No hidden layer is used to force the network to classify the bitmaps directly.

<<Fann` NeuralNet[net, NeuralNetTypeLayer, ConnectionRate1, Layersᢃ ... stPlot[Train[net, data, Epochs200], PlotJoinedTrue, PlotStyle {Hue[1]}] ;

[Graphics:HTMLFiles/PatternRecognition_13.gif]

Input Weight Visualization

With more than 400 neurons and thousands of weights, it is not practical to use the graphs from FannGraph to display the complete layout of the net. Instead a summary of the input weights can illustrate how the network weighs the different parts of the input bitmaps. First a sorted list of weights originating from input neurons are selected among all the connections. Then the weights are summed up for each input bit and organized in a square matrix.

inweights = Sort[Select[net[Weights], First[#] ≤First[net[Layers]] &]] 〚All,  ... n[Map[Total, Partition[inweights, net[Layers] 〚2〛]], net[InputNeuronCount]^(1/2)] ;

A density plot can now be displayed. To enhance the plot, ColorFunctionScaling is turned off and a ColorFunction is defined so negative values are displayed in red, positive in blue. The intensity of the colors are scaled according to the magnitude of the weight sums, so brighter colors show stronger weights, darker colors and black shows weights near zero. The graph shows that the network largely ignores the outer parts of the bitmap since they do not play any role in distinguishing the inputs.

ListDensityPlot[insum, ColorFunctionScaling->False, ColorFunction-> (If[#≥0, RGBColor[0, 0, #/Max[insum]], RGBColor[#/Min[insum], 0, 0]] &)] ;

[Graphics:HTMLFiles/PatternRecognition_16.gif]

Output Weight Visualization

The visualization of the weights can also be performed for each output neuron. This shows how the network classifies its input. As there is no hidden layer the numbers are easy to see. The plot of the weights that recognize the number 2 shows that the middle top part of the number is not used to recognize it.

oc = Select[net[Weights], First[#] ≤First[net[Layers]] &] ; ow = Map[Partition[Sort[ ... RGBColor[#/Min[ow〚i〛], 0, 0]] &)], {i, net[Layers] 〚2〛, 1, -1}] ;

[Graphics:HTMLFiles/PatternRecognition_18.gif][Graphics:HTMLFiles/PatternRecognition_19.gif][Graphics:HTMLFiles/PatternRecognition_20.gif][Graphics:HTMLFiles/PatternRecognition_21.gif][Graphics:HTMLFiles/PatternRecognition_22.gif]
[Graphics:HTMLFiles/PatternRecognition_23.gif][Graphics:HTMLFiles/PatternRecognition_24.gif][Graphics:HTMLFiles/PatternRecognition_25.gif][Graphics:HTMLFiles/PatternRecognition_26.gif][Graphics:HTMLFiles/PatternRecognition_27.gif]

Neural Network With One Hidden Layer

The neural network is created as before with the addition of a hidden layer. The hidden layer is here chosen to contain a neuron for each pattern. The number of hidden neurons can be varied from just a few to multiple layers to see the effect on classification and generalization.

NeuralNet[net2, NeuralNetTypeLayer, ConnectionRate1, Layers {pixels^2, ... tPlot[Train[net2, data, Epochs200], PlotJoinedTrue, PlotStyle {Hue[1]}] ;

[Graphics:HTMLFiles/PatternRecognition_29.gif]

Displaying a Partial Neural Net Graph

Displaying the complete neural network graph is time consuming due to the thousands of weights from the input layer and serves little purpose. By removing the vertices representing the input layer neurons and the input layer bias, a graph of the mapping from the hidden layer to the output layer can be shown. The graph shows that some neurons can act both as a strong indicator for a particular number (thick blue arrow) and as an inhibitor of another number (thick red arrow). The graph also shows that the neurons work in parallel to classify the patterns.

<<FannGraph` nng = NeuralNetGraph[net2, UseStyleTrue] ; ShowGraph[DeleteVertices[nng, Range[net2[InputNeuronCount] + First[net2[Bias]]]]] ;

[Graphics:HTMLFiles/PatternRecognition_31.gif]

Hidden Weight Visualization

The kind of features the hidden layer neurons react on can be visualized. Since the network uses combinations the numbers are less recognizable in the plots of the weights.

hc = Select[net2[Weights], First[#] ≤First[net2[Layers]] &] ; hw = Map[Partition[Sor ... GBColor[#/Min[hw〚i〛], 0, 0]] &)], {i, net2[Layers] 〚2〛, 1, -1}] ;

[Graphics:HTMLFiles/PatternRecognition_33.gif][Graphics:HTMLFiles/PatternRecognition_34.gif][Graphics:HTMLFiles/PatternRecognition_35.gif][Graphics:HTMLFiles/PatternRecognition_36.gif][Graphics:HTMLFiles/PatternRecognition_37.gif]
[Graphics:HTMLFiles/PatternRecognition_38.gif][Graphics:HTMLFiles/PatternRecognition_39.gif][Graphics:HTMLFiles/PatternRecognition_40.gif][Graphics:HTMLFiles/PatternRecognition_41.gif][Graphics:HTMLFiles/PatternRecognition_42.gif]

Pattern Recognition

Testing the network with the training data returns a sufficiently low mean square error to indicate that the input can be correctly recognized and classified.

{Test[net, data], Test[net2, data]}

RowBox[{{, RowBox[{0.000449822, ,, 0.00152588}], }}]

The recognition of numbers from a font present in the training data can be verified.

testfont = charbitmaps[chars, pixels, {FontSizepixels, FontFamily"Courier ...  Flatten[#]]] &, testfont], Map[BitDecode[chars, Execute[net2, Flatten[#]]] &, testfont]}

[Graphics:HTMLFiles/PatternRecognition_46.gif]

{{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}}

The Mathematica 1 font was left out of the training set so it can be used to test how the network performs with bitmaps it has not seen before. The network without a hidden layer manages to correctly classify all the numbers but one. The network with a hidden layer only gets 70% correct indicating that the features it has chosen from the training data does not generalize well.

testfont = charbitmaps[chars, pixels, {FontSizepixels, FontFamily"Mathema ...  Flatten[#]]] &, testfont], Map[BitDecode[chars, Execute[net2, Flatten[#]]] &, testfont]}

[Graphics:HTMLFiles/PatternRecognition_49.gif]

{{0, 1, 2, 3, 1, 5, 6, 7, 8, 9}, {0, 1, 2, 5, 1, 5, 6, 7, 6, 9}}

With a bold font the network without a hidden layer gets all numbers right. The network with a hidden layer gets just one wrong. This indicates that preprocessing likely would improve the networks ability to recognize patterns.

testfont = charbitmaps[chars, pixels, {FontSizepixels, FontFamily"Mathema ...  Flatten[#]]] &, testfont], Map[BitDecode[chars, Execute[net2, Flatten[#]]] &, testfont]}

[Graphics:HTMLFiles/PatternRecognition_52.gif]

{{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {0, 1, 2, 3, 4, 5, 6, 7, 6, 9}}

As another illustration of the ability of the neural networks to generalize from the training data, they can be tested with characters and symbols that look somewhat like numbers.

testfont = charbitmaps[{"O", "l", "!", "|", "A&qu ...  Flatten[#]]] &, testfont], Map[BitDecode[chars, Execute[net2, Flatten[#]]] &, testfont]}

[Graphics:HTMLFiles/PatternRecognition_55.gif]

{{0, 1, 1, 1, 4, 1, 8}, {0, 1, 1, 1, 4, 1, 8}}

Saving the Neural Network

The trained networks can be saved. Modify the path before evaluating the cells.

savepath = $HomeDirectory<>"/My Documents/My Math/" ; netfile = savepath<&g ... ; Put[net[], netfile] ; netfile = savepath<>"Pattern2.m" ; Put[net2[], netfile] ;

The networks can be restored for further use.

NeuralNet[net] ; net[Get[netfile]] ; NeuralNet[net2] ; net2[Get[netfile]] ;

Credits

This example is part of Fann for Mathematica © 2004 freegoldbar (http://www.geocities.com/freegoldbar/).

Discard[All] ;


Created by freegoldbar  (September 16, 2004)

Hosted by www.Geocities.ws

1