HW2

CTH, Artificial neural networks

Mahiar Hamedi (770524-0617)

Homework 2
Problem 1

A boolean function consisting of three inputs can be represented with a cube as in figure 1, where each corner of the cube can be 1 or 0 (black or white)

Figure 1. Cube representing a three input boolean function, different color combination of the balls in each corner represent different boolean functions

The boolean functions that can be represented with a perceptron must be linearly separateable. Such function can be seen as a function in figure 1 where the white and black balls can be separated with a plane. Table 1 lists the number of linearly separateable function giving a certain number of 1:s and 0:s in the cube.

Number of 1:s in the cube	Nr of linearly separateable functions
0	1
1	8
2	12
3	24
4	14
5	24
6	12
7	8
Total	104

Table 1. Number of linearly seperateable function for a given number of 1:s and 0:s in the cube.

The total number of lineraly separatable boolean functions with three inputs is 104. As the number of inputs n increase the number of boolean functions will increase rapidly as 2^(2^n). For an input to a linearly seperateable boolean function there must be at least one input with the same value with (n-1) similar coordinates (except for the case of only one false or one true as input). The combination of such pairs of inputs do not increase as rapidly with n as the number of boolean functions and therefore the number of linearly separateable boolean functions divided by the total number of boolean functions will approach zero as n grows.

Problem 2

The 3-parity problem is a linearly nonseparateable function, it can therefore only be solved with a net with one or more hidden layers.Figure 2 shows a possible solution to the 3-partity problem with a fully connected net with one hidden layer and with step functions as neuron functions. Note that all the weights in the first layer are equal to 1.

Figure 2. Feed forward network solving the 3-parity problem. Note that in addition to the inputs in layer 1 there are constant iputs (-1.5, -2.5 and -0.5)

Problem 3

In this problem the 3-parity problem is solved by implementing a back-propagation algorithm in MATLAB on a fully connected net with three inputs one hidden layer and 1 output (similar as the net in figure 2). The neuron function is chosen as

The weights are initially set to random values in the range

Where N is the number of inputs to each node (N=3 for both layers in this net). The training set is all the 8 possible inputs and corresponding outputs of the 3-parity problem. The error E for each epoch is calculated as

Where d is the correct desired output,O is the output of the net and j runs over all the training set. Figure 3 shows a solution to the 3-parity problem obtained by the implemented back-propagation algorithm.

Figure 3 A solution to the 3 parity problem obtained by the backpropagation algorithm with h =0.1 and a =0.

Note that the inputs to the net are -1 or 1, where -1 represents FALSE and 1 represents TRUE, and the output can be a value between 0 and 1 which is the range of the node function. The output value O can be interpreted as FALSE if O<0.5 and TRUE if O>0.5.

Figure 4 shows the error E as a function of the learning rate h for 5000 epochs.

Figure 4. Error evaluation for different learning rates

We can see that for h =0.01 the back-propagation algorithm converges slowly or not at all. The convergence rate increases with increasing h , however if the learning rate is chosen too high, the net will not converge.

For some values of h the net can converge to a local minimum. To avoid this problem one can choose an appropriate value for a . Figure 5 shows the result of the algorithm for a =0.5 and a =1.

Figure 5. Different learning rates with a =0.5 and a =1

we can see that the value of a also has an effect on the convergence rate. Choosing h and a appropriately the algorithm could converge very fast. Choosing a >1 in this problem often results in a non converging iteration. The problem of choosing a right combination of learning rate and learning parameter for a stable and fast converging net is quite tricky.

Problem 4

This problem deals with solving the problem with a net with n inputs and 1 output. The output is 1 if the n-long bit string contains at least two neighboring 1:s.

This problem is solved with the back propagation algorithm implemented in MATLAB on two different nets. The first net is a fully connected net with n inputs, one hidden layer and 1 output. In the second net, only the neighboring weights in the first layers are connected to the second layer and so the structure of the net contains some a priori information on the structure of the problem.

The number of inputs to the net have been chosen to 12. Other parameters of the net are the same as the net in problem 3. The net is trained with a training set of N randomly chosen inputs and corresponding correct outputs. The result is then tested on a test-set with the same size .

Result of the fully connected Net

Some results of the back-propagation iteration on the fully connected net are depicted as function of the average error which is calculated as

where N is the number of training sets, d is the correct output and O is the actual output of the net.

Figure 5. The bitstring problem with 800 training sets (black) and 800 test sets (red).

Figure 5 shows the result for different learning rates and learning parameters with N=800. We can see that when a =0 the net converges more easily to local minimum compared to when a =0.5. for a =0.5 we can also see that although the net is converging faster with increasing h, the error could increase on the test set with increasing h .

Although the test-set converges, the error on the test-set remains more than the error on the training set.

Figure 6. The bitstring problem with 1000 training sets (black) and 1000 test sets (red).

Figure 6 shows an example where the training set is chosen to 1000 which is approximately 1/4 of the total number of possible inputs to the net. The error is less on the test set compared to figure 5 with the best result for h =0.3.

Result on the net with connected neighbors

The problem is solved with a net as the one depicted in figure 7 with only three neighboring inputs connected to the second layer.

Figure 7. A net with one hidden layer, 1 output and with only the three neighbouring inputs connected to the second layer

Figure 8 shows the error evolution for this net structure for a training set of 500. Although the training set is smaller than the sets used for the examples depicted in figure 5 and 6, the net solves the problem more generally and therefore the error on the test set is much closer to the error on the test data compared to the fully connected net. The structure of the net also requires less calculations leading to a faster iteration time.

Figure 8. The problem with 500 training sets (black) and 500 test sets (red).

The design of the net structure for a particular problem is thus very important for speeding up nets and obtaining more general solutions.

Download of the implemented code

The MATLAB code for problems 3&4 can be downloaded here. The parameters for each problem can be changed at the beginning of the program. Once the parameters are set, the program can be executed and the result will be the final weights and the error figures.

Download:

MATLAB_code.zip

Hosted by www.Geocities.ws