next up previous contents
Next: Calculation of the Output Up: A New Modular Neural Previous: The Network Architecture

Training the System

  Training occurs in two stages, using the Backpropagation algorithm described in section 2.3.

In the first phase all sub-networks in the input layer are trained. The individual training set for each sub-network is selected from the original training set and consists of the components of the original vector which are connected to this particular network (as an input vector) together with the desired output class represented in binary or 1-out-of-k coding.

In the second stage the decision network is trained. To calculate the training set each original input pattern is applied to the input layer; the resulting vector together with the desired output class (represented in a 1-out-of-k coding) form the training pair for the decision module.

To simplify the description of the training a small intermediate representation is used, further it is assumed that the permutation function is the identity π(x) = x.

The original training set TS is: (x1j, x2j, &ldots;, xlj;dj), and where xij ∈IR is the ith component of the jth input vector, dj is the class number, j = 1,&ldots;, t, where t is the number of training instances.

The module MLPi is connected to:
xi n + 1,xi n + 2,&ldots;,x(i+1) n

The training set TSi for the module MLPi:
(xi n + 1j, xi n + 2j, &ldots;, x(i+1) nj;dBINj)
for all j=1,&ldots;, t, where dBINj is the output class dj represented in a binary code.

The mapping performed by the input layer is denoted by:
Φ: Rn*m R7m * log2k

The training set for the decision network:
( Φ((x1j, x2j, &ldots;, xlj));dBITj) and j=1,&ldots;, t. Where dBITj is the output class dj represented in a 1-out-of-k code.

The mapping of the decision network is denoted by:
Ψ: Rm * log2k Rk

 figure641
Figure 5.3:  The Training Algorithm.

The training algorithm is summarized in Figure 5.3.

The training of each module in the input layer is independent of all other modules so this can be done in parallel. The training is stopped either when each module has reached a sufficient small error or a defined maximum number of steps has been performed. This keeps the modules independent.

Alternatively training can be stopped if the overall error of all modules is sufficiently small or the number of maximum steps has been performed. This assumes that the training occurs step by step simultaneously in all modules.


next up previous contents
Next: Calculation of the Output Up: A New Modular Neural Previous: The Network Architecture

Albrecht Schmidt
Mit Okt 4 16:45:34 CEST 2000