In the early stage of the project a large number of possible network structures were considered. The following list shows the main design questions that had to be decided before setting up the model.
From the list above it can be seen that the number of possible network structures is huge. In Figure 5.1 four example architectures are given.
Figure 5.1: Some Examples of Modular Architectures.
The network shown in Figure 5.1(a) is a heterogeneous architecture consisting of two modules. One is a self-organizing map (SOM) and is used to reduce the input dimension; the other one is a multilayer Perceptron that works on the reduced input space. This architecture is already used in different applications; one example is shown in [bell95].
To train a MLP on the output of logical adaptive nodes is one idea for another heterogeneous structure, see Figure 5.1(b).
The distribution of output values may have important information about the input pattern, which is not used in a standard LNN. The suggested architecture uses all the outputs as input vector for a MLP; this might increase the performance of the system. To the best of my knowledge there is currently no published work on this type of network.
A homogeneous pyramidal structure is shown in Figure 5.1(c). This is similar to an architecture proposed by Aleksander [alek89b], but uses small MLPs instead of logical nodes. This should provide fast training of the network as well as improved generalization ability. To the best of my knowledge there is no research published on this topic.
The final network shown in Figure 5.1(d), uses a different approach. The input and output dimensions of the problem remain the same.
Each subnetwork is only concerned with a single class. For any input vector, the output of a particular module indicates whether that input is recognized as belonging to the relevant class or not. This structure is investigated in [anan95] and [zhao95].
In the process of developing a modular neural network a number of other ideas were considered. These included using partly overlapping modules or different training data sets. Other ideas focused on the connections between the inputs and the modules, including the use of statistical methods or the entropy of the input attributes to structure these connections.
Due to the time constraints involved (less than six month) the project concentrated on one structure. The architecture used is a less general version of the network depicted in Figure 5.1(c).