Unsupervised Clustering Algorithms Comparison for On-line Sensor Clustering

Steven Lowette & Kristof Van Laerhoven


Characteristics overview
 
 
 
Algorithm
Implemented
Online
Learning Kind
Number of Clusters
Visualisation
Network Topology
Speed
Performance on Clustered Data
Performance on Spread Data
Remarks
Self Organising Map
Yes
Yes
Soft competitive
Variable
All dim.
Fixed
++
++
--
 
Recurrent Self Organising Map
Yes
Yes
Soft Competitive
Variable
All dim.
Fixed
+
++
-
The same as the SOM, but with a ‘memory’ for Temporal Sequence Processing
K-Means Clustering
Yes
Yes
Hard competitive
Fixed
2 dim. only
Variable
+
-
+
Learning rule 1/n gives mean over all input in cluster so far
Sequential Leader Clustering
No
Yes
Hard competitive
Variable
2 dim. only
Variable
+
+/-
+/-
Low accuracy and flexibility, because the clusters are static
Growing K-Means Clustering
Yes
Yes
Hard competitive
Variable
2 dim. only
Variable
+
+
+/-
Because of the movement of the clusters, old inputs can get out of it’s cluster’s radius
Growing K-means Clustering with Maximum Cluster Size (using FIFO)
Yes
No
Hard competitive
Variable
2 dim. only
Variable
--
+
+/-
The gain of flexibility on the input data is overshadowed by the loss of on-line processing
Neural Gas
Yes
Yes
Soft competitive
Fixed
2 dim. only
Variable
+
+/-
+
 
Neural Gas with Competitive Hebbian Learning
Yes
Yes
Soft competitive
Fixed
2 dim. only
Variable
+
+/-
+
The addition of the edges between the neurones has no influence on the learning
Growing Neural Gas
Yes
Yes
Soft competitive
Variable
2 dim. only
Variable
++
-
+
 

Algorithm overview

Self Organizing Map (SOM)

Parameters

Description of the Algorithm

  1. Read new input
  1. Find winning neuron:
  1. Update neurons. For each neuron and component i:

Recurrent Self Organising Map (RSOM)

Parameters

Description of the Algorithm

  1. Read new input
  2. Update the difference vectors. For each neuron and component i:
  1. Find winning neuron:
  1. Update neurons. For each neuron and component i:

K-Means Clustering (KMC)

Parameters

, depending on the number of inputs in the cluster.
 
 

Description of the Algorithm

  1. Read New Input
  2. Find the winning cluster:
  1. Update:

Sequential Leader Clustering (SLC)

Parameters

Description of the Algorithm

  1. Read New Input
  2. Find the winning cluster
  1. If distance to winning cluster is under threshold then

Growing K-Means Clustering (GKMC)

Parameters

Description of the Algorithm

  1. Read New Input
  2. Find the winning cluster:
  1. If distance to winning cluster is under threshold then update:

Growing K-means Clustering Algorithm with Maximum Cluster Size, using FIFO (GKMC_FIFO)

Parameters

Description of the Algorithm

  1. Read New Input
  2. Find the winning cluster:
  1. If distance to winning cluster is under threshold then update:

Neural Gas Algorithm (NGA)

Parameters

Description of the Algorithm

  1. Read New Input
  2. Calculate all the distances of the (fixed number of) clusters to the input
  3. Make a list of the increasing distances and their corresponding clusternumber: Sequence()
  4. Update the clusters. For each cluster w and each component i:

Neural Gas Algorithm with Competitive Hebbian Learning (NGA_CHL)

Parameters

Description of the Algorithm

  1. Read New Input
  2. Calculate distances
  1. Make an increasing list of the distances to the input and their corresponding clusternumber: Sequence()
  2. Update the clusters. For each cluster w and each component i:
  1. Maintain edges:

Growing Neural Gas (GNG)

Parameters

Description of the Algorithm

  1. Read New Input
  2. Calculate distances:
  1. Add squared distance between input and Winner to winner's 'error'
  2. 'Refresh' the edge between Winner and Winner2: set age=0 or create
  3. Adapt Clusters:
  1. Maintain edges:
  1. Delete Clusters who are not connected any more
  2. If Number of steps is the integer multiple of chosen Lambda Then
  1. Decrease 'errors' of all clusters by factor Beta

Definitions

Metrics

Neighbor Functions

, the adaptation is slightly modified to avoid overflow, by keeping the vector components between 0 and 255, as it should be.
 
 

Adaptation Rules