*Parametric decision rules* make the membership classification
based on the knowledge of the a priori probabilities of occurrence of
objects belonging to some class $C$_{i} and
the underlining distributions of the probability density
functions: $p(X|C$_{i}) which characterise how
probable is the measurement $X$ should the membership
class be $C$_{i}.

In many situations, however, these distributions are unknown or are difficult to describe and handle analytically.

*Non-parametric decision rules*, such as the *nearest
neighbour rule*, are attractive because no prior knowledge of
the distributions is required. These rules rely instead on the
training set of objects with known class membership to make
decisions on the membership of unknown objects.

The nearest neighbour rule, as its name suggests, classifies an unknown object to the class of its nearest neighbour in the measurement space using, most commonly, Euclidean metrics (see Figure 1.0).

**Figure 1.0:** *The nearest neighbour rule.*

Beside being extremely simple to implement in practice, the
probability of error for the nearest neighbour rule given enough
members in the training set is sufficiently close to the Bayes
(optimal) probability of error. It has been shown by Cover and Hart
**[2]** that as the size of
the training set goes to infinity, the asymptotic nearest neighbour
error is never two times worse than the Bayes (optimal) error. More
precisely:

Where $P$_{e} is Bayes error and $N$
is the size of the training set. Furthermore, the performance of
the obvious modification for this rule, namely the *k-nearest neighbour
decision rule*, can be even better. The latter classifies an unknown
object to the class most heavily represented among its $k$
nearest neighbours (see Figure 1.1).

**Figure 1.1:** *The k-nearest neighbour rule.*

Despite simplicity and good performance, the traditional criticism of this decision rule pointed at large space requirement to store the entire training set and the seeming necessity to query the entire training set in order to make a single membership classification. As a result, there has been considerable interest in editing the training set in order to reduce its size.

Different proximity graphs (such as Delaunay triangulation) provide
efficient geometric apparatus for solving this problem
**[1]**.

**Figure 2.0:** *Voronoi diagram.*

The Delaunay triangulation is a dual of the Voronoi diagram. If two Voronoi
regions share a boundary, the nodes of these regions are connected
with an edge (see Figure 2.1). Such nodes are called the *Voronoi
neighbours* (or *Delaunay neighbours*).

**Figure 2.1:** *Delaunay triangulation.*

Considering the nearest neighbour rule, it can be seen that if we compute the Voronoi diagram using the training set as nodes, when some unknown point falls into a particular Voronoi region we must classify this point to the class of that region's node. By definition of the Voronoi region, its every point is closer to the region's node than to any other node, and thus this node must be the nearest neighbour of any new point inside the region.

The *decision boundary* of a decision rule separates all those
points of space in which an unknown object will be classified into some
particular class from the remaining points where we classify into some
other class. Obviously enough, the decision boundary of the nearest
neighbour rule belongs to the Voronoi diagram. In other words, the
boundaries of the Voronoi regions separating those regions whose nodes
are of different class contribute to a portion of the decision boundary
(see Figure 2.2).

**Figure 2.2:** *The decision boundary.*

Although the decision boundary is rarely explicitly computed on practice for high dimension problems, it is clear that, having the decision boundary alone is sufficient to perform the classification of new objects. Hence the nodes of the Voronoi regions whose boundaries did not contribute to the decision boundary are redundant and can be safely deleted from the training set (see Figure 2.3).

**Figure 2.3:** *The edited training set.*

As it can be seen from Figure 2.3, the neighbours of the deleted regions expand, in some sense, to occupy the resulting void.

The above consideration validates the following algorithm for editing the nearest neighbour decision rule:

- Compute the Delaunay triangulation for the training set.
- Visit each node, marking it if all its Delaunay neighbours are of the same class as the current node.
- Delete all marked nodes, exiting with the remaining ones as the edited training set.

It is not hard to see, that if there were degeneracies, particularly more than two collinear points, the resulting edited set may not be minimal (see Figure 2.4).

**Figure 2.4:** *A degenerate case.*

As Figure 2.4 illustrates, editing will leave all the points, however any two opposite points, one from each class, are enough to preserve the decision boundary.

The following applet demonstrates editing the nearest neighbour rule using the Voronoi diagram approach. Create a training set for a two class classifier by selecting nodes of classes A and B. Press "Find" to mark redundant nodes and "Delete" to produce the edited set.

A drawback of Voronoi editing is that it still leaves too many points in the edited set by preserving many points which are well separated from other classes and which should intuitively be deleted (see Figure 2.5).

**Figure 2.5:** *Keeping more points than necessary.*

As Figure 2.5 illustrates points $A$_{1} and
$B$_{1} are well separated. They are preserved by
the Voronoi editing to maintain a portion of the decision boundary.
However, assuming that new points will come from the same distribution
as the training set, the portions of the decision boundary remote
from the concentration of training set points is of lesser importance.

Furthermore, computing the Voronoi diagram in high dimensions
is quite costly. Any such algorithm will take in the worst case at
least $theta(n[d/2])$
**[3]**
where $n$ is the number of points and $d$ is
the number of dimensions.

The cost of the method and practical necessity to obtain smaller edited sets, lead us to consider some approximate solutions which would be attractive for practical applications.

**Figure 3.0:** *Gabriel neighbours.*

As Figure 3.0 demonstrates, the points A and B are Gabriel neighbours, whereas B and C are not.

A graph where all pairs of Gabriel neighbours are connected with an edge
is called the *Gabriel graph*.

The Gabriel editing algorithm can be formulated in a way similar to Voronoi editing:

- Compute the Gabriel graph for the training set.
- Visit each node, marking it if all its Gabriel neighbours are of the same class as the current node.
- Delete all marked nodes, exiting with the remaining ones as the edited training set.

Although, the resulting Gabriel edited set doesn't preserve the original decision boundary, the changes occur mainly outside of the zones of interest (see Figure 3.1).

**Figure 3.1:** *Gabriel editing.*

Gabriel graph can be computed by discarding edges from Delaunay triangulation. However, the complexity of computing the latter may be as bad as $O(n[d/2])$. Thus, this approach is not a very attractive one when number of dimensions is large.

Clearly, the Gabriel graph can be computed by brute force if for every potential pair of neighbours $A$ and $B$ we just verify if any other point $X$ is contained in the diametral sphere:

Since there are $n2$ potential neighbours and we need to verify every pair by checking $n$ other points by using the distance computation which requires $O(d)$ steps, the resulting complexity of the brute force algorithm is $O(dn3)$.

The brute force algorithm can be easily improved by the following heuristic. For a point $A$ we create a list of points which are its potential Gabriel neighbours. When checking if the point $B$ is indeed the Gabriel neighbour we must perform the test for every point C if it is contained in the diametral sphere (see Figure 3.2).

**Figure 3.2:** *Discarding points which can't be Gabriel neighbours.*

This heuristic reduces the average complexity to $O(dn2)$.

The following applet demonstrates editing the nearest neighbour rule using Gabriel graph approach. Create a training set for a two class classifier by selecting nodes of classes A and B. Press "Find" to mark redundant nodes and "Delete" to produce the edited set.

Two points $A$ and $B$ are called *relative
neighbours*, if for all other points $X$ in the set, the
following is true:

Geometrically, the above implies that no point X is contained in
the *"lune"* constructed on points A and B should the latter
be relative neighbours (see Figure 4.0).

**Figure 4.0:** *Relative neighbours.*

As Figure 4.0 demonstrates, the points B and C are relative neighbours whereas A and B are not.

A *relative neighbour graph* is such a graph where all pairs of
relative neighbours are connected by an edge.

Thus, relative neighbour editing algorithm can be expressed similar to Voronoi or Gabriel editing by substituting the relative neighbour graph for the underlining graphs used in those algorithms.

Since the relative neighbour graph is a subgraph of the Gabriel graph, it reduces the size of edited sets even further. However, the changes to the decision boundary can be quite severe, both in the important and less important zones (see Figure 4.1).

**Figure 4.1:** *Relative neighbour editing.*

The algorithm for practical computation of the relative neighbour graph is analogical to that used to obtain the Gabriel graph.

The following applet demonstrates editing the nearest neighbour rule using relative neighbour graph approach. Create a training set for a two class classifier by selecting nodes of classes A and B. Press "Find" to mark redundant nodes and "Delete" to produce the edited set.

- [1] Binay K. Bhattacharya, Ronald S. Poulsen, Godfried T. Toussaint,
*"Application of Proximity Graphs to Editing Nearest Neighbor Decision Rule",*International Symposium on Information Theory, Santa Monica, 1981. - [2] T.M. Cover, P.E. Hart,
*"Nearest Neighbor Pattern Classification",*IEEE Transactions on Information Theory, vol. IT-13, No.1, 1967, pp.21-27 - [3] V. Klee,
*On the complexity of d-dimensional Voronoi diagrams",*Arch. Math., vol. 34, 1980, pp. 75-80. - [4] Godfried T. Toussaint,
*"The Relative Neighborhood Graph of a Finite Planar Set",*Pattern Recognition, vol.12, No.4, 1980, pp.261-268.