How to resolve the algorithm K-d tree step by step in the Java programming language

Published on 12 May 2024 09:40 PM

How to resolve the algorithm K-d tree step by step in the Java programming language

Table of Contents

Problem Statement

A k-d tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). k-d trees are a special case of binary space partitioning trees. k-d trees are not suitable, however, for efficiently finding the nearest neighbor in high dimensional spaces. As a general rule, if the dimensionality is k, the number of points in the data, N, should be N ≫ 2k. Otherwise, when k-d trees are used with high-dimensional data, most of the points in the tree will be evaluated and the efficiency is no better than exhaustive search, and other methods such as approximate nearest-neighbor are used instead. Task: Construct a k-d tree and perform a nearest neighbor search for two example data sets: For the Wikipedia example, find the nearest neighbor to point (9, 2) For the random data, pick a random location and find the nearest neighbor. In addition, instrument your code to count the number of nodes visited in the nearest neighbor search. Count a node as visited if any field of it is accessed. Output should show the point searched for, the point found, the distance to the point, and the number of nodes visited. There are variant algorithms for constructing the tree.
You can use a simple median strategy or implement something more efficient.
Variants of the nearest neighbor search include nearest N neighbors, approximate nearest neighbor, and range searches.
You do not have to implement these.
The requirement for this task is specifically the nearest single neighbor.
Also there are algorithms for inserting, deleting, and balancing k-d trees.
These are also not required for the task.

Let's start with the solution:

Step by Step solution about How to resolve the algorithm K-d tree step by step in the Java programming language

The provided code implements a K-dimensional (Kd) tree in Java, primarily used for efficient nearest neighbor search in high-dimensional space. Here's a detailed breakdown of the code: 1.KdTree Class:

  • Constructor: It takes the number of dimensions (dimensions_) and a list of nodes (nodes) as parameters. It initializes the tree by calling the makeTree function.
  • findNearest: This method finds the nearest point in the tree to the given target node. It initializes best_ to null, visited_ to 0, and bestDistance_ to 0. It then calls the nearest function with the root node, target node, and an index starting at 0.
  • visited: Returns the number of nodes visited during the nearest neighbor search.
  • distance: Returns the Euclidean distance between the nearest point found and the target node.
  • makeTree: Recursively builds the Kd tree using the QuickSelect algorithm. It finds the median node based on the given index and divides the nodes into two subtrees based on that node's value.
  • NodeComparator: A comparator class used by QuickSelect to compare nodes based on their coordinates at a specific index. 2. Node Class:
  • Coordinates: Represents a point in K-dimensional space, storing coordinates in a coords_ array.
  • Left/Right Child Nodes: Each node can have a left and right child node, representing the subtrees containing points with lower and higher coordinates along the current dimension, respectively.
  • get: Retrieves the coordinate value at the specified index.
  • distance: Calculates the Euclidean distance between this node and another node. 3. QuickSelect Class:
  • select: Implements the QuickSelect algorithm to find the n-th smallest element in a list based on a given comparator. It finds the median, partitions the list based on the median, and recursively applies the algorithm to the appropriate sublist to find the desired element. 4. KdTreeTest Class:
  • testWikipedia: Tests the Kd tree with data from the Wikipedia example and prints the results.
  • testRandom: Generates random points and builds a Kd tree to find the nearest neighbor to a randomly generated target point. It runs the test multiple times with different numbers of points and prints the results.

Source code in the java programming language

import java.util.*;

public class KdTree {
    private int dimensions_;
    private Node root_ = null;
    private Node best_ = null;
    private double bestDistance_ = 0;
    private int visited_ = 0;
    
    public KdTree(int dimensions, List<Node> nodes) {
        dimensions_ = dimensions;
        root_ = makeTree(nodes, 0, nodes.size(), 0);
    }
    
    public Node findNearest(Node target) {
        if (root_ == null)
            throw new IllegalStateException("Tree is empty!");
        best_ = null;
        visited_ = 0;
        bestDistance_ = 0;
        nearest(root_, target, 0);
        return best_;
    }
    
    public int visited() {
        return visited_;
    }
    
    public double distance() {
        return Math.sqrt(bestDistance_);
    }
    
    private void nearest(Node root, Node target, int index) {
        if (root == null)
            return;
        ++visited_;
        double d = root.distance(target);
        if (best_ == null || d < bestDistance_) {
            bestDistance_ = d;
            best_ = root;
        }
        if (bestDistance_ == 0)
            return;
        double dx = root.get(index) - target.get(index);
        index = (index + 1) % dimensions_;
        nearest(dx > 0 ? root.left_ : root.right_, target, index);
        if (dx * dx >= bestDistance_)
            return;
        nearest(dx > 0 ? root.right_ : root.left_, target, index);
    }
    
    private Node makeTree(List<Node> nodes, int begin, int end, int index) {
        if (end <= begin)
            return null;
        int n = begin + (end - begin)/2;
        Node node = QuickSelect.select(nodes, begin, end - 1, n, new NodeComparator(index));
        index = (index + 1) % dimensions_;
        node.left_ = makeTree(nodes, begin, n, index);
        node.right_ = makeTree(nodes, n + 1, end, index);
        return node;
    }
    
    private static class NodeComparator implements Comparator<Node> {
        private int index_;

        private NodeComparator(int index) {
            index_ = index;
        }
        public int compare(Node n1, Node n2) {
            return Double.compare(n1.get(index_), n2.get(index_));
        }
    }
    
    public static class Node {
        private double[] coords_;
        private Node left_ = null;
        private Node right_ = null;

        public Node(double[] coords) {
            coords_ = coords;
        }
        public Node(double x, double y) {
            this(new double[]{x, y});
        }
        public Node(double x, double y, double z) {
            this(new double[]{x, y, z});
        }
        double get(int index) {
            return coords_[index];
        }
        double distance(Node node) {
            double dist = 0;
            for (int i = 0; i < coords_.length; ++i) {
                double d = coords_[i] - node.coords_[i];
                dist += d * d;
            }
            return dist;
        }
        public String toString() {
            StringBuilder s = new StringBuilder("(");
            for (int i = 0; i < coords_.length; ++i) {
                if (i > 0)
                    s.append(", ");
                s.append(coords_[i]);
            }
            s.append(')');
            return s.toString();
        }
    }
}


import java.util.*;

//
// Java implementation of quickselect algorithm.
// See https://en.wikipedia.org/wiki/Quickselect
//
public class QuickSelect {
    private static final Random random = new Random();

    public static <T> T select(List<T> list, int n, Comparator<? super T> cmp) {
        return select(list, 0, list.size() - 1, n, cmp);
    }
    
    public static <T> T select(List<T> list, int left, int right, int n, Comparator<? super T> cmp) {
        for (;;) {
            if (left == right)
                return list.get(left);
            int pivot = pivotIndex(left, right);
            pivot = partition(list, left, right, pivot, cmp);
            if (n == pivot)
                return list.get(n);
            else if (n < pivot)
                right = pivot - 1;
            else
                left = pivot + 1;
        }
    }
    
    private static <T> int partition(List<T> list, int left, int right, int pivot, Comparator<? super T> cmp) {
        T pivotValue = list.get(pivot);
        swap(list, pivot, right);
        int store = left;
        for (int i = left; i < right; ++i) {
            if (cmp.compare(list.get(i), pivotValue) < 0) {
                swap(list, store, i);
                ++store;
            }
        }
        swap(list, right, store);
        return store;
    }
    
    private static <T> void swap(List<T> list, int i, int j) {
        T value = list.get(i);
        list.set(i, list.get(j));
        list.set(j, value);
    }

    private static int pivotIndex(int left, int right) {
        return left + random.nextInt(right - left + 1);
    }
}


import java.util.*;

public class KdTreeTest {
    public static void main(String[] args) {
        testWikipedia();
        System.out.println();
        testRandom(1000);
        System.out.println();
        testRandom(1000000);
    }
    
    private static void testWikipedia() {
        double[][] coords = {
            { 2, 3 }, { 5, 4 }, { 9, 6 }, { 4, 7 }, { 8, 1 }, { 7, 2 }
        };
        List<KdTree.Node> nodes = new ArrayList<>();
        for (int i = 0; i < coords.length; ++i)
            nodes.add(new KdTree.Node(coords[i]));
        KdTree tree = new KdTree(2, nodes);
        KdTree.Node nearest = tree.findNearest(new KdTree.Node(9, 2));
        System.out.println("Wikipedia example data:");
        System.out.println("nearest point: " + nearest);
        System.out.println("distance: " + tree.distance());
        System.out.println("nodes visited: " + tree.visited());
    }

    private static KdTree.Node randomPoint(Random random) {
        double x = random.nextDouble();
        double y = random.nextDouble();
        double z = random.nextDouble();
        return new KdTree.Node(x, y, z);
    }

    private static void testRandom(int points) {
        Random random = new Random();
        List<KdTree.Node> nodes = new ArrayList<>();
        for (int i = 0; i < points; ++i)
            nodes.add(randomPoint(random));
        KdTree tree = new KdTree(3, nodes);
        KdTree.Node target = randomPoint(random);
        KdTree.Node nearest = tree.findNearest(target);
        System.out.println("Random data (" + points + " points):");
        System.out.println("target: " + target);
        System.out.println("nearest point: " + nearest);
        System.out.println("distance: " + tree.distance());
        System.out.println("nodes visited: " + tree.visited());
    }
}


  

You may also check:How to resolve the algorithm Interactive programming (repl) step by step in the Common Lisp programming language
You may also check:How to resolve the algorithm Sorting algorithms/Bubble sort step by step in the ALGOL 68 programming language
You may also check:How to resolve the algorithm Abstract type step by step in the Perl programming language
You may also check:How to resolve the algorithm Determine if a string is numeric step by step in the Befunge programming language
You may also check:How to resolve the algorithm Naming conventions step by step in the Common Lisp programming language