Code Ninja | Search Algorithm

What is the shortest distance from point A to point B?

If you’ve taken a CS 101 class in college, this problem should sound very familiar to you. Some of you may even remember that you can solve this problem using something called Dijkstra’s algorithm. You remember that the algorithm is not too hard, so you never bother giving it a second thought. Yet, when it comes time to implement the solution, particularly during a technical interview, you may find yourself at a complete loss as to where to even begin (much like trying to correctly spell “Dijkstra”).

This article presents a Java implementation of Dijkstra’s algorithm using breadth-first-search with a PriorityQueue. This is a very useful algorithm to keep in your arsenal. It is also a great starting place for interview prep since the solution touches upon some of the most fundamental CS concepts including:

Map (HashMap implementation)
PriorityQueue (Heap implementation)
Overriding equals and hashCode (when using a custom class in a Map)
Overriding compareTo (when using a custom class in a PriorityQueue)

Without further ado, here is the formal question statement:

Question:

Given a set of bi-directional, weighted edges of connected nodes in a graph, compute the shortest distance between a starting node and an ending node. The edge weights are non-negative.

Input (input.txt):

A I
A B 2
B C 3
A D 5
B E 2
C F 1
D E 3
E F 6
D G 2
E H 1
F I 4
G H 8
H I 2

Output (stdout):

Solution:

The solution presented below is an end-to-end solution and has three parts: processing the input, implementing Djkstra’s algorithm, and implementing supporting data structure.

1) Process input and construct graph.

The tricky part here is that the nodes can come in any order and must be correctly linked with previous nodes. Thus, a hashmap is needed in order to build the graph in O(n) time since we need to be able to perform look-up on exisiting nodes in O(1) time. Note that a Node data structure is all we need in order to represent our graph – the edges can be stored as a map of weights to adjacent Nodes within the Node data structure – this greatly simplifies our code.

    public static void main (String [] args) throws IOException{

        //Scanner is much more user-friendly than BufferedReader.
        Scanner in = new Scanner(new FileReader("src/input.txt"));

        //Obtain start and end nodes.
        String startLabel = in.next();
        String endLabel = in.next();

        //Construct bi-directional graph (potentially cyclic).
        Map<String,Node> graph = new HashMap<String,Node>();
        while(in.hasNextLine()){

            //Read edge.
            String labelA = in.next();
            String labelB = in.next();
            int weight = in.nextInt();
            Node nodeA;
            Node nodeB;

            //Retrieve or create nodeA.
            if (graph.keySet().contains(labelA)) {
            	nodeA = graph.get(labelA);
            } else {
            	nodeA = new Node(labelA);
            	graph.put(labelA, nodeA);
            }

            //Retrieve or create nodeB.
            if (graph.keySet().contains(labelB)) {
            	nodeB = graph.get(labelB);
            } else {
            	nodeB = new Node(labelB);
            	graph.put(labelB, nodeB);
            }

            //Add bi-drectional edges.
            //(A possible problem variant can use uni-directional edges.)
            nodeA.addEdge(nodeB,weight);
            nodeB.addEdge(nodeA,weight);
        }

        //Invoke algorithm and print solution.
        System.out.println(solve(graph,startLabel,endLabel));
    }

2) Implement Djkstra’s algorithm.

The crux of Djkstra’s algorithms lies in a simple loop invariant: Each time we “visit” a node, it is guranteed to possess the minimum distance from the starting node. We maintain this invariant by repeatedly minimizing our tentative distances to unvisited nodes AND always visiting the nodes closest to our starting node first. Under this strategy, it would be impossible to end up in a situation where a current node we visit does not possess the minimum distance from the starting node, since a hypothetical alternative path would necessarily include an intermediary node with a smaller minimum distance from the starting node, in which case we would have visited this intermediary node first and have shrinked our tentative distance to our current node to the minimum, a contradiction.

A common setup to this algorithm involves initializing the distance to the starting node to 0 and the distance to all other nodes to infinity. A priority queue is used to provide efficient access to the closest node at all times. (The Java PriorityQueue is a heap-based implementation that offers O(logn) insertion/removal time.)

    public static int solve(Map<String, Node> graph, String startLabel,
            String endLabel) {
        //toVisit queue contains nodes we can reach and visit.
        Queue<Node> toVisit = new PriorityQueue<Node>();

        //Determine start node.
        Node startNode = graph.get(startLabel);

        //Set distance to start node to 0.
        startNode.setDistance(0);

        //Start our search at the start node.
        toVisit.add(startNode);

        //While there are more nodes we can reach and visit...
        while (!toVisit.isEmpty()) {

            //Visit the closest node (Djkstras loop-invariant).
            Node currentNode = toVisit.poll();
            if(currentNode.isVisited()) {
                continue; //See Note A.
            }
            currentNode.setVisited(true);

            //If we reached the end node, return its distance.
            //This must be the shortest distance to the end node.
            if (currentNode.getName().equals(endLabel)) {
                return currentNode.distance;
            }

            //For each neighbor, compute the distance to reach the neighbor
            //through our currentNode, and update shortest distance as needed.
            for (Entry<Integer, Node> edge : currentNode.getEdges().entrySet()) {
                int weight = edge.getKey();
                Node neighbor = edge.getValue();

                //It is unnessary to check on neighbors we already visited
                if (!neighbor.isVisited()) {

                    //Compute new distance
                    int newDistance = currentNode.distance + weight;

                    //Update old distance if new distance is better
                    if (newDistance < neighbor.getDistance()) {
                        neighbor.setDistance(newDistance);
                        //Add neighbor to our toVisit priority queue with the new lower priority. See Note B.
                        toVisit.add(neighbor);
                    }
                }
            }
        }

        return -1; //No path to goal exists.
    }

Note A: Since there can be multiple paths to the same node and we do not remove nodes from our PriorityQueue when we make the path update (for efficiency reasons), it is likely for our PriorityQueue to contain duplicate nodes. We should ignore the nodes we’ve already visited to avoid redundant traversals.

Note B: In light of the above, it is redundant to add a neighbor to our toVisit queue if it doesn’t have a better tentative distance since it will just be discarded when it’s dequeued (it will already have been visited by that time). Of course, since any distance is better than infinity, we will always add a “fresh” neighbor to our toVisit queue.

3) Implement supporting data structure.

The Node data structure contains the key fields that support the algorithm including name, distance, and a visited flag. The weighted edges to neighboring nodes are represented by a mapping of edge weights to nodes – this is simply a convenient construct – we are not actually using the map look-up capabilities.

    static class Node implements Comparable<Node> {
        //Always declare fields as private if possible
        private String name;
        private int distance; //distance from starting node.
        private Map<Integer, Node> edges;
        private boolean visited;

        //Each node has a unique name
        public Node(String name) {
            this.name = name;
            //Our algorithm requires distance to be initialized to infinity
            distance = Integer.MAX_VALUE; 
            edges = new HashMap<Integer, Node>();
            visited = false;
        }

        //Add an edge to a neighboring node with the specified weight.
        public void addEdge(Node node, int weight) {
            edges.put(weight, node);
        }

        //equals and hashCode must be overridden for usage in HashMap.
        //We use "name" as our key for this problem.
        @Override
        public boolean equals(Object obj) {
            if (obj == null)
                return false;
            if (obj == this)
                return true;
            if (!(obj instanceof Node))
                return false;

            Node other = (Node) obj;
            return this.getName() == other.getName();
        }

        @Override
        public int hashCode() {
            return name.hashCode();
        }

        //compareTo must be overridden for usage in PriorityQueue.
        //We use "distance" as our comparison value for this problem.
        @Override
        public int compareTo(Node other) {
            if (other == null) {
                return -1;
            }
            if (this.getDistance() == other.getDistance()) {
                return 0;
            }
            return this.getDistance() > other.getDistance() ? 1 : -1;
        }

        //Boiler-plate getter/setters
        public String getName() {
            return name;
        }

        public int getDistance() {
            return distance;
        }

        public void setDistance(int distance) {
            this.distance = distance;
        }

        public Map<Integer, Node> getEdges() {
            return edges;
        }

        public void setVisited(boolean visited) {
            this.visited = visited;
        }

        public boolean isVisited() {
            return visited;
        }
    }

Additional Notes:

The runtime of this algorithm is O(m * log(n)) time where n is the number of vertices (a.k.a. nodes) and m is the number of edges in the graph.
This algorithm does not support negative edge weights since negative edge weights will break the loop-invariant.
If all the edge weights are 1, then this algorithm reduces to a plain breadth-first-search. In this scenario, using a FIFO queue will give us an O(m) solution.

Posts tagged Search Algorithm

Shortest Distance Between Two Nodes

Recent Posts

Archives

Categories

Posts tagged Search Algorithm

Shortest Distance Between Two Nodes

Recent Posts

Archives

Categories

Tags