Lowest common ancestor(LCA) using Range Minimum Query(RMQ)

Lowest common ancestor(LCA) using RMQ

We already have discussed lowest common ancestor and range minimum query. In this post, we will discuss how to use RMQ to find the lowest common ancestor of two given nodes in a binary tree or binary search tree. LCA of two nodes u and v is the node which is furthest from root and u and v are descendant of that node. For example, LCA node(5) and node(9) in below tree is node(2).

lowest common ancestor using RMQ

In earlier solutions, we scan the whole binary tree every time we have to find LCA of two nodes. This has a complexity of O(n) for each query. If this query if fired frequently, this operation may become a bottleneck of the algorithm. One way to avoid processing all nodes on each query is to preprocess binary tree and store precalculated information to find LCA of any two nodes in constant time.

This pattern is very similar to a range minimum query algorithm. Can we reduce the lowest common ancestor problem to range minimum query problem?

Reduction of lowest common ancestor problem to RMQ

Let’s revise what is RMQ: Given an array A of length n; RMQ(i,j) – returns the index of the minimum element in the subarray A[i..j].

lowest common ancestor using RMQ

Let’s find LCA of two nodes 5 and 8 manually in the above binary tree. We notice that LCA(u,v) is a shallowest common node (in terms of distance from root) which is visited when u and v are visited using the depth-first search of the tree. An important thing to note is that we are interested in shallowest, which is minimum depth, the node between u and v. Sounds like RMQ?

Implementation wise, the tree is traversed as Euler tour, which means we visit each node of tree, without lifting the pencil. This is very similar to a preorder traversal of a tree. At most, there can be 2n-1 nodes in Euler tour of a tree with n nodes, store this tour in an array E[1..2n-1].

As algorithm requires the shallowest node, closest to root, so we store the depth of each node while doing Euler tour, so we store the depth of each node in another array D[1..2n-1].

We should maintain the value when the node was visited for the first time. Why?

E[1..2n-1] – Store the nodes visited in a Euler tour of T. Euler[i] stores ith node visited in the tour.
D[1..2n-1] – Stores level of the nodes in tour. D[i] is the level of node at Euler[i]. (level is defined to be the distance from the root).
F[1..n] – F[i] will hold value when node is first visited.

For example of this graph, we start from node(1) and do Euler tour of the binary tree.

lowest common ancestor using rmq

Euler tour would be like

lca using rmq

Depth array is like

lca using rmq

First visit array looks like

lca using rmq

To compute LCA(u,v): All nodes in the Euler tour between the first visits to u and v are E[F[u]...F[v]] (assume F[u] is less than F[v] else, swap u and v). The shallowest node in this tour is at index RMQ D(F[u]..F[v]), since D[i] stores the depth of node at E[i].
RMQ function will return the index of the shallowest node between u and v, thus output node will be E[RMQ D(F[u], F[v])] as LCA(u,v)

Let’s take an example, find the lowest common ancestor of node(5) and node(8).

First of all, find the first visit to node(5) and node(8). It will be F[5] which is 2 and F[8] which is 7.

Now, all the nodes which come between visit of node(5) and node(8) are in E[2..7], we have to find the shallowest node out these nodes. This can be done by applying RMQ on array D with range 3 to 6.

lca using rmq

LCA will be E[RMQ( D(2,7)], in this case, RMQ(D[2..7]) is index 3. E[3] = 2, hence LCA(5,8) is node(2).

Lowest common ancestor using RMQ: Implementation

package com.company.BST;

import java.util.Arrays;

/**
 * Created by sangar on 1.1.19.
 */
public class LowestCommonAncestor {

    private int[] E;
    private int[] D;
    private int[] F;

    int[][] M;

    private int tourCount;

    public LowestCommonAncestor(BinarySearchTree tree){
        //Create Euler tour, Depth array and First Visited array
        E = new int[2*tree.getSize()];
        D = new int[2*tree.getSize()];
        F = new int[tree.getSize() + 1];

        M = new int[2 * tree.getSize()][2 * tree.getSize()];

        Arrays.fill(F, -1);
        getEulerTour(tree.getRoot(), E, D, F, 0);

        preProcess(D);
    }

    public int findLowestCommonAncestor(int u, int v){
        //This means node is not in tree
        if(u >= F.length || v >= F.length || F[u] == -1 || F[u] == -1)
            return -1 ;

        return E[rmq(D, F[u], F[v])];
    }

    /* This function does all the preprocessing on the tree and
       creates all required arrays for the algorithm.
    */
    private void getEulerTour(TreeNode node, int[] E, int[] D, int[] F,
                              int level){
        if(node == null) return;

        int val = (int)node.getValue();

        E[tourCount] = val; // add to tour
        D[tourCount] =  level; // store depth

        if(F[val] == -1) {
            F[(int) node.getValue()] = tourCount;
        }
        tourCount++;
        
        if(node.getLeft() != null ) {
            getEulerTour(node.getLeft(), E, D, F, level + 1);

            E[tourCount] = val;
            D[tourCount++] = level;
        }
        if(node.getRight() != null ) {
            getEulerTour(node.getRight(), E, D, F, level + 1);

            E[tourCount] = val;
            D[tourCount++] = level;
        }
    }

    /*
      This function preprocess the depth array to quickly find 
      RMQ which is used to find shallowest node.
     */
    void preProcess(int[] D) {

        for (int i = 0; i < D.length; i++)
            M[i][0] = i;

        for (int j = 1; 1 << j <D.length ; j++){
            for (int i = 0; i + (1 << j) - 1 < D.length; i++){
                if (D[M[i][j - 1]] < D[M[i + (1 << (j - 1))][j - 1]])
                    M[i][j] = M[i][j - 1];
                else
                    M[i][j] = M[i + (1 << (j - 1))][j - 1];
            }
        }
    }

    private int rmq(int a[], int start, int end){
        int j = (int)Math.floor(Math.log(end-start+1));

        if ( a[ M[start][j] ] <= a[M[end-(1<<j)+1][j]] )
            return M[start][j];

        else
            return M[end-(1<<j)+1][j];
    }
}

The beauty of this algorithm is that it can be used to find LCA of any tree, not just only binary tree or BST. The complexity of the algorithm to find a lowest common ancestor using range minimum query is (O(n), O(1)) with an additional space complexity of O(n).

Reference
Faster algorithms for finding lowest common ancestors in directed acyclic graphs

Please share if there is something wrong or missing. If you are preparing for an interview, please signup for free demo class to guide you through the process.