Lowest common ancestor(LCA) using Range Minimum Query(RMQ)

Lowest common ancestor(LCA) using RMQ

We already have discussed lowest common ancestor and range minimum query. In this post, we will discuss how to use RMQ to find the lowest common ancestor of two given nodes in a binary tree or binary search tree. LCA of two nodes u and v is the node which is furthest from root and u and v are descendant of that node. For example, LCA node(5) and node(9) in below tree is node(2).

lowest common ancestor using RMQ

In earlier solutions, we scan the whole binary tree every time we have to find LCA of two nodes. This has a complexity of O(n) for each query. If this query if fired frequently, this operation may become a bottleneck of the algorithm. One way to avoid processing all nodes on each query is to preprocess binary tree and store precalculated information to find LCA of any two nodes in constant time.

This pattern is very similar to a range minimum query algorithm. Can we reduce the lowest common ancestor problem to range minimum query problem?

Reduction of lowest common ancestor problem to RMQ

Let’s revise what is RMQ: Given an array A of length n; RMQ(i,j) – returns the index of the minimum element in the subarray A[i..j].

lowest common ancestor using RMQ

Let’s find LCA of two nodes 5 and 8 manually in the above binary tree. We notice that LCA(u,v) is a shallowest common node (in terms of distance from root) which is visited when u and v are visited using the depth-first search of the tree. An important thing to note is that we are interested in shallowest, which is minimum depth, the node between u and v. Sounds like RMQ?

Implementation wise, the tree is traversed as Euler tour, which means we visit each node of tree, without lifting the pencil. This is very similar to a preorder traversal of a tree. At most, there can be 2n-1 nodes in Euler tour of a tree with n nodes, store this tour in an array E[1..2n-1].

As algorithm requires the shallowest node, closest to root, so we store the depth of each node while doing Euler tour, so we store the depth of each node in another array D[1..2n-1].

We should maintain the value when the node was visited for the first time. Why?

E[1..2n-1] – Store the nodes visited in a Euler tour of T. Euler[i] stores ith node visited in the tour.
D[1..2n-1] – Stores level of the nodes in tour. D[i] is the level of node at Euler[i]. (level is defined to be the distance from the root).
F[1..n] – F[i] will hold value when node is first visited.

For example of this graph, we start from node(1) and do Euler tour of the binary tree.

lowest common ancestor using rmq

Euler tour would be like

lca using rmq

Depth array is like

lca using rmq

First visit array looks like

lca using rmq

To compute LCA(u,v): All nodes in the Euler tour between the first visits to u and v are E[F[u]...F[v]] (assume F[u] is less than F[v] else, swap u and v). The shallowest node in this tour is at index RMQ D(F[u]..F[v]), since D[i] stores the depth of node at E[i].
RMQ function will return the index of the shallowest node between u and v, thus output node will be E[RMQ D(F[u], F[v])] as LCA(u,v)

Let’s take an example, find the lowest common ancestor of node(5) and node(8).

First of all, find the first visit to node(5) and node(8). It will be F[5] which is 2 and F[8] which is 7.

Now, all the nodes which come between visit of node(5) and node(8) are in E[2..7], we have to find the shallowest node out these nodes. This can be done by applying RMQ on array D with range 3 to 6.

lca using rmq

LCA will be E[RMQ( D(2,7)], in this case, RMQ(D[2..7]) is index 3. E[3] = 2, hence LCA(5,8) is node(2).

Lowest common ancestor using RMQ: Implementation

package com.company.BST;

import java.util.Arrays;

/**
 * Created by sangar on 1.1.19.
 */
public class LowestCommonAncestor {

    private int[] E;
    private int[] D;
    private int[] F;

    int[][] M;

    private int tourCount;

    public LowestCommonAncestor(BinarySearchTree tree){
        //Create Euler tour, Depth array and First Visited array
        E = new int[2*tree.getSize()];
        D = new int[2*tree.getSize()];
        F = new int[tree.getSize() + 1];

        M = new int[2 * tree.getSize()][2 * tree.getSize()];

        Arrays.fill(F, -1);
        getEulerTour(tree.getRoot(), E, D, F, 0);

        preProcess(D);
    }

    public int findLowestCommonAncestor(int u, int v){
        //This means node is not in tree
        if(u >= F.length || v >= F.length || F[u] == -1 || F[u] == -1)
            return -1 ;

        return E[rmq(D, F[u], F[v])];
    }

    /* This function does all the preprocessing on the tree and
       creates all required arrays for the algorithm.
    */
    private void getEulerTour(TreeNode node, int[] E, int[] D, int[] F,
                              int level){
        if(node == null) return;

        int val = (int)node.getValue();

        E[tourCount] = val; // add to tour
        D[tourCount] =  level; // store depth

        if(F[val] == -1) {
            F[(int) node.getValue()] = tourCount;
        }
        tourCount++;
        
        if(node.getLeft() != null ) {
            getEulerTour(node.getLeft(), E, D, F, level + 1);

            E[tourCount] = val;
            D[tourCount++] = level;
        }
        if(node.getRight() != null ) {
            getEulerTour(node.getRight(), E, D, F, level + 1);

            E[tourCount] = val;
            D[tourCount++] = level;
        }
    }

    /*
      This function preprocess the depth array to quickly find 
      RMQ which is used to find shallowest node.
     */
    void preProcess(int[] D) {

        for (int i = 0; i < D.length; i++)
            M[i][0] = i;

        for (int j = 1; 1 << j <D.length ; j++){
            for (int i = 0; i + (1 << j) - 1 < D.length; i++){
                if (D[M[i][j - 1]] < D[M[i + (1 << (j - 1))][j - 1]])
                    M[i][j] = M[i][j - 1];
                else
                    M[i][j] = M[i + (1 << (j - 1))][j - 1];
            }
        }
    }

    private int rmq(int a[], int start, int end){
        int j = (int)Math.floor(Math.log(end-start+1));

        if ( a[ M[start][j] ] <= a[M[end-(1<<j)+1][j]] )
            return M[start][j];

        else
            return M[end-(1<<j)+1][j];
    }
}

The beauty of this algorithm is that it can be used to find LCA of any tree, not just only binary tree or BST. The complexity of the algorithm to find a lowest common ancestor using range minimum query is (O(n), O(1)) with an additional space complexity of O(n).

Reference
Faster algorithms for finding lowest common ancestors in directed acyclic graphs

Please share if there is something wrong or missing. If you are preparing for an interview, please signup for free demo class to guide you through the process.

Lowest common ancestor in binary tree

Lowest common ancestor (LCA) in BST

Given a binary search tree and two nodes, find the lowest node which is the parent of both given nodes, that is the lowest common ancestor (LCA). For example, in the following tree, LCA of 6 and 1 is node(5), whereas lowest common ancestor of node 17 and 6 would be node(10).

lowest common ancestor lca

Lowest common ancestor : Thoughts

What is the condition for a node to be LCA of two nodes?  If paths for given nodes diverges from the node, then node is lowest common ancestor. While path is common for both the nodes, nodes are common ancestor but they are not lowest or least. How can we find where paths are diverging?

Paths are diverging when one node is on left subtree and another node is on right subtree of the node. Brute force solution would be to find one node and then go up the tree and see at what parent node, other given node falls on the opposite subtree.

Implementation wise, traverse to node 1 and node 2 and store both paths on the stack. Then pop from two stacks till you find the same node on both paths, that node would be the lowest common ancestor. There will be two scans of the tree and additional space complexity to store paths which in the worst case be O(n).

However, the brute force solution does not use the property of a binary search tree. Property is that all the nodes on the left side of a node are smaller and all the nodes on the right side of a node are greater than node. Can we use that property to solve this problem?

Basic idea is to return the node if it is found in any of the subtrees. At any node, search for both given nodes in the left subtree.  If we get a non-empty node returned from the left subtree, there is at least one of the two nodes is on left subtree.

Again, search in right subtree these two nodes, if a non-empty node is returned from the right subtree, that means at least one of the node is on right subtree.

What does it means if we have a non-empty node on both left and right subtree? It means two nodes are on the left and right subtree, one on each side. It means the root node is the lowest common ancestor.

What if one of the returned nodes is empty? It means both nodes are on one side of the root node, and we should return the upwards the non-empty node returned.

Let’s take an example and see how does it work? Given the below tree, find the lowest common ancestor of node(1) and node(9).

lowest common anestor in binary tree

Start with the node(10) and look for the left subtree for both node(1) and node(9). Go down all the way to the node(1), at the time, we return 1 as the node as node.value is equal to one of the nodes.

lowest common anestor in binary tree

lowest common anestor in binary treeAt node(5), we have got node(1) return from left subtree. We will search for node(1) and node(9) on right subtree. We go all the way to node(6), which is leaf node.

least common ancestor in binary search tree

At node(8), left subtree returns nothing as none of the nodes in the left subtree of node(8). However, right subtree returns node(9).

lowest common ancestor

As per our algorithm, if either of subtree returns non-empty node, we return the node return from the subtree.

lca in binary search tree

At node(5), we get a non-empty node from right subtree and we already know, from the left subtree, we got node(1). At this point at node(5), we have both left and right subtree returning non-empty node, hence return the node(5).

lca in binary treee

Two nodes will be searched on the right subtree of node(10), which will return nothing, hence, final lowest common ancestor will be node(5).

Lowest common ancestor : Implementation

#include<stdio.h>
#include<stdlib.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};

typedef struct node Node;

Node * findLCA(Node *root, int val1, int val2)
{
    // Base case
    if (root == NULL) return NULL;
 
    /* If either val1 or val2 matches with root's key, 
       report the presence by returning the root
       (Note that if a key is the ancestor of other,
       then the ancestor key becomes LCA 
   */
    if (root->key == val1 || root->key == val2)
        return root;
 
    // Look for keys in left and right subtrees
    Node *left  = findLCA(root->left, val1, val2);
    Node *right = findLCA(root->right, val1, val2);
 
    /* If both of the above calls return Non-NULL,
       then one key is present in once subtree
       and other is present in other,
       So this node is the LCA */
    if (left && right)  return root;
 
    // Otherwise check if left subtree or right subtree is LCA
    return (left != NULL)? left : right;
}

Node * createNode(int value){
  Node *newNode =  (Node *)malloc(sizeof(Node));
  
  newNode->value = value;
  newNode->right= NULL;
  newNode->left = NULL;
  
  return newNode;
}

Node * addNode(Node *node, int value){
  if(node == NULL){
      return createNode(value);
  }
  else{
     if (node->value > value){
        node->left = addNode(node->left, value);
      }
      else{
        node->right = addNode(node->right, value);
      }
  }
  return node;
}
 
/* Driver program for the function written above */
int main(){
  Node *root = NULL;
  //Creating a binary tree
  root = addNode(root,30);
  root = addNode(root,20);
  root = addNode(root,15);
  root = addNode(root,25);
  root = addNode(root,40);
  root = addNode(root,37);
  root = addNode(root,45);
  
  printf("\n least common ancestor: %d ",
      leastCommonAncestor(root, 15, 25));
  
  return 0;
}

Below implementation only works for binary search tree and not for the binary tree as above method works.

#include<stdio.h>
#include<stdlib.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};

typedef struct node Node;

int leastCommonAncestor(Node *root, int val1, int val2){

 	if(!root)
       return -1;

 	if(root->value == val1 || root->value == val2)
    	return root->value;

 	/* Case 3: If one value is less and other greater
             than the current node
             Found the LCS return */
 	if((root->value > val1 && root->value <= val2) ||
  		(root->value <= val1 && root->value >val2)){
             return root->value;
 	}
  	/*Case 2 : If Both values are greater than current node, 
           look in right subtree */
 	else if(root->value < val1 && root->value <val2){
        return leastCommonAncestor(root->right, val1, val2);
 	}
 	/*Case 1 : If Both values are less than current node,
           look in left subtree */
 	else if(root->value > val1 && root->value > val2){
        return leastCommonAncestor(root->left, val1, val2);
 	}
}

Node * createNode(int value){
  Node *newNode =  (Node *)malloc(sizeof(Node));
  
  newNode->value = value;
  newNode->right= NULL;
  newNode->left = NULL;
  
  return newNode;
  
}

Node * addNode(Node *node, int value){
  if(node == NULL){
      return createNode(value);
  }
  else{
     if (node->value > value){
        node->left = addNode(node->left, value);
      }
      else{
        node->right = addNode(node->right, value);
      }
  }
  return node;
}
 
/* Driver program for the function written above */
int main(){
  Node *root = NULL;
  //Creating a binary tree
  root = addNode(root,30);
  root = addNode(root,20);
  root = addNode(root,15);
  root = addNode(root,25);
  root = addNode(root,40);
  root = addNode(root,37);
  root = addNode(root,45);
  
  printf("\n least common ancestor: %d ",
      leastCommonAncestor(root, 15, 25));
  
  return 0;
}

The worst complexity of the algorithm to find the lowest common ancestor in a binary tree is O(n). Also, keep in mind that recursion is involved. More skewed the tree, more stack frames on the stack and more the chances that stack will overflow.

This problem is solved using on traversal of tree and managing states when returning from recursive calls.

Please share if there is something wrong or missing. If you are willing to contribute and share your knowledge with thousands of learners across the world, please reach out to us at communications@algorithmsandme.com