Connect n ropes with minimum cost

Connect n ropes with minimum cost

There are given n ropes of different lengths, we need to connect these n ropes into one rope. The cost to connect two ropes is equal to the sum of their lengths. We need to connect the ropes with minimum cost.

For example, if there are 4 ropes of lengths 5, 2, 3 and 9. We can connect the ropes in the following way: First, connect the ropes of lengths 2 and 3, the cost of this connection is the sum of lengths of ropes which is 2 + 3 = 5. We are left with three ropes with lengths 5, 5 and 9. Next, connect the ropes of lengths 5 and 5. Cost of connection is 10. Total cost till now is 5 + 10 = 15. We have two ropes left with lengths 10 and 9. Finally, connect the last two ropes and all ropes have connected, Total Cost would be 15 + 19 = 34.

Another way of connecting ropes would be: connect ropes with length 5 and 9 first (we get three ropes of 3, 2 and 14), then connect 14 and 3, which gives us two ropes of lengths 17 and 2. Finally, we connect 19 and 2. Total cost in this way is 14 + 17 + 21 = 52. which is much higher than the optimal cost we had earlier.

Minimum cost to connect n ropes: algorithm

When we were doing calculations in examples, did you notice one thing? Lengths of the ropes connected first are added subsequently in all the connections. For example, we connected ropes with length 2 and 3 in the first example, it gets added to next connect as part of rope with length 5, and again when we connect the ropes with lengths 15 and 9, 2 + 3 is already inside 15.

Read Huffman coding to understand how to solve this problem from this hint.

All we have to make sure that the most repeated added rope is the smallest, then the second smallest and so on. This gives the idea that if we sort the ropes by their sizes and add them, sort again the array again until there is no ropes to add. It will always give us the optimal solution to connect ropes.

What will be the complexity of this implementation? The complexity will be dominated by the sorting algorithm, best we can achieve is O(n log n) using quicksort or merge sort. Also, connecting two ropes we have to sort the arry again. So overall complexity of this method is O(n2 log n)

Can we do better than this? Do we need the array sorted at all the times? All we need is the two ropes with the least length. What data structure gives me the minimum element in the least time. Min Heap will do so. If we create a min heap with lengths of ropes, we can easily find the two ropes with least length in O(1) complexity.

  1. Create a min heap from the array of rope lengths
  2. Fetch the root which will give us smallest rope
  3. Fetch the root again which will give us second smallest rope
  4. Add two ropes and put it back into heap
  5. Go back to step 2

Minimum cost to conenct ropes

package com.company;

import java.util.Arrays;
import java.util.List;
import java.util.PriorityQueue;
import java.util.stream.Collectors;

/**
 * Created by sangar on 3.1.19.
 */
public class ConnectRopes {

    public int getMinimumCost(int[] ropeLength){

        PriorityQueue<Integer> minHeap = new PriorityQueue<Integer>();

        /*
        There is no shortcut for converting from int[] to List<Integer> as Arrays.asList
        does not deal with boxing and will just create a List<int[]>
        which is not what you want.
         */
        List<Integer> list = Arrays.stream(ropeLength).boxed().collect(Collectors.toList());

        /*
        Javadoc seems to imply that addAll is inherited from AbstractQueue where
        it is implemented as a sequence of adds.
        So complexity of this operation is O(nlogn)
         */
        minHeap.addAll(list);

        int totalLength = 0;

        while(minHeap.size() > 1){
            int len1 = (int)minHeap.remove();
            int len2 = (int)minHeap.remove();

            totalLength+=(len1 + len2);

            minHeap.add(len1+len2);
        }

        return totalLength;
    }
}

Test cases

package test;

import com.company.ConnectRopes;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;
/**
 * Created by sangar on 23.9.18.
 */
public class ConnectRopeTest {

    ConnectRopes tester = new ConnectRopes();

    @Test
    public void minimumCostTest() {

        int[] a = {5,2,3,9};

        assertEquals(24, tester.getMinimumCost(a));
    }
    @Test
    public void minimumCostOneRopeTest() {

        int[] a = {5};

        assertEquals(0, tester.getMinimumCost(a));
    }
}

The complexity of this implementation is O(nlogn) (to create min heap out of an array in java Priority queue) + O(nlogn) (to fetch two minimum and re-heapify). However, initial complexity to build a heap from the array can be brought down to O(n) by using own implementation of min heap.

Please share if there is something wrong or missing.

Lowest common ancestor(LCA) using Range Minimum Query(RMQ)

Lowest common ancestor(LCA) using RMQ

We already have discussed lowest common ancestor and range minimum query. In this post, we will discuss how to use RMQ to find the lowest common ancestor of two given nodes in a binary tree or binary search tree. LCA of two nodes u and v is the node which is furthest from root and u and v are descendant of that node. For example, LCA node(5) and node(9) in below tree is node(2).

lowest common ancestor using RMQ

In earlier solutions, we scan the whole binary tree every time we have to find LCA of two nodes. This has a complexity of O(n) for each query. If this query if fired frequently, this operation may become a bottleneck of the algorithm. One way to avoid processing all nodes on each query is to preprocess binary tree and store precalculated information to find LCA of any two nodes in constant time.

This pattern is very similar to a range minimum query algorithm. Can we reduce the lowest common ancestor problem to range minimum query problem?

Reduction of lowest common ancestor problem to RMQ

Let’s revise what is RMQ: Given an array A of length n; RMQ(i,j) – returns the index of the minimum element in the subarray A[i..j].

lowest common ancestor using RMQ

Let’s find LCA of two nodes 5 and 8 manually in the above binary tree. We notice that LCA(u,v) is a shallowest common node (in terms of distance from root) which is visited when u and v are visited using the depth-first search of the tree. An important thing to note is that we are interested in shallowest, which is minimum depth, the node between u and v. Sounds like RMQ?

Implementation wise, the tree is traversed as Euler tour, which means we visit each node of tree, without lifting the pencil. This is very similar to a preorder traversal of a tree. At most, there can be 2n-1 nodes in Euler tour of a tree with n nodes, store this tour in an array E[1..2n-1].

As algorithm requires the shallowest node, closest to root, so we store the depth of each node while doing Euler tour, so we store the depth of each node in another array D[1..2n-1].

We should maintain the value when the node was visited for the first time. Why?

E[1..2n-1] – Store the nodes visited in a Euler tour of T. Euler[i] stores ith node visited in the tour.
D[1..2n-1] – Stores level of the nodes in tour. D[i] is the level of node at Euler[i]. (level is defined to be the distance from the root).
F[1..n] – F[i] will hold value when node is first visited.

For example of this graph, we start from node(1) and do Euler tour of the binary tree.

lowest common ancestor using rmq

Euler tour would be like

lca using rmq

Depth array is like

lca using rmq

First visit array looks like

lca using rmq

To compute LCA(u,v): All nodes in the Euler tour between the first visits to u and v are E[F[u]...F[v]] (assume F[u] is less than F[v] else, swap u and v). The shallowest node in this tour is at index RMQ D(F[u]..F[v]), since D[i] stores the depth of node at E[i].
RMQ function will return the index of the shallowest node between u and v, thus output node will be E[RMQ D(F[u], F[v])] as LCA(u,v)

Let’s take an example, find the lowest common ancestor of node(5) and node(8).

First of all, find the first visit to node(5) and node(8). It will be F[5] which is 2 and F[8] which is 7.

Now, all the nodes which come between visit of node(5) and node(8) are in E[2..7], we have to find the shallowest node out these nodes. This can be done by applying RMQ on array D with range 3 to 6.

lca using rmq

LCA will be E[RMQ( D(2,7)], in this case, RMQ(D[2..7]) is index 3. E[3] = 2, hence LCA(5,8) is node(2).

Lowest common ancestor using RMQ: Implementation

package com.company.BST;

import java.util.Arrays;

/**
 * Created by sangar on 1.1.19.
 */
public class LowestCommonAncestor {

    private int[] E;
    private int[] D;
    private int[] F;

    int[][] M;

    private int tourCount;

    public LowestCommonAncestor(BinarySearchTree tree){
        //Create Euler tour, Depth array and First Visited array
        E = new int[2*tree.getSize()];
        D = new int[2*tree.getSize()];
        F = new int[tree.getSize() + 1];

        M = new int[2 * tree.getSize()][2 * tree.getSize()];

        Arrays.fill(F, -1);
        getEulerTour(tree.getRoot(), E, D, F, 0);

        preProcess(D);
    }

    public int findLowestCommonAncestor(int u, int v){
        //This means node is not in tree
        if(u >= F.length || v >= F.length || F[u] == -1 || F[u] == -1)
            return -1 ;

        return E[rmq(D, F[u], F[v])];
    }

    /* This function does all the preprocessing on the tree and
       creates all required arrays for the algorithm.
    */
    private void getEulerTour(TreeNode node, int[] E, int[] D, int[] F,
                              int level){
        if(node == null) return;

        int val = (int)node.getValue();

        E[tourCount] = val; // add to tour
        D[tourCount] =  level; // store depth

        if(F[val] == -1) {
            F[(int) node.getValue()] = tourCount;
        }
        tourCount++;
        
        if(node.getLeft() != null ) {
            getEulerTour(node.getLeft(), E, D, F, level + 1);

            E[tourCount] = val;
            D[tourCount++] = level;
        }
        if(node.getRight() != null ) {
            getEulerTour(node.getRight(), E, D, F, level + 1);

            E[tourCount] = val;
            D[tourCount++] = level;
        }
    }

    /*
      This function preprocess the depth array to quickly find 
      RMQ which is used to find shallowest node.
     */
    void preProcess(int[] D) {

        for (int i = 0; i < D.length; i++)
            M[i][0] = i;

        for (int j = 1; 1 << j <D.length ; j++){
            for (int i = 0; i + (1 << j) - 1 < D.length; i++){
                if (D[M[i][j - 1]] < D[M[i + (1 << (j - 1))][j - 1]])
                    M[i][j] = M[i][j - 1];
                else
                    M[i][j] = M[i + (1 << (j - 1))][j - 1];
            }
        }
    }

    private int rmq(int a[], int start, int end){
        int j = (int)Math.floor(Math.log(end-start+1));

        if ( a[ M[start][j] ] <= a[M[end-(1<<j)+1][j]] )
            return M[start][j];

        else
            return M[end-(1<<j)+1][j];
    }
}

The beauty of this algorithm is that it can be used to find LCA of any tree, not just only binary tree or BST. The complexity of the algorithm to find a lowest common ancestor using range minimum query is (O(n), O(1)) with an additional space complexity of O(n).

Reference
Faster algorithms for finding lowest common ancestors in directed acyclic graphs

Please share if there is something wrong or missing. If you are preparing for an interview, please signup for free demo class to guide you through the process.

Breadth First traversal

Breadth First traversal

In the last post, we discussed depth first traversal of a graph. Today, we will discuss another way to traverse a graph, which is breadth first traversal. What is breadth first traversal? Unlike depth-first traversal, where we go deep before visiting neighbors, in breadth-first search, we visit all the neighbors of a node before moving a level down. For example, breadth first traversal of the graph shown below will be [1,2,5,3,4,6]

breadth first traversal

In breadth first search, we finish visiting all the nodes at a level before going further down the graph. For example, the graph used in the above example can be divided into three levels as shown.

breadth first search

We start with a node in level 1 which is node(1). Then visit all the nodes which are one level below node(1) which are node(2) and node(5). Then we visit all the node at level 3 which are node(3), node(4) and node(6).

Breadth First Traversal: algorithm

  1. Start with the given node u, put node u to queue
  2. While queue is not empty, repeat below steps:
    1. Dequeue fro queue and print node u.
    2. For each neighbor of u, node v
    3. If v is not visited already: add v to the queue
    4. mark v as visited

Let’s take an example and see how it works. Below is the graph and we have to find BFS for this graph.
breadth first traversal

We start from node(1), and put it in the queue.
breadth first traversal of graph

While the queue is not empty, we should pop from it and print the node. In this case, node(1) will be printed. Next, we go through all the neighbors of node(1) and put all the unvisited node on the queue. node(2) and node(5) will go on to the queue and marked as visited. Traversal = {1}

breadth first search

Again, we dequeue from the queue and this time we get node(2). We print it and go for all the neighbor node, node(3) and node(4) and mark them as visited. Traversal = {1,2}

node(5) is dequeued next and printed. Here, even though node(4) is a neighbor of node(5), it is already visited and hence not put on to the queue again. But node(6) is not yet visited, so put it on to the queue. Traversal = {1,2,5}

Now, we pop node(3) and print it, however, node(4) is already visited. Hence, nothing is added to the queue. Traversal = {1,2,5,3}

Next, node(4) is taken out from queue and printed, nothing goes on to queue. Traversal = {1,2,5,3,4}

Last, we pop node(6) and print it. Traversal = {1,2,5,3,4,6}.

At this point, the queue is empty and we stop traversal.

Breadth first traversal: implementation

public ArrayList<Integer> breadthFirstTraversal(){

        boolean[] visited = new boolean[this.G.length];
        ArrayList<Integer> traversal = new ArrayList<>();

        Queue<Integer> q = new LinkedList<>();

        //This is start node
        q.add(1);
        visited[1] = true;

        while(!q.isEmpty()){
            int u = (int)q.remove();
            traversal.add(u);

            for(int i=1; i< this.G[1].length; i++){
                if(this.G[u][i] && !visited[i]){
                    q.add(i);
                    visited[i]= true;
                }
            }
        }
        System.out.println(traversal);
        return traversal;

    }

The complexity of this code is O(V2) as at least V nodes will go in queue and for each nodes internal for loop runs V times.

Implementation of breadth-first search on graph represented by adjanceny list

  public ArrayList<Integer> breadthFirstTraversal(){

        boolean[] visited = new boolean[this.G.size()];
        ArrayList<Integer> traversal = new ArrayList<>();

        Queue<Integer> q = new LinkedList<>();

        //This is start node
        q.add(1);
        visited[1] = true;

        //This loop will run for V times, once for each node.
        while(!q.isEmpty()){
            int u = (int)q.remove();
            traversal.add(u);

            /*This loop has a worst-case complexity of O(V), where 
               node has an edge to every other node, but 
               the total number of times this loop will run is E times 
               where E number of edges.
             */
            for(int v : this.G.get(u)){
                if(!visited[v]){
                    q.add(v);
                    visited[v]= true;
                }
            }
        }
        System.out.println(traversal);
        return traversal;

    }

The complexity of Breadth First Search is O(V+E) where V is the number of vertices and E is the number of edges in the graph.

The complexity difference in BFS when implemented by Adjacency Lists and Matrix occurs due to this fact that in Adjacency Matrix, to tell which nodes are adjacent to a given vertex, we take O(|V|) time, irrespective of edges. Whereas, in Adjacency List, it is immediately available to us, takes time proportional to adjacent vertices itself, which on summation over all vertices |V| is |E|. So, BFS by Adjacency List gives O(|V| + |E|).

StackOverflow

When a graph is strongly connected, O(V + E) is actually O(V2)

Applications of Breadth first traversal

  1. To find shortest path between two nodes u and v
  2. To test bipartite-ness of a graph
  3. To find all nodes within one connected component

Please share if there is something wrong or missing. If you are preparing for an interview and want a free coaching session to guide you through it, please reach out to us at communications@algorithmsandme.com

Runway reservation system

Runway reservation system

Given an airport with a single runway, we have to design a runway reservation system of that airport. Add to details: each reservation request comes with requested landing time let’s say t. Landing can go through if there is no landing scheduled within k minutes of requested time, that means t can be added to the set of scheduling landings. K can vary and depends on external conditions. This system helps with reservations for the future landings.
Once the plane lands safely, we have to remove the plane for landing sets.

This is perfectly possible with airports with multiple runways where only one runway can be used because of weather conditions, maintenance etc. Also, one landing cannot follow another immediately due safety reasons, that’s why there has to be some minimum time before another landing takes place. We have to build this system with given constraints.

In nutshell, we have create a set of landings which are compatible with each other, i.e they do not violate the constraint put.  There are two operations to be performed on this set : insertion and removal. Insertion involves checking of constraints. 

Example: let’s say below is the timeline of all the landing currently scheduled and k = 3 min

reservation system

Now, if the new request comes for landing at 48.5, it should be added to the set as it does not violate the constraint of k mins. However, if a request comes for landing at 53, it cannot be added as it violates the constraint.
If a new request comes for 35, it is invalid as the request is in past.

Reservation system: thoughts

What is the most brute force solution which comes to mind? We can store all the incoming requests in an unsorted array.  Insertion should be O(1) operation as it is at the end. However, checking the constrain that it is satisfied will take O(n) because we have to scan through the entire array.
Same is true even if we use unsorted linked list. 

How about a sorted array? We can use a binary search algorithm to find the constraint, which will take O(log n) complexity. However, the insertion will still be O(n) as we have to move all the elements to the right from the position of insertion.

Sorted list solves the problem of insertion in O(1) but then search for the constraint will be O(n) complexity. It just moves the problem from one place to another.

Reservation system using binary search tree

We have to optimize two things : first check if the new request meets the constraints, second insert the new request into the set. 

Let’s think of binary search tree. To check a constraint, we have to check each node of the binary tree, but based on the relationship of the current node and the new request time, we can discard half of the tree. (Binary search tree property, where all the nodes on the left side are smaller than the root node and all the nodes on the right side are greater than the current node)

When a new request comes, we check with the root node and it does not violate the constraints, then we check if the requested time is less than the root. If yes, we go to left subtree and check there. If requested landing time is greater than the root node, we go to right subtree.
When we reach the leaf node, we add a new node with new landing time as the value of that node.

If at any given node, the constraint is violated, i.e not new landing time is within k minutes of time in the node, then we just return stating it is not possible to add new landing.

What will be complexity of checking the constraint? It will be O(h) where h is height of binary search tree. Insert is then O(1) operation.

Reservation system implementation

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
 
struct node{
    int value;
    struct node *left;
    struct node *right;
};
typedef struct node Node;

void inoderTraversal(Node * root){
    if(!root) return;
	
    inoderTraversal(root->left);
    printf("%d ", root->value);
    inoderTraversal(root->right);
}

Node *createNode(int value){
    Node * newNode =  (Node *)malloc(sizeof(Node));
    
    newNode->value = value;
    newNode->right= NULL;
    newNode->left = NULL;
	
    return newNode;
}

Node *addNode(Node *node, int value, int K){
	if(!node)
        return createNode(value);
    
    if ( node->value + K > value && node->value - K < value ){
        return node;
    }
    if (node->value > value)
        node->left = addNode(node->left, value, K);
    else
        node->right = addNode(node->right, value, K);
    return node;
}

/* Driver program for the function written above */
int main(){
    Node *root = NULL;
	
    //Creating a binary tree
    root = addNode(root, 30, 3);
    root = addNode(root, 20, 3);
    root = addNode(root, 15, 3);
    root = addNode(root, 25, 3);
    root = addNode(root, 40, 3);
    root = addNode(root, 38, 3);
    root = addNode(root, 45, 3);
    inoderTraversal(root);
	
    return 0;
}

Let’s say a new requirement comes which is to find how many flights are scheduled till time t?

This problem can easily be solved using binary search tree, by keeping track of size of subtree at each node as shown in figure below.

runway reservation system

While inserting a new node, update counter of all the nodes on the nodes. When query is run, just return the count of node of that time or closest and smaller than that value of t. 

Please share if there is something wrong or missing. If you are preparing for an interview, please signup to receive free preparation material.

Prune nodes not on paths with given sum

Prune nodes not on paths with given sum

Prune nodes not on paths with given sum is a very commonly asked question in Amazon interviews. It involves two concepts in one problem. First, how to find a path with a given sum and then second, how to prune nodes from binary tree. The problem statement is:

Given a binary tree, prune nodes which are not paths with a given sum.

For example, given the below binary tree and given sum as 43, red nodes will be pruned as they are not the paths with sum 43.

Prune nodes not on path with given sum
Prune nodes not on path with given sum

Prune nodes in a binary tree: thoughts

To solve this problem, first, understand how to find paths with a given sum in a binary tree.  To prune all nodes which are not on these paths,  get all the nodes which are not part of any path and then delete those nodes one by one. It requires two traversals of the binary tree.
Is it possible to delete a node while calculating the path with a given sum? At what point we find that this is not the path with given sum? At the leaf node.
Once we know that this leaf node is not part of the path with given sum, we can safely delete it.  What happens to this leaf node? We directly cannot delete the parent node as there may be another subtree which leads to a path with the given sum. Hence for every node, the pruning is dependent on what comes up from its subtrees processing.

At the leaf node, we return to parent false if this leaf node cannot be part of the path and delete the leaf node. At parent node, we look for return values from both the subtrees. If both subtrees return false, it means this node is not part of the path with the given sum. If one of the subtrees returns true, it means the current node is part of a path with the given sum. It should not be deleted and should return true to its parent.

Prune nodes from a binary tree: implementation

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};
typedef struct node Node;

#define true 1
#define false 0

int prunePath(Node *node, int sum ){
	
	if( !node ) return true;
	
	int subSum =  sum - node->value;
	/* To check if left tree or right sub tree 
	contributes to total sum  */
	
	int leftVal = false, rightVal = false;
	
	/*Check if node is leaf node */
	int isLeaf = !( node->left || node->right );
	
	/* If node is leaf node and it is part of path with sum
	= given sum return true to parent node so tha parent node is
	not deleted */
	if(isLeaf && !subSum )
		return true;
		
	/* If node is leaf and it not part of path with sum 
	equals to given sum
    Return false to parent node */
    else if(isLeaf && subSum ){
    	free(node);
    	return false;
    }
    /* If node is not leaf and there is left child 
	Traverse to left subtree*/
    leftVal = prunePath(node->left, subSum);
    
    /* If node is not leaf and there is right child
	 Traverse to right subtree*/
    rightVal = prunePath(node->right, subSum);
    
    /* This is crux of algo.
    1. If both left sub tree and right sub tree cannot lead to
	path with given sum,Delete the node 
    2. If any one sub tree can lead to path with sum equal
	to given sum, do not delete the node */ 
    if(!(leftVal || rightVal) ){
    	free(node);
    	return false;
    }
    if(leftVal || rightVal ){
    	if(leftVal)
    		node->right = NULL;
    	if(rightVal)
    		node->left = NULL;
    	return true;
    }
    return true ;
}

void inoderTraversal(Node * root){
	if(!root)
		return;
	
	inoderTraversal(root->left);
	printf("%d ", root->value);
	inoderTraversal(root->right);
}
Node *createNode(int value){
	Node * newNode =  (Node *)malloc(sizeof(Node));
	newNode->value = value;
	newNode->right= NULL;
	newNode->left = NULL;
	
	return newNode;
}
Node *addNode(Node *node, int value){
	if(node == NULL){
		return createNode(value);
	}
	else{
		if (node->value > value){
			node->left = addNode(node->left, value);
		}
		else{
			node->right = addNode(node->right, value);
		}
	}
	return node;
}

/* Driver program for the function written above */
int main(){
	Node *root = NULL;
	//Creating a binary tree
	root = addNode(root,30);
	root = addNode(root,20);
	root = addNode(root,15);
	root = addNode(root,25);
	root = addNode(root,40);
	root = addNode(root,37);
	root = addNode(root,45);
	
	inoderTraversal(root);	
	prunePath(root, 65);
	
	printf( "\n");
	if( root ){
		inoderTraversal(root);	
	}
	return 0;
}

The complexity of this algorithm to prune all nodes which are not on the path with a given sum is O(n).

Please share if there is something wrong or missing. If you are preparing for interviews, please signup for free interview material.

Print paths in a binary tree

Print paths in a binary tree

We learned various kind of traversals of a binary tree like inorder, preorder and postorder. Paths in a binary tree problem require traversal of a binary tree too like every other problem on a binary tree. The problem statement is:

Given a binary tree, print all paths in that binary tree

What is a path in a binary tree? A path is a collection of nodes from the root to any leaf of the tree. By definition, a leaf node is a node which does not have left or right child. For example, one of the paths in the binary tree below is 10,7,9.

paths in a binary tree
Paths in a binary tree

Paths in a binary tree: thoughts

It is clear from the problem statement that we have to start with root and go all the way to leaf nodes. Question is do we need to start with root from each access each path in binary tree? Well, no. Paths have common nodes in them. Once we reach the end of a path (leaf node), we just move upwards one node at a time and explore other paths from the parent node. Once all paths are explored, we go one level again and explore all paths from there. 

This is a typical postorder traversal of a binary tree, we finish the paths in the left subtree of a node before exploring paths on the right subtree We process the root before going into left or right subtree to check if it is the leaf node. We add the current node into the path till now. Once we have explored left and right subtree, the current node is removed from the path.

Let’s take an example and see how does it work. Below is the tree for each we have to print all the paths in it.

paths in a binary tree
Paths in binary tree

First of all our list of paths is empty. We have to create a current path, we start from the root node which is the node(10). Add node(10) to the current path. As node(10) is not a leaf node, we move towards the left subtree.

print paths in a binary tree

node(7) is added to the current path. Also, it is not a leaf node either, so we again go down the left subtree.

paths in binary search tree

node(8) is added to the current path and this time, it is a leaf node. We put the entire path into the list of paths or print the entire path based on how we want the output.

paths in a binary tree

At this point, we take outnode(8) from the current path and move up to node(7). As we have traversed the left subtree of,node(7) we will traverse right subtree of the node(7).

paths in binary search tree

node(9) is added now to the current path. It is also a leaf node, so again, put the path in the list of paths. node(9) is moved out of the current path.

Now, left and right subtrees of node(7) have been traversed, we remove node(7) from the current path too.

At this point, we have only one node in the current path which is the node(10) We have already traversed the left subtree of it. So, we will start traversing the right subtree, next we will visit node(15) and add it to the current path.

node(15) is not a leaf node, so we move down the left subtree. node(18) is added to the current path. node(18) is a leaf node too. So, add the entire path to the list of paths. Remove node(18) from the current path.

We go next to the right subtree of the node(15), which is the node(19). It is added to the current path. node(19) is also a leaf node, so the path is added to the list of paths.

Now, the left and right subtrees of the node(15) are traversed, it is removed from the current path and so is the node(10).

Print paths in a binary tree: implementation

package com.company.BST;

import java.util.ArrayList;

/**
 * Created by sangar on 21.10.18.
 */
public class PrintPathInBST {
    public void printPath(BinarySearchTree tree){
        ArrayList<TreeNode> path  = new ArrayList<>();
        this.printPathRecursive(tree.getRoot(), path);
    }

    private void printPathRecursive(TreeNode root,
									ArrayList<TreeNode> path){
        if(root == null) return;

        path.add(root);

        //If node is leaf node
        if(root.getLeft() == null && root.getRight() == null){
            path.forEach(node -> System.out.print(" " 
							+ node.getValue()));
            path.remove(path.size()-1);
            System.out.println();
            return;
        }

        /*Not a leaf node, add this node to 
		path and continue traverse */
        printPathRecursive(root.getLeft(),path);
        printPathRecursive(root.getRight(), path);

		//Remove the root node from the path
        path.remove(path.size()-1);
    }
}

Test cases

package com.company.BST;
 
/**
 * Created by sangar on 10.5.18.
 */
public class BinarySearchTreeTests {
    public static void main (String[] args){
        BinarySearchTree binarySearchTree = new BinarySearchTree();
 
        binarySearchTree.insert(7);
        binarySearchTree.insert(8);
        binarySearchTree.insert(6);
        binarySearchTree.insert(9);
        binarySearchTree.insert(3);
        binarySearchTree.insert(4);
 
        binarySearchTree.printPath();
    }
}

Tree node definition

package com.company.BST;

/**
 * Created by sangar on 21.10.18.
 */
public class TreeNode<T> {
    private T value;
    private TreeNode left;
    private TreeNode right;

    public TreeNode(T value) {
        this.value = value;
        this.left = null;
        this.right = null;
    }

    public T getValue(){
        return this.value;
    }
    public TreeNode getRight(){
        return this.right;
    }
    public TreeNode getLeft(){
        return this.left;
    }

    public void setValue(T value){
        this.value = value;
    }

    public void setRight(TreeNode node){
        this.right = node;
    }

    public void setLeft(TreeNode node){
        this.left = node;
    }
}

Complexity of above algorithm to print all paths in a binary tree is O(n).

Please share if there is something wrong or missing. If you are preparing for Amazon, Microsoft or Google interviews, please signup for interview material for free.

First non repeated character in string

First non repeated character in string

Given a string, find first non repeated character in a stringFor example, string is abcbdbdebab, the first non repeating character would be c. Even though e is also non repeating in string, c is output as it is first non repeating character.

Non repeating character : thoughts

What does it mean to be non-repeating character? Well, the character should occur in string only once. How about we scan the string and find what is count for each character? Store character and count in map as key value pair.
Now, that we have <character, count> key value pair for all unique characters in string, how can we find first non repeating character? Refer back to original string; scan the string again and for each character, check the corresponding count and if it is 1 return the character.

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 4.10.18.
 */
public class FirstNonRepeatingChar {

    HashMap<Character, Integer> characterCount = new HashMap<>();

    public char firstNonRepeatingCharacter(String s){
        //Best to discuss it with interviewer, what should we return here?
        if(s == null) return ' ';

        if(s.length() == 0) return ' ';

        for (char c: s.toCharArray()){
            if(!characterCount.containsKey(c)){
                characterCount.put(c,1);
            }
            else {
                characterCount.put(c, characterCount.get(c) + 1);
            }
        }
        for (char c: s.toCharArray()) {
            if(characterCount.get(c) == 1) return c;
        }

        return ' ';
    }
}
package test;

import com.company.FirstNonRepeatingChar;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class FirstNonRepeatingCharTest {

    FirstNonRepeatingChar tester = new FirstNonRepeatingChar();

    @Test
    public void firstNonRepeatingCharExists() {
        String s = "abcbcbcbcbad";
        assertEquals('d', tester.firstNonRepeatingCharacter(s));
    }

    @Test
    public void firstNonRepeatingCharDoesNotExists() {
        String s = "abcbcbcbcbadd";
        assertEquals(' ', tester.firstNonRepeatingCharacter(s));
    }

    @Test
    public void firstNonRepeatingCharWithEmptyString() {
        String s = "";
        assertEquals(' ', tester.firstNonRepeatingCharacter(s));
    }

    @Test
    public void firstNonRepeatingCharWithNull() {
        assertEquals(' ', tester.firstNonRepeatingCharacter(null));
    }
}

Complexity of this method to find first non-repeating character in a string is O(n) along with space complexity of O(1) to store the character to count map.

There is always some confusion about space complexity of above method, we think as 256 characters are used, should it not be counted as space complexity? Definitely. But in asymptotic notations, this space is independent of size of input, so space complexity remains O(1).

One more thing, even though time complexity is O(n), input string is scanned twice, first time to get count of characters and second time to find first non repeating character.

Optimization

Consider a case when string is too large with millions of characters, most of them repeated, above solution may turn slow in last where we look for character with count 1 in map.  How can we avoid scanning array second time?
How about we store some information with character in map along with count, so that we can figure out if the character is first non repeating or not.
Or we can have two maps, one stores the count and other stores the first index of character.

Once, we have created two maps as mentioned above, go through the first map and find the all characters with count 1. For each of these characters, check which one has the minimum index on second map and return that character.

Complexity of algorithm remains same, however, second scan of string is now not required. In other words, second scan is now independent of size of input as it depends on the size of first map, which is constant to 256 as that’s the number of unique 8 bit characters possible.

Find first non repeating character : Implementation

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 4.10.18.
 */
public class FirstNonRepeatingChar {
    public char firstNonRepeatingCharacterOptimized(String s){
        HashMap<Character, Integer> characterCount = new HashMap<>();
        HashMap<Character, Integer>characterIndex = new HashMap<>();
        //Best to discuss it with interviewer, what should we return here?
        if(s == null) return ' ';

        if(s.length() == 0) return ' ';

        for (int i=0; i<s.length(); i++){
            char c  = s.charAt(i);
            if(!characterCount.containsKey(c)){
                characterCount.put(c,1);
                characterIndex.put(c,i);
            }
            else {
                characterCount.put(c, characterCount.get(c) + 1);
            }
        }
        char nonRepeatedCharacter = ' ';
        int prevIndex = s.length();
        for (char c : characterCount.keySet()) {
            if(characterCount.get(c) == 1 
			&& characterIndex.get(c) < prevIndex){
                prevIndex = characterIndex.get(c);
                nonRepeatedCharacter = c;
            }
        }
        return nonRepeatedCharacter;
    }
}
package test;

import com.company.FirstNonRepeatingChar;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class FirstNonRepeatingCharTest {

    FirstNonRepeatingChar tester = new FirstNonRepeatingChar();

    @Test
    public void firstNonRepeatingOptimizedCharExists() {
        String s = "aebcbcbcbcbad";
        assertEquals('e', tester.firstNonRepeatingCharacterOptimized(s));
    }

    @Test
    public void firstNonRepeatingCharOptimizedDoesNotExists() {
        String s = "abcbcbcbcbadd";
        assertEquals(' ', tester.firstNonRepeatingCharacterOptimized(s));
    }

    @Test
    public void firstNonRepeatingCharOptimizedWithEmptyString() {
        String s = "";
        assertEquals(' ', tester.firstNonRepeatingCharacterOptimized(s));
    }

    @Test
    public void firstNonRepeatingCharOptimizedWithNull() {
        assertEquals(' ', tester.firstNonRepeatingCharacterOptimized(null));
    }
}

Please share if there is anything wrong or missing. If you are interested in taking personalized coaching sessions by our expert teachers, please signup to website and get first session free.


Difference between array and linked list

Difference between array and linked list

In last post : Linked list data structure, we discussed basics of linked list, where I promised to go in details what is difference between array and linked list. Before going into post, I want to make sure that you understand that there is no such thing called one data structure is better than other. Based on your requirements and use cases, you chose one or the other. It depends on what is most frequent operation your algorithm would perform in it’s lifetime. That’s why they have data structure round in interview process to understand if you can chose the correct one for the problem.

What is an array?
Array is linear, sequential and contiguous collection of elements which can be addressed using index.

What is a linked list?
Linked list is linear, sequential and non-contiguous collection of nodes, each node store the reference to next node. To understand more, please refer to Linked list data structure.

Difference between arrays and linked list

Static Vs dynamic size

Size of an array is defined statically at the compile time where as linked list grows dynamically at run time based on need. Consider a case where you know the maximum number of elements algorithm would ever have, then you can confidently declare it as array. However, if you do not know, the linked list is better. There is a catch : What if there is a rare chance that number of elements will reach maximum, most of the time it will be way less than maximum? In this case, we would unnecessary allocating extra memory for array which may or may not be used. 

Memory allocation

An array is given contiguous memory in system. So, if you know the address of any of the element in array, you can access other elements based position of the element.

linked list vs arrays
Statically allocated contiguous memory

Linked list are not store contiguous on memory, nodes are scattered around on memory. So you may traverse forward in linked list, given node (using next node reference), but you can not access nodes prior to it.

arrays vs linked list
Dynamically allocated non-contiguous memory

Contiguous allocation of memory required sufficient memory before hand for an array to be stored, for example if want to store 20 integers in an array, we would required 80 bytes contiguous memory chunk. However, with linked list we can start with 8 bytes and request more memory as when required, which may be wherever. Contiguous allocation of memory makes it difficult to resize an array too. We have to look for different chunk of memory, which fits the new size, move all existing elements to that location. Linked list on other hand are dynamically size and can grow much faster without relocating existing elements.

Memory requirement

It’s good to have non-contiguous memory then? It comes with a cost. Each node of linked list has to store reference to next node in memory. This leads to extra payload of 4 bytes in each node. On the other hand, array do not require this extra payload. You  have to trade off extra space with advantages you are getting. Also, sometime, spending extra space is better that have cumbersome operations like shifting, adding and deleting operation on array. Or value stored in node is big enough to make these 4 bytes negligible in analysis.

Operation efficiency

We do operations of data structure to get some output. There are four basic operations we should be consider : read, search, insert/update and delete.

Read on array is O(1) where you can directly access any element in array given it’s index. By O(1), read on array does not depend on size of array.
Whereas, time complexity of read on linked list is O(n) where n is number of nodes. So, if you have a problem, which requires more random reads, array will over-weigh linked list.

Given the contiguous memory allocation of array, there are optimized algorithms like binary search to search elements on array which has complexity of O(log n). Search on linked list on other hand requires O(n).

Insert on array is O(1) again, if we are writing within the size of array. In linked list, complexity of insert depends where do you want to write new element at. If insert happens at head, then it O(1), on the other hand if insert happens at end, it’s O(n).

Insert node at start of linked list
Insert node at the tail of linked list

Update means here, changing size of array or linked list by adding one more element. In array it is costly operation, as it will require reallocation of memory and copying all elements on to it. Does not matter if you add element at end or start, complexity remains O(1).
For linked list, it varies, to update at end it’s O(n), to update at head, it’s O(1). 
In same vain, delete on array requires movement of all elements, if first element is deleted, hence complexity of O(n). However, delete on linked list O(1), if it’s head, O(n) if it’s tail.

To see the difference between O(1) and O(n), below graph should be useful.

difference between array and linked list
Complexity analysis graph

Key difference between array and linked list are as follows

  • Arrays are really bad at insert and delete operation due to internal reallocation of memory.
  • Statically sized at the compile time
  • Memory allocation is contiguous,  which make access elements easy without any additional pointers. Can jump around the array without accessing all the elements in between.
  • Linked list almost have same complexity when insert and delete happens at the end, however no memory shuffling happens
  • Search on linked list is bad.=, usually require scan with O(n) complexity
  • Dynamically sized on run time.
  • Memory allocation is non-contiguous, additional pointer is required to store neighbor node reference. Cannot jump around in linked list.

Please share if there is something wrong or missing. If you wan to contribute to website, please reach out to us at communications@algorithmsandme.com

Linked list data structure

Linked list data structure

Linked list is a very important data structure to understand as lot of problems are asked based on linked list in Amazon, Microsoft and Google interview. Today, we will understand the basics of linked list data structure and it’s implementation. 

Linked list represent linear sequence of elements. Each element connected to next element using chain of references. Another data structure which store linear sequence of items is array. There are some advantages and uses cases where linked list way of storing sequence is more efficient than array, I will cover that into next post : Arrays Vs Linked lists.

In last paragraph, I emphasized on linkedlist being linear data structure. In linear data structure, there is a sequence and order how elements are inserted, arranged and traversed. In order to go to tail of linked list, we have to go through all of the nodes.

linked list data structure
linear data structure when elements can be traversed only in one order


Non linear data structures are the ones where elements are not arranged or traversed in a specific order. One element may be connected to many others, hence we cannot traverse them in the same order every time. Example of non-linear data structure would be maps, dictionaries, trees, graphs etc.

linked list as data structure
Non  linear data structure when nodes cannot be traversed in one order always

Linked list implementation

Linked list consists of node, any number of nodes. Each node contains two things : first, value of the node, this value can be of any type, integer, string, or other user defined type. Second, a reference which points to next node in linked list. A node can be declared as follows:

typedef struct Node {
	int data;
	struct Node * next;
} Node;
Node structure
Linked list

What happens if the node is last node in linked list? At last node, next pointer of the node points to the null. It’s very important to understand this bit, as this condition will be used on almost every problem you have to solve on linked list.

Linked list is dynamic data structure. By dynamic data structure, we mean, it’s size and nature is not defined at the time of compilation, but defined at run time. Every time, a new node is added to linked list, new memory location is allocated and previous node’s next pointer will point to new node.

Operations of linked list

  • Adding node at the end of list
    There are three basic steps to add a node to linked list at end:
  1. Check if there is already a node
    1. If no, then create a new node and return it as head of linked list.
  2. If there is a node,
    1. Scan through linked list using next pointer, reach to the last node.
    2. Create a new node, and point next pointer of last node to this new node.
Node * createNode(int val){
	Node * newNode = (Node *)malloc(sizeof(Node));
	if(newNode){
		newNode->data = val;
		newNode->next = NULL;
	}
	return newNode;
}

void addNode(Node **headRef, int value){
	//create new node
	Node *newNode = createNode(value);

	//find the last node
	Node *currentNode = *headRef;
	while(currentNode && currentNode->next != NULL){
		currentNode = currentNode->next;
	}
	if(currentNode)
		currentNode->next = newNode;
	}
	else{
		//Change headRef to point to new head.
		*headRef = newNode;
	}
}

Complexity of adding a node to linked list is O(n). 

  • Insert node at head of list
    In this case too, we allocate a new node, however, this time we do not have to scan the entire list. Every time we add node to list, it’s head changes though.
  1. Check if there is already a node
    1. If no, then create a new node and return it as head of linked list.
  2. If there is a node,
    1. Create a new node, and point next pointer new node to head.
    2. Return new node as head pointer.
Node * createNode(int val){
	Node * newNode = (Node *)malloc(sizeof(Node));
	if(newNode){
		newNode->data = val;
		newNode->next = NULL;
	}
	return newNode;
}

void addNode(Node **headRef, int value){
	//create new node
	Node *newNode = createNode(value);
	newNode->next = *headRef;
	*headRef = newNode;
}

Linked list data structure problems

It’s very important to understand that linked list is a recursive data structure. Base case is a linked list with no node, represented by NULL node. Every problem on linked list can be solved using template : process one node, and then recursively process the remaining linked list.

In programming terms, linked list is divided into two parts, head and tail. The node being processed is called head and rest of the linked list is tail. Tail has the exactly same structure as the original list. 

Problems like merging linked lists, reverse a linked list, find length of linked list all can be solved using the same template of processing one node and the recursively call function on remaining node. 

Types of linked list

There are three types of linked lists :
1. Singly linked list 
Singly linked lists contain nodes with data and reference, i.e., next, which points to the next node in the sequence of nodes. The next pointer of the last node will point to null. In singly linked list you can traverse only in one direction.

singly linked list
singly linked list

2. Doubly linked list
In a doubly linked list, each node contains two links – previous, which points to the node before current node and next,  which points to next node. The previous pointer of the first node and next pointer of the last node will point to null. In doubly linked list, you can traverse it both directions. Two references adds to weight as extra memory is required.

doubly linked list
doubly linked list

3. Circular linked list
In circular linked list, next pointer of  the last node will point to the first node. A circular linked list can be both singly as well as doubly linked list.

circular linked list
Circular doubly linked list

This was all for basics of linked list, I know problems on them are hard to solve but if you look at all the problems, they boil down to one thing : understanding of node and how recursion can be used. In next posts, we will be solving many of these problems and see how we can use these basics.

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with thousands of users across world, please reach out to us at communications@algorithmsandme.com

Inorder predecessor in binary search tree

Inorder predecessor in binary search tree

What is an inorder predecessor in binary tree? Inorder predecessor is the node which traversed before given node in inorder traversal of binary tree.  In binary search tree, it’s the previous big value before a node. For example, inorder predecessor of node(6) in below tree will 5 and for node(10) it’s 6.

inorder predecessor

If node is leftmost node in BST or least node, then there is no inorder predecessor for that node.

Inorder predecessor : Thoughts

To find inorder predecessor , first thing to find is the node itself.  As we know in inorder traversal, root node is visited after left subtree.  A node can be predecessor for given node which is on right side of it.

Let’s come up with examples and see what algorithm works. First case, if given node is left most leaf node of tree, there is no inorder predecessor, in that case return NULL. For example, predecessor for node 1 is NULL.

predecessor in BST

What if node has left subtree? In that case, maximum value in left subtree will be predecessor of given node.  We can find maximum value in tree by going deep down right subtree, till right subtree is NULL, and then return the last node. For example, predecessor node(10) is 6.

inorder predecessor

What are the other cases? Another case is that node does not have left subtree but it is also not the leftmost leaf node? Then parent of given node will be inorder predecessor. While moving down the tree on right side, keep track of parent node as it may be solution. predecessor of node(12) will be 10 as that’s where we moved to right subtree last time.  Note that we change predecessor candidate only  while moving down right subtree.

Algorithm to find inorder predecessor

  1. Start with root, current = root, successor = NULL.
  2. If node.value > current.value, then predecessor = current, current = current.right.
  3. If node.value < current.value, current = current.left.
  4. If node.value == current.value and node.left!= null, predecessor = maximum(current.left).
  5. Return predecessor

Inorder predevessor: Implementation

#include<stdio.h>
#include<stdlib.h>
 
struct node{
	int value;
	struct node *left;
	struct node *right;
};

typedef struct node Node;


/* This function return the maximum node in tree rooted at node root */
Node *findMaximum(Node *root){
    if(!root)
        return NULL;
 
    while(root->right) root = root->right;
    return root;
}
/* This function implements the logic described in algorithm to find inorder predecessor
of a given node */
Node *inorderPredecessor(Node *root, int K){
 
    Node *predecessor 	= NULL;
    Node *current  		= root;
    
    if(!root)
        return NULL;
 
    while(current && current->value != K){
         /* Else take left turn and no need to update predecessor pointer */
        if(current->value >K){
            current= current->left;
        }
        /* If node value is less than the node which are looking for, then go to right sub tree
        Also when we move right, update the predecessor pointer to keep track of last right turn */
        else{
            predecessor = current;
            current = current->right;
        }
    }
    /*Once we reached at the node for which inorder predecessor is to be found,
    check if it has left sub tree, if yes then find the maximum in that right sub tree and return that node 
    Else last right turn taken node is already stored in predecessor pointer and will be returned*/
    if(current && current->left){
        predecessor = findMaximum(current->left);
    }
    return predecessor;
}
Node * createNode(int value){
  Node *newNode =  (Node *)malloc(sizeof(Node));
  
  newNode->value = value;
  newNode->right= NULL;
  newNode->left = NULL;
  
  return newNode;
  
}

Node * addNode(Node *node, int value){
  if(node == NULL){
      return createNode(value);
  }
  else{
     if (node->value > value){
        node->left = addNode(node->left, value);
      }
      else{
        node->right = addNode(node->right, value);
      }
  }
  return node;
}
 
/* Driver program for the function written above */
int main(){
  Node *root = NULL;
  int n = 0;
  //Creating a binary tree
  root = addNode(root,30);
  root = addNode(root,20);
  root = addNode(root,15);
  root = addNode(root,25);
  root = addNode(root,40);
  root = addNode(root,37);
  root = addNode(root,45);
  
  Node *predecessor = inorderPredecessor(root, 40);
  printf("\n Inorder successor node is : %d ",predecessor ? predecessor->value: 0);
  
  return 0;
}

Complexity of algorithm to find inorder predecessor will be O(logN) in almost balanced binary tree. If tree is skewed, then we have worst case complexity of O(N).

Please share if there is something wrong or missing. If you want to contribute to website and share your knowledge with thousands of learners across the world, please reach out to us at communications@algorithmsandme.com