Trapping rain water

Given n non-negative integers representing an elevation map where the width of each bar is 1, compute how much water it is able to trap after rain. For example,

Input: [0,1,0,2,1,0,1,3,2,1,2,1] 
Output: 6

trapping rain water

Basic thought

We know water stays at the highest level it is able to, and it always maintains the same flat surface. Using this, we can infer that we need to find holes in the elevation where water would be able to rest at a level. To calculate how much water these holes would need to store, we can see that we need to have elevations on both sides, and we also need to track how much space a particular hole would be able to trap the water. Fig below is an example of a hole which holds 4 units of water.

trap water

Upon further breakdown of these holes, we can notice that we do not need to track the entire hole to find the capacity it holds, but we can parse each unit of the hole individually. Thus, the amount of water in each unit of the hole is

min(leftHeight, rightHeight) - currentUnitHeight.

What remains now is to calculate the leftHeight and the rightHeight. We could parse through them individually to find these out, but we can see a general pattern here: The highest elevation to the left inclusive of the current unit will become the leftHeight, and the highest elevation to the right inclusive of the current unit will become the rightHeight. The problem has been greatly simplified into maintaining track of highest heights on both sides of every unit.

Brute force solution

From our observations in the previous section, the simplest brute force approach is to calculate the highest elevation on both sides of every unit, and sum them up together. Each unit takes O(n) time with this approach, and there are n units to calculate in total. Thus, this approach will take O(n2) time.

Show me the brute force implementation

class Solution:
    def trap(self, height: List[int]) -> int:
        if len(height) < 3:
            return 0
        ans = 0
        for i in range(len(height)):
            left_max = 0
            right_max = 0
            for j in range(i + 1):
                left_max = max(left_max, height[j])
            for j in range(i, len(height)):
                right_max = max(right_max, height[j])
            ans += min(left_max, right_max) - height[i]
        return ans

Dynamic Programming approach

As we can see from the brute force solution, we calculate the leftHeight and the rightHeight multiple times for the same node, i.e. the problem has overlapping subproblems. Thus a dynamic programming approach should optimize the brute force approach further. We can store the leftHeight and rightHeight elements till each index we have iterated, and thus the water storage calculation for each unit will now take O(1) time. Overall, this approach passes over the array thrice, and thus has a runtime of O(n) with a space complexity of O(n).

Show me the dynamic programming implementation

class Solution:
    def trap(self, height: List[int]) -> int:
        if len(height) < 3:
            return 0
        ans = 0
        left_max = [0] * len(height)
        right_max = [0] * len(height)
        left_max[0] = height[0]
        right_max[-1] = height[-1]
        for i in range(1, len(height)):
            left_max[i] = max(height[i], left_max[i - 1])
        for i in reversed(range(len(height) - 1)):
            right_max[i] = max(height[i], right_max[i + 1])
        for i in range(1, len(height) - 1):
            ans += min(left_max[i], right_max[i]) - height[i]
        return ans

Stacks

Since we need to keep track of the highest elevations up to a point, stacks are a good approach to perform this operation in one pass of the array. The basic idea is since we need to store the largest elevations in the stack, as we iterate through the array, we can find the amount of water stored till the currentHeight is higher than elements of the stack and move on to the next value, i.e. water stored will always be of a hole shape, thus we can find the amount of water that can be stored between two high values. This operation passes over the array once, and each element can only have two operations maximum: Pushing and popping from the stack, thus its time complexity is O(n). The space complexity will be O(n), in case the entire array is stored on the stack.

Show me the stack implementation

class Solution:
    def trap(self, height: List[int]) -> int:
        if len(height) < 3: return 0 ans = 0 stack = [] for i in range(len(height)): while len(stack) > 0 and height[i] > height[stack[-1]]:
                top = stack.pop()
                if len(stack) == 0:
                    break
                # Distance between the larger value still in the 
                #stack with a hole the height of the top element 
                #and the current element
                distance = i - stack[-1] - 1
                # Water that can be stored is smaller 
                # heights between these bounds, and the height 
                # of the intermediate region between top of the stack 
                # and current index
                curr_height = min(height[i], height[stack[-1]]) - height[top]
                ans += distance * curr_height
            stack.append(i)
        return ans

Simple Optimization

Another way to approach the problem is that since we need to find the maximum elevations on either side to calculate the current water stored, we can calculate the global maximum in one pass, and once we have the index for the same, we can iterate to it and from it to the rest of the array knowing that we have one bounded measurement which is the highest elevation in the array. The time complexity for this operation is O(n), but there are two passes over the array(once to calculate global maximum, and once to calculate the water amount). The space complexity is O(1).

Show me the implementation

class Solution:
    def trap(self, height: List[int]) -> int:
        if len(height) < 3:
            return 0
        ans = 0
        gMax = 0 # Global Max
        for i in range(len(height)):
            gMax = i if height[gMax] < height[i] else gMax
        lMax = 0 # Left max yet
        for i in range(1, gMax):
            lMax = i if height[lMax] <= height[i] else lMax
            ans += max(0, height[lMax] - height[i])
        lMax = len(height) - 1 # Right max yet
        for i in reversed(range(gMax, len(height) - 1)):
            lMax = i if height[lMax] <= height[i] else lMax
            ans += max(0, height[lMax] - height[i])
        return ans

This article is contributed by Khushman Patel

Maximum area rectangle in a histogram

A histogram is a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. Below is an example of a histogram.

maximum area rectangle in histogram

Given a histogram, whose class interval is 1, find maximum area rectangle in it. Let me explain the problem in more details.

In the histogram above, there are at least 6 rectangles with areas 2, 1,5,6,2, and 3. Are there more rectangles? Yes, we can make more rectangles by combining some of these rectangles. A few are shown below.

Apparently, the largest area rectangle in the histogram in the example is 2 x 5 = 10 rectangle. The task is to find a rectangle with maximum area in a given histogram. The histogram will be given as an array of the height of each block, in the example, input will be [2,1,5,6,2,3].

Maximum area rectangle: thoughts

First insight after looking at the rectangles above is: block can be part of a rectangle with a height less than or equal to its height. For each block of height h[i], check what all blocks on the left can be part of a rectangle with this block. All the blocks on the left side with a height greater than the current block height can be part of such a rectangle.
Similarly, all the blocks on the right side with a height greater than the current block height can be part of such a rectangle.
Idea is to calculate leftLimit and rightLimit and find the area (rightLimit - leftLimit) * h[i].
Check if this area is greater than previously known area, then update the maximum area else, continue to the next block.

class Solution {
    public int largestRectangleArea(int[] heights) {
        
        if(heights.length == 0) return 0;
        int maxArea = Integer.MIN_VALUE;

        for(int i=0; i<heights.length; i++){
            //Find the left limit for current block
            int leftLimit = findLeftLimit(heights, i);

            //Find the right limit for current block
            int rightLimit = findRightLimit(heights, i);

            int currentArea = (rightLimit - leftLimit-1) * heights[i];
            maxArea = Integer.max(maxArea, currentArea);
        }

        return maxArea;
    }

    private int findLeftLimit(int [] heights, int index){
        int j = index-1;
        while (j >= 0 && heights[j] >= heights[index]) j--;

        return j;
    }

    private int findRightLimit(int [] heights, int index){
        int j = index+1;
        while (j < heights.length && heights[j] >= heights[index])
            j++;

        return j;
    }
}

The time complexity of the implementation is O(n2); we will left and right of each block which will take n operations, we do it for n blocks and hence the complexity is quadratic. Can we optimize the time complexity?

If heights[j] >= heights[i] and leftLimit of index j is already known, can we safely say that it will also be the leftLimit of index i as well?
Can we say the same thing for rightLimit well? Answers to all the questions are yes. If we store the left and right limit for all indices already seen, we can avoid re-calculating them.

class Solution {
    public int largestRectangleArea(int[] heights) {
        
        if(heights.length == 0) return 0;

        int maxArea = Integer.MIN_VALUE;

        //Finds left limit for each index, complexity O(n)
        int [] leftLimit = getLeftLimits(heights);
        //Find right limit for each index, complexity O(n)
        int [] rightLimit = getRightLimits(heights);

        for(int i=0; i<heights.length; i++){
            int currentArea = 
                (rightLimit[i] - leftLimit[i] -1) * heights[i];
            maxArea = Integer.max(maxArea, currentArea);
        }

        return maxArea;
    }

    private int[] getLeftLimits(int [] heights){

        int [] leftLimit = new int[heights.length];
        leftLimit[heights.length-1] = -1;

        for(int i=0; i<heights.length; i++) {
            int j = i - 1;
            while (j >= 0 && heights[j] >= heights[i]) {
                j = leftLimit[j];
            }
            leftLimit[i] = j;
        }
        return leftLimit;
    }

    private int[] getRightLimits (int [] heights){

        int [] rightLimit = new int[heights.length];
        rightLimit[heights.length-1] = heights.length;

        for(int i=heights.length-2; i>=0; i--){
            int j = i+1;
            while(j<heights.length 
                      && heights[j] > heights[i]){
                j = rightLimit[j];
            }
            rightLimit[i] = j;
        }
        return rightLimit;
    }
}

The array leftLimitcontains at index i the closest index j to the left of i such that height[j] < height[i]. You can think about each value of the array as a pointer (or an arrow) pointing to such j for every i. How to calculate leftLimit[i]? Just point the arrow one to the left and if necessary just follow the arrows from there, until you get to proper j. The key idea here to see why this algorithm runs in O(n) is to observe that each arrow is followed at most once.

Largest area rectangle: stack-based solution

There is a classic method to solve this problem using the stack as well. Let’s see if we can build a stack-based solution using the information we already have. Let’s we do not calculate the area of the rectangle which includes the bar when we are processing it. When should we process it? Where should this bar be put on? If we want to create a rectangle with a height of this bar, we should find the left and right boundaries of such a rectangle. We should put this bar on a stack.
Now when you are processing bar j if height[j] is less than the bar on the top of the stack, we pop out the bar at the top. Why? Because this is the first bar on the right which has a height less than the height of the bar at top of the stack. This means if we want to make a rectangle with a height of the bar at the top of the stack, this index means the right boundary. This also gives away that all the blocks on the stack are in increasing order, as we never put a block which has a height less than the height of block at the top on to the stack. It means the next bar on the stack is the first bar which has a height lower than the bar at the top. To calculate the area of the rectangle with height as h[top], we need to take width as current index j - stack.peek() - 1

So the idea is that:

  1. For each bar, take its height as the rectangle’s height. Then find the left and right boundaries of this rectangle.
  2. The second top bar in the stack is always the first bar lower than the top bar on the stack on the left.
  3. The bar that j points to is always the first bar lower than the top bar in the stack on the right.
  4. After step 2 and 3, we know the left and right boundaries, then know the width, then know the area.
private int maxAreaUsingStack(int[] heights){

        Stack<Integer> s = new Stack<>();

        int maxArea = 0;
        for(int i=0; i<=heights.length; i++){
            //Handling the last case
            int h = i == heights.length ? 0 : heights[i];
            while(!s.empty() && h < heights[s.peek()]){
                int top = s.pop();
                int leftLimit = s.isEmpty() ? -1 : s.peek();
                int width = i-leftLimit-1;

                int area = width * heights[top];
                maxArea = Integer.max(area, maxArea);
            }
            s.push(i);
        }
        return maxArea;
    }
The time complexity of the code is O(n) with an additional space complexity of O(n) If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

Print last n lines of a file

Print last n lines of file

A lot of times, when we are debugging production systems, we go through the logs being generated by systems. To see the logs which are most recent, we commonly use tail -n functionality of Unix.

Tail -n functionality prints the last n lines of each FILE to standard output

After going through many interview experiences at Microsoft, I found that this question regularly features in the majority of interviews. Let’s take an example and see what to expect out of the functionality.

print last n lines of a file
This is given file content

tail -f
This is output of tail functionality

The first thing we notice about this problem is that we have to print the last n lines. It means we have to maintain some kind of order. If we want the last line first, this is typical LIFO, which is implemented using the stack data structure.

However, another constraint is that we have to print most n lines. In that case, if the number of lines on stack goes more than n, we will remove some lines from it. Which lines should be removed? We will remove the lines which came first. Unstack all the lines from the stack and removed the first line and then put all lines back on to the stack.
When you read, we just read from the top of the stack till stack is empty which will give us last n lines of the file.

Also, tail functionality of Unix prints the line in forwarding order rather than reverse order. If we are implementing true tail functionality, the order will be FIFO rather than LIFO. But make sure that you clarify this with the interviewer.

The complexity of reading n lines is O(n) and putting a new line also takes O(n) complexity. If the stack is implemented using linked list, we do not require additional memory.

What if the file is continuously written on, and tail happens occasionally. As mentioned above, the stack solution has O(n) complexity to put every line, which is not ideal in here. Tail -f actually requires that output grows as things are added to the file.

       -f, --follow[={name|descriptor}]
       output appended data as the file grows;

What if we optimize the writing part using queues to store the last n lines of the file. Imagine a case, when a queue has last n lines of the file at a point of time. Now if a new line comes, we can add its tail of the queue and remove them from the front. If we keep track of tail of queue, insertion and removal operation both become O(1).

To read lines, we have to read the queue in reverse order. This should give us the idea that a doubly linked list should be used to implement queue. Using doubly linked list, if we have the tail pointer, we can always traverse queue in reverse order. The complexity of reading n lines will still be O(n). Real Tail does not require it, you can print the entire queue in FIFO manner, howver, it is good to mention in interview why you chose DLL over singly linked list to implement queue.

Print last n lines of a file: Algorithm

  1. For every line being added to the file, do the following:
  2. If size of queue is less than n, we simply enqueue the line in queue.
  3. If size of queue is greater than n, dequeue line from front and enqueue new line at the end.

If you are tailing an existing file, then read the whole file line by line and do the last two operations in the algorithm.

Print last n lines: implementation

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_SIZE 500
#define true 1
#define false 0

typedef struct queue_l{
	char data[MAX_SIZE];
	struct queue_l *next;
	struct queue_l *prev;
}Queue;

typedef struct dummyNode{
	int size;
	struct queue_l *front;
	struct queue_l *tail;
}dummyNode;

/* Below are the routine function for init queue, enqueue, 
dequeue, queue_empty etc */
void initializeQueue(dummyNode **q){
	*q  = (dummyNode *)malloc(sizeof(dummyNode));
	if(*q){
		(*q)->front = NULL;
		(*q)->tail = NULL;
		(*q)->size = 0;
	}
}

int isEmpty(dummyNode *q){
	if( !(q->size))
		return true;
	
	return false;
}

Queue * enqueue(dummyNode *q, char * elem){
	Queue *newNode= (Queue *) malloc(sizeof(Queue));
	if(newNode){
		strcpy(newNode->data, elem);
		newNode->next = NULL;
		newNode->prev = q->tail;
		
		if(q->tail){
			q->tail->next = newNode;
		}
		q->tail = newNode;
		if(!q->front)
			q->front = newNode;
		q->size++;
	}
	return newNode;
}

char * dequeue(dummyNode *d){
	
	if(isEmpty(d)){
		printf("\n Queue is empty");
		return NULL;
	}

	Queue *q  = d->front;
	d->front = q->next;
        
	if(q->next)
		q->next->prev = NULL;
	else
		d->tail = NULL;

	char * deletedNode = q->data;
	free(q);
	d->size--;
	
	return deletedNode;
}

void update_lines(dummyNode *d, char *s, int n){
    if(d->size <n){
        enqueue(d, s);
    }
    else{
        dequeue(d);
        enqueue(d, s);
    }
}

int main(){
	
	dummyNode *d =  NULL;
	
	int n=10;

	initializeQueue(&d);

	char line[MAX_SIZE], *result;
	FILE *stream;
	/* Open the file */
	stream  = fopen("problems.txt","rb");

	/*Read lines one by one */
	while((result =fgets(line, MAX_SIZE, stream)) != NULL){
		update_lines(d, line,n);
	}

	fclose(stream);
        print_queue(d);
	
	return 0;
}

Please share if there is something wrong or missing. If you are preparing for an interview and want coaching session to prepare you fast, please book a free session with us.