Maximum area rectangle in a histogram

A histogram is a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. Below is an example of a histogram.

maximum area rectangle in histogram

Given a histogram, whose class interval is 1, find maximum area rectangle in it. Let me explain the problem in more details.

In the histogram above, there are at least 6 rectangles with areas 2, 1,5,6,2, and 3. Are there more rectangles? Yes, we can make more rectangles by combining some of these rectangles. A few are shown below.

Apparently, the largest area rectangle in the histogram in the example is 2 x 5 = 10 rectangle. The task is to find a rectangle with maximum area in a given histogram. The histogram will be given as an array of the height of each block, in the example, input will be [2,1,5,6,2,3].

Maximum area rectangle: thoughts

First insight after looking at the rectangles above is: block can be part of a rectangle with a height less than or equal to its height. For each block of height h[i], check what all blocks on the left can be part of a rectangle with this block. All the blocks on the left side with a height greater than the current block height can be part of such a rectangle.
Similarly, all the blocks on the right side with a height greater than the current block height can be part of such a rectangle.
Idea is to calculate leftLimit and rightLimit and find the area (rightLimit - leftLimit) * h[i].
Check if this area is greater than previously known area, then update the maximum area else, continue to the next block.

class Solution {
    public int largestRectangleArea(int[] heights) {
        
        if(heights.length == 0) return 0;
        int maxArea = Integer.MIN_VALUE;

        for(int i=0; i<heights.length; i++){
            //Find the left limit for current block
            int leftLimit = findLeftLimit(heights, i);

            //Find the right limit for current block
            int rightLimit = findRightLimit(heights, i);

            int currentArea = (rightLimit - leftLimit-1) * heights[i];
            maxArea = Integer.max(maxArea, currentArea);
        }

        return maxArea;
    }

    private int findLeftLimit(int [] heights, int index){
        int j = index-1;
        while (j >= 0 && heights[j] >= heights[index]) j--;

        return j;
    }

    private int findRightLimit(int [] heights, int index){
        int j = index+1;
        while (j < heights.length && heights[j] >= heights[index])
            j++;

        return j;
    }
}

The time complexity of the implementation is O(n2); we will left and right of each block which will take n operations, we do it for n blocks and hence the complexity is quadratic. Can we optimize the time complexity?

If heights[j] >= heights[i] and leftLimit of index j is already known, can we safely say that it will also be the leftLimit of index i as well?
Can we say the same thing for rightLimit well? Answers to all the questions are yes. If we store the left and right limit for all indices already seen, we can avoid re-calculating them.

class Solution {
    public int largestRectangleArea(int[] heights) {
        
        if(heights.length == 0) return 0;

        int maxArea = Integer.MIN_VALUE;

        //Finds left limit for each index, complexity O(n)
        int [] leftLimit = getLeftLimits(heights);
        //Find right limit for each index, complexity O(n)
        int [] rightLimit = getRightLimits(heights);

        for(int i=0; i<heights.length; i++){
            int currentArea = 
                (rightLimit[i] - leftLimit[i] -1) * heights[i];
            maxArea = Integer.max(maxArea, currentArea);
        }

        return maxArea;
    }

    private int[] getLeftLimits(int [] heights){

        int [] leftLimit = new int[heights.length];
        leftLimit[heights.length-1] = -1;

        for(int i=0; i<heights.length; i++) {
            int j = i - 1;
            while (j >= 0 && heights[j] >= heights[i]) {
                j = leftLimit[j];
            }
            leftLimit[i] = j;
        }
        return leftLimit;
    }

    private int[] getRightLimits (int [] heights){

        int [] rightLimit = new int[heights.length];
        rightLimit[heights.length-1] = heights.length;

        for(int i=heights.length-2; i>=0; i--){
            int j = i+1;
            while(j<heights.length 
                      && heights[j] > heights[i]){
                j = rightLimit[j];
            }
            rightLimit[i] = j;
        }
        return rightLimit;
    }
}

The array leftLimitcontains at index i the closest index j to the left of i such that height[j] < height[i]. You can think about each value of the array as a pointer (or an arrow) pointing to such j for every i. How to calculate leftLimit[i]? Just point the arrow one to the left and if necessary just follow the arrows from there, until you get to proper j. The key idea here to see why this algorithm runs in O(n) is to observe that each arrow is followed at most once.

Largest area rectangle: stack-based solution

There is a classic method to solve this problem using the stack as well. Let’s see if we can build a stack-based solution using the information we already have. Let’s we do not calculate the area of the rectangle which includes the bar when we are processing it. When should we process it? Where should this bar be put on? If we want to create a rectangle with a height of this bar, we should find the left and right boundaries of such a rectangle. We should put this bar on a stack.
Now when you are processing bar j if height[j] is less than the bar on the top of the stack, we pop out the bar at the top. Why? Because this is the first bar on the right which has a height less than the height of the bar at top of the stack. This means if we want to make a rectangle with a height of the bar at the top of the stack, this index means the right boundary. This also gives away that all the blocks on the stack are in increasing order, as we never put a block which has a height less than the height of block at the top on to the stack. It means the next bar on the stack is the first bar which has a height lower than the bar at the top. To calculate the area of the rectangle with height as h[top], we need to take width as current index j - stack.peek() - 1

So the idea is that:

  1. For each bar, take its height as the rectangle’s height. Then find the left and right boundaries of this rectangle.
  2. The second top bar in the stack is always the first bar lower than the top bar on the stack on the left.
  3. The bar that j points to is always the first bar lower than the top bar in the stack on the right.
  4. After step 2 and 3, we know the left and right boundaries, then know the width, then know the area.
private int maxAreaUsingStack(int[] heights){

        Stack<Integer> s = new Stack<>();

        int maxArea = 0;
        for(int i=0; i<=heights.length; i++){
            //Handling the last case
            int h = i == heights.length ? 0 : heights[i];
            while(!s.empty() && h < heights[s.peek()]){
                int top = s.pop();
                int leftLimit = s.isEmpty() ? -1 : s.peek();
                int width = i-leftLimit-1;

                int area = width * heights[top];
                maxArea = Integer.max(area, maxArea);
            }
            s.push(i);
        }
        return maxArea;
    }
The time complexity of the code is O(n) with an additional space complexity of O(n) If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

Print last n lines of a file

Print last n lines of file

A lot of times, when we are debugging production systems, we go through the logs being generated by systems. To see the logs which are most recent, we commonly use tail -n functionality of Unix.

Tail -n functionality prints the last n lines of each FILE to standard output

After going through many interview experiences at Microsoft, I found that this question regularly features in the majority of interviews. Let’s take an example and see what to expect out of the functionality.

print last n lines of a file
This is given file content

tail -f
This is output of tail functionality

The first thing we notice about this problem is that we have to print the last n lines. It means we have to maintain some kind of order. If we want the last line first, this is typical LIFO, which is implemented using the stack data structure.

However, another constraint is that we have to print most n lines. In that case, if the number of lines on stack goes more than n, we will remove some lines from it. Which lines should be removed? We will remove the lines which came first. Unstack all the lines from the stack and removed the first line and then put all lines back on to the stack.
When you read, we just read from the top of the stack till stack is empty which will give us last n lines of the file.

Also, tail functionality of Unix prints the line in forwarding order rather than reverse order. If we are implementing true tail functionality, the order will be FIFO rather than LIFO. But make sure that you clarify this with the interviewer.

The complexity of reading n lines is O(n) and putting a new line also takes O(n) complexity. If the stack is implemented using linked list, we do not require additional memory.

What if the file is continuously written on, and tail happens occasionally. As mentioned above, the stack solution has O(n) complexity to put every line, which is not ideal in here. Tail -f actually requires that output grows as things are added to the file.

       -f, --follow[={name|descriptor}]
       output appended data as the file grows;

What if we optimize the writing part using queues to store the last n lines of the file. Imagine a case, when a queue has last n lines of the file at a point of time. Now if a new line comes, we can add its tail of the queue and remove them from the front. If we keep track of tail of queue, insertion and removal operation both become O(1).

To read lines, we have to read the queue in reverse order. This should give us the idea that a doubly linked list should be used to implement queue. Using doubly linked list, if we have the tail pointer, we can always traverse queue in reverse order. The complexity of reading n lines will still be O(n). Real Tail does not require it, you can print the entire queue in FIFO manner, howver, it is good to mention in interview why you chose DLL over singly linked list to implement queue.

Print last n lines of a file: Algorithm

  1. For every line being added to the file, do the following:
  2. If size of queue is less than n, we simply enqueue the line in queue.
  3. If size of queue is greater than n, dequeue line from front and enqueue new line at the end.

If you are tailing an existing file, then read the whole file line by line and do the last two operations in the algorithm.

Print last n lines: implementation

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_SIZE 500
#define true 1
#define false 0

typedef struct queue_l{
	char data[MAX_SIZE];
	struct queue_l *next;
	struct queue_l *prev;
}Queue;

typedef struct dummyNode{
	int size;
	struct queue_l *front;
	struct queue_l *tail;
}dummyNode;

/* Below are the routine function for init queue, enqueue, 
dequeue, queue_empty etc */
void initializeQueue(dummyNode **q){
	*q  = (dummyNode *)malloc(sizeof(dummyNode));
	if(*q){
		(*q)->front = NULL;
		(*q)->tail = NULL;
		(*q)->size = 0;
	}
}

int isEmpty(dummyNode *q){
	if( !(q->size))
		return true;
	
	return false;
}

Queue * enqueue(dummyNode *q, char * elem){
	Queue *newNode= (Queue *) malloc(sizeof(Queue));
	if(newNode){
		strcpy(newNode->data, elem);
		newNode->next = NULL;
		newNode->prev = q->tail;
		
		if(q->tail){
			q->tail->next = newNode;
		}
		q->tail = newNode;
		if(!q->front)
			q->front = newNode;
		q->size++;
	}
	return newNode;
}

char * dequeue(dummyNode *d){
	
	if(isEmpty(d)){
		printf("\n Queue is empty");
		return NULL;
	}

	Queue *q  = d->front;
	d->front = q->next;
        
	if(q->next)
		q->next->prev = NULL;
	else
		d->tail = NULL;

	char * deletedNode = q->data;
	free(q);
	d->size--;
	
	return deletedNode;
}

void update_lines(dummyNode *d, char *s, int n){
    if(d->size <n){
        enqueue(d, s);
    }
    else{
        dequeue(d);
        enqueue(d, s);
    }
}

int main(){
	
	dummyNode *d =  NULL;
	
	int n=10;

	initializeQueue(&d);

	char line[MAX_SIZE], *result;
	FILE *stream;
	/* Open the file */
	stream  = fopen("problems.txt","rb");

	/*Read lines one by one */
	while((result =fgets(line, MAX_SIZE, stream)) != NULL){
		update_lines(d, line,n);
	}

	fclose(stream);
        print_queue(d);
	
	return 0;
}

Please share if there is something wrong or missing. If you are preparing for an interview and want coaching session to prepare you fast, please book a free session with us.