Arrays With Elements In The Range 1 to N

When an array has all its elements in the range of 1 to N ( where N is the length of the array ) we can use the indices to store the ordered state of the elements in the array. This ordered-state can in-turn be used to solve a variety of problems which we’ll explore soon. First, a very simple demonstration of this property. 

Here is an array which has unique elements in the range of 1 to N.
Given array (A) : 5,3,1,4,2
Indices:                0,1,2,3,4

Sort in Linear Time

The first use-case of this unique property is being able to sort in O(N) time i.e. a special-case(all unique elements) of the Counting Sort. The crux of this sort is to check whether an element is at its corresponding index and swap it to its correct index if it’s not. Following is a demonstration of this logic:

Given array (A) : 5,3,1,4,2
Indices:                0,1,2,3,4

For each A[i] check if A[A[i] – 1] equals A[i] or not. If they are not equal then swap element at A[A[i] – 1] with A[i]. Basically the correct value for any index i is when A[i] contains i+1. 

In the above case, let’s start with i = 0.

A[A[0] – 1] or A[5-1] orA[4] which is 2 and A[0] = 5. This means that A[A[i] – 1] is not equal to A[i] and hence not in its correct position. So we need to swap in order to put A[0] -> 5 to its correct position which is index 4 and A[0] will hold 4 after the swap. Similarly, we need to repeat this check & swap for all the elements.

What if we cancel-out the common terms and modify the check from  A[i] != A[A[i] - 1] to i != A[i]-1 ?

Find The Missing Integer

A similar approach can help us find the smallest missing positive-integer in a given array. By smallest missing positive-integer, we just mean the smallest positive integer that does not exist in the given list of numbers. For example: 

Given Array: -2, 3, 0, 1, 3
In the above case, the smallest missing positive integer is 2.

If we were to apply the usual sorting techniques and then scan the array for the smallest positive integer absent it would imply a time-complexity of O(NLog(N)) + O(N). We can definitely do better than this! At first glance, it seems that this problem does not lie within the unique property of elements being in the range of 1 to N since the numbers in the given array are well outside the range, but to solve this problem we still only need to figure out whether we have elements from 1 to N present in the given array or not.

How do we know whether the given array has elements from 1 to N? We can use the counting-sort discussed earlier to put each element in its “correct position”, i.e index 0 should hold 1, index 1 should hold 2 and so on. The smallest index that does not hold its correct element is the missing integer.

If we sort the given array using counting sort described above, we will get: 1, 0, 3, -2, 3. And the smallest index i to not hold its correct value i.e. i+1 will give us the answer to the smallest missing positive integer. In this case, that index is 1 since it does not hold 2, thus the smallest positive missing integer is 2.

Find The Duplicate Element

The third use-case of this property is to figure out the duplicate elements without using any extra space. We can iterate over the array A and mark the corresponding index of the encountered element as negative – unless it has already been marked negative! For example: if A[1] = 3 (or -3 ) then mark A[ Abs[3] - 1] as negative, this way whenever we encounter 3 (or -3) again in any of the A[i] we will know that the value 3 has been visited before since A[3-1] will be negative.

Given array (A) : 5,3,1,4,3
Indices:                0,1,2,3,4

When we encounter A[0] i.e. 5, we make A[5-1] i.e. A[4] negative, so the array becomes: 
5,3,1,4,-3
Next, we encounter A[1] i.e. 3, we make A[3-1] i.e. A[2] negative, so the array becomes: 
5,3,-1,4,-3
Next, we encounter A[2] i.e. -1, we make A[1-1] i.e. A[0] negative, so the array becomes: 
-5,3,-1,4,-3
Next, we encounter A[3] i.e. 4, we make A[4-1] i.e. A[3] negative, so the array becomes: 
-5,3,-1,-4,-3
Next, we encounter A[4] i.e. -3, we want to make A[3-1] i.e. A[2] negative, but in this case, A[2] is already negative thus we know that A[2] has been visited before! Which means Abs(A[4]) i.e 3 is the duplicate element.


Here is a snippet to demonstrate the code for sorting an array in linear time as per the above approach. The exact same approach can be used to solve the other two applications i.e. Finding the Duplicate and Finding The Missing Integer.

        int swap=0;

        for(int i = 0; i < nums.length;){
            
            if(nums[i] > 0 && nums[i] < nums.length) {

                if(nums[nums[i]-1] != nums[i]){                     
                    swap = nums[i];
                    nums[i] = nums[nums[i] - 1];
                    nums[swap - 1] = swap;
                }else{
                    i++;
                }
                
            }else{
                i++;
            }
        }

 

If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

Amazon interview experience Set 1

Amazon interview experience set 1

Online Round
1. Write a function to merge two sorted linked list in a linked list which is sorted. The time complexity of this function should be O(n+m) where n and m are sizes of two linked list, the space complexity should be O(1).

2. Implement a stack which has three operations: Push, Pop, and Max. The time complexity of all these operations has to be O(1) and space complexity can be O(n) where n is the number of elements present on the stack.

3. Find a missing number in an array which has numbers in range 1 to N and only 1 element is missing.

DS and algorithm round 1
1. Given a stream of integers, find the median of all the integers appeared till now.

2. Print right side view of a binary search tree.

DS and algorithm round 2
1. Given a stream of integers, find if there is a subset of these integers which add up to a particular sum.

2. Given a log file, implement a function which reads the last 10 lines of this file.

System design round
1. Implement an email service.

Cultural fit round
1. How did you handle conflict with your manager or a colleague?
2. What do you do when you see someone is taking credit for things you delivered.

If you are preparing for an interview and looking for personalized coaching, please book a free session with us.

Print unique rows in a boolean matrix

Unique rows in boolean matrix

Given a matrix of 0 and 1, print all unique rows in boolean matrix. For example, if the matrix is show below:
0 1 0 0 1
1 0 1 1 1
0 1 0 0 1
1 0 1 0 0
Output should be
 0 1 0 0 1
 1 0 1 1 1
1 0 1 0 0

The first brute force solution which has a complexity of O(n2 m) where is n is the number of rows and m is the number of columns is as follows:
For each row in the boolean matrix, check all the rows if it matches with any one of them. If yes, skip printing, else print the row.
Another way which is better than the above approach is to convert each row in equivalent decimal and then check for duplicates. Again it has a complexity of O(n * m) where n is the number of rows and m as the number of columns.

Can we do better than this? There is a standard technique to check if a particular pattern is already seen or not. That’s using tries. Each pattern is added to trie, when we try to add a duplicate pattern, we will end up at the leaf node which is already marked as leaf due to an earlier pattern.
With given info, we can say that we will insert each row pattern in the trie. We will modify the insert function of trie to return us whether the row is added newly or it was already present in the trie. As explained above it can be ascertained by the fact that if the last node is marked as leaf node already, then it is duplicate row as we must have traveled the same nodes above. If the last node is not a leaf node already, then there is at least one digit which is different and hence row becomes unique. Mark last node of this pattern as a leaf node, so that same patterns will be detected as duplicates.

Add a row in the trie.
If the last node while entering the pattern is the leaf node, return false. It’s not a unique row.
If the last node is not a leaf node, mark it as a leaf node and return true.
If insert operation returns true, print the row.
Else, skip printing row.

#include<stdio.h>
#include<stdlib.h>

#define MAX_SIZE 2

#define LEAF_NODE  1
#define true 1
#define false 0

typedef struct trie_node_t {
    int value;
    struct trie_node_t * children[MAX_SIZE];
}Node;

typedef struct trie{
    int count ;
    Node *root;
}trie;

void initialize(trie *t){
    t->root = (Node *)malloc(sizeof(Node));
    t->count =0;
}

int insert_1(trie *t, int key[], int col){

    int i;

    if(t->root == NULL){
        printf("\n Trie is not initialized\n");
    }

    Node * current = t->root;
    for(i=0; i<col; i++){
        int index  =  key[i];
        if(current->children[index] == NULL){
            current->children[index] = (Node *)malloc(sizeof(Node));
        }
        current = current->children[key[i]];
    }

    /* To mark it as leaf */
    if(current->value != LEAF_NODE){
        current->value = LEAF_NODE;
        return true;
    }
    else{
        return false;
    }
}

void print_unique_rows(int a[5][5], int col, int row){

    int i,j;
    int key[col];

    trie t;
    initialize(&t);
    for(i=0; i<row; i++){
        for(j=0; j<col; j++){
            key[j] = a[i][j];
        }
        if(insert_1(&t, key, col)){
            for(j=0;j<col; j++){
                printf("%d", key[j]);
            }
            printf("\n");
        }
    }
}
//Driver program
int main(){

    int a[4][5] = {{0, 1, 0, 0, 1},
                    {1, 0, 1, 1, 0},
                    {0, 1, 0, 0, 1},
                    {1, 0, 1, 0, 0}
                  };

    print_unique_rows(a, 5, 4 );

    return 0;
}

The complexity of trie based solution is O(n * m).