Find friend groups

There are Nstudents in a class. Some of them are friends, while some are not. Their friendship is transitive in nature. For example, if Ais a direct friend of B, and Bis a direct friend of C, then Ais an indirect friend of C. And we defined a friend circle is a group of students who are direct or indirect friends.

Given a N*N matrix M representing the friend relationship between students in the class. If M[i][j] = 1, then the ith and jth students are direct friends with each other, otherwise not. And you have to output the total number of friend circles among all the students. For example,

Input: 
[[1,1,0,0,0,0],
 [1,1,0,0,0,0],
 [0,0,1,1,0,0],
 [0,0,1,1,1,0],
 [0,0,0,1,1,0],
 [0,0,0,0,0,1]
]
Output: 3

Thought process

When we talk about the connections or relationships, we immediately will think of graph data structure. The node is the person and the edge is the relationship between two persons. So, first, we have to figure out whether it will be a directed graph or an undirected graph. In this problem, the friendship is a mutual relationship, thus the graph is undirected.

When you are reading this problem, the concept of Strongly Connected Components(SCC) will come into your mind. Ok, we will discuss why? If A is a friend of B, B is a friend of C, then A will be a friend of C. What does it mean? A is indirectly connected to C. It means that every friend can reach every other friend through a path if they are directly or indirectly connected. So in this way, they are forming a strong group or circle, in which every vertex is connected directly or indirectly in its group/circle. Notice that all friends (both direct and indirect), who should be in one friend circle are also in one connected component​ in the graph. 

A particular group of friends is a single component. In this problem we are going to find out how many components are there in the graph.

friend groups

When you make a graph out of it. It will be like this(see below fig). It means there are 3 friends circles.

friend-circle

We can solve this problem using 2 methods: depth first search and disjoint set method

Using Depth first traversal method

Finding connected components for an undirected graph is very easy. We can do either BFS or DFS starting from every unvisited vertex, and we get all strongly connected components

1. Initialize all nodes as not visited.
2. Initialize variable count as 1.
3. for every vertex 'v'.
       (i) If 'v' is unvisited
            Call DFS(v)
            count=count+1
DFS(v)
1. Mark 'v' as visited.
2. Do following for every unvisited neighbor `u`
      recursively call DFS(u)

DFS based approach implementation

class Solution {
Public:
    void DFS(int node,vector<vector<int>> edges,vecto<bool>visited)
    {
        int i;
        visited[node]=true;
        for(int i=0;i<edges[node].size();i++)
        {
            if(edges[node][i]==1 && node!=i && visited[i]==false)
            {
                DFS(i,edges,visited);
            }
        }
    }

    //Main Method
    int findCircleNum(vector<vector<int>> edges) {
        int i,j;
        
        int n=edges.size();
        int count=0;
        vector<bool>visited(m);
        
        //mark all the nodes as unvisited
        for(i=0;i<n;i++)
        {
            visited[i]=false;
        }
        
        for(i=0;i<n;i++)
        {
            if(visited[i]==false)
            {
                DFS(i,edges,visited);
                count=count+1;
            }
        }
        
        return count;
    }
};

Complexity

  • Time Complexity: Since we just go along the edges and visit every node, it will be O(n).
  • Space Complexity: O(n), to store the visited nodes.

Using Disjoint Sets(Union Find)

So, how to think that this problem is solved by Disjoint Sets(union-find algorithm)?

 The answer is simple because we need to keep track of the set of elements(here friends) partitioned into a number of non-overlapping subsets. Disjoint Sets(Union Find) always do this work very efficiently. We will use the Union by Rank algorithm to solve this problem.

If you haven’t heard of the Disjoint Sets. Go to this link and read about it.

To join two nodes, we will compare the rank of parents of both nodes.

  • If the rank is equal, we can make any one of the parent’s node as a parent and increment the rank of the parent node by 1.
  • If the rank is not same, then we can make the parent whose rank is greater than other.

graph breadth first traversal application

 

Let’s start solving this.

Union(1,2): 1 is a parent of itself and 2 is parent of itself. As both of them have different parents, so we have to connect them, and we will any of the parent as root, in this case we chose 1 and make it a parent.

friend groups

Union(2,1): 1 is a parent of itself and 1 is a parent of 2, as both of them have the same parents, already joined.

Union(3,4) :3 is a parent of itself and 4 is a parent of itself. Both of them have different parents, we need to join them.

Union(4,3): 3 is parent of itself and 3 is the parent of 4. Both of them have the same parents, already joined.

Union(4,5):   3 is the parent node of 4 and 5 is the parent node of 5. Since parents are different, we have to compare the rank of the parents of both 4 and 5 nodes. 3 has higher rank then 5, it will be parent of 5 .(Used Path Compression) as shown in the below fig.

Union(5,4): As now, 4 and 5 have the same parents, already joined. Last is the node 6 which connected to itself. So, nothing to do there. At the end of this exercise, we found that there are three different sets in the graph, and that is our answer of number of groups of friends in this graph or matrix.

Disjoint set based approach implementation

class Solution {
public:
    class Node
    {
        public:
            int data;
            Node*parent;
            int rank=0;
    };
    //make a set with only one element.
    void make(int data)
    {
        Node*node=new Node();
        node->data=data;
        node->parent=node;
        node->rank=0;
        mapy[data]=node;
        return;
    }
    map<int,Node*> mapy;
    //To return the address of the particular node having data as `data` 
    Node*find(int data)
    {
        auto k=mapy.find(data);
        if(k==mapy.end())
        {
            //There is no any node created, create the node
            make(data);
            return mapy[data];
        }
        else
        {
            return mapy[data];
        }
        return NULL;
    }
    /*Find the representative(parent) recursively and does path compression  
    as well*/
    Node*parents(Node*node)
    {
        if(node->parent==node)
        {
            return node;
        }
        return node->parent = parents(node->parent);
    }
    //Main Method
    int findCircleNum(vector<vector<int>>edges) {
        int i,j;
        
        vector<int> v;
        int m=edges.size();
        int n=edges[0].size();
        for(i=0;i<m;i++)
        {
            for(j=0;j<n;j++)
            {
                if(edges[i][j]==1)
                {
                    int a=i;
                    int b=j;
                    
                    
                    Node*A=find(a);
                    Node*B=find(b);
            
                    Node*PA=parents(A);
                    Node*PB=parents(B);
            
                    if(PA==PB)
                    {
                    }
                    else
                    {
                        if(PA->rank>=PB->rank)
                        {
                            //increment rank if both sets have Same rank
                            PA->rank=(PA->rank==PB->rank)?PA->rank+1:PA->rank;
                            PB->parent=PA;
                        }
                        else
                        {
                            PA->parent=PB;
                        }
                    }
                    
                    }
                }
            }
        
        int number=0;
        for(auto k: mapy)
        {
            if(k.second->parent==k.second)
            {
                number=number+1;
            }
        }
        return number;
    }
};

Complexity

  • Time Complexity: For each of the edge, we need to find the parents  and do the union, which is O(mlogn).
  • Space Complexity: We used a map to store the parent information, O(n).

This post is contributed by Monika Bhasin

Russian doll envelopes

There are a number of envelopes with widths and heights given as a pair of integers (w, h). An envelope can fit into another if and only if both the width and height of one envelope is greater than the width and height of the other envelope. Given the list of such envelope dimensions, find the maximum number of envelopes can you Russian doll? (put one inside other) For example:

Input: [[5,4],[6,4],[6,7],[2,3]]
Output: 3 
Explanation: The maximum number of envelopes you can Russian doll is 3 ([2,3] => [5,4] => [6,7]).

How should we go about the solution? One of the ways is to find all the subsets of the envelopes and check every subset if envelopes can fit in each other. This solution has exponential time complexity. Can we do better than that?

Let’s look at the conditions again, an envelope can fit in another one only if width and height are smaller than the other envelope. Let’s simplify this. An envelope cannot fit into another one if the width is bigger than the other. What if we sort the envelopes based on their width? Then we are sure that i-th envelop can have all the envelopes inside it that are at index 0 to i-1 as far as the height permits.

Once, we order the envelopes based on width, our problem reduces to one dimension, which is height. Now if we want to find how many will fit in each other till envelope i, we find the envelopes from 0 to i-1 in increasing order. Why? Because even if the height of the envelope j is greater than height of envelope k, where k > j, it will not fit because width of envelope j is less than width of envelope k because of the sorted order.

To find the maximum number, all we have to is to find the longest increasing subsequence based on the height of envelopes.

One thing which you should take care of is what if the width of the two envelopes is the same? In that case, those two will not fit in together in each other even if heights are increasing. To handle this case, if widths are equal, we will put the envelope with a bigger height in front of the other, so that both envelopes are not part of increasing subsequence.

Show me the implementation

class Solution {
    public int maxEnvelopes(int[][] envelopes) {
        
        if(envelopes == null || envelopes.length == 0) return 0;
      
        /*Sort the array based on width in increasing order,
           if the width is equal, sort it based on height
           in decreasing order.
        */
        Arrays.sort(envelopes, (a,b) -> a[0] == b[0] 
                           ? b[1] - a[1] : a[0] - b[0]);
        
        int n = envelopes.length;
        
        int [] dp = new int[n];
        int len = 0;
        
        for(int[] envelope : envelopes){
            int index = Arrays.binarySearch(dp, 0, len, envelope[1]);
  
            if(index < 0)
                index = -(index + 1);
            dp[index] = envelope[1];
            if(index == len)
                len++;
        }
        
        return len;
    }
}

The time complexity of implementation is O(nlogn); we are doing n times binary search on maximum n numbers and space complexity is O(n).

You can solve another problem with the same approach: Box stacking problem, where one box can be put on another one only if the area of the base is less than the area of the base of another.

Three Sum Problem

Given an array nums of n integers, are there elements a, b,c in nums such that a+ b + c= 0? Find all unique triplets in the array which gives the sum of zero.

The solution set must not contain duplicate triplets.

This problem is commonly known as three sum problem.

Input: = [-1, 0, 1, 2, -1, -4],
Output:
[
  [-1, 0, 1],
  [-1, -1, 2]
]

This problem is commonly asked in Amazon and Google interviews.

Three Sum problem : thought process

Before going into the details of find three numbers with a given sum, do you know how to find two numbers with a given sum s? The idea is simple, we keep a map of all the numbers in we already seen, if see a number a[i], we see in the map if S-a[i] is present or not. If yes, then we found the pair, if not, we put a[i] and move forward.

This solution has linear time complexity i.e O(n) with a space complexity of O(n) as well. There is a way to avoid space complexity by sorting the input array first and using two pointer approach.

Details of 2 sum problem, you can read here: 2 Sum problem: Find pair with given sum in array

Can we use our understanding of the two-sum problem to solve this three-sum problem?

Hint 1:
What happens if we take one number from the array and say that this number will be one of the three numbers which add up to zero? What happens to our problem statement now?

Hint 2:
If you have decided that A[0] will be part of three numbers which add up to zero, all we need is two find out two numbers from index 1..len-1 which add up to OA[0]. What does that mean? It looks like a 2-sum problem, doesn’t it?

Can you please explain more?


We will start with index i from 0 and len-1, each time claiming that A[i] is part of the solution. Once, we fixed A[i] in solution, we will find two numbers which add to 0-A[i] solution to that problem we already know. If we find those two numbers let’s say A[j] and A[k], then A[i],A[j] and A[k] from the triplet. If we do not find those two numbers, discard A[i] and move forward.
3 sum problem

Show me the implementation of three sum problem

class Solution {

        public List<List<Integer>> threeSum(int[] nums) {

            List<List<Integer>> result = new ArrayList<>();
            //O(n log n)
            Arrays.sort(nums);

         
            for(int i=0; i<nums.length; i++){ if (i > 0 && nums[i] == nums[i - 1]) {
                 continue;
                }
                if(nums[i] > 0) continue;
                int j = i+1;
                int k = nums .length -1;

                int target = 0-nums[i];
                /*
                 Below code is exactly the same to find 
                 two numbers with a given sum using two pointers
                */
                while(j<k){
                    if(nums[j] + nums[k] == target ){
                      /* If find j and k, such that element at 
                         index i, j and K add to zero, put them in result
                      */
                        ArrayList<Integer> triplet = new ArrayList<Integer>(
                            Arrays.asList(nums[i], nums[j], nums[k]));
                        j++;
                        k--;
                         while (j < k && nums[j] == nums[j - 1]) j++;  // skip same result
                         while (j < k && nums[k] == nums[k + 1]) k--; // skip same result //since we want only unique triplets, using set result.add(triplet); } else if(nums[j] + nums[k] > target){
                        k--;
                    }
                    else{
                        j++;
                    }

                }
            }
            return  result;
        } 
    }

If you do not come up with the idea of moving i, j and k to get unique triplets, it is OK. You can always use HashSet to filter out duplicates and then copy the HashSet to the result list. It will take more time but will give the correct result.

 if(uniqueSet.add(triplet)){
     result.add(triplet);
 }

The time complexity of the solution is O(n2) plus the complexity of sorting algorithm (never forget that to mention!!). However, sorting complexity is O(nlogn), hence the overall complexity is dominated by the quadratic expression.

Can you solve the following problems on the leetcode?
1. 3SUM
2. 3Sum Closest
3. Four sum problem.

Digit sum of n-digit numbers

Find the count of n-digit numbers whose sum of digits is equal to given sum s.
For example,

 
Input:
n=2 and s=5
Output: 
14, 23, 32, 41, 50
Explanation:
we have to find those 2-digit numbers whose sum of digits is 5. Note that leading zeroes are not considered as digits, i.e. in above eg. we will not consider 05 as a 2-digit number. 

Sum of n-digit numbers thought process

A brute force solution that first comes to mind is to consider each of the n-digits numbers, calculate the sum of digits of each number and count the ones where the sum of digits is s. But this solution is of exponential time complexity, we need to find a better solution. Let’s consider a few examples:
Numbers: numbers with n-digits and sum of digits s
Cnt(n,s): count of numbers

Input Cnt(n,s) Numbers
n=2, sum=4 4 13, 22, 31, 40
n=2, sum=3 3 12, 21, 30
n=2, sum=2 2 11, 20
n=2, sum=1 1 10
n=2, sum=0 0
n=3, sum=5 15 104, 113, 122, 131, 140, 203, 212, 221, 230,302, 311, 320,401, 410,500

If we look closely at above example we can see that actual numbers in case of (n=3, sum=5) are generated from numbers in case of (n=2, sum=4), (n=2, sum=3), (n=2, sum=2), (n=2, sum=1) and (n=2, sum=0) by fixing the most significant digit. As stated in the question, leading zeroes cannot be considered as digits hence for the most significant digit available choices are ranging from 1 to 9 or given sum (whichever is less):

1 _ _	2 _ _	3 _ _	4 _ _	5 _ _

After fixing one digit we have got just 2 (n-1 i.e. 3-1 = 2) digits left with us and also the sum reduces by the digit value i, where i ranges from 1 to 9.

Let’s consider the case of (1 _ _), after fixing most significant digit to 1, problem reduces to remaining n – 1 = 3 – 1 = 2-digit numbers and reduced s = s – 1 = 5 – 1 = 4, which is nothing but the sub-problem cnt(n=2, s=4).
But also note that while fixing any digit other than the most significant digit, 0 can also be used as a digit, and the sub-problem is instead denoted by cntWithZero(n=2, s=4). The second digit can similarly be fixed but this time 0 is also a candidate:

1 0 _	1 1 _	1 2 _	1 3 _	1 4 _

After fixing 2 digits, the problem now reduces to 1-digit numbers and reduced sum (s-i) where i ranges from 0 to 9 or last reduced sum (whichever is less).

In this way for any n-digit number we can keep fixing a digit at a time and keep solving the sub-problems, until all n digits are fixed or in case s was small and it reaches zero before all digits were fixed.
As we had observed numbers for cnt(n=3, s=5) are generated by fixing the most significant digit, to find the solution for cnt(n=3, s=5) from its sub-problems –

1 _ _ => cntWithZero(2,4) => 5 (04, 13, 22, 31, 40)
2 _ _ => cntWithZero(2,3) => 4 (03, 12, 21, 30)
3 _ _ => cntWithZero(2,2) => 3 (02, 11, 20)
4 _ _ => cntWithZero(2,1) => 2 (01, 10)
5 _ _ => cntWithZero(2,0) => 1 (00)

Cnt(n=3, s=5) = cntWithZero(2,4) + cntWithZero(2,3) + cntWithZero(2,2) + cntWithZero(2,1) + cntWithZero(2,0)
                         = 5 + 4 + 3 + 2 + 1
                         = 15

Considering this observation we can say that cnt(n, s) can be calculated using sub-problems such as cnt(n-1, s-i) where i ranges from 0 to s:

cnt(n, s) = SUM(cntWithZero(n-1, s-i)), where 1 <= i <= s cntWithZero() = SUM(cntWithZero(n-1, s-i)), where 0 <= i <= s

This shows this problem has optimal substructure (i.e. an optimal solution can be constructed from optimal solutions of subproblems), which is the first condition for application of dynamic programming.

Recursive implementation C++

long long int cntWithZero(int n, int sum)
{
    if (n == 0)
    {
         if (sum == 0)
             return 1;
         return 0;
    }
    if (sum == 0)
       return 1;
 
    long long int ans = 0;
    for(int i=0; i<10; ++i)
    {
         if(sum – i >= 0)
         {
             ans += cntWithZero(n-1, sum-i);
         }
    }
    return ans; 
}
 
long long int cnt(int n, int sum)
{
    long long int ans = 0;
    for(int i=1; i<10; ++i)
    {
         if(sum – i >= 0)
         {
              ans += cntWithZero(n-1, sum-i);
         }
    }
    return ans;
}
 
int main(void) 
{
    int n=3, s=5;
    printf("%lld", cnt(n, s));
    return 0;
}



Recursive implementation Java

import java.io.*;
 
class digit_sum
{
 
     static int cntWithZero(int n, int sum)
    {
         if (n == 0)
        {
              if (sum == 0)
                  return 1;
             return 0;
         }
        if (sum == 0)
           return 1;
 
    int ans = 0;
       for(int i=0; i<10; ++i)
       {
           if(sum - i >= 0)
          {
              ans += cntWithZero(n-1, sum-i);
           }
       }
       return ans; 
   }
 
    static int cnt(int n, int sum)
    {
       int ans = 0;
       for(int i=1; i<10; ++i)
       {
           if(sum - i >= 0)
           {
                ans += cntWithZero(n-1, sum-i);
            }
        }
        return ans;
     }
 
    public static void main(String args[])
    { 
        int n=3, s=5;
        System.out.println(cnt(n, s));
    }
}


This implementation has exponential time complexity. Let’s have a look at the recursive function call tree to find the reason behind this:
sum of digits

In this example of cnt(n=3, s=5), subproblems such as cntWithZero(n=1, s=2) and cntWithZero(n=1, s=1) are calculated repeatedly. Our recursive algorithm for this problem solves the same subproblem over and over rather than always generating new subproblems. These are called overlapping subproblems, the second condition to apply dynamic programming.

As the two properties required for using Dynamic Programming: optimal substructure and overlapping subproblems hold, we can use DP for this problem. But before jumping to DP solution, there’s another way to resolve the issue of overlapping subproblems in a recursive approach: Memoized approach.

Memoised approach C++

long long int memoized_table[101][101];
 
long long int cntWithZero(int n, int sum)
{
    if (n == 0)
    {
         if (sum == 0)
             return 1;
         return 0;
    }
 
    if (memoized_table[n][ sum] == -1)
    {
         long long int ans = 0;
        for(int i=0; i<10; ++i)
        {
            if(sum – i >= 0)
            {
                ans += cntWithZero(n-1, sum -i);
            }
              memoized_table[n][ sum] = ans;
        }
    }
    return memoized_table[n][ sum]; 
}
 
long long int cnt(int n, int sum)
{
    long long int ans = 0;
 
    memset(memoized_table, -1, sizeof(memoized_table));    
 
    for(int i=1; i<10; ++i)
    {
         if(sum – i >= 0)
         {
              ans += cntWithZero(n-1, sum-i);
         }
    }
    return ans;
}
 
int main(void) 
{
    int n=3, s=5;
    printf("%lld", cnt(n, s));
    return 0;
}



Memoised approach Java

Java:
import java.io.*;
 
class digit_sum
{
 
static int memoized_table [][] = new int[101][101]; 
 
static int cntWithZero(int n, int sum)
{
    if (n == 0)
    {
         if (sum == 0)
             return 1;
         return 0;
    }
 
    if (memoized_table[n][sum] == -1)
    {
         int ans = 0;
        for(int i=0; i<10; ++i)
        {
            if(sum - i >= 0)
            {
                ans += cntWithZero(n-1, sum -i);
            }
              memoized_table[n][ sum] = ans;
        }
    }
    return memoized_table[n][ sum]; 
}
 
static int cnt(int n, int sum)
{
    int ans = 0;
 
    for(int i = 0; i <= 100; ++i)
    { 
        for(int j = 0; j <= 100; ++j)
        { 
            memoized_table[i][j] = -1; 
        } 
    } 
 
    for(int i=1; i<10; ++i)
    {
         if(sum - i >= 0)
         {
              ans += cntWithZero(n-1, sum -i);
         }
    }
    return ans;
}
 
public static void main(String[] args) 
{
    int n=3, s=5;
    System.out.println(cnt(n, s));
}
}



The time complexity of the above implementation is O(ns) along with the space complexity of O(ns)

Sum of digits Dynamic Programming Approach

We go bottom-up in a dynamic programming approach. We will use a 2D array / DP table in the implementation. Start from the bottom i.e. i=0, j=0, and keep solving each sub-problem and store its result in DP table until we reach i=n and j=s.

Dynamic programming Java solution of sum of digits problem

import java.io.*;
 
class digit_sum
{
 
static int cnt(int n, int sum)
{
    int temp = 0;
    
    if(sum==0 || n==0)
    {
        return 0;
    }
    
    int dp [][] = new int[n+1][sum+1];
    
    for(int i=0; i<=n; ++i)
        for(int j=0; j<=sum; ++j)
            dp[i][j] = 0;
 
    for(int i=0;i<=9 && i<=sum;i++)
    {
        dp[1][i]=1;
    }
    
    for(int i=2;i<n;i++)
    {
        for(int j=0;j<=sum;j++)
        {
            temp = 0;
            for(int k=0;k<=9;k++)
            {
                if(j-k>=0)
                {
                    temp += dp[i-1][j-k];
                }
                else
                    break;
            }
            dp[i][j] = temp;
        }
    }
    
    temp=0;
    for(int k=1; n>1 && k<=9; k++)
    {
        if((sum-k)>=0)
        {
            temp += dp[n-1][sum-k];
        }
    }
    dp[n][sum] = temp;
    
    return dp[n][sum];
}
 
public static void main(String[] args) 
{
    int n=3, s=5;
    System.out.println(cnt(n, s));
}
}


The time complexity of the dynamic programming solution is O(ns) with the space complexity of O(ns)
This article is contributed by Bhushan Aggrawal.

Ship capacity problem

There are some problems which do not appear to be a binary search problem at first. For example, ship capacity problem on leetcode. Problem statement is:
A conveyor belt has packages that must be shipped from one port to another within D days.
The i-th package on the conveyor belt has a weight of weights[i]. Each day, we load the ship with packages on the conveyor belt (in the order given by weights). We may not load more weight than the maximum weight capacity of the ship.
Return the least weight capacity of the ship that will result in all the packages on the conveyor belt being shipped within D days.

For example:

Input: weights = [1,2,3,4,5,6,7,8,9,10], D = 5
Output: 15
Explanation: 
A ship capacity of 15 is the minimum to ship all the packages in 5 days like this:
1st day: 1, 2, 3, 4, 5
2nd day: 6, 7
3rd day: 8
4th day: 9
5th day: 10

Note that the cargo must be shipped in the order given, so using a ship of capacity 14 and splitting the packages into parts like (2, 3, 4, 5), (1, 6, 7), (8), (9), (10) is not allowed. 

Ship capacity problem and binary search algorithm

At first glance, it does not appear to be a binary search problem. Where is the input we will search on? That’s where we have to remember that to apply a binary search, we actually do not need an input set, all we need is lower and upper bound.

Hint 1
What is the minimum capacity you would need to ship this cargo? You can choose the lowest possible capacity if you infinite number of days to ship all weights.

Hint 2
What if you have only one day to ship all the weights? In that case, what will be the capacity of the ship? Do you need more than capacity ever?

To find these bounds, try to vary the constraint to extremes. In the example of ship capacity, try to put the number of days constraints to an extreme. What if we had an infinite number of days to ship the cargo?

Can you please explain more?


In that case, we will need a ship that has at least capacity greater than equal to the heaviest weight cargo, as no matter how many days, we cannot move the cargo.

Again, what if I had only 1 day to ship the whole cargo? In that case, we need a ship that can take all the weights in 1 day, so ship capacity should be the sum of all the weights.

Now, we know the lower and upper bound of the ship, all we have to adjust the capacity and see if we can ship cargo in D days? Start with mid, see if we can. If yes, then try a smaller capacity. If not, then try greater capacity. All we are doing is find the first capacity between lower and upper bounds. It seems like the first occurrence problem now.

Understood the concept, you still need help with implementation? Please see the code below.

Ship capacity problem implementation

    public int shipWithinDays(int[] weights, int D) {
        
        int upperLimit = 0;
        int lowerLimit = 0;
        
        for(int i = 0; i<weights.length; i++){
            upperLimit+= weights[i];
        }
        //Not returning from while loop :)
        while(lowerLimit < upperLimit){
            int shipCapacity = lowerLimit + (upperLimit - lowerLimit)/2;
            
            if(isItPossible(D, shipCapacity, weights)){
                upperLimit = shipCapacity;
            }
            else{
                lowerLimit = shipCapacity + 1;
            }
        }
        
        return lowerLimit;
        
    }
    
    private boolean isItPossible(int D, int shipCapacity, int[] weights){
        
        int currentLoad = 0;
        int days = 1;
        int i = 0;
        
        while(i<weights.length){
            if(weights[i] > shipCapacity) return false;
            
            currentLoad+= weights[i];
            if(currentLoad == shipCapacity){
                days++; i++;
                currentLoad = 0;
            }
            else if(currentLoad > shipCapacity){
                days++;
                currentLoad = 0;
            }
            else{
                i++;
            }
        }
        
        return days <= D;
    }

The time complexity of the solution is O(nlogc) where n is number of weights and c is capacity range.

Please share if you find anything wrong or missing. If you are looking for coaching to prepare for your technical interviews, please book a free session with us.

Partition labels

Given a string S, partition the string into maximum possible partitions so that each letter appears in at most one part, and return a list of integers containing the size of each partition.
For example:

Input: 
S = "ababcbacadefegdehijhklij"
Output:
[9,7,8]
Explanation:
The partition is "ababcbaca", "defegde", "hijhklij".
This is a partition so that each letter appears in at most one part.
A partition like "ababcbacadefegde", "hijhklij" is incorrect, because it splits S into fewer parts.

This problem is commonly asked in Amazon interviews.

Thought process to partition string in labels

Fundamental idea is that if two instances of characters should be in the same partition. So, we start with the first character and see at what point we can finish the partition. The earliest we can do is at the last instance of this character.
What if we find a character between the first character and the last instance of it? In this case, we increase the length of the partition. Why? If we do not increase the partition, the new character ends up into two partitions, which violates the constraint of the problem.

If we have gone through all the characters between the start of partition and maximum of the last instance of characters, we can close the partition and start a new one.

Let’s take an example and see how it works. Start with the first character of the string and find the last instance of that character, in this case, it is a
partition labels

Go through all the characters until the last instance, if the last instance of any of the characters is greater than the current last instance, expand the partition to include new last instance. In this example, none of the characters have the last index beyond the last index of a

Once, we reach the last index, we can close the current partition and start a new one. Find the last instance of the first character of the new partition.

amazon interview question partition labels

Go though the characters in between. In this example, character e has last instance greater than current one, so we update the partition till e.

There is no character in between with the last instance greater than the last instance of e, so we close the partition there and start the next one from the next character in the string which is h

Next character i will expand the partition to index 22 and next to next character j will push it to 23. There is no other character that has the last occurrence after that index. Hence, we would close the partition there.
partitions

Whenever we are closing the partition, we would add the length of that partition end – start + 1 in a list to return.

Partition labels implementation

class Solution {
    public List<Integer> partitionLabels(String S) {
        
        Map<Character, Integer> map = new HashMap<>(); 
        
        //Get the last index the character.
        for(int i=0; i< S.length(); i++){
            map.put(S.charAt(i), i);
        }
        
        int last = 0;
        int start = 0;
        
        List<Integer> list = new ArrayList<>();
        for(int i=0; i< S.length(); i++){  
            
            last = Math.max(last, map.get(S.charAt(i)));
            if(last == i){
                list.add(last - start + 1);
                start = last + 1;
            }
        }
        
        return list;
    }
}

The time complexity of implementation is O(n), however, we are scanning the string twice. We also need constant space. Do not get confused with map, the maximum size of the map is bound by 52 (upper and lower case characters) irrespective of the size of the string.

Please share if there is something wrong or missing.

Number of islands

Given a 2d grid map of ‘1’s (land) and ‘0’s (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.

Input:
11110
11010
11000
00000

Output: 1

The first challenge of this problem is to identify that this is a graph problem. To identify that, we have to look if cells of the matrix can represent nodes, and are there any relationship between cells that can act as edges of the graph?
One hint is to the relationship between the cells of the matrix. For example, in the island problem, 0s are water, and 1s are land. Land can be connected to land only which is adjacent to the land and a piece of land is isolated if surrounded by water from all four sides.

To find the number of islands in the matrix, we have to find the number of connected components in the graphs as explained above. Which traversal can we use to find the number of connected components of a graph? It is a depth-first search problem.

In this case, nodes seem to be the cells with value 1(land). Most of the time, in these kinds of problems, each cell of the matrix with a certain value is a node. If we have an n x m matrix, we are looking at n x m possible nodes. Also, here be aware that we are not interested in all the cells. Based on the problem statement, we can discard some cells. For example, in the island problem, we are interested in the cells which represent the land.

Now, what will be the edges? In these kinds of problems, a statement is given like an island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. It means that a cell with value 1 is connected to neighbor cells only if they also contain 1.

Now that we know that cells containing 1 in the matrix are nodes of the graphs and connected to neighbors in they are also 1. To count the number of islands, all we have to do is find connected components in the graph. We reduced a matrix problem to a graph problem after all. Should we create an entire graph? Well no, at any point in time, we do not need the entire graph.

Number of islands implementation

public int numIslands(char[][] grid) {
int count = 0;
Set<Integer> visited = new HashSet&lt;&gt;();
int n = grid.length;

if(n &lt;=0) return 0;
int m = grid[0].length;

for(int i=0; i&lt;n; i++){
    for(int j=0; j&lt;m; j++){
        int cell = i * m + j;
        if(!visited.contains(cell) &amp;&amp; grid[i][j] == '1'){
            count++;
            dfs(i, j, visited, grid);
        }
    }
}
return count;
}

void dfs(int i, int j, Set<Integer> visited, char[][] a){
//Check if the node we are traversing is actually a valid node.
if(i&lt;0 || i&gt;=a.length 
      || j&lt;0 || j&gt;=a[0].length 
      || a[i][j] == '0') return;

int cell = i * a[0].length + j;
if(visited.contains(cell)) return;
visited.add(cell); 

//Visit neighbors
dfs(i+1, j, visited, a);
dfs(i-1, j, visited, a);
dfs(i, j+1, visited, a);
dfs(i, j-1, visited, a);
}

The complexity of the above code is O(n * m) where m and n are rows and columns of the matrix.

Find Binomial Coefficient

In mathematics, Binomial Coefficients are the positive integers that occur as coefficients in the binomial theorem.

A binomial coefficient is the coefficient of the term (xk) in the polynomial expansion of (1 + x)n and is indexed by a pair of integers n >= k >= 0

In this post, we will denote it as bc(n, k).

For example: The fourth power of (1 + x) is –

(1 + x)4 = 1 + 4x + 6x2 + 4x3 + x4

Writing this expression in terms of Binomial Coefficients - 
(1 + x)4 = bc(4,0) * x0 + bc(4,1) * x1 + bc(4,2) * x2 + bc(4,3) * x3 + bc(4,4) * x4

(1 + x)4 = 1 * x0 + 4 * x1 + 6 * x2 + 4 * x3 + 1 * x4

The Binomial coefficient for n=4 and k=2 is bc(4, 2) = 6, the coefficient of (x2) in the expansion of (1+x)4.

Problem is – given values of n and k, find the binomial coefficient bc(n, k).

Let’s have a look at few (1+x)n expansions:

N Expression
0 (1+x)0 = 1
1 (1+x)1 = (1+x)
2 (1+x)2 = (1 + 2x + x2)
3 (1+x)3 = (1+ 3x + 3x2 + x3)
4 (1+x)4 = (1 + 4x + 6x2 + 4x3 + x4)
5 (1+x)5 = (1 + 5x + 10x2 + 10x3 + 5x4 + x5)
6 (1+x)6 = (1 + 6x + 15x2 + 20x3 + 15x4 + 6x5 + x6)
7 (1+x)7 = (1 + 7x + 21x2 + 35x3 + 35x4 + 21x5 + 7x6 + x7)

The binomial coefficients from these expansions can be arranged to form following triangle like structure called Pascal’s Triangle:

binomial coefficient
pascal triangle

From Pascal’s triangle, we observe that an entry in any of the rows is addition of two entries from the row above –

bc(n, k) = bc(n-1, k-1) + bc(n-1, k)

This shows this problem has optimal substructure (i.e. an optimal solution can be constructed from optimal solutions of subproblems). Also we can observe that first and last binomial coefficients of each are 1 i.e. –

bc(n, 0) = bc(n, n) = 1

Recursive Approach C++ implementation

long long int findBinomialCoeff (int n, int k){
    if(k == 0 || n == k) return 1;
    return findBinomialCoeff (n-1, k-1) + findBinomialCoeff (n-1, k);
}

int main(void){
   int n = 4, k=2;
   long long int binomialCoeff = findBinomialCoeff(n, k);
   printf("%lld", binomialCoeff);
   
   return 0;
}

Same can be implemented in Java as well.

Recursive Approach Java implementation

import java.util.*;

class BinomialCoefficient
{
    static int findBinomialCoeff (int n, int k){
        if(k == 0 || n == k) return 1;
        return findBinomialCoeff (n-1, k-1) + findBinomialCoeff (n-1, k);
    }

    public static void main(String[] args){
        int n = 4, k=2;
        int binomialCoeff = findBinomialCoeff(n, k);
        System.out.printf("%d", binomialCoeff);
    }
}

The recursive code has exponential time complexity, let’s have a look at the reasoning behind this:
binomial coefficient

In this example of finding Binomial coefficient for n=5 and k=3, subproblems such as bc(3,2) and bc(2,2) are calculated again and again. Our recursive algorithm for this problem solves the same subproblem over and over. These are called overlapping subproblems.

As the two properties required for using Dynamic Programming : optimal substructure and overlapping subproblems are holding here, we can use DP for this problem.

Binomial coefficient : Dynamic Programming Approach

In DP, we start calculating from the bottom and move up towards the final solution. Solution of all subproblems are stored in a 2D array / DP table so that they can be reused when required.

Binomial coefficient with dynamic programming C++

long long int findBinomialCoeff (int n, int k){
    long long int DP_table[n+1][k+1];
    for(int i=0; i&lt;=n; ++i){
        for(int j=0; j &lt;= min(i,k); ++j){ 
            if (j == 0 || i == j) dp[i][j] = 1;
            else dp[i][j] = dp[i-1][j-1] + dp[i-1][j];
        }
    }
    return dp[n][k];
}



Java implementation is very similar.

Java implementation


import java.util.*;

class BinomialCoefficient{
static int min(int a, int b){
return (a<b) ? a : b;
}

static int findBinomialCoeff (int n, int k){
int dp[][] = new int [n+1][k+1];
for(int i=0; i<=n; ++i){
for(int j=0; j <= min(i,k); ++j){
if (j == 0 || i == j) dp[i][j] = 1;
else dp[i][j] = dp[i-1][j-1] + dp[i-1][j];
}
}
return dp[n][k];
}
[/code]

The time complexity of the dynamic programming approach to implementing the binomial coefficient is O(nk) whereas space complexity is O(nk)
The space complexity can be reduced by storing just the last row of Pascal’s triangle and that too just the (k+1) elements from the last row are required, with this approach instead of a 2D array we can work with a 1D array.

This article is contributed by Bhushan Aggrawal.

Top K Frequent Keywords

Given a list of reviews, a list of keywords and an integer k. Find the most frequent k keywords in order of most to least frequently mentioned.

  • The comparison of strings is case-insensitive.
  • Multiple occurrences of a keyword in a review should be considered as a single mention.
  • If keywords are mentioned an equal number of times in reviews, sort alphabetically.

Example

Input:
k = 2
keywords = ["services", "algorithms", "inteview"]
reviews = [
  "algorithms and Me provides the best services for the interview preparations",
  "Also, their mock interview services awesome services",
  "Best services provided by Algorithms and me, everyone should use their services",
]

Output:
["services", "algorithms"]

Explanation:
"services" is occurring in 3 different reviews, and "algorithms" occurs in 2 reviews. Note that even though "services" occur two times in the same review, it is counted only once.

Rule of thumb, if you have to find top k of anything, give priority queue a thought, and to understand priority queues, it is very important to understand the fundamentals of heaps.

This is a straight forward application of priority queues. All we have to do is count the frequency of each word in the keywords list in the given list of reviews. Once we have the frequency count, use priority queue to find the top K keywords.

Implementation notes
1. Convert the list of keywords into a set, to have an efficient lookup.
2. Split each review into words and check if the word is the keyword set.
3. If yes, increment the frequency count.
4. Put the map of words to frequency into the priority queue and pluck top K words. Adjust the comparator function to have order on the frequency and then alphabetically.

Top K frequent keywords in reviews

public List<String> findKWordsInReviews(int k,
                       List<String>keywords, List<String> reviews) {

        List<String> res = new ArrayList<>();

        //For fast look up. additional space O(n)
        Set<String> set = new HashSet<>(keywords);

        //To store the freq count of keywords
        Map<String, Integer> map = new HashMap<>();

        //Go through each review
        for(String review : reviews) {
            //Split the reviews in words
            String[] words = review .split("\\W");

            //So that we do not count word twice
            Set<String> added = new HashSet<>();
            for(String word : words) {
                //As it is case sensitive
                word = word.toLowerCase();
                //If word is keywords and not yet added
                if(set.contains(word) && !added.contains(word)) {
                    map.put(word, map.getOrDefault(word, 0) + 1);
                    added.add(word);
                }
            }
        }

        //Add the map created into the priority queue.
        Queue<Map.Entry<String, Integer>> maxHeap = new PriorityQueue<>(
                (a, b)->a.getValue() == b.getValue()
                        ? a.getKey().compareTo(b.getKey())
                        : b.getValue() - a.getValue());

        //Add everything to PQ
        maxHeap.addAll(map.entrySet());

        //Take out top k words
        while(!maxHeap.isEmpty() && k-- > 0) {
            res.add(maxHeap.poll().getKey());
        }
        return res;
    }

The complexity of this code depends on the following :
1. No of keywords let’s say is n
2. No of reviews, m
3. K

If we are using max heap approach, building a max heap with n keywords will take O(n), while taking out k words from it, will take O(klogn).

Splitting the reviews into words will take O(m * average length of reviews) with additional space depending on the number of words in reviews.

Critical Connections in a Network

There are n servers numbered from 0 to n-1 connected by undirected server-to-server connections forming a network where connections[i] = [a, b] represents a connection between servers a and b. Any server can reach any other server directly or indirectly through the network.
critical connection is a connection that, if removed, will make some server unable to reach some other server.
For example,
Input: n = 4, connections = [[0,1],[1,2],[2,0],[1,3]]
Output: [[1,3]]
Explanation: [[3,1]] is also accepted.
critical connections in a network

Before going through this solution, I would strongly recommend reading Tarjan’s algorithm and find bridges in a graph problem.

class Solution {
    int timer;

    List<List<Integer>> g;
      
    boolean [] visited;
    /* This map stores the time when the
    current node is visited
     */
    int [] tin;
    int [] low;
    
    void dfs(int u, int parent, 
                    List<List<Integer>> res ) {
        visited[u] = true;

        //Put the current timer.
        tin[u] = timer;
        //Low is the time of entry to start with
        low[u] = timer;

        timer++;

        /*
            Go through all the neighbors
         */
        for (int to : g.get(u)) {
            //If it is parent, nothing to be done
            if (to == parent) continue;

            /* If the neighbor was already visited
                get the minimum of the neighbor entry time
                or the current low of the node.
             */
            if (visited[to]) {
                low[u] = Math.min(low[u], tin[to]);
            } else {
                //Else do the DFS
                dfs(to, u, res);
                /*
                 Normal edge scenario,
                 take the minimum of low of the parent 
                  and the child.
                 */
                low[u] = Math.min(low[u], low[to]);

                /* If low of the child node is less than
                   time of entry of current node, then
                   there is a bridge.
                 */
                if (low[to] > tin[u]){
                   //List<Integer> l = new ArrayList<>();
                    List<Integer> l = 
                           Arrays.asList(new Integer[]{u, to});
                   res.add(l);
                }
            }
        }


    }

    public void findBridges(List<List<Integer>> res) {
        timer = 0;
        for(int i=0; i<g.size(); i++){
            if (!visited[i])
                dfs(i, -1, res);
        }
    }

  public List<List<Integer>> criticalConnections(int n, 
                             List<List<Integer>> connections) {
      

    g = new ArrayList<>(); 
      
    visited = new boolean[n];
    /* This map stores the time when the
    current node is visited
     */
    tin = new int[n];
    low = new int[n];
    
    Arrays.fill(visited, false);
    Arrays.fill(tin, n);
    Arrays.fill(low, n);
    
    for(int i=0; i<n; i++){
        g.add(new ArrayList<>());
    }
    for(List<Integer> l : connections){
          g.get(l.get(0)).add(l.get(1));  
          g.get(l.get(1)).add(l.get(0));   
      }
      
      List<List<Integer>> res = new ArrayList<>();
      findBridges(res);
    
      return res;
        
    }
}

The complexity of this code is O(V + E) where V is number of vertices and E is number of edges in the graph.

If you want help with interview preparations, please feel free to book a free demo session with us.