Three Sum Problem

Given an array nums of n integers, are there elements a, b,c in nums such that a+ b + c= 0? Find all unique triplets in the array which gives the sum of zero.

The solution set must not contain duplicate triplets.

This problem is commonly known as three sum problem.

Input: = [-1, 0, 1, 2, -1, -4],
Output:
[
  [-1, 0, 1],
  [-1, -1, 2]
]

This problem is commonly asked in Amazon and Google interviews.

Three Sum problem : thought process

Before going into the details of find three numbers with a given sum, do you know how to find two numbers with a given sum s? The idea is simple, we keep a map of all the numbers in we already seen, if see a number a[i], we see in the map if S-a[i] is present or not. If yes, then we found the pair, if not, we put a[i] and move forward.

This solution has linear time complexity i.e O(n) with a space complexity of O(n) as well. There is a way to avoid space complexity by sorting the input array first and using two pointer approach.

Details of 2 sum problem, you can read here: 2 Sum problem: Find pair with given sum in array

Can we use our understanding of the two-sum problem to solve this three-sum problem?

Hint 1:
What happens if we take one number from the array and say that this number will be one of the three numbers which add up to zero? What happens to our problem statement now?

Hint 2:
If you have decided that A[0] will be part of three numbers which add up to zero, all we need is two find out two numbers from index 1..len-1 which add up to OA[0]. What does that mean? It looks like a 2-sum problem, doesn’t it?

Can you please explain more?
We will start with index i from 0 and len-1, each time claiming that A[i] is part of the solution. Once, we fixed A[i] in solution, we will find two numbers which add to 0-A[i] solution to that problem we already know. If we find those two numbers let’s say A[j] and A[k], then A[i],A[j] and A[k] from the triplet. If we do not find those two numbers, discard A[i] and move forward.
3 sum problem

Show me the implementation of three sum problem

class Solution {

        public List<List<Integer>> threeSum(int[] nums) {

            List<List<Integer>> result = new ArrayList<>();
            //O(n log n)
            Arrays.sort(nums);

         
            for(int i=0; i<nums.length; i++){ if (i > 0 && nums[i] == nums[i - 1]) {
                 continue;
                }
                if(nums[i] > 0) continue;
                int j = i+1;
                int k = nums .length -1;

                int target = 0-nums[i];
                /*
                 Below code is exactly the same to find 
                 two numbers with a given sum using two pointers
                */
                while(j<k){
                    if(nums[j] + nums[k] == target ){
                      /* If find j and k, such that element at 
                         index i, j and K add to zero, put them in result
                      */
                        ArrayList<Integer> triplet = new ArrayList<Integer>(
                            Arrays.asList(nums[i], nums[j], nums[k]));
                        j++;
                        k--;
                         while (j < k && nums[j] == nums[j - 1]) j++;  // skip same result
                         while (j < k && nums[k] == nums[k + 1]) k--; // skip same result //since we want only unique triplets, using set result.add(triplet); } else if(nums[j] + nums[k] > target){
                        k--;
                    }
                    else{
                        j++;
                    }

                }
            }
            return  result;
        } 
    }

If you do not come up with the idea of moving i, j and k to get unique triplets, it is OK. You can always use HashSet to filter out duplicates and then copy the HashSet to the result list. It will take more time but will give the correct result.

 if(uniqueSet.add(triplet)){
     result.add(triplet);
 }

The time complexity of the solution is O(n2) plus the complexity of sorting algorithm (never forget that to mention!!). However, sorting complexity is O(nlogn), hence the overall complexity is dominated by the quadratic expression.

Can you solve the following problems on the leetcode?
1. 3SUM
2. 3Sum Closest
3. Four sum problem.

Ship capacity problem

There are some problems which do not appear to be a binary search problem at first. For example, ship capacity problem on leetcode. Problem statement is:
A conveyor belt has packages that must be shipped from one port to another within D days.
The i-th package on the conveyor belt has a weight of weights[i]. Each day, we load the ship with packages on the conveyor belt (in the order given by weights). We may not load more weight than the maximum weight capacity of the ship.
Return the least weight capacity of the ship that will result in all the packages on the conveyor belt being shipped within D days.

For example:

Input: weights = [1,2,3,4,5,6,7,8,9,10], D = 5
Output: 15
Explanation: 
A ship capacity of 15 is the minimum to ship all the packages in 5 days like this:
1st day: 1, 2, 3, 4, 5
2nd day: 6, 7
3rd day: 8
4th day: 9
5th day: 10

Note that the cargo must be shipped in the order given, so using a ship of capacity 14 and splitting the packages into parts like (2, 3, 4, 5), (1, 6, 7), (8), (9), (10) is not allowed. 

Ship capacity problem and binary search algorithm

At first glance, it does not appear to be a binary search problem. Where is the input we will search on? That’s where we have to remember that to apply a binary search, we actually do not need an input set, all we need is lower and upper bound.

Hint 1
What is the minimum capacity you would need to ship this cargo? You can choose the lowest possible capacity if you infinite number of days to ship all weights.

Hint 2
What if you have only one day to ship all the weights? In that case, what will be the capacity of the ship? Do you need more than capacity ever?

To find these bounds, try to vary the constraint to extremes. In the example of ship capacity, try to put the number of days constraints to an extreme. What if we had an infinite number of days to ship the cargo?

Can you please explain more?
In that case, we will need a ship that has at least capacity greater than equal to the heaviest weight cargo, as no matter how many days, we cannot move the cargo.

Again, what if I had only 1 day to ship the whole cargo? In that case, we need a ship that can take all the weights in 1 day, so ship capacity should be the sum of all the weights.

Now, we know the lower and upper bound of the ship, all we have to adjust the capacity and see if we can ship cargo in D days? Start with mid, see if we can. If yes, then try a smaller capacity. If not, then try greater capacity. All we are doing is find the first capacity between lower and upper bounds. It seems like the first occurrence problem now.

Understood the concept, you still need help with implementation? Please see the code below.

Ship capacity problem implementation

    public int shipWithinDays(int[] weights, int D) {
        
        int upperLimit = 0;
        int lowerLimit = 0;
        
        for(int i = 0; i<weights.length; i++){
            upperLimit+= weights[i];
        }
        //Not returning from while loop :)
        while(lowerLimit < upperLimit){
            int shipCapacity = lowerLimit + (upperLimit - lowerLimit)/2;
            
            if(isItPossible(D, shipCapacity, weights)){
                upperLimit = shipCapacity;
            }
            else{
                lowerLimit = shipCapacity + 1;
            }
        }
        
        return lowerLimit;
        
    }
    
    private boolean isItPossible(int D, int shipCapacity, int[] weights){
        
        int currentLoad = 0;
        int days = 1;
        int i = 0;
        
        while(i<weights.length){
            if(weights[i] > shipCapacity) return false;
            
            currentLoad+= weights[i];
            if(currentLoad == shipCapacity){
                days++; i++;
                currentLoad = 0;
            }
            else if(currentLoad > shipCapacity){
                days++;
                currentLoad = 0;
            }
            else{
                i++;
            }
        }
        
        return days <= D;
    }

The time complexity of the solution is O(nlogc) where n is number of weights and c is capacity range.

Please share if you find anything wrong or missing. If you are looking for coaching to prepare for your technical interviews, please book a free session with us.

Partition labels

Given a string S, partition the string into maximum possible partitions so that each letter appears in at most one part, and return a list of integers containing the size of each partition.
For example:

Input: 
S = "ababcbacadefegdehijhklij"
Output:
[9,7,8]
Explanation:
The partition is "ababcbaca", "defegde", "hijhklij".
This is a partition so that each letter appears in at most one part.
A partition like "ababcbacadefegde", "hijhklij" is incorrect, because it splits S into fewer parts.

This problem is commonly asked in Amazon interviews.

Thought process to partition string in labels

Fundamental idea is that if two instances of characters should be in the same partition. So, we start with the first character and see at what point we can finish the partition. The earliest we can do is at the last instance of this character.
What if we find a character between the first character and the last instance of it? In this case, we increase the length of the partition. Why? If we do not increase the partition, the new character ends up into two partitions, which violates the constraint of the problem.

If we have gone through all the characters between the start of partition and maximum of the last instance of characters, we can close the partition and start a new one.

Let’s take an example and see how it works. Start with the first character of the string and find the last instance of that character, in this case, it is a
partition labels

Go through all the characters until the last instance, if the last instance of any of the characters is greater than the current last instance, expand the partition to include new last instance. In this example, none of the characters have the last index beyond the last index of a

Once, we reach the last index, we can close the current partition and start a new one. Find the last instance of the first character of the new partition.

amazon interview question partition labels

Go though the characters in between. In this example, character e has last instance greater than current one, so we update the partition till e.

There is no character in between with the last instance greater than the last instance of e, so we close the partition there and start the next one from the next character in the string which is h

Next character i will expand the partition to index 22 and next to next character j will push it to 23. There is no other character that has the last occurrence after that index. Hence, we would close the partition there.
partitions

Whenever we are closing the partition, we would add the length of that partition end – start + 1 in a list to return.

Partition labels implementation

class Solution {
    public List<Integer> partitionLabels(String S) {
        
        Map<Character, Integer> map = new HashMap<>(); 
        
        //Get the last index the character.
        for(int i=0; i< S.length(); i++){
            map.put(S.charAt(i), i);
        }
        
        int last = 0;
        int start = 0;
        
        List<Integer> list = new ArrayList<>();
        for(int i=0; i< S.length(); i++){  
            
            last = Math.max(last, map.get(S.charAt(i)));
            if(last == i){
                list.add(last - start + 1);
                start = last + 1;
            }
        }
        
        return list;
    }
}

The time complexity of implementation is O(n), however, we are scanning the string twice. We also need constant space. Do not get confused with map, the maximum size of the map is bound by 52 (upper and lower case characters) irrespective of the size of the string.

Please share if there is something wrong or missing.

Number of islands

Given a 2d grid map of ‘1’s (land) and ‘0’s (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.

Input:
11110
11010
11000
00000

Output: 1

The first challenge of this problem is to identify that this is a graph problem. To identify that, we have to look if cells of the matrix can represent nodes, and are there any relationship between cells that can act as edges of the graph?
One hint is to the relationship between the cells of the matrix. For example, in the island problem, 0s are water, and 1s are land. Land can be connected to land only which is adjacent to the land and a piece of land is isolated if surrounded by water from all four sides.

To find the number of islands in the matrix, we have to find the number of connected components in the graphs as explained above. Which traversal can we use to find the number of connected components of a graph? It is a depth-first search problem.

In this case, nodes seem to be the cells with value 1(land). Most of the time, in these kinds of problems, each cell of the matrix with a certain value is a node. If we have an n x m matrix, we are looking at n x m possible nodes. Also, here be aware that we are not interested in all the cells. Based on the problem statement, we can discard some cells. For example, in the island problem, we are interested in the cells which represent the land.

Now, what will be the edges? In these kinds of problems, a statement is given like an island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. It means that a cell with value 1 is connected to neighbor cells only if they also contain 1.

Now that we know that cells containing 1 in the matrix are nodes of the graphs and connected to neighbors in they are also 1. To count the number of islands, all we have to do is find connected components in the graph. We reduced a matrix problem to a graph problem after all. Should we create an entire graph? Well no, at any point in time, we do not need the entire graph.

Number of islands implementation

public int numIslands(char[][] grid) {
int count = 0;
Set<Integer> visited = new HashSet&lt;&gt;();
int n = grid.length;

if(n &lt;=0) return 0;
int m = grid[0].length;

for(int i=0; i&lt;n; i++){
    for(int j=0; j&lt;m; j++){
        int cell = i * m + j;
        if(!visited.contains(cell) &amp;&amp; grid[i][j] == '1'){
            count++;
            dfs(i, j, visited, grid);
        }
    }
}
return count;
}

void dfs(int i, int j, Set<Integer> visited, char[][] a){
//Check if the node we are traversing is actually a valid node.
if(i&lt;0 || i&gt;=a.length 
      || j&lt;0 || j&gt;=a[0].length 
      || a[i][j] == '0') return;

int cell = i * a[0].length + j;
if(visited.contains(cell)) return;
visited.add(cell); 

//Visit neighbors
dfs(i+1, j, visited, a);
dfs(i-1, j, visited, a);
dfs(i, j+1, visited, a);
dfs(i, j-1, visited, a);
}

The complexity of the above code is O(n * m) where m and n are rows and columns of the matrix.

Find Binomial Coefficient

In mathematics, Binomial Coefficients are the positive integers that occur as coefficients in the binomial theorem.

A binomial coefficient is the coefficient of the term (xk) in the polynomial expansion of (1 + x)n and is indexed by a pair of integers n >= k >= 0

In this post, we will denote it as bc(n, k).

For example: The fourth power of (1 + x) is –

(1 + x)4 = 1 + 4x + 6x2 + 4x3 + x4

Writing this expression in terms of Binomial Coefficients - 
(1 + x)4 = bc(4,0) * x0 + bc(4,1) * x1 + bc(4,2) * x2 + bc(4,3) * x3 + bc(4,4) * x4

(1 + x)4 = 1 * x0 + 4 * x1 + 6 * x2 + 4 * x3 + 1 * x4

The Binomial coefficient for n=4 and k=2 is bc(4, 2) = 6, the coefficient of (x2) in the expansion of (1+x)4.

Problem is – given values of n and k, find the binomial coefficient bc(n, k).

Let’s have a look at few (1+x)n expansions:

N Expression
0 (1+x)0 = 1
1 (1+x)1 = (1+x)
2 (1+x)2 = (1 + 2x + x2)
3 (1+x)3 = (1+ 3x + 3x2 + x3)
4 (1+x)4 = (1 + 4x + 6x2 + 4x3 + x4)
5 (1+x)5 = (1 + 5x + 10x2 + 10x3 + 5x4 + x5)
6 (1+x)6 = (1 + 6x + 15x2 + 20x3 + 15x4 + 6x5 + x6)
7 (1+x)7 = (1 + 7x + 21x2 + 35x3 + 35x4 + 21x5 + 7x6 + x7)

The binomial coefficients from these expansions can be arranged to form following triangle like structure called Pascal’s Triangle:

binomial coefficient
pascal triangle

From Pascal’s triangle, we observe that an entry in any of the rows is addition of two entries from the row above –

bc(n, k) = bc(n-1, k-1) + bc(n-1, k)

This shows this problem has optimal substructure (i.e. an optimal solution can be constructed from optimal solutions of subproblems). Also we can observe that first and last binomial coefficients of each are 1 i.e. –

bc(n, 0) = bc(n, n) = 1

Recursive Approach C++ implementation
long long int findBinomialCoeff (int n, int k){
    if(k == 0 || n == k) return 1;
    return findBinomialCoeff (n-1, k-1) + findBinomialCoeff (n-1, k);
}

int main(void){
   int n = 4, k=2;
   long long int binomialCoeff = findBinomialCoeff(n, k);
   printf("%lld", binomialCoeff);
   
   return 0;
}

Same can be implemented in Java as well.

Recursive Approach Java implementation
import java.util.*;

class BinomialCoefficient
{
    static int findBinomialCoeff (int n, int k){
        if(k == 0 || n == k) return 1;
        return findBinomialCoeff (n-1, k-1) + findBinomialCoeff (n-1, k);
    }

    public static void main(String[] args){
        int n = 4, k=2;
        int binomialCoeff = findBinomialCoeff(n, k);
        System.out.printf("%d", binomialCoeff);
    }
}

The recursive code has exponential time complexity, let’s have a look at the reasoning behind this:
binomial coefficient

In this example of finding Binomial coefficient for n=5 and k=3, subproblems such as bc(3,2) and bc(2,2) are calculated again and again. Our recursive algorithm for this problem solves the same subproblem over and over. These are called overlapping subproblems.

As the two properties required for using Dynamic Programming : optimal substructure and overlapping subproblems are holding here, we can use DP for this problem.

Binomial coefficient : Dynamic Programming Approach

In DP, we start calculating from the bottom and move up towards the final solution. Solution of all subproblems are stored in a 2D array / DP table so that they can be reused when required.

Binomial coefficient with dynamic programming C++
long long int findBinomialCoeff (int n, int k){
    long long int DP_table[n+1][k+1];
    for(int i=0; i&lt;=n; ++i){
        for(int j=0; j &lt;= min(i,k); ++j){ 
            if (j == 0 || i == j) dp[i][j] = 1;
            else dp[i][j] = dp[i-1][j-1] + dp[i-1][j];
        }
    }
    return dp[n][k];
}

Java implementation is very similar.

Java implementation
import java.util.*;

class BinomialCoefficient{
static int min(int a, int b){
return (a<b) ? a : b;
}

static int findBinomialCoeff (int n, int k){
int dp[][] = new int [n+1][k+1];
for(int i=0; i<=n; ++i){
for(int j=0; j <= min(i,k); ++j){
if (j == 0 || i == j) dp[i][j] = 1;
else dp[i][j] = dp[i-1][j-1] + dp[i-1][j];
}
}
return dp[n][k];
}
[/code]

The time complexity of the dynamic programming approach to implementing the binomial coefficient is O(nk) whereas space complexity is O(nk)
The space complexity can be reduced by storing just the last row of Pascal’s triangle and that too just the (k+1) elements from the last row are required, with this approach instead of a 2D array we can work with a 1D array.

This article is contributed by Bhushan Aggrawal.

Top K Frequent Keywords

Given a list of reviews, a list of keywords and an integer k. Find the most frequent k keywords in order of most to least frequently mentioned.

  • The comparison of strings is case-insensitive.
  • Multiple occurrences of a keyword in a review should be considered as a single mention.
  • If keywords are mentioned an equal number of times in reviews, sort alphabetically.

Example

Input:
k = 2
keywords = ["services", "algorithms", "inteview"]
reviews = [
  "algorithms and Me provides the best services for the interview preparations",
  "Also, their mock interview services awesome services",
  "Best services provided by Algorithms and me, everyone should use their services",
]

Output:
["services", "algorithms"]

Explanation:
"services" is occurring in 3 different reviews, and "algorithms" occurs in 2 reviews. Note that even though "services" occur two times in the same review, it is counted only once.

Rule of thumb, if you have to find top k of anything, give priority queue a thought, and to understand priority queues, it is very important to understand the fundamentals of heaps.

This is a straight forward application of priority queues. All we have to do is count the frequency of each word in the keywords list in the given list of reviews. Once we have the frequency count, use priority queue to find the top K keywords.

Implementation notes
1. Convert the list of keywords into a set, to have an efficient lookup.
2. Split each review into words and check if the word is the keyword set.
3. If yes, increment the frequency count.
4. Put the map of words to frequency into the priority queue and pluck top K words. Adjust the comparator function to have order on the frequency and then alphabetically.

Top K frequent keywords in reviews

public List<String> findKWordsInReviews(int k,
                       List<String>keywords, List<String> reviews) {

        List<String> res = new ArrayList<>();

        //For fast look up. additional space O(n)
        Set<String> set = new HashSet<>(keywords);

        //To store the freq count of keywords
        Map<String, Integer> map = new HashMap<>();

        //Go through each review
        for(String review : reviews) {
            //Split the reviews in words
            String[] words = review .split("\\W");

            //So that we do not count word twice
            Set<String> added = new HashSet<>();
            for(String word : words) {
                //As it is case sensitive
                word = word.toLowerCase();
                //If word is keywords and not yet added
                if(set.contains(word) && !added.contains(word)) {
                    map.put(word, map.getOrDefault(word, 0) + 1);
                    added.add(word);
                }
            }
        }

        //Add the map created into the priority queue.
        Queue<Map.Entry<String, Integer>> maxHeap = new PriorityQueue<>(
                (a, b)->a.getValue() == b.getValue()
                        ? a.getKey().compareTo(b.getKey())
                        : b.getValue() - a.getValue());

        //Add everything to PQ
        maxHeap.addAll(map.entrySet());

        //Take out top k words
        while(!maxHeap.isEmpty() && k-- > 0) {
            res.add(maxHeap.poll().getKey());
        }
        return res;
    }

The complexity of this code depends on the following :
1. No of keywords let’s say is n
2. No of reviews, m
3. K

If we are using max heap approach, building a max heap with n keywords will take O(n), while taking out k words from it, will take O(klogn).

Splitting the reviews into words will take O(m * average length of reviews) with additional space depending on the number of words in reviews.

Critical Connections in a Network

There are n servers numbered from 0 to n-1 connected by undirected server-to-server connections forming a network where connections[i] = [a, b] represents a connection between servers a and b. Any server can reach any other server directly or indirectly through the network.
critical connection is a connection that, if removed, will make some server unable to reach some other server.
For example,
Input: n = 4, connections = [[0,1],[1,2],[2,0],[1,3]] Output: [[1,3]] Explanation: [[3,1]] is also accepted.
critical connections in a network

Before going through this solution, I would strongly recommend reading Tarjan’s algorithm and find bridges in a graph problem.

class Solution {
    int timer;

    List<List<Integer>> g;
      
    boolean [] visited;
    /* This map stores the time when the
    current node is visited
     */
    int [] tin;
    int [] low;
    
    void dfs(int u, int parent, 
                    List<List<Integer>> res ) {
        visited[u] = true;

        //Put the current timer.
        tin[u] = timer;
        //Low is the time of entry to start with
        low[u] = timer;

        timer++;

        /*
            Go through all the neighbors
         */
        for (int to : g.get(u)) {
            //If it is parent, nothing to be done
            if (to == parent) continue;

            /* If the neighbor was already visited
                get the minimum of the neighbor entry time
                or the current low of the node.
             */
            if (visited[to]) {
                low[u] = Math.min(low[u], tin[to]);
            } else {
                //Else do the DFS
                dfs(to, u, res);
                /*
                 Normal edge scenario,
                 take the minimum of low of the parent 
                  and the child.
                 */
                low[u] = Math.min(low[u], low[to]);

                /* If low of the child node is less than
                   time of entry of current node, then
                   there is a bridge.
                 */
                if (low[to] > tin[u]){
                   //List<Integer> l = new ArrayList<>();
                    List<Integer> l = 
                           Arrays.asList(new Integer[]{u, to});
                   res.add(l);
                }
            }
        }


    }

    public void findBridges(List<List<Integer>> res) {
        timer = 0;
        for(int i=0; i<g.size(); i++){
            if (!visited[i])
                dfs(i, -1, res);
        }
    }

  public List<List<Integer>> criticalConnections(int n, 
                             List<List<Integer>> connections) {
      

    g = new ArrayList<>(); 
      
    visited = new boolean[n];
    /* This map stores the time when the
    current node is visited
     */
    tin = new int[n];
    low = new int[n];
    
    Arrays.fill(visited, false);
    Arrays.fill(tin, n);
    Arrays.fill(low, n);
    
    for(int i=0; i<n; i++){
        g.add(new ArrayList<>());
    }
    for(List<Integer> l : connections){
          g.get(l.get(0)).add(l.get(1));  
          g.get(l.get(1)).add(l.get(0));   
      }
      
      List<List<Integer>> res = new ArrayList<>();
      findBridges(res);
    
      return res;
        
    }
}

The complexity of this code is O(V + E) where V is number of vertices and E is number of edges in the graph.

If you want help with interview preparations, please feel free to book a free demo session with us.

Priority Queue problems

A priority queue a data structure that keeps things in order of priority. The question is how is it different from the monotonic queue we discussed earlier. The difference is in the time complexity and applicability. The time complexity to find insert an element in the priority queue is O(logn), whereas in monotonic queues it is O(n). On the point of applicability, priority queue orders the entire input set, whereas monotonic queues maintain only partial input set in a particular order, and whenever an order is violated, we remove elements from the monotonic queues.

Before we do in detail of priority queues, please refresh fundamentals of heaps.

A priority queue is a collection in which items can be added at any time, but the only item that can be removed is the one with the highest priority.

It means that no matter what the insertion order of elements, removal will always give you the highest or lowest priority element. Internally, a priority queue is implemented as heaps. A heap is a data structure that as the following property:

A parent is greater than both its children (max heap) or a parent is less than both its children (min-heap)

How do I identify that it is a priority queue question? Almost all the time, the problem statement will have top-K, smallest-k, most frequent K in it. It means this problem has something to do with ranking and for ranking problems best data structure is a priority queue.

Priority queue in Java

If your choice of language in the interview is Java, you are in luck, because Java has this in build class called priority queue.

PriorityQueue pq = new PriorityQueue();

With the above instantiation, you can define a min priority queue, it means the smallest number will be polled first.

If you want to declare it in such a way that the largest number is polled first, all you have to do is pass a comparator. We can use an in-built function provided by Comparator.

PriorityQueue pq = new PriorityQueue(Comparator.reverseOrder());

What if you are intending to store objects in a priority queue, then this default will not work as they work on integers or long only. This is a point you must read on comparators in Java. You can pass a comparator to the constructor, it will be used by the queue to order objects accordingly.

 PriorityQueue<Map.Entry<Integer, Integer>> pq = new PriorityQueue<>(
            (a, b) -> a.getValue().compareTo(b.getValue()) //Comparator
        );

You will find problems where you have to first find the frequency or priority of elements, we use HashMap to map element to its priority.

Use Map.Entry as an object to be stored in the queue. It helps us avoid creating new objects

Enough of theory, let’s get to the business, how can I solve problems using priority queues?

Top K Frequent Elements

Given a non-empty array of integers, return the k most frequent elements. For example: nums = [1,1,1,2,2,3], k = 2 Output: [1,2]

From the problem statement, it is evident that it is a ranking problem and we know the data structure to use. All we have to do is collect the respective frequency of elements. However, we have two ways to add elements to the priority queue. The first way is to create a map of element and frequency and add the whole map into the priority queue in decreasing order of frequency (max heap). Take out K element from the priority queue and that will be your K top elements.
The time complexity of the solution is O(n) to add all the elements in the heap, and then O(klogn) to take K elements out. Remember, each take-out incurs O(logn) complexity due to heapification process.

The second way is to add only K elements in the priority queue in min heap order. We keep the size of the queue to constant K, if it goes more, we just take our the top, which will be the minimum element. Once we have processed all the elements, elements in the queue will be top K elements
The time complexity of the solution is O(nlogk). Remember, each take-out and insert incurs O(logn) complexity due to heapification process.
Depending on the relationship between n and k, you can chose which method to use.

 public List<Integer> topKFrequent(int[] nums, int k) {
        
        
        Map<Integer, Integer> map = new HashMap<>();
        for(int i : nums){
            map.put(i, map.getOrDefault(i,0) + 1);
        }
        
        PriorityQueue<Map.Entry<Integer, Integer>> pq = new PriorityQueue<>(
            (a, b) -> a.getValue().compareTo(b.getValue())
        );
        /* We chose the second method. apparently,
           it give better performance in leetcode
        */
        for(Map.Entry e : map.entrySet()){
            pq.offer(e);
            if(pq.size() > k){
                pq.poll();
            }
        }
        
        List<Integer> res = new ArrayList<>();
        pq.stream().forEach(e -> res.add(e.getKey()));
        
        return res;
        
    }

Top K Frequent Words

Given a non-empty list of words, return the k most frequent elements. The answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

It is more or less the same problem as above, the only change is we have to add comparator to take care of alphabetical order.

    public List<String> topKFrequent(String[] words, int k) {
        
        Map<String, Integer> map = new HashMap<>();
       
        PriorityQueue<Map.Entry<String, Integer>> pq 
            = new PriorityQueue<>(
                       (a,b) -> a.getValue()==b.getValue() 
            ? b.getKey().compareTo(a.getKey()) : a.getValue()-b.getValue()
        );
        
        for(int i=0; i<words.length; i++){
            map.put(words[i], map.getOrDefault(words[i], 0) + 1);
        }
        
        for(Map.Entry<String, Integer> entry: map.entrySet()){
            pq.offer(entry);
            if(pq.size() > k){
                pq.poll();
            }
        }
        
        List<String> res = new LinkedList<>();
        while(!pq.isEmpty())
            res.add(0, pq.poll().getKey());
        
        return res;
    }

Kth Largest Element in an Array

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

class Solution {
    public int findKthLargest(int[] nums, int k) {
        
        PriorityQueue<Integer> pq = new PriorityQueue<>();
        int i = 0;
        for(; i<nums.length; i++){
            pq.add(nums[i]);
            if(pq.size() > k)
                pq.poll();
        }   
        
        return pq.peek();
    }
}

K Closest Points to Origin

We have a list of points on the plane. Find the K closest points to the origin (0, 0). (Here, the distance between two points on a plane is the Euclidean distance.)

    public int[][] kClosest(int[][] points, int K) {
        
        PriorityQueue<int[]> pq = new PriorityQueue<>(
                (a, b) -> ((a[0] - 0 ) * (a[0] - 0 )  + (a[1] - 0 ) * (a[1] - 0 )
                           - (b[0] - 0 ) * (b[0] - 0 ) - (b[1] - 0 ) * (b[1] - 0 ))
        );
            
        for(int i=0; i<points.length; i++){
            pq.add(new int[]{points[i][0], points[i][1]});
        }
            
        int [][] res = new int[K][2];
        for(int i=0; i<K && !pq.isEmpty(); i++){
            res[i] = pq.poll();
        }
            
        return res;
        
    }

Sort Characters By Frequency

Given a string, sort it in decreasing order based on the frequency of characters. For example, S = “tree”, the output should be “eert”

    public String frequencySort(String s) {
        
        PriorityQueue<Map.Entry<Character, Integer>> pq 
            = new PriorityQueue<>((a,b) -> b.getValue() - a.getValue());
        
        Map<Character, Integer> map = new HashMap<>();
       
        for(int i=0; i<s.length(); i++){
            map.put(s.charAt(i), 1 + map.getOrDefault(s.charAt(i), 0));
        }
        
        pq.addAll(map.entrySet());
       
        StringBuilder sb = new StringBuilder();
        while (!pq.isEmpty()) {
            Map.Entry<Character, Integer> e = pq.poll();
            int n = (int)e.getValue();
            for (int i = 0; i < n;  i++) 
                sb.append(e.getKey());
        }
        return sb.toString();
    }

Find median in a stream of integer

This problem is covered in detail here: Median in a stream of integers.

 PriorityQueue<Integer> pqMin;
    PriorityQueue<Integer> pqMax;
    
    /** initialize your data structure here. */
    public MedianFinder() {
         pqMax = new PriorityQueue<>(Collections.reverseOrder());
         pqMin = new PriorityQueue<>();
    }
    
    public void addNum(int num) {
        /* If size of max heap, i.e left bunch of numbers is 
           greater than the min heap, right bunch of numbers,
           add the number in left bunch and take the Max and 
           add it to right bunch */
        if (pqMax.size() > pqMin.size()) {
            pqMax.offer(num);
            pqMin.offer(pqMax.poll());
        } else {
            //Do the opposite.
            pqMin.offer(num);
            pqMax.offer(pqMin.poll());
        }
        
    }
    
    public double findMedian() {
        /*If the size of two heaps is equal, 
           even number of integers, take the average of two
           middle elements */
        if(pqMin.size() == pqMax.size()){
            return (double)(pqMin.peek() + pqMax.peek())/2.0;
        }
        
        //Else return from the left bunch
        return (double)(pqMax.peek() * 1.0);
    }
}

Find K-th Smallest Pair Distance

Given an integer array, return the kth smallest distance among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

  private int findKth(int [] a, int k){
        PriorityQueue<Integer> pq = 
                 new PriorityQueue<>(Comparator.reverseOrder());
        
        for(int i=0; i<a.length; i++){
            for(int j=i+1; j<a.length; j++){
                int diff =  Math.abs(a[i]-a[j]);
                pq.offer(diff);
                
                if(pq.size() > k){
                    pq.poll();
                }
            }
        }
        return pq.peek();
    }    

The above solution works but due to the constraints, it may run out of memory and time. A better solution is based on a binary search and trial and error methods.

Merge k Sorted Lists

Merge k sorted linked lists and return it as one sorted list. For example:
Input:
[
1->4->5,
1->3->4,
2->6
] Output: 1->1->2->3->4->4->5->6

private ListNode mergeList(ListNode[] lists){
        
        if(lists.length == 0) return null;
        
        PriorityQueue<ListNode> pq = new PriorityQueue<>(
            (a,b) -> ((Integer)a.val).compareTo(b.val)
        );
        
        for(int i=0; i<lists.length; i++){
            if(lists[i] != null)
                pq.offer(lists[i]);
        }
        
        ListNode head = pq.poll();
        ListNode tail =  head; 
        if(head != null && head.next != null)
            pq.add(head.next);
    
        while(!pq.isEmpty()){
            tail.next = pq.poll();
            tail = tail.next;
            if(tail.next != null)
                pq.offer(tail.next);
            
        }
        
        return head;
    }

As we see from the above code examples, it is easy to identify a priority queue problem and simple to solve. Please share your views and best of luck with your preparations.

Super Egg drop problem

Egg dropping problem goes like given n eggs and f floors, our task is to find the minimum number of trials required to find the highest floor from which it is safe to drop eggs (i.e. the eggs don’t break) considering worst-case scenario and a few conditions:

  • An egg can be reused only if it survives the fall
  • If an egg breaks from the xth floor, it will also break from all floors above it (x+1th floor to fth floor)
  • If an egg survives a fall from the xth floor, it will also survive the falling from all floors below it (1st floor to x-1th floor)

A couple of variations of this problem where the number of eggs and the number of floors is fixed are famous puzzles like 2 eggs and 100 floors puzzle or 2 eggs and 36 floors puzzle. Finding solutions to such puzzles take a more mathematical approach.

Solution
Let’s say we have 1 egg and 4 floors and we need to find the minimum number of trials required to determine the highest floor from which it is safe to drop eggs. In this case, if we try to drop the egg from a random floor, it might break and we would not be able to conduct further trials to find the required floor.
Hence in such a case, we have to be careful and start from the 1st floor and keep checking on each floor above, as soon as the egg breaks on a floor, we would have found the required floor.

In this example let’s say starting from the 1st floor and reaching the 3rd floor (f-1thth floor) we still couldn’t determine the required floor, only after the trial on the 4th floor, we would find the answer. Hence in the worst case, we require 4 trials.

Now instead of 1 egg if we have 2 eggs and 4 floors, will it reduce the minimum number of trials required? As we have an extra egg we can afford to start from any floor.
If we start from the 4th floor, the egg might break or it might not –

  • If it doesn’t break we can say that it won’t break on any of the lower floors and hence we were lucky enough to have found the answer in the first trial itself.
  • If the egg breaks, we now have 1 egg fewer, but we also know the 4th floor is not the required floors and hence we would perform further trials on floors below it only.
  • But now since we have just 1 egg, we again have to be careful and fall back to the strategy of starting trials on the 1st floor and checking each floor one by one until the 3rd floor. Again in the worst case, we will check each floor from 1st to 3rd and in this way, we will need 3 trials for that.

So in total, we need (1 trial on 4th floor) + (3 trials from 1st to 3rd floor) = 4 trials if we start trials from the 4th floor. Also note in this case the worse of the two possibilities is 4 and we will consider the worst-case scenario always.

But what if we choose some other floor?

If we start from the 3rd floor, the egg might break or it might not –

  • If it doesn’t break, we now have to perform trials on floors above the 3rd floor. In this case, there’s only one floor above: 4th floor and we have two eggs. When we have just one floor to check we just have to drop an egg from it and we can determine our answer from just that one trial.
  • So in total we will have to do 2 trials = (1 trial on 3rd floor) + (1 trial on floors above it i.e. 4th floor).

  • If the egg breaks, we now have 1 egg fewer, but we also know that the 3rd floor is not the required floors and even floors above it will also give the same outcomes, and hence we would perform further trials on floors below it only.
  • Now since we have just 1 egg, we again have to be careful and fall back to the strategy of starting trials on the 1st floor and checking each floor one by one until the 2nd floor. Again in the worst case, we will check each floor from 1st to 2nd and in this way, we will need 2 trials for that.

So in total, we need (1 trial on 3rd floor) + (2 trials from 1st to 2nd floor) = 3 trials if we start trials from the 3rd floor. Also note in this case the worse of the two possibilities is 3 and we will consider the worst-case scenario always.

Similarly, we will start trials from 2nd floor –

  • If the egg breaks on the 2nd floor, we just have to check the 1st floor. Hence total trials required are: 2
  • If the egg doesn’t break, then we have to check for the floors above that are 3rd and 4th floor, i.e. we have 2 eggs and 2 floors for trials.

Note that we do not care about which floor we are or we’ll be (either floor above us or below) doing trials on, after the current trial we just consider which direction will we do next trials and how many floors remaining (either ground or terrace) so that we can calculate the number of required trials accordingly.

Using this similar approach we can find that with 2 eggs and 2 floors we’ll need 2 trials in the worst case. Hence total trials required are: (1 trial on 2nd floor) + (2 trials on 2 floors above 2nd floor) = 3 trials.

Also note in this case the worse of the two possibilities is 3 and we will consider the worst-case scenario always.

Similarly when we start trials from 1st floor-

  • If the egg breaks on the 1st floor, we don’t need more trials as we know there’s no safe floor. Hence total trials required are 1.
  • If the egg doesn’t break, then we have to check for the floors above that are 2nd, 3rd and 4th floor, i.e. we have 2 eggs and 3 floors for trials.

Using this similar approach we can find that with 2 eggs and 3 floors we’ll need 2 trials in the worst case. Hence total trials required are: (1 trial on 1st floor) + (2 trials on 3 floors above 1st floor) = 3 trials. The worse of the two possibilities is 3.

So we tried on all 4 floors: except for starting from the 4th floor all others require 3 trials to find the required answer. So our answer for min_total_trials(2 eggs, 4 floors): 3.

Considering the above simulation and our approach we can, in general, say that when we conduct a trial at xth floor, there are two possible outcomes:

The egg breaks or it does not break

In both cases, we have to conduct more trials, but in case if egg breaks at xth floor, we need to conduct trials on floors below x and in case if the egg does not break at xth floor we need to conduct trials on floors above x. Only exceptions to “conducting more trials” statement are:

  • When we are doing a trial on the 1st floor and the egg breaks, we don’t need more trials as we now know that there are no safe floors.
  • When we are doing a trial at the fth floor (i.e. highest floor) and the egg doesn’t break, we don’t need more trials as we now know that all floors are safe.

And obviously, if we run out of eggs, no more trials can be conducted.

Note that while conducting a trial at any floor we need to consider the worst possible scenario meaning that we cannot make assumptions about the outcomes of the trial conducted and we have to consider the worse of the two.
Also from our approach in the above example, one can observe that we have to think less about finding the highest safe floor and more about minimizing the number of trials. We actually don’t have to find the highest safe floor but have to focus on minimizing the number of trials by simulating drops from different floors.

We’ll consider doing trials on all the floors and take the worst case for every floor and find our final answer by getting the minimum of all the trials from all floors we just calculated. We aim to reduce our current problem to sub-problems. This shows this problem has optimal substructure (i.e. an optimal solution can be constructed from optimal solutions of sub-problems).

Also, we can derive a formula for our finalized approach that will give us the minimum number of trials required to find the highest safe floor given n eggs and f floors:

min_total_trials(n, f) = 1 + MIN( MAX(min_total_trials(n-1, x-1), min_total_trials(n, f – x) ) for each x in {1…f} )

We conduct trials on each floor from 1 to f and for each trial, we consider worse (MAX) of the two possible outcomes, and of all those we consider the one that is minimum. You must have observed in our simulation that we added 1 to the outcome from the sub-problem that was for each of the trials that were done on the current floor, that’s the same 1 that we can see in the formula here.

Using this formula and the exceptions for “conducting more trials” described above as base cases we write a recursive implementation:

int min_total_trials(int e, int f)
{
     /* for 0 floors no trials are required
         for 1 floor only 1 trial is required
        if only 1 egg is available then we have to be 
        careful and perform trials on each of the f floors, 
        hence f trials
     */
   if (f == 0 || f == 1 || e == 1) return f;
   int min_trials = INT_MAX, current_floor_min_trials;
   for (int current_floor = 1 ; current_floor <= f ; ++current_floor)
   {	
      current_floor_min_trials = 1 + 
                max( min_total_trials(e-1, current_floor-1), 
                min_total_trials(e, f - current_floor) );
       min_trials = min(min_trials, current_floor_min_trials); 
   }
   return min_trials;
}
 
int main(void) 
{
    int e = 2, f = 4;
    int min_trials = min_total_trials(e, f);
    printf("%d", min_trials);
    return 0;
}
package com.company;
import java.io.*;
 
class MinTrials
{
    static int max(int a, int b) { return (a > b)? a: b; } 
    static int min(int a, int b) { return (a < b)? a: b; } 
    
    static int min_total_trials(int e, int f)
    {
         /* for 0 floors no trials are required
            for 1 floor only 1 trial is required
            if only 1 egg is available then we have to 
            be careful and perform trials on each of
            the f floors, hence f trials
         */
        if (f == 0 || f == 1 || e == 1) return f;
       int min_trials = Integer.MAX_VALUE, current_floor_min_trials;
       for (int current_floor = 1 ; current_floor <= f ; ++current_floor)
       {
          current_floor_min_trials = 1 
                 + max( min_total_trials(e-1, current_floor-1), 
                        min_total_trials(e, f - current_floor) );
           min_trials = min(min_trials, current_floor_min_trials); 
       }
       return min_trials;
    }
 
	public static void main (String[] args) 
	{
		int e = 2, f = 4;
        System.out.println(min_total_trials(e, f));
    }
}

This implementation takes more than 3 secs for e = 2 and f = 28, and the execution time increases exponentially with increasing inputs.
To find the reason behind such high runtime let’s have a look at the recursive function call tree:
egg drop problem dynamic programming

As we can see while calculating min_total_trials(2,4), min_total_trials(2,2) and min_total_trials(2,2) are calculated repeated.
This means recursive solution solves the same sub-problem over and over rather than always generating new sub-problems. These are called overlapping sub-problems.
As the two properties required for using Dynamic Programming: optimal substructure and overlapping sub-problems hold, we can use dynamic programming for this problem.
Let’s look at the DP solution for this problem: We’ll use a 2D array / DP table to store solutions to the sub-problems and reuse it when required. In DP when moving from bottom-up we will find a solution for each of the sub-problems where-

e = 1 and f = 1, e = 1 and f = 2, e = 1 and f = 3,….. e = 1 and f = F
e = 2 and f = 1, e = 2 and f = 2, e = 2 and f = 3,….. e = 2 and f = F
.
.
.
e = E and f = 1, e = E and f = 2, e = E and f = 3,…… e = E and f = F

DP_table[E][F] will give the final answer for ‘E’ eggs and ‘F’ floors.

int min_total_trials(int e, int f)
{
    int DP_table[e+1][f+1] = {};
     for(int i=1; i<=e; ++i)
    {
         /* for 0 floors no trials are required */
         DP_table[i][0] = 0; 
         /* for 1 floor only 1 trial is required */    
         DP_table[i][1] = 1;    
    }
     for(int i=1; i <= f; ++i)
    {
          /* if only 1 egg is available then we have to 
          be careful and perform trials on each of 
         the f floors, hence f trials */
         DP_table[1][i] = i;
    }
 
     for(int i=2; i <= e; ++i)
    {
          for(int j=2; j <= f; ++j)
         {
             int min_trials = INT_MAX, current_floor_min_trials;
             for (int current_floor=1 ; current_floor<=j; ++current_floor)
             {
                  current_floor_min_trials = 1 
                        + max( DP_table[i-1][current_floor-1], 
                               DP_table[i][j - current_floor] );
                 min_trials = min(min_trials, current_floor_min_trials); 
              }
             DP_table[i][j] = min_trials;
         }
    }
    return DP_table[e][f];
}

int main(void) 
{
    int e = 2, f = 100;
    int min_trials = min_total_trials(e, f);
    printf("%d", min_trials);
    return 0;
}

Javacode

package com.company;	
import java.io.*;
 
class MinTrials
{
    static int max(int a, int b) { return (a > b)? a: b; } 
    static int min(int a, int b) { return (a < b)? a: b; } 
    
    static int min_total_trials(int e, int f)
    {
        int DP_table[][] = new int[e+1][f+1];
         for(int i=1; i<=e; ++i)
        {
             DP_table[i][0] = 0;
             DP_table[i][1] = 1;
        }
         for(int i=1; i <= f; ++i)
        {
             DP_table[1][i] = i;
        }
    
         for(int i=2; i <= e; ++i)
        {
              for(int j=2; j <= f; ++j)
             {
                 int min_trials = Integer.MAX_VALUE, current_floor_min_trials;
                 for (int current_floor=1; current_floor<=j; ++current_floor)
                 {
                      current_floor_min_trials = 1 
                         + max( DP_table[i-1][current_floor-1], 
                                DP_table[i][j - current_floor] );
                     min_trials = min(min_trials, current_floor_min_trials); 
                  }
                 DP_table[i][j] = min_trials;
             }
        }
        return DP_table[e][f];
    }
 
      public static void main (String[] args) 
      {
            int n = 2, f = 100;
            System.out.println(min_total_trials(n, f));
    }
}

Time complexity of DP solution is O(n * f * f) Space complexity of DP solution is O(n * f) where n is no. of eggs and f is no. of floors.

Even though this works for small input, above implementation times out when you run it at Leetcode. We need to change our thinking process. What if we do not think floors and eggs together and think differently in terms of moves and the number of eggs. How about we find that give m moves and k eggs, how many floors can we test?

If we have 1 egg and 1 move only, then we can test only 1 floor. Imagine we are at floor at X, and we have k eggs and m moves allowed. If we make a move on floor x, then there are two possibilities, either the egg will break or it will survive. If the egg breaks, then we are left with k-1 eggs and m-1 moves, this means with k eggs and m moves, we cannot find certainly the floor greater than dp[k-1][m-1]+1. If it does not break, then we have m-1 moves, but still, k eggs remaining, then we can test dp[k][m-1] floors.
If the egg doesn’t break, the egg which is still the first one can start its second try from x + (x-1)floor. Because, if the current egg breaks when it is dropped from the x + (x-1) floor, the second egg still has x-2 opportunities and it can still try from x+1 floor to x + (x-2) floor. If the second egg doesn’t break again, the third try can start from x + (x-1) + (x-2) floor. Hence, the DP recurrence comes up as

dp[m][k] = dp[k-1][m-1] + 1 + dp[k][m-1]

Below is the implementation.

    public int superEggDrop(int K, int N) {
        int[][] dp = new int[N + 1][K + 1];
        int m = 0;
        while (dp[m][K] < N) {
            ++m;
            for (int k = 1; k <= K; ++k)
                dp[m][k] = dp[m - 1][k - 1] + dp[m - 1][k] + 1;
        }
        return m;
    }

This article is contributed by Bhushan Aggrawal. If you would like to contribute, please reach out to us on [email protected]

Sorting algorithms: Technical interviews must-know

Sorting algorithms are a must-know for any technical interview, be it a startup or a FANG company like Facebook, Amazon. If you fail to explain the sorting algorithm you are using and why are you using it, you can kiss goodbye to your interview. However, the most important aspect of sorting algorithms is their complexity. We will be going through 5 most known algorithms and group them by their complexity. Another aspect that is good know is the stability of the sorting algorithm.

Stability does the sorting algorithm maintain the relative order of elements that are equal?

The first group is of polynomial complexity like O(n2) and second group is linear logarithmic complexity like (n log n). Algorithms like Selection Sort, Bubble Sort, and Insertion sort fall in the first group. Quicksort, Merge sort and heap sort they fall in the second group.

Worst-case complexity of quicksort is O(n2), however, that is limited to very specific cases like already sorted input, on all other cases, if the pivot is selected properly, it will give O(nlogn)

Selection sort

The fundamental of this sort is: Find the minimum element in remaining unsorted array and put it at the current position. We start with index 0 and go all the way to the last index, find the minimum in the remaining array, and put it to the current index.

  public static void selectionSort(int[] arr){  
      for (int i = 0; i < arr.length - 1; i++){  
            int index = i;
            //Select minimum in the remaining array  
            for (int j = i + 1; j < arr.length; j++){  
                if (arr[j] < arr[index]){  
                    index = j;
                }  
            }  
            //Swap the smallest number and current index number
            int smallestNumber = arr[index];   
            arr[index] = arr[i];  
            arr[i] = smallerNumber;  
        }  
    }  

If you are wondering why complexity of selection sort is O(n2) even though inner loop goes from i+1 to end, I would suggest reading this post on calculation of time complexity in for loop. The space complexity is O(1).

Bubble sort

The fundamental of Bubble sort is: Compare two adjacent indices, if they are out of order, put them in order. We start with index 0, compare it with every new neighbor till the end of the array. After each pass, we have sorted one element.

   void bubbleSort(int arr[]) 
    { 
        int n = arr.length; 
        for (int i=0; i < n; i++){
            for (int j = 1; j < n-i; j++){
                if (arr[j-1] > arr[j]){ 
                    // swap arr[j-1] and arr[j] 
                    int temp = arr[j]; 
                    arr[j] = arr[j-1]; 
                    arr[j-1] = temp; 
                } 
             }
        }
    }

The time complexity is obviously O(n2) and the space complexity is O(1). Bubble sort is a stable sorting algorithm.

Insertion sort

The third algorithm in the polynomial complexity class is insertion sort, fundamental of it is: Taken element and put it in its correct position in the array by shifting all the indices on right by 1. The idea to keep the part of the array sorted, examine all the indices, and put them at the correct position in the sorted part of the array.

    void insertionSort(int arr[]){ 
        int n = arr.length; 
        for (int i = 1; i < n; ++i) { 
            int key = arr[i]; 
            int j = i - 1; 
  
            /* Move elements of arr[0..i-1], that are 
               greater than key, to one position ahead 
               of their current position */
            while (j >= 0 && arr[j] > key) { 
                arr[j + 1] = arr[j]; 
                j = j - 1; 
            } 
            arr[j + 1] = key; 
        } 
    } 

Time complexity is O(n2) and space complexity is O(1). Insertion sort is stable sort as well.

Quicksort

The entire idea of quicksort revolves around a pivot. Pivot is an element in input around which input is arranged in such a way that all elements on the left side are smaller and all elements on the right side are greater than the pivot. The question is how to find or select pivot and put it into the correct position.
To put this pivot at the correct position in input, start with the next element of pivot in input space, and find an element that is greater than the pivot. Let that be ith position.

At the same time, start from the end of the array and find an element that is smaller than pivot. Let it be jth position.

If i and j have not crossed each other i.e i < j, then swap element at ith and jth positions, and continue moving right on input to find element greater than pivot and moving left to find element smaller than pivot. Once i and j cross each other, swap pivot with element at jth position.
After this step, the pivot will be at its correct position and the array will be divided into two parts. All elements on the left side will be less than the pivot and all elements on the right side will be greater than the pivot.

Recursively apply quicksort on left and right subarray, until there is only one element to sort.

    private void swap(int [] a, int i, int j){
        int temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }

    private int partition(int [] a, int start, int end){
        int pivot = a[start];
        int i  = start;
        int j  = end;

        while(i < j){
            while(i <= end && a[i] <= pivot) i++;
            while(j >= start && a[j] > pivot) j--;
            
            if(i < j) {
                swap(a, i, j);
            }
        }

        swap(a, start, j);
        return j;
    }

    public void quickSort(int [] a, int start, int end){
        if(start < end){
            int p = partition(a, start, end);
            quickSort(a,start, p-1);
            quickSort(a, p+1, end);
        }
    }

Time complexity of the quicksort implementation depends on the selection of pivot. If pivot splits the original array into two equal parts (which is the intention), the complexity of quicksort is O(n log n) (Master Theorem).

The worst-case complexity of quick sort happens when array is sorted or reverse sorted. Array is partitioned into two subarrays, one with size 1 and other with size n-1. Subarray with n-1 elements, it again is divided into two subarrays of size 1 and n-2. In order to completely sort array, it will split for n-1 times and each time it requires to traverse n element to find the correct position of pivot. Hence overall complexity of quick sort comes out as O(n2).

What is space complexity of this algorithm? There is no apparent memory is used. However, recursive implementation internally puts stack frames on the stack for partitioned indices and function call return addresses and so on. In the worst case, there can be n stack frames, so, the worst-case complexity of quicksort will be O(n).

Merge sort

Merge sort algorithm falls in divide and conquer class of algorithms where input space is divided into subproblems and these subproblems are then solved individually and combined together to solve the original problems.
A question arises is what is the smallest subproblem? The smallest subproblem is where input cannot be further divided, a subproblem with one item to sorted. Once the split is done to the smallest size, we start merging. Going from the bottom, start with two one-item subproblems, combine that to create a two-item subproblem, then combine two two-items subproblems to create a four-item subproblem. Go on till you get a problem with the original size.
In merge operation, scan both sorted arrays one by one element and based on the output of comparison, put one of the two items into output array, till both arrays are scanned.

int merge(int[] a, int start, int mid, int end){
 
    int i,j,k;
    int temp[end-start+1];
 
    i = start; j = mid+1; k=0;
    /* Compare all elements of two sorted arrays and store
      them in sorted array. */
    while(i <= mid && j <= end){
        if(a[i]< a[j]){
            temp[k++]= a[i++];
        }
        else {
            temp[k++]  = a[j++];
        }
    }
    while(i <= mid){
        temp[k++] = a[i++]; 
    }
    while(j <= end){
        temp[k++] = a[j++]; 
    }
    //Copy the temporary array into original array
    for(i=0; i<k; i++){
        a[i+start] = temp[i];
    }
}
int mergeSort(int a[], int start, int end ){
 
    int mid  = (start + end)/2;
    if(start<end){
        //sort the left part
        mergeSort(a, start, mid);
        //sort right part
        mergeSort(a, mid+1, end);
 
        //merge them together
        merge(a, start, mid, end);
    }
}

For n elements in an array, there will be O(logn) such steps. Hence, the complexity of merge sort algorithm is O(nlogn) Refer Master theorem.

Every time we merge, two sub-arrays an auxiliary array is needed to temporarily store array elements and that incurs O(n) space complexity on merge sort algorithm.

This in addition to overhead caused by recursion.

Heap Sort

This one depends on the concept of heaps, which is covered on this website here: Heaps Fundamental is that we put all the elements in a priority queue or heap, min-heap to be precise. Then take element one by one and add back into the array. Using Java this is one of the simplest sorting algorithms, given that we get an inbuilt implementation of the priority queue in java.

    public void heapSort (int[] a){

        PriorityQueue<Integer> pq = Arrays.stream(a)
                .boxed()
                .collect(Collectors.toCollection(PriorityQueue::new));

        int i = 0;
        while(!pq.isEmpty()){
            a[i] = pq.poll();
            i++;
        }

        Arrays.stream(a)
                .forEach(e -> System.out.print(e + " "));
   }

The time complexity of heap sort is O(long) and space complexity is O(n).

There is another class of sorting algorithms that sort input in linear time complexity, but those algorithms are dependent on some conditions to be present in the input, for example for Radix sort, the input must be an integer.

Prepare your sorting algorithm basics and understand time and space complexity and stability. It is very good to know the fundamentals of them, even if you are using the inbuild library or function to do the sorting.

Always account for complexity for sorting algorithm in your solution, most the time that dominates the complexity.

If you are preparing for an interview and need some help with preparation, please reach out to us or book a session with us to understand more.