Partition labels

Given a string S, partition the string into maximum possible partitions so that each letter appears in at most one part, and return a list of integers containing the size of each partition.
For example:

Input: 
S = "ababcbacadefegdehijhklij"
Output:
[9,7,8]
Explanation:
The partition is "ababcbaca", "defegde", "hijhklij".
This is a partition so that each letter appears in at most one part.
A partition like "ababcbacadefegde", "hijhklij" is incorrect, because it splits S into fewer parts.

This problem is commonly asked in Amazon interviews.

Thought process to partition string in labels

Fundamental idea is that if two instances of characters should be in the same partition. So, we start with the first character and see at what point we can finish the partition. The earliest we can do is at the last instance of this character.
What if we find a character between the first character and the last instance of it? In this case, we increase the length of the partition. Why? If we do not increase the partition, the new character ends up into two partitions, which violates the constraint of the problem.

If we have gone through all the characters between the start of partition and maximum of the last instance of characters, we can close the partition and start a new one.

Let’s take an example and see how it works. Start with the first character of the string and find the last instance of that character, in this case, it is a
partition labels

Go through all the characters until the last instance, if the last instance of any of the characters is greater than the current last instance, expand the partition to include new last instance. In this example, none of the characters have the last index beyond the last index of a

Once, we reach the last index, we can close the current partition and start a new one. Find the last instance of the first character of the new partition.

amazon interview question partition labels

Go though the characters in between. In this example, character e has last instance greater than current one, so we update the partition till e.

There is no character in between with the last instance greater than the last instance of e, so we close the partition there and start the next one from the next character in the string which is h

Next character i will expand the partition to index 22 and next to next character j will push it to 23. There is no other character that has the last occurrence after that index. Hence, we would close the partition there.
partitions

Whenever we are closing the partition, we would add the length of that partition end – start + 1 in a list to return.

Partition labels implementation

class Solution {
    public List<Integer> partitionLabels(String S) {
        
        Map<Character, Integer> map = new HashMap<>(); 
        
        //Get the last index the character.
        for(int i=0; i< S.length(); i++){
            map.put(S.charAt(i), i);
        }
        
        int last = 0;
        int start = 0;
        
        List<Integer> list = new ArrayList<>();
        for(int i=0; i< S.length(); i++){  
            
            last = Math.max(last, map.get(S.charAt(i)));
            if(last == i){
                list.add(last - start + 1);
                start = last + 1;
            }
        }
        
        return list;
    }
}

The time complexity of implementation is O(n), however, we are scanning the string twice. We also need constant space. Do not get confused with map, the maximum size of the map is bound by 52 (upper and lower case characters) irrespective of the size of the string.

Please share if there is something wrong or missing.

Tarjan’s Algorithm to find strongly connected components

Tarjan Algorithm is a very useful algorithm in finding strongly connected components in a directed graph. What is a strongly connected component?

What is Strongly Connected Component?

A directed graph is said to be strongly connected if we can reach any node from every other node in the graph. A strongly connected component of a directed graph is a subset of the nodes in the graph such that any two nodes of this subset are reachable from each other.
For any two nodes u and v in graph, if they are part of a strongly connected component, there exists a path from u to v and vice-a-versa.

For example, in the graph given below, there are 4 strongly connected components.
strongly connected components

There are two algorithms to strongly connected components one is Kosaraju’s algorithm and another one is the Tarjan algorithm.
Tarjan algorithm requires only one depth-first search traversal to find out all strongly connected components present in the graph. It is efficient as compared to Kosaraju’s algorithm as the other requires two DFS traversal.
An important thing to known to understand Tarjan’s algorithm is that depth-first traversal creates a DFS tree, meaning when we traverse a graph, it does traversal in a manner that a node is not visited again.

Tarjan’s algorithm to find Strongly Connected Component?

Fundamental is very simple: From a node u if we can reach to any of its ancestors for any of its descendants (including itself), then there must be a cycle containing this node, all these nodes should be part of the same connected component.

For example, if x is the root of the tree and y is the descendant of x and there is an edge from y to another node z that is an ancestor of x, then we can say that edge y – Z together with the paths in the DFS tree from z to x and from x to z form a cycle, which must all be part of the same component.

tarjan's algorithm

The edge from y to z is called back edge.

How can we know if there is a back-edge from a node to another? We keep track of two things for each node v of the graph:

tin[v]: Represents the number of node visited before v in DFS.
low[v]: Represents smallest tin[x] of a node reachable by a back edge from the subtree of v.

The below figure displays tin[i] for each node. In the simplest way, it can be calculated as in-time of the node in depth-first traversal.

strongly connected components in a graph

The interesting part is how to calculate low[i]? To start with, low[i] is the node the node i itself. Going forward, for each node u, go through its neighbors:
1. If the neighbor v not already visited, that means, it is not a back edge. We will go forward and do the DFS on the v. Once DFS is finished, we will update the low[u] as follows:

low[u] = min(low[u], low[v]).

2. If the node v is already visited, that means there is a back edge from node v to any of the ancestor of node u. In that case, low[u] will be defined as follows

low[u] = min(low[u], tin[v])

tarjan's algorithm to find strongly connected components

For node v, if tin[v] == low[v], v is then the root of the next connected component. If you want to print nodes in strongly connected components, keep track of them on the stack. As soon as we get a node with low[i] == tin[i], pop all the nodes from the stack till we get the node i. All of these nodes belong to same SCC.

If low[v] > tin[u], we can say that node u and v belong to different SCCs and edge (u,v) is a bridge edge.

Tarjan's algorithm to find strongly connected components

package AlgorithmsAndMe;

import java.util.*;

public class TarjansAlgorithm {

    Set<Integer> visited = new HashSet<>();
    /* This map stores the time when the
    current node is visited
     */
    int [] tin;
    /*
      low will store minimum on
       tin[v]
       tin[p] for all p for which (v,p) is a back edge
       low[to] for all to for which (v,to) is a tree edge
     */
    int [] low;

    //To maintain monotonic increasing order.
    int timer;

    void dfs(Graph g, int u, int parent, Stack<Integer> stack) {
        visited.add(u);

        //Put the current timer.
        tin[u] = timer;
        //Low is the time of entry to start with
        low[u] = timer;

        stack.push(u);
        timer++;

        /*
            Go through all the neighbors
         */
        for (int to : g.getNeighbors(u)) {
            //If it is parent, nothing to be done
            if (to == parent) continue;

            /* If the neighbor was already visited
                get the minimum of the neighbor entry time
                or the current low of the node.
             */
            if (visited.contains(to)) {
                low[u] = Math.min(low[u], tin[to]);
            } else {
                //Else do the DFS
                dfs(g, to, u,stack);
                /*
                 Normal edge scenario,
                 take the minimum of low of the parent and the child.
                 */
                low[u] = Math.min(low[u], low[to]);
            }
        }

        System.out.println();
        if (low[u] == tin[u]){
            //Print the stack.
            while(!stack.isEmpty() && stack.peek() != u){
                System.out.print(stack.pop() + " ");
            }
            System.out.print(u + " ");
            stack.pop();
        }
    }

    public void findConnectedComponents(Graph g) {
        timer = 0;
        Stack<Integer> stack = new Stack<>();
        Iterator it = g.getNodes().iterator();

        tin = new int[g.getSize() + 1];
        low = new int[g.getSize() + 1];
        Arrays.fill(tin, Integer.MAX_VALUE);
        Arrays.fill(low, Integer.MAX_VALUE);

        while(it.hasNext()){
            int i = (int) it.next();
            if (!visited.contains(i))
                dfs(g, i, -1, stack);
        }
    }
}

As we can see from the code that we only need one DFS traversal,therefore compexity of the algorithm will be O(|V| + |E|), where V and E are total number of vertex and edges present in the graph.

Problems solved using algorithm

1. Critical connection in a network.

Please write to us if you need coaching for your interview preparations.

Top K Frequent Keywords

Given a list of reviews, a list of keywords and an integer k. Find the most frequent k keywords in order of most to least frequently mentioned.

  • The comparison of strings is case-insensitive.
  • Multiple occurrences of a keyword in a review should be considered as a single mention.
  • If keywords are mentioned an equal number of times in reviews, sort alphabetically.

Example

Input:
k = 2
keywords = ["services", "algorithms", "inteview"]
reviews = [
  "algorithms and Me provides the best services for the interview preparations",
  "Also, their mock interview services awesome services",
  "Best services provided by Algorithms and me, everyone should use their services",
]

Output:
["services", "algorithms"]

Explanation:
"services" is occurring in 3 different reviews, and "algorithms" occurs in 2 reviews. Note that even though "services" occur two times in the same review, it is counted only once.

Rule of thumb, if you have to find top k of anything, give priority queue a thought, and to understand priority queues, it is very important to understand the fundamentals of heaps.

This is a straight forward application of priority queues. All we have to do is count the frequency of each word in the keywords list in the given list of reviews. Once we have the frequency count, use priority queue to find the top K keywords.

Implementation notes
1. Convert the list of keywords into a set, to have an efficient lookup.
2. Split each review into words and check if the word is the keyword set.
3. If yes, increment the frequency count.
4. Put the map of words to frequency into the priority queue and pluck top K words. Adjust the comparator function to have order on the frequency and then alphabetically.

Top K frequent keywords in reviews

public List<String> findKWordsInReviews(int k,
                       List<String>keywords, List<String> reviews) {

        List<String> res = new ArrayList<>();

        //For fast look up. additional space O(n)
        Set<String> set = new HashSet<>(keywords);

        //To store the freq count of keywords
        Map<String, Integer> map = new HashMap<>();

        //Go through each review
        for(String review : reviews) {
            //Split the reviews in words
            String[] words = review .split("\\W");

            //So that we do not count word twice
            Set<String> added = new HashSet<>();
            for(String word : words) {
                //As it is case sensitive
                word = word.toLowerCase();
                //If word is keywords and not yet added
                if(set.contains(word) && !added.contains(word)) {
                    map.put(word, map.getOrDefault(word, 0) + 1);
                    added.add(word);
                }
            }
        }

        //Add the map created into the priority queue.
        Queue<Map.Entry<String, Integer>> maxHeap = new PriorityQueue<>(
                (a, b)->a.getValue() == b.getValue()
                        ? a.getKey().compareTo(b.getKey())
                        : b.getValue() - a.getValue());

        //Add everything to PQ
        maxHeap.addAll(map.entrySet());

        //Take out top k words
        while(!maxHeap.isEmpty() && k-- > 0) {
            res.add(maxHeap.poll().getKey());
        }
        return res;
    }

The complexity of this code depends on the following :
1. No of keywords let’s say is n
2. No of reviews, m
3. K

If we are using max heap approach, building a max heap with n keywords will take O(n), while taking out k words from it, will take O(klogn).

Splitting the reviews into words will take O(m * average length of reviews) with additional space depending on the number of words in reviews.

Critical Connections in a Network

There are n servers numbered from 0 to n-1 connected by undirected server-to-server connections forming a network where connections[i] = [a, b] represents a connection between servers a and b. Any server can reach any other server directly or indirectly through the network.
critical connection is a connection that, if removed, will make some server unable to reach some other server.
For example,
Input: n = 4, connections = [[0,1],[1,2],[2,0],[1,3]] Output: [[1,3]] Explanation: [[3,1]] is also accepted.
critical connections in a network

Before going through this solution, I would strongly recommend reading Tarjan’s algorithm and find bridges in a graph problem.

class Solution {
    int timer;

    List<List<Integer>> g;
      
    boolean [] visited;
    /* This map stores the time when the
    current node is visited
     */
    int [] tin;
    int [] low;
    
    void dfs(int u, int parent, 
                    List<List<Integer>> res ) {
        visited[u] = true;

        //Put the current timer.
        tin[u] = timer;
        //Low is the time of entry to start with
        low[u] = timer;

        timer++;

        /*
            Go through all the neighbors
         */
        for (int to : g.get(u)) {
            //If it is parent, nothing to be done
            if (to == parent) continue;

            /* If the neighbor was already visited
                get the minimum of the neighbor entry time
                or the current low of the node.
             */
            if (visited[to]) {
                low[u] = Math.min(low[u], tin[to]);
            } else {
                //Else do the DFS
                dfs(to, u, res);
                /*
                 Normal edge scenario,
                 take the minimum of low of the parent 
                  and the child.
                 */
                low[u] = Math.min(low[u], low[to]);

                /* If low of the child node is less than
                   time of entry of current node, then
                   there is a bridge.
                 */
                if (low[to] > tin[u]){
                   //List<Integer> l = new ArrayList<>();
                    List<Integer> l = 
                           Arrays.asList(new Integer[]{u, to});
                   res.add(l);
                }
            }
        }


    }

    public void findBridges(List<List<Integer>> res) {
        timer = 0;
        for(int i=0; i<g.size(); i++){
            if (!visited[i])
                dfs(i, -1, res);
        }
    }

  public List<List<Integer>> criticalConnections(int n, 
                             List<List<Integer>> connections) {
      

    g = new ArrayList<>(); 
      
    visited = new boolean[n];
    /* This map stores the time when the
    current node is visited
     */
    tin = new int[n];
    low = new int[n];
    
    Arrays.fill(visited, false);
    Arrays.fill(tin, n);
    Arrays.fill(low, n);
    
    for(int i=0; i<n; i++){
        g.add(new ArrayList<>());
    }
    for(List<Integer> l : connections){
          g.get(l.get(0)).add(l.get(1));  
          g.get(l.get(1)).add(l.get(0));   
      }
      
      List<List<Integer>> res = new ArrayList<>();
      findBridges(res);
    
      return res;
        
    }
}

The complexity of this code is O(V + E) where V is number of vertices and E is number of edges in the graph.

If you want help with interview preparations, please feel free to book a free demo session with us.

Priority Queue problems

A priority queue a data structure that keeps things in order of priority. The question is how is it different from the monotonic queue we discussed earlier. The difference is in the time complexity and applicability. The time complexity to find insert an element in the priority queue is O(logn), whereas in monotonic queues it is O(n). On the point of applicability, priority queue orders the entire input set, whereas monotonic queues maintain only partial input set in a particular order, and whenever an order is violated, we remove elements from the monotonic queues.

Before we do in detail of priority queues, please refresh fundamentals of heaps.

A priority queue is a collection in which items can be added at any time, but the only item that can be removed is the one with the highest priority.

It means that no matter what the insertion order of elements, removal will always give you the highest or lowest priority element. Internally, a priority queue is implemented as heaps. A heap is a data structure that as the following property:

A parent is greater than both its children (max heap) or a parent is less than both its children (min-heap)

How do I identify that it is a priority queue question? Almost all the time, the problem statement will have top-K, smallest-k, most frequent K in it. It means this problem has something to do with ranking and for ranking problems best data structure is a priority queue.

Priority queue in Java

If your choice of language in the interview is Java, you are in luck, because Java has this in build class called priority queue.

PriorityQueue pq = new PriorityQueue();

With the above instantiation, you can define a min priority queue, it means the smallest number will be polled first.

If you want to declare it in such a way that the largest number is polled first, all you have to do is pass a comparator. We can use an in-built function provided by Comparator.

PriorityQueue pq = new PriorityQueue(Comparator.reverseOrder());

What if you are intending to store objects in a priority queue, then this default will not work as they work on integers or long only. This is a point you must read on comparators in Java. You can pass a comparator to the constructor, it will be used by the queue to order objects accordingly.

 PriorityQueue<Map.Entry<Integer, Integer>> pq = new PriorityQueue<>(
            (a, b) -> a.getValue().compareTo(b.getValue()) //Comparator
        );

You will find problems where you have to first find the frequency or priority of elements, we use HashMap to map element to its priority.

Use Map.Entry as an object to be stored in the queue. It helps us avoid creating new objects

Enough of theory, let’s get to the business, how can I solve problems using priority queues?

Top K Frequent Elements

Given a non-empty array of integers, return the k most frequent elements. For example: nums = [1,1,1,2,2,3], k = 2 Output: [1,2]

From the problem statement, it is evident that it is a ranking problem and we know the data structure to use. All we have to do is collect the respective frequency of elements. However, we have two ways to add elements to the priority queue. The first way is to create a map of element and frequency and add the whole map into the priority queue in decreasing order of frequency (max heap). Take out K element from the priority queue and that will be your K top elements.
The time complexity of the solution is O(n) to add all the elements in the heap, and then O(klogn) to take K elements out. Remember, each take-out incurs O(logn) complexity due to heapification process.

The second way is to add only K elements in the priority queue in min heap order. We keep the size of the queue to constant K, if it goes more, we just take our the top, which will be the minimum element. Once we have processed all the elements, elements in the queue will be top K elements
The time complexity of the solution is O(nlogk). Remember, each take-out and insert incurs O(logn) complexity due to heapification process.
Depending on the relationship between n and k, you can chose which method to use.

 public List<Integer> topKFrequent(int[] nums, int k) {
        
        
        Map<Integer, Integer> map = new HashMap<>();
        for(int i : nums){
            map.put(i, map.getOrDefault(i,0) + 1);
        }
        
        PriorityQueue<Map.Entry<Integer, Integer>> pq = new PriorityQueue<>(
            (a, b) -> a.getValue().compareTo(b.getValue())
        );
        /* We chose the second method. apparently,
           it give better performance in leetcode
        */
        for(Map.Entry e : map.entrySet()){
            pq.offer(e);
            if(pq.size() > k){
                pq.poll();
            }
        }
        
        List<Integer> res = new ArrayList<>();
        pq.stream().forEach(e -> res.add(e.getKey()));
        
        return res;
        
    }

Top K Frequent Words

Given a non-empty list of words, return the k most frequent elements. The answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

It is more or less the same problem as above, the only change is we have to add comparator to take care of alphabetical order.

    public List<String> topKFrequent(String[] words, int k) {
        
        Map<String, Integer> map = new HashMap<>();
       
        PriorityQueue<Map.Entry<String, Integer>> pq 
            = new PriorityQueue<>(
                       (a,b) -> a.getValue()==b.getValue() 
            ? b.getKey().compareTo(a.getKey()) : a.getValue()-b.getValue()
        );
        
        for(int i=0; i<words.length; i++){
            map.put(words[i], map.getOrDefault(words[i], 0) + 1);
        }
        
        for(Map.Entry<String, Integer> entry: map.entrySet()){
            pq.offer(entry);
            if(pq.size() > k){
                pq.poll();
            }
        }
        
        List<String> res = new LinkedList<>();
        while(!pq.isEmpty())
            res.add(0, pq.poll().getKey());
        
        return res;
    }

Kth Largest Element in an Array

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

class Solution {
    public int findKthLargest(int[] nums, int k) {
        
        PriorityQueue<Integer> pq = new PriorityQueue<>();
        int i = 0;
        for(; i<nums.length; i++){
            pq.add(nums[i]);
            if(pq.size() > k)
                pq.poll();
        }   
        
        return pq.peek();
    }
}

K Closest Points to Origin

We have a list of points on the plane. Find the K closest points to the origin (0, 0). (Here, the distance between two points on a plane is the Euclidean distance.)

    public int[][] kClosest(int[][] points, int K) {
        
        PriorityQueue<int[]> pq = new PriorityQueue<>(
                (a, b) -> ((a[0] - 0 ) * (a[0] - 0 )  + (a[1] - 0 ) * (a[1] - 0 )
                           - (b[0] - 0 ) * (b[0] - 0 ) - (b[1] - 0 ) * (b[1] - 0 ))
        );
            
        for(int i=0; i<points.length; i++){
            pq.add(new int[]{points[i][0], points[i][1]});
        }
            
        int [][] res = new int[K][2];
        for(int i=0; i<K && !pq.isEmpty(); i++){
            res[i] = pq.poll();
        }
            
        return res;
        
    }

Sort Characters By Frequency

Given a string, sort it in decreasing order based on the frequency of characters. For example, S = “tree”, the output should be “eert”

    public String frequencySort(String s) {
        
        PriorityQueue<Map.Entry<Character, Integer>> pq 
            = new PriorityQueue<>((a,b) -> b.getValue() - a.getValue());
        
        Map<Character, Integer> map = new HashMap<>();
       
        for(int i=0; i<s.length(); i++){
            map.put(s.charAt(i), 1 + map.getOrDefault(s.charAt(i), 0));
        }
        
        pq.addAll(map.entrySet());
       
        StringBuilder sb = new StringBuilder();
        while (!pq.isEmpty()) {
            Map.Entry<Character, Integer> e = pq.poll();
            int n = (int)e.getValue();
            for (int i = 0; i < n;  i++) 
                sb.append(e.getKey());
        }
        return sb.toString();
    }

Find median in a stream of integer

This problem is covered in detail here: Median in a stream of integers.

 PriorityQueue<Integer> pqMin;
    PriorityQueue<Integer> pqMax;
    
    /** initialize your data structure here. */
    public MedianFinder() {
         pqMax = new PriorityQueue<>(Collections.reverseOrder());
         pqMin = new PriorityQueue<>();
    }
    
    public void addNum(int num) {
        /* If size of max heap, i.e left bunch of numbers is 
           greater than the min heap, right bunch of numbers,
           add the number in left bunch and take the Max and 
           add it to right bunch */
        if (pqMax.size() > pqMin.size()) {
            pqMax.offer(num);
            pqMin.offer(pqMax.poll());
        } else {
            //Do the opposite.
            pqMin.offer(num);
            pqMax.offer(pqMin.poll());
        }
        
    }
    
    public double findMedian() {
        /*If the size of two heaps is equal, 
           even number of integers, take the average of two
           middle elements */
        if(pqMin.size() == pqMax.size()){
            return (double)(pqMin.peek() + pqMax.peek())/2.0;
        }
        
        //Else return from the left bunch
        return (double)(pqMax.peek() * 1.0);
    }
}

Find K-th Smallest Pair Distance

Given an integer array, return the kth smallest distance among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

  private int findKth(int [] a, int k){
        PriorityQueue<Integer> pq = 
                 new PriorityQueue<>(Comparator.reverseOrder());
        
        for(int i=0; i<a.length; i++){
            for(int j=i+1; j<a.length; j++){
                int diff =  Math.abs(a[i]-a[j]);
                pq.offer(diff);
                
                if(pq.size() > k){
                    pq.poll();
                }
            }
        }
        return pq.peek();
    }    

The above solution works but due to the constraints, it may run out of memory and time. A better solution is based on a binary search and trial and error methods.

Merge k Sorted Lists

Merge k sorted linked lists and return it as one sorted list. For example:
Input:
[
1->4->5,
1->3->4,
2->6
] Output: 1->1->2->3->4->4->5->6

private ListNode mergeList(ListNode[] lists){
        
        if(lists.length == 0) return null;
        
        PriorityQueue<ListNode> pq = new PriorityQueue<>(
            (a,b) -> ((Integer)a.val).compareTo(b.val)
        );
        
        for(int i=0; i<lists.length; i++){
            if(lists[i] != null)
                pq.offer(lists[i]);
        }
        
        ListNode head = pq.poll();
        ListNode tail =  head; 
        if(head != null && head.next != null)
            pq.add(head.next);
    
        while(!pq.isEmpty()){
            tail.next = pq.poll();
            tail = tail.next;
            if(tail.next != null)
                pq.offer(tail.next);
            
        }
        
        return head;
    }

As we see from the above code examples, it is easy to identify a priority queue problem and simple to solve. Please share your views and best of luck with your preparations.

Matrix and graph traversals

Today, we will discuss an approach that can be used to solve quite a few problems with matrices. Problem structure is quite similar, you have a matrix, where cells represent real-world entities like land and water or oranges or walls and gates, etc. and we are asked to find something in that matrix. An example problem statement would be island problem. (taken from leetcode):

Given a 2d grid map of ‘1’s (land) and ‘0’s (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.

island problem graph dfs
In this example, there are three islands.

Most of the time these problems have two entities, in the above example, its water and land. At first sight, this problem does not give any hint that it is a graph problem. For a problem to be a graph problem, it must have nodes and edges.

What are the nodes and edges of the graph?

One hint is to the relationship between the cells of the matrix. For example, in the island problem, 0s are water, and 1s are land. Land can be connected to land only which is adjacent to the land and a piece of land is isolated if surrounded by water from all four sides.

In this case, nodes seem to be the cells with value 1(land). Most of the time, in these kinds of problems, each cell of the matrix with a certain value is a node. If we have an n x m matrix, we are looking at n x m possible nodes. Also, here be aware that we are not interested in all the cells. Based on the problem statement, we can discard some cells. For example, in the island problem, we are interested in the cells which represent the land.

Now, what will be the edges? In these kinds of problems, a statement is given like an island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. It means that a cell with value 1 is connected to neighbor cells only if they also contain 1.

Now that we know that cells containing 1 in the matrix are nodes of the graphs and connected to neighbors in they are also 1. To count the number of islands, all we have to do is find connected components in the graph. We reduced a matrix problem to a graph problem after all. Should we create an entire graph? Well no, at any point in time, we do not need the entire graph.

Traversals of graphs like DFS or BFS need a node and its neighbors.

If we are at a particular cell of a matrix, we already know the node. With the problem definition, we can know which other cells can be a neighbor of this node. So, we do not need to actually convert the matrix to a graph but just mimic the getNeighbor(Node u) function of a graph.

Number of islands

The problem statement is given above as an example, we already identified the nodes and edges of the graph implicit in the matrix. We also know that to find the number of islands in the matrix, we have to find the number of connected components in the graphs as explained above. Which traversal can we use to find the number of connected components of a graph? It is a depth-first search problem.

  public int numIslands(char[][] grid) {
    int count = 0;
    Set<Integer> visited = new HashSet<>();

    int n = grid.length;
    
    if(n <=0) return 0;
    int m = grid[0].length;

    for(int i=0; i<n; i++){
        for(int j=0; j<m; j++){
            int cell = i * m + j;
            if(!visited.contains(cell) && grid[i][j] == '1'){
                count++;
                dfs(i, j, visited, grid);
            }
        }
    }
    return count;
 }

void dfs(int i, int j, Set<Integer> visited, char[][] a){

    //Check if the node we are traversing is actually a valid node.
    if(i<0 || i>=a.length 
          || j<0 || j>=a[0].length 
          || a[i][j] == '0') return;

    int cell = i * a[0].length + j;
    if(visited.contains(cell)) return;
    visited.add(cell); 

    //Visit neighbors
    dfs(i+1, j, visited, a);
    dfs(i-1, j, visited, a);
    dfs(i, j+1, visited, a);
    dfs(i, j-1, visited, a);
}

Number of closed islands

There is another leetcode problem called Number of closed islands, it goes like:

Given a 2D grid consists of 0s (land) and 1s (water). An island is a maximal 4-directionally connected group of 0s and a closed island is an island totally (all left, top, right, bottom) surrounded by 1s. Return the number of closed islands.

Again, the similar structure of the problem, a matrix, cells representing the land and water. The only difference is now we have to find which land pieces are connected to the corners. Cells with 0s are lands, so cells with value 0 are our nodes. We will do a DFS starting from each of these nodes and see if we hit boundaries of the matrix, in that case, the node all these nodes cannot be closed islands.

  int numClosedIslands(int[][] g){
        int count = 0; 
        for(int i = 0;i < g.length;i++){
           for(int j = 0;j < g[0].length;j++){
                /* Our node is all the cells with 
                   value zero, we will do a DFS from each node
                */
                if(g[i][j] == 0){
                       count += dfs(g,i,j);
                }
           }
        }
        return count;
    }

    private int dfs(int[][] g,int row,int col){
         //if we meet the edge return 0;
         if(row < 0 || row >= g.length 
                || col < 0 || col >= g[0].length){
                 return 0;
         }
         //if we meet 1 return 1;
         if(g[row][col] == 1){
             return 1;
         }

         /* This is to make sure that 
             we do not visit same node again and again
         */
         g[row][col] = 1;

         /* Make sure that we do not reach to the boundaries 
         doing DFS. */
         int up = dfs(g,row-1,col);
         int down = dfs(g,row+1,col);
         int left = dfs(g,row,col-1);
         int right = dfs(g,row,col+1);

         //any of the sides meet the edge, is not an island;
         return up & down & left & right;
    }

Walls and gates problem

There is another problem which falls under the same criteria, the problem is known as walls and gates problem

Given a m x n 2D grid initialized with these three possible values. -1 – A wall or an obstacle. 0 – A gate. INF – Infinity means an empty room. We use the value 231 – 1 = 2147483647 to represent INF as you may assume that the distance to a gate is less than 2147483647.

Fill each empty room with the distance to its nearest gate. If it is impossible to reach a gate, it should be filled with INF.

walls-and-gates

walls-and-gates-solution
Walls and gates solutions

In this problem, we have to find the distance to the room from gates and fill it with the minimum distance. Question number one, what are our nodes? In this problem, all the cells except the walls are nodes of the graph. Why walls are not nodes? Because one cannot go through the obstacles, so the cannot be part of the path of nodes, hence they cannot be considered as a node.

We can traverse horizontal and vertical, that means nodes at (i+1, j), (i-1, j), (i, j+1) and (i, j-1) are neighbors of cell (i,j) if they are a room or a gate.

To find the minimum distance from gates to a room, start a DFS from each gate and update the distance of the room with minimum distance seen so far. All we have to do it keep track of distance while starting from a gate.

package AlgorithmsAndMe;

import java.util.HashSet;
import java.util.Set;

public class WallsAndGates {

    public int[][] wallsAndGates(int[][] grid){

        int n = grid.length;
        if(n<=0) return new int[0][0];

        int m = grid[0].length;

        Set<Integer> visited = new HashSet<>();

        for(int i=0; i<n; i++){
            for(int j=0; j<m; j++){
                //This is our gate
                if(grid[i][j] == 0){
                    dfs(i, j, grid, 0, visited);
                }
            }
        }

        return grid;
    }

    private void dfs(int x, int y, int[][] grid,
                     int distance, Set<Integer> visited){

        int cellId = x*grid[0].length + y;
        if(x<0 || x>=grid.length
                || y<0 || y>grid[0].length
                || grid[x][y] == -1
                || visited.contains(cellId)){
            return;
        }

        //Mark as visited for this round.
        visited.add(cellId);

        //update the distance
        grid[x][y] = Math.min(grid[x][y], distance);

        dfs(x+1, y, grid, distance+1,visited);
        dfs(x-1, y, grid, distance+1, visited);
        dfs(x, y+1, grid, distance+1, visited);
        dfs(x, y-1, grid, distance+1, visited);

        //Mark unvisited for the next round.
        visited.remove(cellId);

    }
}

Surrounded Regions

Given a 2D board containing ‘X’ and ‘O’ (the letter O), capture all regions surrounded by ‘X’. A region is captured by flipping all ‘O’s into ‘X’s in that surrounded region.

In this problem, again all the cells with zero are nodes, and edges are between the horizontal and vertical neighbors nodes. To capture all the nodes which are not connected to the edge of the matrix, first, find out which all are actually connected to the edge and mark them. Once we know all the nodes which are connected to the edge of the matrix, we can capture the nodes which are not connected.
To do so, we will start with nodes at the edge of the matrix and find the connected component of it and mark all the nodes in that connected component.

 public void solve(char[][] grid) {
        int n = grid.length;
        if(n<=0) return;

        int m = grid[0].length;

        //Cover the first and last rows.
        for(int j=0; j<m; j++){
            if(grid[0][j] == 'O') dfs(0, j, grid);
            if(grid[n-1][j] == 'O') dfs(n-1, j, grid);
        }

        //Cover the first and last columns.
        for(int i=0; i<n; i++){
            if(grid[i][0] == 'O') dfs(i, 0, grid);
            if(grid[i][m-1] == 'O') dfs(i, m-1, grid);
        }

        for(int i=0; i<n; i++){
            for(int j=0; j<m; j++){
                if(grid[i][j] == 'O') grid[i][j] = 'X';
                if(grid[i][j] == '1') grid[i][j] = 'O';
            }
        }

        return;
    }

    private void dfs(int x, int y, char[][] grid){

        /* If we reach the edge of the node, return 
           Or we hit 'X', in that case, we cannot move
           further in that direction.
        */
        if(x<0 || x>=grid.length
                || y<0 || y>=grid[0].length
                || grid[x][y] == 'X'){
            return;
        }

        if(grid[x][y] == 'O') {
            grid[x][y] = '1';

            dfs(x + 1, y, grid);
            dfs(x - 1, y, grid);
            dfs(x, y + 1, grid);
            dfs(x, y - 1, grid);
        }
    }  
 

The time complexity of all the solutions mentioned above is O(n * m) where n and m are rows and columns of the matrix. Because we will visit each cell at least once.
Please share if there is something wrong or missing. If you are preparing for an interviews and need help with it, please book a free session with us.

Plus One Linked List

Given a number represented by a linked list, add 1 to that number and return a new linked list. For example, if the number is 2345, then it is represented as linked as shown below.

plus one linked list

When we add one to 2345 represented by the linked list, the resulting linked list looks as follows.

add 1 to linked list

Thought process

First of all, think if there is only one node in the linked list, that means the linked list represents a single-digit number. What will do? We will add 1 to the node.val and return a new node with the sum. This is our smallest case, however, there is a catch here as well. What if the single-digit number is 9. In that case, the sum is 10 and since every node contains only a single digit of the number, we can store 0 in the new node. What happens to 1? We can treat that 1 as a carry.
Now, there are no digits present to add carry to (remember we had only one node linked list), we will create a new node and put carry in that and then linked the carry node to the previous node with 0.

add one to linked list

This is a very important case which we learned here. When you have processed all the nodes of the linked list, if the carry remains, create a new node and attach it to the head of the list. Most of the students make a mistake in this case.

How about when there is more than one node in the linked list? In that case, 1 is added to the node in the end and carry is propagated, if any, backward till head. Next question, how can you process the last node of the linked list and move backward? Well, remember how we print a linked list in the reverse order. We go deep in the linked list till the last node and use recursion unfolding to come backward. We can use the same concept. Go to the last node recursively, maintain a carry and move backward till the head. Once, you have processed head node, check if carry is non zero, if it is, creates a new node and attach it at front.

plus one linked list

plus 1 in linked list

Show me the implementation

package AlgorithmsAndMe;

import java.util.List;

public class PlusOne {
    int carry = 0;


    public ListNode addOneToLinkedList(ListNode head){
        ListNode newList = addOneToLinkedListUtil(head, 1);

        if(carry > 0){
            ListNode newNode = new ListNode(carry);
            newNode.setNext(newList);
            return newNode;
        }
        return newList;
    }

    public ListNode addOneToLinkedListUtil(ListNode head, int val){

        if(head == null) return null;

        if(head.getNext() == null){
            return createNode(head, val);
        }

        ListNode returnedNode = 
             addOneToLinkedListUtil(head.getNext(), val);

        ListNode newNode = createNode(head, carry);
        newNode.setNext(returnedNode);

        return newNode;
    }

    private ListNode createNode(ListNode node, int val){
        int newVal = node.getValue() + val % 10;
        carry = (node.getValue() + val) / 10;

        return new ListNode(newVal);
    }
}

The complexity of the code is O(n), we will be processing each node at least once. The tricky part is space complexity, as recursion takes implicit stack memory, the space complexity is also O(n).
This code creates a new node, what if we new list does not have to be created and we have to return the same list back. The below code implements the PlustOne such that it returns the same list back.

package AlgorithmsAndMe;

import java.util.List;

public class PlusOne {

    public ListNode addOneToLinkedList2(ListNode head){
        int c = addOneToLinkedListUtil2(head);

        if(carry > 0){
            ListNode newNode = new ListNode(carry);
            newNode.setNext(head);
            return newNode;
        }
        return head;
    }

    public int addOneToLinkedListUtil2(ListNode head){

        if(head == null) return 1;
        int sum = head.getValue() 
                      + addOneToLinkedListUtil2(head.getNext());
        head.setVal(sum % 10);

        return sum / 10;
    }
}

In the next post, we will look at how to add two numbers represented by two linked lists.

If you are preparing for interview and looking for personalized coaching, please reach out to us on [email protected] or book a free session with us.

Reverse K Nodes In Linked List

Given a singly linked list, reverse every K nodes of the linked list and return the head of the modified linked list.
k is a positive integer.

Good clarifying question: If the number of nodes is not a multiple of k then the last nodes should remain as it is.
Example:

Input:
List = 1->2->3->4->5
K = 2
Output: 
2->1->4->3->5

Input:
List = 1->2->3->4->5
K = 
Output: 
3->2->1->4->5

Reverse K nodes in list: thought process

If I have two chunks of k nodes, one after the other, how would I reverse them in chunks of k nodes? Well, I will first reverse the part which is at the end, return the new head of the reverse part, then I will reverse the first part. Now the most important part, how would I link the first and second parts?
The head of the first part should be the last node of the reversed list, right? The last node of the first part should link to the head of the second part to link these two chunks together.

reverse k nodes in linked list
How can we implement it using recursion? From the above explanation, we know that we have to chunk the linked list into parts with k nodes each till we cannot further divide it. Once we reach the last k nodes part, we apply the method above. Reverse the chunk and return the head of the reversed chunk.
Take care of the last chunk, as it may have less than nodes, in that case, we will return the head of that chunk without reversing the chunk as per the question demands.

It is very important that you know how to reverse a linked list to solve this problem. If you have doubts, I recommend reading that first.

Reverse K nodes : Implementation

    public ListNode reverseKGroup(ListNode head, int k) {
        
        //This is head of current chunk
        ListNode currentHead = head;
        
        //Move k nodes;
        ListNode current = head;
        for(int i=1; i< k && current != null; i++){
            current = current.next;   
        }
        
        if(current != null){
            //Reverse the remaining link list and get the head.
            ListNode reverseHead = reverseKGroup(current.next, k);
            
            //Now reverse the current k nodes list.
            current = head;
            ListNode prev = null;
            ListNode next = head;
        
            int count = 1;
            while(current != null && count <= k){
                next = current.next;
                current.next = prev;
                prev = current;
                current = next;
                count++;
            }
        
            //Link the head with the returned head
            currentHead.next = reverseHead;
            /*Return the new head, which is last 
              node of the current of current k nodes */
            return prev;
        }
        //If there are less than k nodes in last chunk
        else return head;
    }

The time complexity of the implementation is O(n) where n is number of nodes in the list.

Swap 2 nodes in a linked list

In this problem, we have to reverse every pair of nodes in the linked list. For example:

Input:
list = 1->2->3->4->5->6
Output:
2->1->4->3->6->5

This problem is a special case of the above problem of reversing K nodes in a linked list, here k is 2, however, can be implemented in an easier way.

Implementation

    public ListNode swapPairs(ListNode head) {
        if(head == null || head.next == null) return head;
        
        ListNode temp = head.next;
        ListNode next = head.next.next;
        
        //Reversing the pair
        temp.next = head;
        head = temp;

        //Reverse the remaining list.
        temp.next.next = swapPairs(next);
        
        return head;
    }

The time complexity of the implementation is O(n) where n is number of nodes in the list.

Arrays With Elements In The Range 1 to N

When an array has all its elements in the range of 1 to N ( where N is the length of the array ) we can use the indices to store the ordered state of the elements in the array. This ordered-state can in-turn be used to solve a variety of problems which we’ll explore soon. First, a very simple demonstration of this property. 

Here is an array which has unique elements in the range of 1 to N.
Given array (A) : 5,3,1,4,2
Indices:                0,1,2,3,4

Sort in Linear Time

The first use-case of this unique property is being able to sort in O(N) time i.e. a special-case(all unique elements) of the Counting Sort. The crux of this sort is to check whether an element is at its corresponding index and swap it to its correct index if it’s not. Following is a demonstration of this logic:

Given array (A) : 5,3,1,4,2
Indices:                0,1,2,3,4

For each A[i] check if A[A[i] – 1] equals A[i] or not. If they are not equal then swap element at A[A[i] – 1] with A[i]. Basically the correct value for any index i is when A[i] contains i+1. 

In the above case, let’s start with i = 0.

A[A[0] – 1] or A[5-1] orA[4] which is 2 and A[0] = 5. This means that A[A[i] – 1] is not equal to A[i] and hence not in its correct position. So we need to swap in order to put A[0] -> 5 to its correct position which is index 4 and A[0] will hold 4 after the swap. Similarly, we need to repeat this check & swap for all the elements.

What if we cancel-out the common terms and modify the check from  A[i] != A[A[i] - 1] to i != A[i]-1 ?

Find The Missing Integer

A similar approach can help us find the smallest missing positive-integer in a given array. By smallest missing positive-integer, we just mean the smallest positive integer that does not exist in the given list of numbers. For example: 

Given Array: -2, 3, 0, 1, 3
In the above case, the smallest missing positive integer is 2.

If we were to apply the usual sorting techniques and then scan the array for the smallest positive integer absent it would imply a time-complexity of O(NLog(N)) + O(N). We can definitely do better than this! At first glance, it seems that this problem does not lie within the unique property of elements being in the range of 1 to N since the numbers in the given array are well outside the range, but to solve this problem we still only need to figure out whether we have elements from 1 to N present in the given array or not.

How do we know whether the given array has elements from 1 to N? We can use the counting-sort discussed earlier to put each element in its “correct position”, i.e index 0 should hold 1, index 1 should hold 2 and so on. The smallest index that does not hold its correct element is the missing integer.

If we sort the given array using counting sort described above, we will get: 1, 0, 3, -2, 3. And the smallest index i to not hold its correct value i.e. i+1 will give us the answer to the smallest missing positive integer. In this case, that index is 1 since it does not hold 2, thus the smallest positive missing integer is 2.

Find The Duplicate Element

The third use-case of this property is to figure out the duplicate elements without using any extra space. We can iterate over the array A and mark the corresponding index of the encountered element as negative – unless it has already been marked negative! For example: if A[1] = 3 (or -3 ) then mark A[ Abs[3] - 1] as negative, this way whenever we encounter 3 (or -3) again in any of the A[i] we will know that the value 3 has been visited before since A[3-1] will be negative.

Given array (A) : 5,3,1,4,3
Indices:                0,1,2,3,4

When we encounter A[0] i.e. 5, we make A[5-1] i.e. A[4] negative, so the array becomes: 
5,3,1,4,-3
Next, we encounter A[1] i.e. 3, we make A[3-1] i.e. A[2] negative, so the array becomes: 
5,3,-1,4,-3
Next, we encounter A[2] i.e. -1, we make A[1-1] i.e. A[0] negative, so the array becomes: 
-5,3,-1,4,-3
Next, we encounter A[3] i.e. 4, we make A[4-1] i.e. A[3] negative, so the array becomes: 
-5,3,-1,-4,-3
Next, we encounter A[4] i.e. -3, we want to make A[3-1] i.e. A[2] negative, but in this case, A[2] is already negative thus we know that A[2] has been visited before! Which means Abs(A[4]) i.e 3 is the duplicate element.


Here is a snippet to demonstrate the code for sorting an array in linear time as per the above approach. The exact same approach can be used to solve the other two applications i.e. Finding the Duplicate and Finding The Missing Integer.

        int swap=0;

        for(int i = 0; i < nums.length;){
            
            if(nums[i] > 0 && nums[i] < nums.length) {

                if(nums[nums[i]-1] != nums[i]){                     
                    swap = nums[i];
                    nums[i] = nums[nums[i] - 1];
                    nums[swap - 1] = swap;
                }else{
                    i++;
                }
                
            }else{
                i++;
            }
        }

 

If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

Range sum query- Immutable array

Range sum query- Immutable array

Write a service which given an integer array, returns the sum of the elements between indices i and j (i ≤ j), inclusive. Example: nums = [-2, 0, 3, -5, 2, -1]
sumRange(0, 2) -> 1
sumRange(2, 5) -> -1
sumRange(0, 5) -> -3

Also, the input set does not change during the calls to the sumRange(i,j).

The brute force solution is to calculate the sum of all the elements A[i] to A[j] whenever a sumRange(i,j) is called. This method has time complexity of O(n). It is OK to have this solution for small scale but as the number of queries goes up, processing of all the numbers from i to j would be inefficient. Also, imagine a case where the array itself is very large, then O(n) complexity for each query will lead to choking of your service.

Range sum query- Immutable array : thoughts

There are two hints for optimization is in the question, first, the array is immutable, it does not change. Second, we have to build a service, that means we have a server with us. These two things allow us to pre-compute and store results even before the query is made.

Now, the question is what do we pre-compute and how do we store it? We can precompute the sum of all the elements between each i and j and store them in a two-dimensional array. range[i][j] stores the sum of all the elements between index i and j. It will use O(n2) additional memory, however, the response time for each sumRange query will be constant. Also, the preprocessing step is O(n2)

Can we optimize for space as well? If I know the sum of all the elements from index 0 to index i and sum of all the elements from 0 to j, can I find the sum of all the elements between i and j? Of course, we can do it.

 Sum(i,j) = Sum(0,j) - Sum(0,i) + A[i]. 

Below diagram explains it.
range sum query array

However, integer array is not passed in the query request, we cannot use it while calculating the sum. Instead, we will use formula like: Sum(i,j) = Sum(0,j) – Sum(0,i-1), which is equivalent to the above.

We will pre-calculate the sum of all the elements between index 0 and j for all j>=0 and jImplementation

class NumArray {

    int[] rangeSum;
    public NumArray(int[] nums) {
        rangeSum = new int[nums.length];
        
        if(nums.length>0){
            rangeSum[0] = nums[0]; 
            for(int i=1; i<nums.length; i++){
                rangeSum[i] = rangeSum[i-1] + nums[i];
            }
        }
    }
    
    public int sumRange(int i, int j) {
        if(i==0) return rangeSum[j];
        return rangeSum[j] - rangeSum[i-1];
    }
}

Now, the preprocessing step is O(n). N additional space is used. At the same time query response time is O(1).

Please share if there is anything wrong or missing in the post. If you are preparing for an interview and needs coaching to prepare for it, please book a free demo session with us.