Longest Substring Without Repeating Characters

Longest Substring Without Repeating Characters

Given a string, find longest substring without repeating characters in it.  For example, S = “abcaabaca”, longest substring without repeating characters will be “abc”

Brute force solution will be to scan all substrings of given string and check which one has longest length and no repeating characters. For a string with size n, there will be n * (n-1) substrings, and to check it each for unique characters, it will take n comparison in worst case. So, worst case complexity of this algorithm is O(n3) with additional space of O(n). Code is simple enough.

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 1.1.18.
 */
public class NonRepeatingCharacters {

    private static boolean allUniqueCharacters(String s, int start, int end) {

        HashMap<Character, Boolean> characters = new HashMap<>();

        for (char c : s.substring(start, end).toCharArray()) {
            if(characters.containsKey(c)) return false;
            characters.put(c, Boolean.TRUE);
        }
        return true;
    }

    private static int longestSubstringWithoutRepeatingCharacters(String s) {
        int len = s.length();
        int maxLength = 0;
          
        for (int i =0; i < len; i++){
            for (int j=i+1; j<len; j++){
                int length = j-i;
                if (allUniqueCharacters(s, i, j)){
                    maxLength = Integer.max(maxLength, length);
                }
            }
        }
        return maxLength;
    }

    public static void main(String[] args) {

	String s = "abcdabcbb";
        System.out.println("Longest substting without repeating characters: " +
                longestSubstringWithoutRepeatingCharacters(s));

    }
}

Longest Substring Without Repeating Characters : Sliding window approach

A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in array/string which defined by start and end indices. A sliding window is a window which “slides” its two boundaries to the certain direction.

In brute force approach, we repeatedly checked each substring for unique characters. Do we need to check each substring? If a substring s[i,j-1] contains non repeating characters, while adding jthcharacter, check if that character is already present in substring s[i,j-1]. Since we scan substring to ascertain uniqueness of new character, complexity of this algorithm is O(n2).
How about optimizing the scanning part? What if hash is used to store characters which are already seen in substring s[i,j-1]. In that case, checking uniqueness of new character is done in O(1) and overall algorithm complexity becomes linear.

 public  static int longestSubstringWithoutRepeatingCharacters(String s) {
        int len = s.length();
        HashMap<Character, Boolean> characters = new HashMap<>();

        int maxLength = 0;
        int start = 0;
        int  end = 0;
        while (start < len && end < len) {
            //Check only the last character.
            if(!characters.containsKey(s.charAt(end))){
                characters.put(s.charAt(end), Boolean.TRUE);
                end++;
            }
            else {
                int currentLength = end-start;
                maxLength = Integer.max(maxLength, currentLength);
                //Move start of window one position ahead.
                characters.remove(s.charAt(start));
                start++;
            }
        }
        return maxLength;
    }

If character already present in substring s[i,j-1], that means, it cannot be added to longest substring. Find length of substring (j-i) and compare it with current maximum length. if it is greater, max length of longest substring without repeating characters is (j-i).
At last move the window to position of duplicate.

Below is example execution of above code.

Current Character : a
Substring (  ) does not contain a
New length of substring without repeating character : 1
Current Character : b
Substring ( a ) does not contain b
New length of substring without repeating character : 2

Current Character : c
Substring ( ab ) does not contain c
New length of substring without repeating character : 3

Current Character : a
Substring (abc) contains a
Advancing i to 1

Current Character : a
Substring ( bc ) does not contain a
New length of substring without repeating character : 3

Current Character : b
Substring (bca) contains b
Advancing i to 2

Current Character : b
Substring ( ca ) does not contain b
New length of substring without repeating character : 3

Current Character : c
Substring (cab) contains c
Advancing i to 3

Current Character : c
Substring ( ab ) does not contain c
New length of substring without repeating character : 3

Current Character : b
Substring (abc) contains b
Advancing i to 4

Current Character : b
Substring (bc) contains b
Advancing i to 5

Current Character : b
Substring ( c ) does not contain b
New length of substring without repeating character : 3

Current Character : b
Substring (cb) contains b
Advancing i to 6

Current Character : b
Substring (b) contains b
Advancing i to 7

Current Character : b
Substring (  ) does not contain b
New length of substring without repeating character : 3

Longest substring without repeating characters : 3

There is a small optimization which helps us to skip more characters when repeating character is found instead skipping one at a time. Store the index of each character seen in substring [i,j-1].  While processing jth character, if it is already in hash, we know the index j’ where that character is in string. There is no way that any substring can contain unique characters till j’ and j are in it. So, we skip all indices from i to j’ and start from j’+1 instead of i+1 as in above method.

  public static int longestSubstringWithoutRepeatingCharacters3(String s) {
        int len = s.length();
        HashMap<Character, Integer> characters = new HashMap<>();

        int maxLength = 0;

        for (int start=0, end = 0; end <len; end++) {
            if (characters.containsKey(s.charAt(end))) {
                //find the index of duplicate character.
                int currentIndex = characters.get(s.charAt(end));
                start = Integer.max(currentIndex, start) + 1;
            }
            int currentLength = end - start;
            maxLength = Integer.max(maxLength, currentLength);
            //Update new location of duplicate character
            characters.put(s.charAt(end), end );
        }
        return maxLength;
    }

Complexity of find longest substring without repeating character is hence O(n) with additional space complexity of O(n).
Please share if something is wrong or missing. We would love to hear from you.

Merge overlapping intervals

Merge overlapping intervals

Given N intervals S = {E1,E2,…..En} with each Ei has start time si and end time ei. Some of these intervals can be overlapping, Just to clarify, Ei and Ej overlap when start time of Ej i.e sj is less than end time of Ei i.e ei. For example, [(1,3),(2,4),(5,8), (6,9)] should transform into [(1, 4),(5,9)] has interval (1,3) and (2,4) overlap and interval (5,8) and (6,9) also overlap.

merge overlapping intervals

Merge overlapping intervals  : Thought process

As we always do, first try to come up with brute force solution, given enough time and space and money, how would you solve this?
Natural course is to take ith interval and compare start time of all jth intervals with end time of ith, if the start time of jth interval is less than the end time of ith event, then you can merge two intervals. What should be end time for merged interval then?  It should be maximum of end times of two merged intervals.

What will be time complexity of this approach? We are not using any additional space, however, worst case time complexity is O(n2). Can we do better?

What are two times we are comparing in brute force solution? It’s the start time of one interval with the end time of another. If we arrange input in a specific order, can we reduce processing some entries?

If we sort all intervals based on their start time, si < si+1< si+2. Also, interval is always forward looking, ei > si, ei+1 > si+1 and so on.

If si is greater ei-1, then si+1 will be greater than ei-1, so no need to compare si+1 with ei-1, that is no need to go beyond immediate previous interval for any interval Ei. If si is less than ei-1, update ei-1 with maximum of ei-1 and ei and move to Ei+1.
Notice that we need last interval Ei-1 to decide if to merge new interval into previous one or keep it as standalone. A stack is the best data structure to use. The algorithm will look like:

  1. Consider interval Ei.
  2. If stack is empty, push Ei to stack.
  3. If stack is not empty, then pop interval at top of stack call it Ei-1.
  4. Compare si, start time of Ei with ei-1, end time of Ei-1.
  5. If si less than ei-1, update ei-1 as max(ei-1, ei), as in maximum of end times of two intervals and push back Ei-1on to stack.
  6. Else push Ei on to stack.
  7. Continue till all events are considered.
  8. At the end of processing, stack will contain all merged interval.

Let’s take an example and see how this algorithm works. We have following intervals and we have to merge overlapping intervals.

First of all, sort all interval based on their start time.

Create a stack, start with the first interval, since the stack is empty, we will push the first event on to the stack.

After pushing the first event, the problem state looks like this

Take the second interval, start time (2) of the second interval is less than the end time of the previous event on the stack (3), hence, find the maximum of end times of these two intervals and update the last interval with that end time and push back on to the stack.

 

Look at the third interval, the start time of it is greater than the end time of interval on top of the stack, just push interval on to the stack.

Last interval, this time, the start time of the new interval is less than the end time of interval on top of the stack.

Find the maximum of end times of two intervals and update the previous interval with that end time and push it back on to stack.

merge overlapping intervals

At this point, when there is no more interval remaining, stack contains all merged overlapping intervals.

Merge overlapping intervals : Implementation

package com.company;


import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.Stack;

/**
 * Created by sangar on 8.4.18.
 */
public class OverlappingIntervals {
    public  static ArrayList<Interval>
        mergeOverlappingIntervals(ArrayList<Interval> intervals){

        ArrayList<Interval> mergedIntervals = new ArrayList<>();
        Stack<Interval> s = new Stack();

        //Sort the ArrayList of interval based on start time.
        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));
        for(Interval currentInterval : intervals){
            if(s.empty())s.push(currentInterval);
            else {
                Interval previousInterval = s.pop();
                if(previousInterval.getEndTime() > 
                     currentInterval.getStartTime()){
                    /*
                    If current interval's start time is less than end time of
                    previous interval, find max of end times of two intervals
                    and push new interval on to stack.
                     */
                    int endTime = Integer.max(previousInterval.getEndTime(),
                                              currentInterval.getEndTime());
                    /* Notice that we have created new interval and 
                       did not update the old one
                       This concept is called as immutability of class
                     */
                    s.push(new Interval(previousInterval.getStartTime(),
                                        endTime));
                }
                else{
                    s.push(previousInterval);
                    s.push(currentInterval);
                }
            }
        }
        while(!s.empty()){
            mergedIntervals.add(s.pop());
        }

        return mergedIntervals;
    }

    public static void main(String[] args) {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(1,3));
        intervals.add(new Interval(2,4));
        intervals.add(new Interval(5,8));
        intervals.add(new Interval(6,9));
        ArrayList<Interval> mergedIntervals = mergeOverlappingIntervals(intervals);
        for (Interval interval : mergedIntervals){
            System.out.print("(" + interval.getStartTime() +"," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm to merge overlapping intervals will be O(n log N) due to sorting with O(n) extra space for stack and then copying into the list to return also takes O(n) space.

There is another way to implement the same function without using the stack, here we use the fact that ArrayList in Java is implemented using the array as the base and getting an element at a particular index should be O(1) operation. The code looks more or less the same, however, there is no traversal of the stack at the end to create the list to return.

public List<Interval> mergeOptimized(List<Interval> intervals) {

        if(intervals.size() == 0) return intervals;

        Collections.sort(intervals, 
           (Interval a, Interval b) -> a.getStartTime() - b.getStartTime());

        List<Interval> mergedIntervals = new ArrayList<Interval>();
        for(Interval interval : intervals){

            /*If the merged list is empty add the interval to 
              it or check if the last interval in merged list overlaps

            /*Remember the get function on ArrayList is O(1) operation
              because Arraylists in Java are backed by arrays */
            if(mergedIntervals.isEmpty()
                    || mergedIntervals.get(mergedIntervals.size()-1).getEndTime() < 
                       interval.getStartTime() ){
                mergedIntervals.add(interval);
            }
            else {
                int lastEndTime = Math.max(
                        mergedIntervals.get(mergedIntervals.size()-1).getEndTime(),
                        interval.getEndTime()
                );
                mergedIntervals.get(mergedIntervals.size()-1).setEndTime(lastEndTime);
            }
        }

        return mergedIntervals;
    }

You can use the above snippet of code to submit for this leetcode problem and it should be accepted.

Please share if there is something missing or wrong. Also, please reach out to us at communications@algorithmsandme.com if you want to contribute to the website and help others to learn by sharing your knowledge. If you are preparing for an interview and need some coaching to prepare for it, please sign up for the free session with us.

Subarray with sum zero

Subarray with sum zero

Given an array of positive and negative integers, find a subarray with sum zero in that array. For example, in the array given below, there are two subarrays whose elements sum to zero.

subarray with sum zero
Input array
Array highlighted adds up to zero
subarray with zero sum

Brute force method to find subarray with sum zero will be to find all sub-arrays of the array and then add them individually to see if any subarray adds up to zero. There can be n * (n-1) subarrays for a given array of size n, so the complexity of brute force solution is O(n2).

package com.company;

import java.util.Arrays;
import java.util.HashMap;

/**
 * Created by sangar on 3.12.18.
 */
public class SubarrayWithZeroSum {
    public int [] findSubarrayWithZeroSumBrute(int[] a){
        int len = a.length;

        for(int i=0; i<len; i++){
            int  sum  = 0;
            for(int j=i; j<len; j++){
                sum += a[j];
                if(sum == 0){
                    return Arrays.copyOfRange(a,i,j+1);
                }
            }
        }
        return new int[0];
    }
}

Test cases

package test;

import com.company.SubarrayWithZeroSum;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.Assert.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class SubarrayWithSumZeroTest {

    SubarrayWithZeroSum tester = new SubarrayWithZeroSum();

    @Test
    public void subarrayWithZeroSumBruteTest() {

        int[] a = {2, -3, -1, 4};
        int [] output = {-3, -1, 4};
        assertEquals(Arrays.toString(output),
              Arrays.toString(tester.findSubarrayWithZeroSumBrute(a)));
    }

    @Test
    public void subarrayWithZeroSumBruteNoSubArrayTest() {

        int[] a = {2, -3, -2, 4};
        int [] output = {};
        assertEquals(Arrays.toString(output),
              Arrays.toString(tester.findSubarrayWithZeroSumBrute(a)));
    }

    @Test
    public void subarrayWithZeroSumBruteOneElementTest() {

        int[] a = {2, 0, -1, 4};
        int [] output = {0};
        assertEquals(Arrays.toString(output),
              Arrays.toString(tester.findSubarrayWithZeroSumBrute(a)));
    }
}

Find subarray with sum zero: thoughts

A subarray is a contiguous part of an array. Let’s say we find the sum of subarray starting at 0 and ending at any index i. So, T[i] represents the sum of subarray A[0..i].

What if we have two indices i and j; such that i< j and T[i] = T[j]. In this case, all the elements which are between index i and index j add up to zero and that is our subarray with sum zero.
Length of subarray with sum zero will be j-i+1.

Implementation

package com.company;

import java.util.Arrays;
import java.util.HashMap;

/**
 * Created by sangar on 3.12.18.
 */
public class SubarrayWithZeroSum {
    public int [] findSubarrayWithZeroSum(int[] a){

        int len = a.length;

        int [] T = new int[len];

        T[0] = a[0];
        for(int i=1; i<len; i++){
            T[i] = T[i-1] + a[i];
        }

        //Complexity of below code is O(n^2)

        for(int i=0; i<len; i++){
            for(int j=i+1; j<len; j++){
                if(T[i]== T[j]){
                    return Arrays.copyOfRange(a, i+1, j+1);
                }
            }
        }
        return new int[0];
    }
}

Test cases

package test;

import com.company.SubarrayWithZeroSum;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.Assert.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class SubarrayWithSumZeroTest {

    SubarrayWithZeroSum tester = new SubarrayWithZeroSum();

    @Test
    public void subarrayWithZeroSumTest() {

        int[] a = {2, -3, -1, 4};
        int [] output = {-3, -1, 4};
        assertEquals(Arrays.toString(output),
                Arrays.toString(tester.findSubarrayWithZeroSum(a)));
    }

    @Test
    public void subarrayWithZeroSumNoSubArrayTest() {

        int[] a = {2, -3, -2, 4};
        int [] output = {};
        assertEquals(Arrays.toString(output),
                Arrays.toString(tester.findSubarrayWithZeroSum(a)));
    }

    @Test
    public void subarrayWithZeroSumOneElementTest() {

        int[] a = {2, 0, -1, 4};
        int [] output = {0};
        assertEquals(Arrays.toString(output),
                Arrays.toString(tester.findSubarrayWithZeroSum(a)));
    }

The complexity of the algorithm to find a subarray with zero sum in a given array of integers is O(n2) with an additional space complexity of O(n) to store sum till index i.

We can optimize it further by creating a hash of all the sums which we see while adding. When we add the index i to already calculated sum till index i-1, we check if the new sum is zero? If yes, then subarray from 0 to index i add up to zero. If there is already a sum present which is equal to the current sum then there is subarray with sum zero between index when we saw the sum last and current index.

package com.company;

import java.util.Arrays;
import java.util.HashMap;

/**
 * Created by sangar on 3.12.18.
 */
public class SubarrayWithZeroSum {

    public int [] findSubarrayWithZeroSumOptimized(int[] a){

        int len = a.length;

        HashMap<Integer, Integer> T = new HashMap<Integer, Integer>();

        int sum  = 0 ;
        for(int i=0; i<len; i++){
            sum  += a[i];
            if(T.get(sum) != null){
                return Arrays.copyOfRange(a,T.get(sum)+1, i+1);
            }
            T.put(sum, i);
        }

        return new int[0];
    }
}

Test cases

package test;

import com.company.SubarrayWithZeroSum;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.Assert.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class SubarrayWithSumZeroTest {

    SubarrayWithZeroSum tester = new SubarrayWithZeroSum();

    @Test
    public void subarrayWithZeroSumOptimizedTest() {

        int[] a = {2, -3, -1, 4};
        int [] output = {-3, -1, 4};
        assertEquals(Arrays.toString(output),
          Arrays.toString(tester.findSubarrayWithZeroSumOptimized(a)));
    }

    @Test
    public void subarrayWithZeroSumOptimizedNoSubArrayTest() {

        int[] a = {2, -3, -2, 4};
        int [] output = {};
        assertEquals(Arrays.toString(output),
          Arrays.toString(tester.findSubarrayWithZeroSumOptimized(a)));
    }

    @Test
    public void subarrayWithZeroSumOptimizedOneElementTest() {

        int[] a = {2, 0, -1, 4};
        int [] output = {0};
        assertEquals(Arrays.toString(output),
          Arrays.toString(tester.findSubarrayWithZeroSumOptimized(a)));
    }

}

The complexity of this method is O(n) with additional space of O(n) in worst case.

Please share if there is something wrong or missing. If you are preparing for an interview, please signup for free interview kit.

Find k number in sliding window problem

Sliding window problem

Given a large integer array of size x, window size of n and a random number k, find smallest k numbers in every window of n elements in array. This is commonly know as sliding window problem. For example: for an array [2,3,1,5,6,4,2,5,4,3,8] k = 2 and n = 6, output should be [1,2],[1,2],[1,3][1,4][1,3][1,3]. How? see below figure.

This problem regularly features in Amazon interviews.

Find k numbers in sliding window : thoughts

If we spit down the problem, it reduces to find k smallest elements in an array, which can easily be solve in multiple ways. All we have to take care of is moving the window and storing results for each window.

Quick sort method
First way is to use quick sort, we randomly pick a pivot and put it in right place. When pivot is at right place, all elements on the right side of pivot are greater than pivot and all elements on the left side are less than pivot. If pivot is a kth position in array, all elements on left side of pivot automatically become K smallest elements of given array. In worst case this method take O(n log n) for each window.

Using heaps
What are we interested in is k elements, what if from current window, we take out first k numbers and consider them as k smallest elements? This set of k numbers may change based value of following numbers in the window. Which way? If new number is smaller than any of the number chosen randomly, new number has to be added into the k smallest element set. However, we have only k spaces there, so someone has to move out.

If new number is less than any number in set, it must be less than maximum number in set

Given above fact, we can always swap new number with maximum of set. Now problem is how to find max in a set? This set will modified repeatedly, so we cannot just sort it once and find the max. For use cases when data is changing and we have to find max of that set, heaps are the best data structures to use. In this case we will use max heap. Max heap is kind of heap where children of root node are smaller than root node. Max heap will give us O(1) complexity to find max and O(log n) complexity to heapify on removal old max and insertion of new number.

Algorithm

  1. Create a max heap with first k elements of window.
  2. Scan through remaining elements in window
    1. If root of max heap is less than new number, remove the root and add new element to heap
    2. All elements in heap at the end of processing are k smallest numbers in window.

    Sliding window algorithm to find k smallest elements : Implementation

    #include<stdio.h>
    #include<stdlib.h>
    #include <math.h>
    
    typedef struct node {
    	struct node * left;
    	struct node * right;
    	int data;
    } heapNode;
    
    int leftChild(int i){
    	return 2*i + 1;
    }
    
    int rightChild(int i){
    	return 2*i + 2;
    }
    
    void swapPtr(heapNode *a[], int i, int largest){
    	heapNode *temp = a[i];
    	a[i] = a[largest];
    	a[largest] = temp;
    }
    /* This function heapifies heap after removal of root  
    or at time of building heap from an array */
    void max_heapify_ptr(heapNode *a[], int i, int len){
            int largest = i;
            int left, right;
    
            left = leftChild(i);
            right = rightChild(i);
           
            if(left <= len && a[i]->data <a[left]->data){
                    largest = left;
            }
            if(right <= len && a[largest]->data < a[right]->data){
                    largest = right;
            }
            if(largest != i){
                    swapPtr(a, i, largest);
                    max_heapify_ptr(a, largest, len);
            }
    }
    
    /* Building heap from given elements */
    void build_max_heap_ptr(heapNode *a[], int len){
            int i = len/2 +1;
            for(; i>=0; i--){
                    max_heapify_ptr(a,i, len);
            }
    }
    
    /* This function allocates node of heap */
    heapNode * create_node(int data){
            heapNode *node = (heapNode *)(malloc)(sizeof(heapNode));
            if(node){
                    node->data = data;
            }
            return node;
    
    }
    
    /* This function is real implementation of 
    the sliding window algorithm */
    void slide_window(int buffer[], int N, int K, int buffer_len){
    
        int i =0, j =0,s;
        heapNode *max_heap[K+1];
        int num = K;
    
        for(j=0 ; j + N < buffer_len; j++){
          /* Window starts at index 0 and is of size N */
           printf("\nCurrent window :");
           for(s =j; s<j+N; s++){
               printf("%d ", buffer[s]);
           }
           printf("\n");
           /* Put K element from N element window */
           for(i=0;i<K; i++){
           /* Since we wold be doing for every window, 
              avoiding reallocation of node */
               if(max_heap[i]){
                    max_heap[i]->data = buffer[i+j];
                }
                else{
                    max_heap[i] = create_node(buffer[i+j]);
                }
            }
            /* Build min heap with those entered elements */
             build_max_heap_ptr(max_heap,K-1);
    
            /*Now for all remaining N-K-1 elements in window, 
             check if they fit in max heap */ 
             for(i=K+j; i< N+j; i++){
                 heapNode * root = max_heap[0];
                 if(buffer[i] < root->data){
                       root->data = buffer[i];
                       max_heapify_ptr(max_heap, 0, K-1);
                  }
              }
              
              /*Print the current max heap, it will contain K smallest 
                element in current window */
               printf("K minimum elements in this window :");
               for(int x=0; x< K; x++){
               	printf("%d ", max_heap[x]->data);
               }
               
               
            }
    }
    /* Driver Program to execute above code */
    int main(){
       int buffer[10] = {1,4,5,6,3,2,4,8,9,6};
    
       int K= 4;
       int N =5;
       
       int size = sizeof(buffer)/ sizeof(buffer[0]);
       
       slide_window(buffer,N, K,size);
       return 0;
    }
    

    Following figures explain how window slides and how heap is updated.
    1. Window starts at index 0 and ends at N. We take K minimum elements among N elements and store in max heap. Array is given in below picture with window size of 9 and k = 4.
    First step is to create a max heap with first 4 elements of window.

    sliding window problem

    Next we are looking at 4, which is less than max in max heap. So we remove the max from heap and add the new element(4) to heap.

    k smallest element in sliding window

    Next is 2, which is less than max in max heap. So we remove the max from heap and add the new element(2) to heap.

    Next is 3, which is less than max in max heap. So we remove the max from heap and add the new element(3) to heap.

    Next we have 10 and 11 which are greater than root of max heap, so nothing happens.

    We come to end of window. Therefore, 4 smallest element in window are [ 1,2,3,4 ]

    Next window moves one step ahead, that’s where you discard the max heap and create the new empty one and repeat the process.

    We can actually avoid discarding the entire heap when window moves, however complexity of overall algorithm will remain the same. This problem is asked in a different way, which is to find maximum in sliding window.

    #include <iostream>
    #include<deque>
    using namespace std;
    
    void slidingWindow(int buffer[], int n, int w, int output[])
    {
       deque<int> Q;
       int i;
       /*Initilize deque Q for first window, put all W elements, however also
       removing elements which cannot be maximum in this window */
       for (i = 0; i < w; i++)
       {
       	   //This is where we are removing all less than elements
           while (!Q.empty() && buffer[i] >= buffer[Q.back()])
               Q.pop_back();
           // Pushing the index
           Q.push_back(i);
       }
      
       for (i = w; i < n; i++)
       {
           output[i-w] = buffer[Q.front()];
    
           //update Q for new window
           while (!Q.empty() && buffer[i] >= buffer[Q.back()])
               Q.pop_back();
    
           //Pop older element outside window from Q    
           while (!Q.empty() && Q.front() <= i-w)
               Q.pop_front();
          
           //Insert current element in Q
           Q.push_back(i);
       }
       output[n-w] = buffer[Q.front()];
    }
    
    int main(){
    	int a[]={3,5,4,2,-1,4,0,-3};
    	int n = sizeof(a)/sizeof(a[0]);
    	int output[n];
    
    	slidingWindow(a,n,4,output);
    	return 0;
    }
    

    Worst case complexity of sliding window algorithm would be O(n2k). K is included as it takes O(k) complexity to build heap of k elements.

    Please share if there is something wrong or missing.

Merge k sorted arrays

Given k sorted arrays of varying lengths, merge the arrays into one array in the sorted order. For example, given 3 arrays:

The resulting array should be like

Merge k sorted arrays

Merge k sorted arrays

Since all the input arrays are sorted, the first element in the result array will be among the first elements of input arrays. How can we find the minimum among all the elements plucked from the first index of each array? Easy, take those k elements (there are k arrays, so k first elements) and build a min-heap. The root of the min-heap will be the least element among each of the first elements of the given k arrays, i.e.
initial_root = min(arr1[0], arr2[0], arr3[0]...arrK[0])
Which implies:
result_array[0] = min(arr1[0], arr2[0], arr3[0]...arrK[0])

The initial root above will be the first element in the result array. Now the second element for the result array can be found from the set of first elements of all input arrays except the array from which the first element of result array was taken. For example, if arr3 had the least of all first elements while finding the initial root, then:
result_array[1] = min(arr1[0], arr2[0], arr3[1]...arrK[0])

In order to know which array gave the minimum element at a particular time, we will store additional information of about array and index at which minimum element was.

If i represents the array number, and j represents the index of the minimum number currently in the heap from the ith array, then we add (j+1)th element to the min-heap next and re-heapify.

If we have put all the element from the ith array in the heap then we need to reduce the size of min-heap to k-1.

Follow the procedure for (n-1)*k times. When all array elements are processed the result array will be the sorted array for all `n*k` element.

Merge k sorted arrays: algorithm

  • Build min heap with the first element of all k arrays.
  • Pick the root of min element and put it in the result array.
  • If there are remaining elements in the array,  put next element at the root of min heap and heapify again
  • If all elements are already of an array are processed, reduce the size of min heap by 1.
  • Repeat step 2, 3 and 4 till min heap is empty.

Merge k sorted arrays: implementation

package com.company;

import java.util.PriorityQueue;

/**
 * Created by sangar on 2.12.18.
 */
public class MergeKSortedArrays {
    private class HeapNode {
        public int arrayNum;
        public int index;
        public int value;

        public HeapNode(int arrayNum, int index, int value) {
            this.arrayNum = arrayNum;
            this.index = index;
            this.value = value;
        }
    }

    public int[] mergeKSortedArrays(int[][] arrays) {

        if (arrays == null) return null;

        PriorityQueue<HeapNode> minHeap =
                new PriorityQueue<>(arrays.length,
                        (HeapNode a, HeapNode b) -> a.value - b.value);

        int size = 0;
        for (int i = 0; i < arrays.length; i++) {
            size += arrays[i].length;
        }
        int[] result = new int[size]; // k * n

        //add first elements in the array to this heap
        for (int i = 0; i < arrays.length; i++) {
            minHeap.add(new HeapNode(i, 0, arrays[i][0]));
        }

        //Complexity O(n * k * log k)
        for (int i = 0; i < size; i++) {
            //Take the minimum value and put into result
            HeapNode node = minHeap.poll();

            if (node != null) {
                result[i] = node.value;
                if (node.index + 1 < arrays[node.arrayNum].length) {
                    //Complexity of O(log k)
                    minHeap.add(new HeapNode(node.arrayNum,
                            node.index + 1,
                            arrays[node.arrayNum][node.index + 1]));
                }
            }
        }
        return result;
    }
}

 

Test cases

package test;

import com.company.MergeKSortedArrays;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class MergeKSortedArraysTest {

    MergeKSortedArrays tester = new MergeKSortedArrays();

    @Test
    public void mergeKSortedArraysTest() {

        int[][] input  ={
            { 1, 2, 3, 4 }, { 5, 6, 7, 8 }, { 9, 10, 11, 12 }
        };

        int[] expectedOutput = {1,2,3,4,5,6,7,8,9,10,11,12};

        int [] output = tester.mergeKSortedArrays(input);

        System.out.println(Arrays.toString(output));
        assertEquals(Arrays.toString(expectedOutput), 
					Arrays.toString(output));
    }

    @Test
    public void mergeKSortedArraysWithUnequalSizeTest() {

        int[][] input  ={
                { 1, 2 }, { 5, 6, 7}, { 9, 10, 11, 12 }
        };

        int[] expectedOutput = {1,2,5,6,7,9,10,11,12};

        int [] output = tester.mergeKSortedArrays(input);

        System.out.println(Arrays.toString(output));
        assertEquals(Arrays.toString(expectedOutput),
			Arrays.toString(output));
    }

    @Test
    public void mergeKSortedArraysWithNullTest() {

        int [] output = tester.mergeKSortedArrays(null);

        assertEquals(null, output);
    }
}

The complexity of the code to merge k sorted arrays is O(n * k * log k) along with space complexity of O(k).

Please share if there is something wrong or missing. If you are preparing for an interview, please sign up to receive interview preparation kit for free.

Find Kth smallest element in array

Kth smallest element in array

Given an array of integers which is non sorted, find kth smallest element in that array. For example: if input array is A = [3,5,1,2,6,9,7], 4th smallest element in array A is 5, because if you sort the array A, it looks like A = [1,2,3,5,6,7,9] and now you can easily see that 4th element is 5.

This problem is commonly asked in Microsoft and Amazon interviews as it has multiple layers and there is some many things that can be measured with this one problem.

Kth smallest element : Line of thought

First of all, in any interview, try to come up with brute force solution. Brute force solution to find Kth smallest element in array of integers would be to sort array and return A[k-1] element (K-1 as array is zero base indexed).

What is the complexity of brute force solution? It’s O(n2)? Well, we have sort algorithms like merge sort and heap sort which work in O(n log n) complexity. Problem with both searches is that they use additional space. Quick sort is another sort algorithm. It has problem that it’s worst case complexity will be O(n2), which happens when input is completely sorted.
In our case, input is given as unsorted already, so we can expect that quick sort will function with O(n log n) complexity which is it’s average case complexity. Advantage of using quick sort is that there is no additional space complexity.

Optimising quick sort

Let’s see how quicksort works and see if we can optimize solution further?
Idea behind quicksort is to find the correct place for the selected pivot. Once the pivot is at the correct position, all the elements on the left side of the pivot are smaller and on the right side of the pivot are greater than the pivot. This step is partitioning.

If after partitioning, pivot is at position j, can we say that pivot is actually jth smallest element of the array? What if j is equal to k? Well problem solved, we found the kth smallest element.

If j is less than k, left subarray is less than k, we need to include more elements from right subarray, therefore kth smallest element is in right subarray somewhere. We have already found j smallest elements, all we need to find is k-j elements from right subarray.

What if j is greater than k? In this case, we have to drop some elements from left subarray, so our search space would be left subarray after partition.

Theoretically, this algorithm still has complexity of O(n log n), but practically, you do not need to sort the entire array before you find k smallest elements.

Algorithm to find K smallest elements in array

  1. Select a pivot and partition the array with pivot at correct position j
  2. If position of pivot, j, is equal to k, return A[j].
  3. If j is less than k, discard array from start to j, and look for (k-j)th smallest element in right sub array, go to step 1.
  4. If j is greater than k, discard array from j to end and look for kth element in left subarray, go to step 1

Let’s take an example and see if this algorithm works? A =  [4, 2, 1, 7, 5, 3, 8, 10, 9, 6 ], and we have to find fifth smallest element in array A.

Kth smallest element in array

Start with pivot as first index of array, so pivot = 0, partition the array into two parts around pivot such that all elements on left side of pivot element, i.e. A[pivot] are smaller and all elements on right side are greater than A[pivot].

Start with pivot as first index of array, so pivot = 0, partition the array into two parts around pivot such that all elements on left side of pivot element, i.e. A[pivot] are smaller and all elements on right side are greater than A[pivot].

In our example, array A will look like below after pivot has found it’s correct position.

k smallest element
After partition, correct position of pivot is index 3

If pivot == k-1 (array is represented as zero base index), then A[pivot] is kth smallest element. Since pivot (3) is less than k-1 (4), look for kth smallest element on right side of the pivot.

k remains as it is as opposed to k-j mentioned in algorithm as pivot is given w.r.t entire array and not w.r.t subarray.

In second iteration, pivot = 4 (index and not element). After second execution of quick sort array A will be like

After partition of right subarray, correct position of pivot is index 4

pivot(4) which is equal to k-1(5-1). 5th smallest element in array A is 5.

Kth smallest element : Implementation

package com.company;

/**
	* Created by sangar on 30.9.18.
*/
public class KthSmallest {
	private void swap(int[] a, int i, int j){
		int temp = a[i];
		a[i] = a[j];
		a[j] = temp;
	}
	private int partition(int[] a, int start, int end){
		int pivot = a[start];
		int i  = start+1;
		int j  = end;

		while(i <= j){
			while(a[i] < pivot) i++;
			while(a[j] > pivot) j--;

			if(i < j) {
				swap(a, i, j);
			}
		}
		swap(a, start, j);
		return j;
	}

	public int findKthSmallestElement(int a[], int start, 
				int end, int k){
		if(start <= end){
		int p = partition(a, start, end);
		if(p == k-1){
			return a[p];
		}
		if(p > k-1)
			return findKthSmallestElement(a, start, p, k);
		if(p < k-1)
			return findKthSmallestElement(a, p+1, end, k);
		}
		return -1;
	}
}
package test;

import com.company.KthSmallest;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class KthSmallestTest {

	KthSmallest tester = new KthSmallest();
	private int[] a = {4, 2, 1, 7, 5, 3, 8, 10, 9};
	@Test
	public void kthSmallest() {
		assertEquals(7, tester.findKthSmallestElement(a,0,8,6));
	}

	@Test
	public void firstSmallest() {
		assertEquals(1, tester.findKthSmallestElement(a,0,8,1));
	}

	@Test
	public void lastSmallest() {
		assertEquals(10, tester.findKthSmallestElement(a,0,8,9));
	}

	@Test
	public void kGreaterThanSize() {
		assertEquals(-1, tester.findKthSmallestElement(a,0,8,15));
	}
	@Test
	public void emptyArray() {
		int[] a = {};
		assertEquals(-1, tester.findKthSmallestElement(a,0,0,1));
	}

	@Test
	public void nullArray() {
		assertEquals(-1, tester.findKthSmallestElement(null,0,0,1));
	}
}

Complexity of using quick sort algorithm to find kth smallest element in array of integers in still O(n log n).

Kth smallest element using heaps

Imagine a case where there are a billion integers in the array and you have to find 5 smallest elements from that array. The complexity of O(n log n) is too costly for that use case. Above algorithm using quick sort does not take into consideration disparity between k and n.

We want top k elements, how about we chose those k elements randomly, call it set A and then go through all other n-k elements, call it set B, check if element from set B (n-k elements) can displace element in set A (k elements)?

What will be the condition for an element from set B to replace an element in set A? Well, if the new element is less than maximum in set A than the maximum in set A cannot be in the set of k smallest elements right?  Maximum element in set A would be replaced by the new element from set B.

Now, the problem is how to quickly find the maximum out of set A. Heap is the best data structure there. What kind of heap: min heap or max heap? Max heap as it store the maximum of the set at the root of it.

Let’s defined concrete steps to find k smallest elements using max heap. 

  1. Create a max heap of size k from first k elements of array.
  2. Scan all elements in array one by one.
    1.  If current element is less than max on heap, add current element to heap and heapify.
    2. If not, then go to next element.
  3. At the end, max heap will contain k smallest elements of array and root will be kth smallest element.

Let’s take an example and see if this algorithm works? The input array is shown below and we have to find the 6th smallest element in this array.

kth smallest element using heaps
input array

Step 1 : Create a max heap with first 6 elements of array.

Create a max heap with set A

Step 2: Take next element from set B and check if it is less than the root of max heap. In this case, yes it is. Remove the root and insert the new element into max heap.

Element from set B removes root from max heap and added to max heap

Step 2: It continues to 10, nothing happens as the new element is greater than the root of max heap. Same for 9.  At 6, again the root of max heap is greater than 6. So remove the root and add 6 to max heap.

Again, new element from set B is less than root of max heap. Root is removed and new element is added.

Array scan is finished, so just return the root of the max heap, 6 which is the sixth smallest element in given array.

	public int findKthSmallestElementUsingHeap(int a[], int k){
	//https://stackoverflow.com/questions/11003155/change-priorityqueue-to-max-priorityqueue

	PriorityQueue<Integer>  maxHeap =
			new PriorityQueue<>(k, Collections.reverseOrder());

		if(a == null || k > a.length) return -1;
		//Create max with first k elements
		for(int i=0; i<k; i++){
			maxHeap.add(a[i]);
		}

		/*Keep updating max heap based on new element
		If new element is less than root, 
		remove root and add new element
		*/

		for(int i=k; i<a.length; i++){
			if(maxHeap.peek() > a[i]){
				maxHeap.remove();
				maxHeap.add(a[i]);
			}
		}
		return maxHeap.peek();
	}

Can you calculate the complexity of above algorithm? heapify() has complexity of log(k) with k elements on heap. In worst case, we have to do heapify() for all elements in array, which is n, so overall complexity of algorithm becomes O(n log k). Also, there is additional space complexity of O(k) to store heap.
When is very small as compared to n, this algorithm again depends on the size of array.

We want k smallest elements, if we pick first k elements from a min heap, will it solve the problem? I think so. Create a min heap of n elements in place from the given array, and then pick first k elements.
Creation of heap has complexity of O(n), do more reading on it. All we need to do is delete k times from this heap, each time there will be heapify(). It will have complexity of O(log n) for n element heap. So, overall complexity would be O(n + k log n).

Depending on what you want to optimize, select correct method to find kth smallest element in array.

Please share if there is something wrong or missing. If you are interested in taking coaching sessions from our experienced teachers, please reach out to us at communications@algorithmsandme.com

Quick sort algorithm

Quick sort Algorithm

Quick sort like merge sort is a sorting algorithm under divide and conquer paradigm of algorithms like merge sort. Basic idea of algorithm is to divide inputs around a pivot and then sort two smaller parts recursively and finally get original input sorted.

Selection of pivot

Entire idea of quick sort revolves around pivot. Pivot is an element in input around which input is arranged in such a way that all elements on left side are smaller and all elements on right side are greater than pivot. Question is how to find or select pivot and put it into correct position.

To make things simpler to start with, let’s assume first element of input is pivot element.

To put this pivot at correct position in input, start with next element of pivot in input space and find first element which is greater than pivot. Let that be ith position.

At the same time, start from end of array and find first element which is smaller than pivot. Let it be jth position.

If i and j have not crossed each other i.e i < j, then swap element at ith and jth positions, and continue moving right on input to find element greater than pivot and moving left to find element smaller than pivot.
Once i and j cross each other, swap pivot with element at jth position.  After this step, pivot will be at its correct position and array will be divided into two parts. All elements on left side will be less than pivot and all elements on right side will be greater than pivot.

Quick sort partition example

This is too much to process, I know! Let’s take an example and see how it does it work? We have an array as follows

quick sort

Let’s select first element as pivot, pivot = 3.

quick sort pivot selection

Start from next element of pivot, move towards right of array, till we see first element which is greater than pivot i.e. 3.

From end of array, move towards left till you find an element which is less than pivot.

Now, there are two indices, i and j, where A[i] > pivot and A[j] < pivot. See that i and j not yet crossed each other. Hence, we swap A[i] with A[j]. Array at the bottom of pic, shows resultant array after swap.

quick sort partition

Again, start with i+1 and follow the same rule : Stop when you find element greater than pivot. In this case, 10 is greater than 3, hence we stop.

Similarly, move left from end again, till we find an element which is less than pivot. In this case, we end up at index = 2 which is element 1.

Since, i > j, than means paths have been crossed. At this time, instead of swapping element at i and j index, swap element at j index with pivot.

After swapping pivot with jth index, we have array divided into two parts, pivot as boundary. All elements on left side of pivot are smaller (they may not be sorted) and all elements on right side of pivot are greater than pivot (again may not be sorted).

quick sort partitions

We, apply this same partition process to left and right arrays again, till base condition is hit. In this case, base condition would be if there is only one element in array to be partitioned.

Quick sort algorithm

quickSort([], start, end)
1. If array has more than one elements i.e (start < end):
1.1 Find correct place for pivot.
pivot = partition(arr, low, high)
1.2. Apply same function recursively to left of pivot index
quickSort(arr, start, pivot -1 )
and to the right of pivot index
quickSort(arr, pivot + 1, end)

Quick sort implementation

package AlgorithmsAndMe;

public class QuickSort {

    private void swap(int [] a, int i, int j){
        int temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }

    private int partition(int [] a, int start, int end){
        int pivot = a[start];
        int i  = start;
        int j  = end;

        while(i < j){
            while(i <= end && a[i] <= pivot) i++;
            while(j >= start && a[j] > pivot) j--;
            
            if(i < j) {
                swap(a, i, j);
            }
        }

        swap(a, start, j);
        return j;
    }

    public void quickSort(int [] a, int start, int end){
        if(start < end){
            int p = partition(a, start, end);
            quickSort(a,start, p-1);
            quickSort(a, p+1, end);
        }
    }
}

There is another implementation which is based on Lomuto partition scheme, in this scheme, we make last element as pivot. The implementation is compact but complexity is bit higher than the original partition methods in terms of number of swaps.

#include<stdlib.h>
#include<stdio.h>
 
void swap(int *a, int *b){
    int temp = *a;
    *a = *b;
    *b = temp;
}
 
int partition(int a[], int low, int high)
{
    // set pivot as highest element
    int x  = a[high];
 
    //Current low points to previous of low of this part of array. 
    int i = low - 1;
 
    for (int j = low; j <= high-1; j++)
    {
    	/*Move in the array till current node data is 
        less than the pivot */
        if (a[j] <= x){
            //set the current low appropriately
            i++;
            swap(&a[i], &a[j]);
        }
    }
    //Now swap the next node of current low with pivot
 
    swap(&a[i+1], &a[high]);
 
    printf("\n Pivot : %d\n", a[i+1]);
    for(int j=0; j<=high; j++){
 
    	printf("%d ", a[j]);
    }
    //return current low as partitioning point.
    return i+1;
}
 
/* A recursive implementation of quicksort for linked list */
void quickSortUtil(int a[], int low, int high)
{
    if (low < high)
    {
        int p = partition(a,low, high);
        quickSortUtil(a,low, p-1);
        quickSortUtil(a, p+1, high);
    }
}
 
/* Driver program to run above code */
int main(){
 
    int a[] = {5,4,2,7,9,1,6,10,8};
 
    int size = sizeof(a)/sizeof(a[0]);
    quickSortUtil(a, 0, size-1);
 
    for(int i=0; i<size; i++){
    	printf("%d ", a[i]);
    }
    return 0;
}

Complexity analysis of quick sort algorithm

If pivot splits original array into two equal parts (which is the intention), complexity of quick sort is O(n log n). However, worst case complexity of quick sort happens when input array is already sorted in increasing or decreasing order. In this case, array is partitioned into two subarrays, one with size 1 and other with size n-1. Similarly, subarray with n-1 elements, it again is divided into two subarrays of size 1 and n-2. In order to completely sort array it will split for n-1 times and each time it requires to traverse n element to find correct position of pivot. Hence overall complexity of quick sort comes out as O(n2).

There is a very interesting question, which tests your understanding of system basics. Question is what is space complexity of this algorithm? There is no apparent memory is used. However, recursive implementation internally puts stack frames on stack for partitioned indices and function call return address and so on. In worst case, there can be n stack frames, hence worst case complexity of quick sort will be O(n).

How can we reduce that? If the partition with fewest elements is (recursively) sorted first, it requires at most O(log n) space. Then the other partition is sorted using tail recursion or iteration, which doesn’t add to the call stack. This idea, was described by R. Sedgewick, and keeps the stack depth bounded by O(log n) and hence space complexity will be O(log n).

Quick sort with tail recursion

Quicksort(A, p, r)
{
 while (p < r)
 {
  q = Partition(A, p, r)
  Quicksort(A, p, q)
  p = q+1
 }
}

Selection of Pivot
If array is completely sorted, then worst case behavior of quick sort is O(n2), so there comes another problem. How can we select pivot so that two subarrays are almost equal size. There are many solutions proposed.
1. Taking median of array as pivot. So how to select median of an unsorted array. We look into this problem separately, but yes it guarantees two halves of same size.
2. Selecting pivot randomly. This requires heuristics algorithms to select pivot.

Please leave your comment in case you find something wrong or you have some improved version.