Intersection of two arrays

Intersection of two arrays

Given two unsorted arrays of integers, find intersection of these two arrays. Intersection means common elements in the given two arrays. For example, A = [1,4,3,2,5,6] B = [3,2,1,5,6,7,8,10] intersection of A and B is [ 1,3,2,5,6 ].

Sort array and then use binary search
As given arrays are unsorted, sort one of the arrays, preferably the larger one. Then search each element of another array in the sorted array using binary search. If the element is present, put it into the intersection array.

class Solution {
    public int[] intersection(int[] nums1, int[] nums2) {
        
        int len1 = nums1.length;
        int len2 = nums2.length;
        Set<Integer> result = new HashSet<>();
        
        for(int i=0; i<len2; i++){
            if(binarySearch(nums1, nums2[i]) != -1){
                result.add(nums2[i]);
            }
        }
        int i = 0;
        int[] resultArray = new int[result.size()];
        for(Integer num : result){
            resultArray[i++] = num ;
        }
        
        return resultArray;
    }
    
    private int binarySearch(int[] a, int key) {
        
        for(int i=0; i<a.length; i++){
            if(a[i] == key) return i;
        }
        
        return -1;
    }
}

The time complexity of binary search method to find intersection is O(nlogn) for sorting and then O(mlogn) for searching. Effective time complexity becomes O((n+m)logn) which is not optimal.

Sort and use merge to find common elements
Again in this method, sort two arrays first. Then use two pointers to scan both arrays simultaneously. (Please refer to merge part of merge sort ). The difference is we will put only common elements, instead of all.

The time complexity of merge sort method is O(nlogn) + O(mlogm) for sorting and then O(m+n) for scanning both arrays. It is worst than the binary search method.

Use hash
Create a hash with key as elements from the smaller array (saves space). Then scan through other array and see if the element is present in hash. If yes, put into intersection array else do not.

package AlgorithmsAndMe;

import com.sun.org.apache.xpath.internal.operations.Bool;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class IntersectionTwoArrays {

    public List<Integer> findIntersecton(int[] a, int[] b) {
        List<Integer> result = new ArrayList<>();
        Map<Integer, Boolean> existingElements = new HashMap<>();

        for (int i = 0; i < a.length; i++) {
            existingElements.put(a[i], true);
        }

        for (int i = 0; i < b.length; i++) {
            if (existingElements.containsKey(b[i])) {
                result.add(b[i]);
            }
        }
        return result;
    }
}

Test case

package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import AlgorithmsAndMe.IntersectionTwoArrays;

import java.util.List;
import java.util.Set;

public class IntersectonTwoArraysTest {


    IntersectionTwoArrays intersectionTwoArrays
             = new IntersectionTwoArrays();

    @org.junit.Test
    public void testIntersectionTwoArrays() {
        int [] a = {1,6,3};
        int [] b = {1,2,3};
        List<Integer> result = intersectionTwoArrays.findIntersecton(a,b);

        result.forEach(s -> System.out.println(s));
    }
}

This method has the complexity of O(n) where n is the number of elements in the larger array and extra space complexity of O(m) where m is the number of elements in the smaller array.

These methods to find the intersection of two arrays do not work when there are duplicate elements in any of the array as they will be part of intersection array only once.

Please share if there is something wrong or missing. we would love to hear from you.

Find duplicate numbers in array

Find all duplicate numbers in array

Given an array of positive integers in range 0 to N-1, find all duplicate numbers in the array. The array is not sorted. For example:
A = [2,4,3,2,1,5,4] Duplicate numbers are 2,4 whereas in A = [4,1,3,2,1,1,5,5] duplicate numbers are 1,5.

Brute force solution would be to keep track of every number which is already visited. The basic idea behind the solution is to keep track that whether we have visited the number before or not. Which data structure is good for quick lookups like this? Of course a map or hash.
The time complexity of this solution is O(n) but it has an additional space complexity of O(n).

To reduce space requirement, a bit array can be used, where ith index is set whenever we encounter number i in the given array. If the bit is set already, its a duplicate number. It takes O(n) extra space which is actually less than earlier O(n) as only bits are used. The time complexity remains O(n)

Find duplicate numbers in an array without additional space

Can we use the given array itself to keep track of the already visited numbers? How can we change a number in an array while also be able to get the original number back whenever needed? That is where reading the problem statement carefully comes. Since array contains only positive numbers, we can negate the number at the index equal to the number visited. If ever find a number at any index negative, that means we have seen that number earlier as well and hence should be a duplicate.

Idea is to make the number at ith index of array negative whenever we see number i in the array. If the number at ith index is already negative, it means we have already visited this number and it is duplicate. Limitation of this method is that it will not work for negative numbers.

Duplicate numbers implementation

package AlgorithmsAndMe;

import java.util.HashSet;
import java.util.Set;

public class DuplicatesInArray {

    public Set<Integer> getAllDuplicates(int[] a ) 
                              throws IllegalArgumentException {

        Set<Integer> result = new HashSet<>();

        if(a == null) return result;

        for(int i=0; i<a.length; i++) {
            //In case input is wrong
            if(Math.abs(a[i]) >= a.length ){
               throw new IllegalArgumentException();
            }
            
            if (a[Math.abs(a[i])] < 0) {
                result.add(Math.abs(a[i]));
            } else {
                a[Math.abs(a[i])] = -a[Math.abs(a[i])];
            }
        }
        return result;
    }
}

Test cases

package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import java.util.Set;

public class DuplicatesInArrayTest {

    DuplicatesInArray duplicatesInArray = new DuplicatesInArray();

    @org.junit.Test
    public void testDuplicatesInArray() {
        int [] a = { 1,2,3,4,2,5,4,3,3};
        Set<Integer> result = duplicatesInArray.getAllDuplicates(a);

        result.forEach(s -> System.out.println(s));
    }

    @org.junit.Test
    public void testDuplicatesInArrayWithNullArray() {
        Set<Integer> result = duplicatesInArray.getAllDuplicates(null);

        result.forEach(s -> System.out.println(s));
    }

    //This case should generate an exception as 3 is greater than the size.
    @org.junit.Test
    public void testDuplicatesInArrayWithNullArray() {
        int [] a = { 1,2,3};
        try{
             Set<Integer> result = duplicatesInArray.getAllDuplicates(a);
        } catch (IllegalArgumentException  e){
            System.out.println("invalid input provided");
        }
    }
}

The complexity of the algorithm to find duplicate elements in an array is O(n).

Repeated number in array

Repeated number in an array

In last post : Find missing number in array, we learned how to find a missing number in array of integers with values in a given range. Today, we will learn how find a repeated number in array of integers from 1 to N. Note that here also, numbers are not sorted but are confined to a range. So, if size of array is N, then range of numbers is from 1 to N-1 as one number is repeated. Examples :

A = [1,2,3,3,4,5]. Repeated number is 3
Size of array : 6 Range : 1 to 5

Repeated number : Algorithm

As we have learned while solving the missing number problem earlier, XOR principle can be applied here too. Why? Because in this case repeated number will be XORed with itself three times. Properties of XOR to understand the method and how we use them.

A XOR A = 0
0 XOR A = A

Now, when a number XORed with itself, the result is zero, and when zero is XORed with a number, the result is the number itself. Extending this, if we XORed the same number thrice or without losing generality, an odd number of times, the result will be the number itself.

Using an odd number of times XOR principle, algorithm to find repeating number in an array.

1. XOR all actual numbers in the array. Call it aXOR.
2. XOR all numbers in range 1 to N-1. Call it eXOR
3. XOR aXOR with eXOR. Result will be repeated number.

This is because all numbers except the repeated number will be XORed even number of times, and cancel each other. The repeated number will be XORed thrice, the final result will be the repeated number. Let’s take above example and see if it works

A = [1,2,2,3,4]

aXOR = 001 XOR 010 = 011 XOR 010 = 001 XOR 011 = 010 XOR 100 = 110
eXOR = 001 XOR 010 = 011 XOR 011 = 000 XOR 100 = 100

ActualXOR XOR expectedXOR = 110 XOR 100 = 010

Repeated number in array implementation

public int repeatedNumber(int[] nums) {
 
    int n =  nums.length;
     
    int nXOR = 0;
    for(int i=0; i<=n; i++){
        nXOR ^= i;
    }
     
    int aXOR = 0;
    for(int i=0; i<n; i++){
        aXOR ^= nums[i];
    }
     
    return aXOR ^ nXOR;
}

The time complexity of the XOR method to find a repeated number in an array is O(n).

Please share your thoughts through comments, if you see something is missing or wrong or not explained properly.

Find a missing number in array

Missing number in an array

Given an array of N integers, ranging from 1 to N+1, find the missing number in that array. It is easy to see that with N slots and N+1 integers, there must be a missing number in the array. For example, A = [1,2,5,4,6] N = 5 range 1 to 6. The output is 3.
A = [1,5,3,4,7,8,9,2] N = 8 range 1 to 9. Output is 6

Methods to find a missing number

Using hash
Create a hash with the size equal to N+1. Scan through elements of the array and mark as true in the hash. Go through the hash and find a number which is still set to false. That number will be the missing number in the array.
The complexity of this method is O(n) with additional O(n) space complexity.

Using mathmatics
We know that the sum of N consecutive numbers is N*(N+1)/2. If a number is missing, the sum of all numbers will not be equal to N*(N+1)/2. The missing number will be the difference between the expected sum and the actual sum.

Missing num = (N+2) * (N+1) /2 – Actual sum; N+1 because the range of numbers is from 1 to N+1
Complexity is O(n). However, there is a catch: there may be an overflow risk if N is big enough.

Using XOR
There is a beautiful property of XOR, that is: if we XOR a number with itself, the result will be zero. How can this property help us to find the missing number? In the problem, there are two sets of numbers: the first one is the range 1 to N+1, and others which are actually present in the array. These two sets differ by only one number and that is our missing number. Now if we XOR first set of numbers with the second set of numbers, all except the missing number will cancel each other. The final result will be the actual missing number.

Algorithm to find a missing number using XOR

1. Scan through the entire array and XOR all elements. Call it aXOR
2. Now XOR all numbers for 1 to N+1. Call it eXOR
3. Now XOR aXOR and eXOR, the result is the missing number

Let’s take an example and see if this works

A = [1,3,4,5] Here N = 4, Range is 1 to 5.

XORing bit representations of actual numbers
001 XOR 011 = 010 XOR 100 = 110 XOR 101 = 011 (aXOR)

XORing bit representation of expected numbers
001 XOR 010 = 011 XOR 011 = 000 XOR 100 = 100 XOR 101 = 001 (eXOR)

Now XOR actualXOR and expectedXOR;
011 XOR 001 = 010 = 2 is the missing number

Implementation

    public int missingNumber(int[] nums) {
    
        int n =  nums.length;
        
        int nXOR = 0;
        for(int i=0; i<=n; i++){
            nXOR ^= i;
        }
        
        int aXOR = 0;
        for(int i=0; i<n; i++){
            aXOR ^= nums[i];
        }
        
        return aXOR ^ nXOR;
    }

The complexity of the XOR method to find a missing number in an array of integers is O(n) with no additional space complexity.

If you want to contribute to this blog in any way, please reach out to us: Contact. Also, please share if you find something wrong or missing. We would love to hear what you have to say.

Segregate 0s and 1s in an array

Given an array of 0s and 1s, segregate 0s and 1s in such as way that all 0s come before 1s. For example, in the array below,

segregate 0s and 1s in an array

The output will be as shown below.

segregate 0s and 1s in an array

This problem is very similar to Dutch national flag problem

Different methods to segregate 0s and 1s in an array

Counting 0s and 1s.
The first method is to count the occurrence of 0s and 1s in the array and then rewrite o and 1 onto original array those many times. The complexity of this method is O(n) with no added space complexity. The only drawback is that we are traversing the array twice.

package com.company;

/**
 * Created by sangar on 9.1.19.
 */
public class SegregateZerosAndOnes {

    public void segregate(int[] a) throws IllegalArgumentException{

        if(a == null) throw new IllegalArgumentException();
        int zeroCount = 0;
        int oneCount = 0;

        for (int i = 0; i < a.length; i++) {
            if (a[i] == 0) zeroCount++;
            else if (a[i] == 1) oneCount++;
            else throw new IllegalArgumentException();
        }

        for (int i = 0; i < zeroCount; i++) {
            a[i] = 0;
        }

        for (int i = zeroCount; i < zeroCount + oneCount; i++) {
            a[i] = 1;
        }
    }
}

Using two indices.
the second method is to solve this problem in the same complexity, however, we will traverse the array only once. Idea is to maintain two indices, left which starts from index 0 and right which starts from end (n-1) where n is number of elements in the array.
Move left forward till it encounters a 1, similarly decrement right until a zero is encountered. If left is less than right, swap elements at these two indice and continue again.

1. Set left = 0 and right = n-1
2. While left < right 2.a if a[left] is 0 then left++
2.b if a[right] is 1 then right– ;
2.c if left < right, swap(a[left], a[right])

segregate 0s and 1s implementation

public void segregateOptimized(int[] a) throws IllegalArgumentException{

        if(a == null) throw new IllegalArgumentException();
        int left = 0;
        int right = a.length-1;

        while(left < right){
            while(left < a.length && a[left] == 0) left++;
            while(right >= 0 && a[right] == 1) right--;

            if(left >= a.length || right <= 0) return;
            
            if(a[left] > 1 || a[left] < 0 || a[right] > 1 || a[right] < 0)
                throw new IllegalArgumentException();

            if(left < right){
                a[left] = 0;
                a[right] = 1;
            }
        }
    }

The complexity of this method to segregate 0s and 1s in an array is O(n) and only one traversal of the array happens.

Test cases

package test;

import com.company.SegregateZerosAndOnes;
import org.junit.*;
import org.junit.rules.ExpectedException;

import java.util.Arrays;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class SegregateZerosAndOnesTest {

    SegregateZerosAndOnes tester = new SegregateZerosAndOnes();

    @Test
    public void segregateZerosAndOnesOptimizedTest() {

        int[] a = {0,1,0,1,0,1};
        int[] output = {0,0,0,1,1,1};

        tester.segregateOptimized(a);
        assertEquals(Arrays.toString(output), Arrays.toString(a));

    }

    @Test
    public void segregateZerosAndOnesAllZerosOptimizedTest() {

        int[] a = {0,0,0,0,0,0};
        int[] output = {0,0,0,0,0,0};

        tester.segregateOptimized(a);
        assertEquals(Arrays.toString(output), Arrays.toString(a));

    }

    @Test
    public void segregateZerosAndOnesAllOnesOptimizedTest() {

        int[] a = {1,1,1,1,1};
        int[] output = {1,1,1,1,1};

        tester.segregateOptimized(a);
        assertEquals(Arrays.toString(output), Arrays.toString(a));

    }

    @Test(expected=IllegalArgumentException.class)
    public void segregateZerosAndOnesOptimizedIllegalArgumentTest() {

        int[] a = {1,1,1,1,2};
        tester.segregateOptimized(a);
    }

    @Test(expected=IllegalArgumentException.class)
    public void segregateZerosAndOnesOptimizedNullArrayTest() {

        tester.segregateOptimized(null);
    }

}

Please share if you have any suggestion or queries. If you are interested in contributing to the website or have an interview experience to share, please contact us at communications@algorithmsandme.com.

Range minimum query (RMQ)

Range minimum query RMQ

Given an array A[0..n], find the index of the element with the minimum value in a given range. This problem is known as Range Minimum Query or RMQ.
For example, if given array below, find the index of minimum value between index 2 and 7, RMQ answer would be 5, which is the index of element 1.

 RMQ range minimum query

Going by the brute force, every time a query is fired, we scan the range and find the minimum in a given range in the same way as we do for an entire array. The complexity of each query being answered is O(n) wherein the worst-case range is the entire array.

Can we preprocess our data, so that our query operations are less costly? If we do so, there are two parts to the solution now, first preprocessing and the second query. Let’s assume complexity of each step is f(n) and g(n) respectively, then the complexity of solution can be denoted as (f(n), g(n)).

What kind of preprocessing can be done? Basic idea is to calculate the minimum index of all the ranges possible in the array. How many ranges are possible for an array with n elements? It’s n2 ranges. Why?

So, to store the index of minimum value element of each range, O(n2) order space is required and time complexity goes to O(n3). However, complexity of query is O(1). So overall complexity of solution is ( O(n3), O(1) ).

#include <stdio.h>

int M[100][100];

int findMinimum(int a[], int start, int end, int size){
	if(start >= size || end >= size) return -1;
	int min = start;
	for(int i=start; i<=end; i++){
		if( a[i] < a[min] ){
			min = i;
		}
	}
	return min;
	
}
void preprocess(int a[], int size ){
    for(int i=0; i<size; i++){
        for(int j=0; j<size; j++){
            for(int k=i; k<=j; k++){
                M[i][j] = findMinimum(a,i,j,size); 
            }
        }
    }
}

int rmq(int start, int end){
	return M[start][end];
}

int main(void) {
	
	int a[] = { 2,3,1,5,9,7,10,5,6,3 };
	int size = sizeof(a)/sizeof(a[0]);
	
	//Preprocessing step
	preprocess(a, size);
	printf("\n Minimum index in range is : %d ", rmq(3,9) );
	printf("\n Minimum index in range is : %d ", rmq(2,7) );
	
	return 0;
}

With application of dynamic programming, the complexity of the preprocessing step can be reduced to O(n2).

#include <stdio.h>

int M[100][100];

void preprocess(int a[], int size)
{
	int i,j;
	for (i=0; i<size; i++)
		M[i][i] = i;
	
	for (i=0; i<size; i++){
		for (j=i+1; j<size; j++){
			if (a[M[i][j - 1]] < a[j])
				M[i][j] = M[i][j - 1];
			else
				M[i][j] = j;
		}
	}
}

int rmq(int start, int end){
	return M[start][end];
}

int main(void) {
	
	int a[] = { 2,3,1,5,9,7,10,5,6,3 };
	int size = sizeof(a)/sizeof(a[0]);
	
	//Preprocessing step
	preprocess(a, size);
	printf("\n Minimum index in range is : %d ", rmq(3,9) );
	printf("\n Minimum index in range is : %d ", rmq(2,7) );
	
	return 0;
}

Range minimum query with O(n), O(√n) complexity solution

Can we do better for preprocessing step while trading off query step? If we divide the array into smaller chunks and store index of minimum value element in those chunks, will it help? And what should be the size of chunks? How about we divide the array in √n parts, where √n is size of part.

RMQ or range minimum query based on square root partitioning

Now, find minimum element index in each of this chunk, and store it. Extra space required is (√n). Finding minimum for each chunk has a complexity of (√n * √n) as O(n).

To find minimum element index in the given range, follow three steps:
1. Find the index of the element with the minimum value in all chunks lying between the start and end of the given range. (Max √n operations if all chunks fall in the range)
2. Find minimum index in chunk where the start of the range lies. ( Max √n comparisons from start of the range to end of the chunk).
3. Find minimum index in chuck where end of the range lies from the start of chunk to end of the range.
4. Compare all these values and return the index of the minimum of them.

No matter, how big or small range is to find the index of an element with the minimum value, the worst case will be O(√n) as there are only 3*√n operations.

Let’s take an example and see how it works. Find minimum in range (2,7)

range minimum query or RMQ example

To get RMQ(2,7), what are the chunks with are lying within range? There is only one: chunk 1. Minimum index of chunk 1 is M[1] which is 5, so, minimum element in those chunks is A[5].

Find the index of the minimum value in chunk 0 where start of the range lies (starting from start of the range which 2). There is only one element, which is index 2, so element to compare is A[2].

Find minimum from the start of chunk where the end of the range lies. So, we will be comparing A[6] and A[7].

At the end, compare A[5] (minimum of all chunks between start and end of range ), A[2] (minimum in chunk where the start of the range lies) and A[6], A[7] (minimum in chunk where end of the range lies) and we have the answer as 5 as A[5] is the minimum of all these values.

Aggregating all things, we found a way to optimize solution of range minimum query with complexity as ((o(n), O(√n)).

RMQ using sparse table

Method 3 uses only O(√n) space, however, query time complexity is also O(√n). To reduce query time at the expense of space, there is another method called as sparse table method. This method uses features of method 2 (dynamic programming) and features of method 3 (find minimums of chunks).

In this approach, split input array into chunks of size 2j where j varies from 0 to log n and n is number of elements in array. There will be n log n such chunks and hence the space complexity becomes O(n log n).

After splitting, find the index of the minimum element in each chunk and store it in a lookup table. 

M[i][j] stores minimum in range from i with size 2j.

RMQ using sparse matrix table

For example, M[0][3] stores index of the minimum value between 0 and 7 (23 = 8 elements).

Now problem is how to create this lookup table? This table can be created using dynamic programming from bottom up. Specifically, we find index of the minimum value in a block of size 2j by comparing the two minima of its two constituent blocks of size 2j-1. More formally,

M[i,j] = M[i, j-1] if A[M[i, j-1]] >= A[M[i+2^j-1, j-1]] 
M[i,j] = M[i+2^j-1, j-1] otherwise.

How to find the index of the minimum value in a given range? Idea is to find two subranges which cover the entire range and then find the minimum of minimums of these two ranges.
For example, find RMQ(i,j). If 2k be size of largest block that fits into the range from i to j, then k = log(j – i + 1)

Now, we have two parts to look in from i to i+2k + 1 (already computed as M[i,k] ) and from j-2k+1 (already computed as M[j-2k+1, k]).

Formally,

    RMQ(i,j) =  M[i][k] if A[ M[i][k] ] >= A[M[j-2^k+1, k]]
    RMQ(i,j) =  M[j-2^k+1, k]

RMQ implementatio using sparse table

#include <stdio.h>
#include <math.h>

int M[100][100];

void preprocess(int a[], int size)
{
    int i, j;
	
    for (i = 0; i < size; i++)
        M[i][0] = i;
		
    for (j = 1; 1 << j <size ; j++){
        for (i = 0; i + (1 << j) - 1 < size; i++){
            if (a[M[i][j - 1]] < a[M[i + (1 << (j - 1))][j - 1]])
                M[i][j] = M[i][j - 1];
            else
                M[i][j] = M[i + (1 << (j - 1))][j - 1];
        }
    }
}  
  
int rmq(int a[], int start, int end){
    int j = floor(log(start-end+1));

    if ( a[M[start][j]] <= a[M[end-(1<<j)+1][j]] )
        return M[start][j];
    else 
        return M[end-(1<<j)+1][j];
}

int main(void) {
	
	int a[] = { 2,3,1,5,9,7,10,5,6,3 };
	int size = sizeof(a)/sizeof(a[0]);
	
	//Preprocessing step
	preprocess(a, size);
	printf("\n Minimum index in range is : %d ", rmq(a,3,9) );
	printf("\n Minimum index in range is : %d ", rmq(a,2,7) );
	
	return 0;
}

These two blocks entirely cover the range and since only once comparison required, the complexity of lookup will be O(1).

In this post, we discussed various ways to implement range minimum query based on space and time complexity tradeoff. In future posts, we will discuss applications of RMQ such as segmented trees and least common ancestor problem.

Please share if something is wrong or missing, we would love to hear from you.

Number of occurrences of element

Number of occurrences of element

Given a sorted array and a key, find the number of occurrences of a key in that array. For example, in the below array, the number of occurrences of 3 is 3.

number of occurrences of element

Brute force method will be to scan through the array, find the first instance of an element and then find the last instance, then do the math. The complexity of that method is O(N). Can we do better than that?

Did you get some hint when brute force method was described? Yes,we have already cracked the problem to find first occurrence and last occurrence in O(log n) complexity earlier. We will be using those two methods, all we need to do know is math.

occurrences = lastInstance - firstInstance + 1

Number of occurrences of element : Implementation.

package com.company;

/**
 * Created by sangar on 25.3.18.
 */
public class BinarySearcchAlgorithm {

    private static boolean isGreaterThanEqualTo(int[] a, int index, int key){
        if(a[index] >= key) return true;

        return false;
    }

    private static boolean isLessThanEqualTo(int[] a, int index, int key){
        if(a[index] <= key) return true;

        return false;
    }

    private int findFirstOccurance(int[] nums, int target){
        int start = 0;
        int end = nums.length-1;
        
        while(start<end){
            int mid =  start + (end-start)/2;
            
            if(if(isGreaterThanEqualTo(nums, mid, target)){){
                end = mid;
            }
            else{
                start = mid+1;
            }
        }
        return start < nums.length && nums[start] == target ? start : -1;
    }
    
    private int findLastOccurance(int[] nums, int target){
        int start = 0;
        int end = nums.length-1;
        
        while(start<=end){
            int mid =  start + (end-start)/2;
        
            if(isLessThanEqualTo(nums, mid, target)){
                start = mid+1;
            }
            else if(nums[mid] > target){
                end = mid-1;
            }
        }
        return end >= 0 && nums[end] == target ? end : -1;
    }

    public  static  int numberOfOccurrences(int[] a, int key){
        int firstInstance = findFirstOccurance(a, key);
        int lastInstance = findLastOccurance(a, key);

        return (firstInstance != -1) ? lastInstance-firstInstance + 1 : 0;
    }

    public static void main(String[] args) {
        int[] input = {3,10,11,15,17,17,17,20};

        int index = numberOfOccurrences(input,3);
        System.out.print(index == -1 ? "Element not found" : "Element found at : " + index);

    }
}

The worst case time complexity of the algorithm to find the number of occurrences of an element in a sorted array is O(log n). We are using the iterative method to find the first and last instances, therefore, there is no hidden space complexity of the algorithm.

You can test the code at leetcode
Please share if there is something wrong or missing. Also if you want to contribute to algorithms and me, please drop an email at communications@algorithmsandme.com

Longest Substring Without Repeating Characters

Longest Substring Without Repeating Characters

Given a string, find longest substring without repeating characters in it.  For example, S = “abcaabaca”, longest substring without repeating characters will be “abc”

Brute force solution will be to scan all substrings of given string and check which one has longest length and no repeating characters. For a string with size n, there will be n * (n-1) substrings, and to check it each for unique characters, it will take n comparison in worst case. So, worst case complexity of this algorithm is O(n3) with additional space of O(n). Code is simple enough.

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 1.1.18.
 */
public class NonRepeatingCharacters {

    private static boolean allUniqueCharacters(String s, int start, int end) {

        HashMap<Character, Boolean> characters = new HashMap<>();

        for (char c : s.substring(start, end).toCharArray()) {
            if(characters.containsKey(c)) return false;
            characters.put(c, Boolean.TRUE);
        }
        return true;
    }

    private static int longestSubstringWithoutRepeatingCharacters(String s) {
        int len = s.length();
        int maxLength = 0;
          
        for (int i =0; i < len; i++){
            for (int j=i+1; j<len; j++){
                int length = j-i;
                if (allUniqueCharacters(s, i, j)){
                    maxLength = Integer.max(maxLength, length);
                }
            }
        }
        return maxLength;
    }

    public static void main(String[] args) {

	String s = "abcdabcbb";
        System.out.println("Longest substting without repeating characters: " +
                longestSubstringWithoutRepeatingCharacters(s));

    }
}

Longest Substring Without Repeating Characters : Sliding window approach

A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in array/string which defined by start and end indices. A sliding window is a window which “slides” its two boundaries to the certain direction.

In brute force approach, we repeatedly checked each substring for unique characters. Do we need to check each substring? If a substring s[i,j-1] contains non repeating characters, while adding jthcharacter, check if that character is already present in substring s[i,j-1]. Since we scan substring to ascertain uniqueness of new character, complexity of this algorithm is O(n2).
How about optimizing the scanning part? What if hash is used to store characters which are already seen in substring s[i,j-1]. In that case, checking uniqueness of new character is done in O(1) and overall algorithm complexity becomes linear.

 public  static int longestSubstringWithoutRepeatingCharacters(String s) {
        int len = s.length();
        HashMap<Character, Boolean> characters = new HashMap<>();

        int maxLength = 0;
        int start = 0;
        int  end = 0;
        while (start < len && end < len) {
            //Check only the last character.
            if(!characters.containsKey(s.charAt(end))){
                characters.put(s.charAt(end), Boolean.TRUE);
                end++;
            }
            else {
                int currentLength = end-start;
                maxLength = Integer.max(maxLength, currentLength);
                //Move start of window one position ahead.
                characters.remove(s.charAt(start));
                start++;
            }
        }
        return maxLength;
    }

If character already present in substring s[i,j-1], that means, it cannot be added to longest substring. Find length of substring (j-i) and compare it with current maximum length. if it is greater, max length of longest substring without repeating characters is (j-i).
At last move the window to position of duplicate.

Below is example execution of above code.

Current Character : a
Substring (  ) does not contain a
New length of substring without repeating character : 1
Current Character : b
Substring ( a ) does not contain b
New length of substring without repeating character : 2

Current Character : c
Substring ( ab ) does not contain c
New length of substring without repeating character : 3

Current Character : a
Substring (abc) contains a
Advancing i to 1

Current Character : a
Substring ( bc ) does not contain a
New length of substring without repeating character : 3

Current Character : b
Substring (bca) contains b
Advancing i to 2

Current Character : b
Substring ( ca ) does not contain b
New length of substring without repeating character : 3

Current Character : c
Substring (cab) contains c
Advancing i to 3

Current Character : c
Substring ( ab ) does not contain c
New length of substring without repeating character : 3

Current Character : b
Substring (abc) contains b
Advancing i to 4

Current Character : b
Substring (bc) contains b
Advancing i to 5

Current Character : b
Substring ( c ) does not contain b
New length of substring without repeating character : 3

Current Character : b
Substring (cb) contains b
Advancing i to 6

Current Character : b
Substring (b) contains b
Advancing i to 7

Current Character : b
Substring (  ) does not contain b
New length of substring without repeating character : 3

Longest substring without repeating characters : 3

There is a small optimization which helps us to skip more characters when repeating character is found instead skipping one at a time. Store the index of each character seen in substring [i,j-1].  While processing jth character, if it is already in hash, we know the index j’ where that character is in string. There is no way that any substring can contain unique characters till j’ and j are in it. So, we skip all indices from i to j’ and start from j’+1 instead of i+1 as in above method.

  public static int longestSubstringWithoutRepeatingCharacters3(String s) {
        int len = s.length();
        HashMap<Character, Integer> characters = new HashMap<>();

        int maxLength = 0;

        for (int start=0, end = 0; end <len; end++) {
            if (characters.containsKey(s.charAt(end))) {
                //find the index of duplicate character.
                int currentIndex = characters.get(s.charAt(end));
                start = Integer.max(currentIndex, start) + 1;
            }
            int currentLength = end - start;
            maxLength = Integer.max(maxLength, currentLength);
            //Update new location of duplicate character
            characters.put(s.charAt(end), end );
        }
        return maxLength;
    }

Complexity of find longest substring without repeating character is hence O(n) with additional space complexity of O(n).
Please share if something is wrong or missing. We would love to hear from you.

Merge overlapping intervals

Merge overlapping intervals

Given N intervals S = {E1,E2,…..En} with each Ei has start time si and end time ei. Some of these intervals can be overlapping, Just to clarify, Ei and Ej overlap when start time of Ej i.e sj is less than end time of Ei i.e ei. For example, [(1,3),(2,4),(5,8), (6,9)] should transform into [(1, 4),(5,9)] has interval (1,3) and (2,4) overlap and interval (5,8) and (6,9) also overlap.

merge overlapping intervals

Merge overlapping intervals  : Thought process

As we always do, first try to come up with brute force solution, given enough time and space and money, how would you solve this?
Natural course is to take ith interval and compare start time of all jth intervals with end time of ith, if the start time of jth interval is less than the end time of ith event, then you can merge two intervals. What should be end time for merged interval then?  It should be maximum of end times of two merged intervals.

What will be time complexity of this approach? We are not using any additional space, however, worst case time complexity is O(n2). Can we do better?

What are two times we are comparing in brute force solution? It’s the start time of one interval with the end time of another. If we arrange input in a specific order, can we reduce processing some entries?

If we sort all intervals based on their start time, si < si+1< si+2. Also, interval is always forward looking, ei > si, ei+1 > si+1 and so on.

If si is greater ei-1, then si+1 will be greater than ei-1, so no need to compare si+1 with ei-1, that is no need to go beyond immediate previous interval for any interval Ei. If si is less than ei-1, update ei-1 with maximum of ei-1 and ei and move to Ei+1.
Notice that we need last interval Ei-1 to decide if to merge new interval into previous one or keep it as standalone. A stack is the best data structure to use. The algorithm will look like:

  1. Consider interval Ei.
  2. If stack is empty, push Ei to stack.
  3. If stack is not empty, then pop interval at top of stack call it Ei-1.
  4. Compare si, start time of Ei with ei-1, end time of Ei-1.
  5. If si less than ei-1, update ei-1 as max(ei-1, ei), as in maximum of end times of two intervals and push back Ei-1on to stack.
  6. Else push Ei on to stack.
  7. Continue till all events are considered.
  8. At the end of processing, stack will contain all merged interval.

Let’s take an example and see how this algorithm works. We have following intervals and we have to merge overlapping intervals.

First of all, sort all interval based on their start time.

Create a stack, start with the first interval, since the stack is empty, we will push the first event on to the stack.

After pushing the first event, the problem state looks like this

Take the second interval, start time (2) of the second interval is less than the end time of the previous event on the stack (3), hence, find the maximum of end times of these two intervals and update the last interval with that end time and push back on to the stack.

 

Look at the third interval, the start time of it is greater than the end time of interval on top of the stack, just push interval on to the stack.

Last interval, this time, the start time of the new interval is less than the end time of interval on top of the stack.

Find the maximum of end times of two intervals and update the previous interval with that end time and push it back on to stack.

merge overlapping intervals

At this point, when there is no more interval remaining, stack contains all merged overlapping intervals.

Merge overlapping intervals : Implementation

package com.company;


import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.Stack;

/**
 * Created by sangar on 8.4.18.
 */
public class OverlappingIntervals {
    public  static ArrayList<Interval>
        mergeOverlappingIntervals(ArrayList<Interval> intervals){

        ArrayList<Interval> mergedIntervals = new ArrayList<>();
        Stack<Interval> s = new Stack();

        //Sort the ArrayList of interval based on start time.
        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));
        for(Interval currentInterval : intervals){
            if(s.empty())s.push(currentInterval);
            else {
                Interval previousInterval = s.pop();
                if(previousInterval.getEndTime() > 
                     currentInterval.getStartTime()){
                    /*
                    If current interval's start time is less than end time of
                    previous interval, find max of end times of two intervals
                    and push new interval on to stack.
                     */
                    int endTime = Integer.max(previousInterval.getEndTime(),
                                              currentInterval.getEndTime());
                    /* Notice that we have created new interval and 
                       did not update the old one
                       This concept is called as immutability of class
                     */
                    s.push(new Interval(previousInterval.getStartTime(),
                                        endTime));
                }
                else{
                    s.push(previousInterval);
                    s.push(currentInterval);
                }
            }
        }
        while(!s.empty()){
            mergedIntervals.add(s.pop());
        }

        return mergedIntervals;
    }

    public static void main(String[] args) {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(1,3));
        intervals.add(new Interval(2,4));
        intervals.add(new Interval(5,8));
        intervals.add(new Interval(6,9));
        ArrayList<Interval> mergedIntervals = mergeOverlappingIntervals(intervals);
        for (Interval interval : mergedIntervals){
            System.out.print("(" + interval.getStartTime() +"," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm to merge overlapping intervals will be O(n log N) due to sorting with O(n) extra space for stack and then copying into the list to return also takes O(n) space.

There is another way to implement the same function without using the stack, here we use the fact that ArrayList in Java is implemented using the array as the base and getting an element at a particular index should be O(1) operation. The code looks more or less the same, however, there is no traversal of the stack at the end to create the list to return.

public List<Interval> mergeOptimized(List<Interval> intervals) {

        if(intervals.size() == 0) return intervals;

        Collections.sort(intervals, 
           (Interval a, Interval b) -> a.getStartTime() - b.getStartTime());

        List<Interval> mergedIntervals = new ArrayList<Interval>();
        for(Interval interval : intervals){

            /*If the merged list is empty add the interval to 
              it or check if the last interval in merged list overlaps

            /*Remember the get function on ArrayList is O(1) operation
              because Arraylists in Java are backed by arrays */
            if(mergedIntervals.isEmpty()
                    || mergedIntervals.get(mergedIntervals.size()-1).getEndTime() < 
                       interval.getStartTime() ){
                mergedIntervals.add(interval);
            }
            else {
                int lastEndTime = Math.max(
                        mergedIntervals.get(mergedIntervals.size()-1).getEndTime(),
                        interval.getEndTime()
                );
                mergedIntervals.get(mergedIntervals.size()-1).setEndTime(lastEndTime);
            }
        }

        return mergedIntervals;
    }

You can use the above snippet of code to submit for this leetcode problem and it should be accepted.

Please share if there is something missing or wrong. Also, please reach out to us at communications@algorithmsandme.com if you want to contribute to the website and help others to learn by sharing your knowledge. If you are preparing for an interview and need some coaching to prepare for it, please sign up for the free session with us.

Subarray with sum zero

Subarray with sum zero

Given an array of positive and negative integers, find a subarray with sum zero in that array. For example, in the array given below, there are two subarrays whose elements sum to zero.

subarray with sum zero
Input array
Array highlighted adds up to zero
subarray with zero sum

Brute force method to find subarray with sum zero will be to find all sub-arrays of the array and then add them individually to see if any subarray adds up to zero. There can be n * (n-1) subarrays for a given array of size n, so the complexity of brute force solution is O(n2).

package com.company;

import java.util.Arrays;
import java.util.HashMap;

/**
 * Created by sangar on 3.12.18.
 */
public class SubarrayWithZeroSum {
    public int [] findSubarrayWithZeroSumBrute(int[] a){
        int len = a.length;

        for(int i=0; i<len; i++){
            int  sum  = 0;
            for(int j=i; j<len; j++){
                sum += a[j];
                if(sum == 0){
                    return Arrays.copyOfRange(a,i,j+1);
                }
            }
        }
        return new int[0];
    }
}

Test cases

package test;

import com.company.SubarrayWithZeroSum;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.Assert.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class SubarrayWithSumZeroTest {

    SubarrayWithZeroSum tester = new SubarrayWithZeroSum();

    @Test
    public void subarrayWithZeroSumBruteTest() {

        int[] a = {2, -3, -1, 4};
        int [] output = {-3, -1, 4};
        assertEquals(Arrays.toString(output),
              Arrays.toString(tester.findSubarrayWithZeroSumBrute(a)));
    }

    @Test
    public void subarrayWithZeroSumBruteNoSubArrayTest() {

        int[] a = {2, -3, -2, 4};
        int [] output = {};
        assertEquals(Arrays.toString(output),
              Arrays.toString(tester.findSubarrayWithZeroSumBrute(a)));
    }

    @Test
    public void subarrayWithZeroSumBruteOneElementTest() {

        int[] a = {2, 0, -1, 4};
        int [] output = {0};
        assertEquals(Arrays.toString(output),
              Arrays.toString(tester.findSubarrayWithZeroSumBrute(a)));
    }
}

Find subarray with sum zero: thoughts

A subarray is a contiguous part of an array. Let’s say we find the sum of subarray starting at 0 and ending at any index i. So, T[i] represents the sum of subarray A[0..i].

What if we have two indices i and j; such that i< j and T[i] = T[j]. In this case, all the elements which are between index i and index j add up to zero and that is our subarray with sum zero.
Length of subarray with sum zero will be j-i+1.

Implementation

package com.company;

import java.util.Arrays;
import java.util.HashMap;

/**
 * Created by sangar on 3.12.18.
 */
public class SubarrayWithZeroSum {
    public int [] findSubarrayWithZeroSum(int[] a){

        int len = a.length;

        int [] T = new int[len];

        T[0] = a[0];
        for(int i=1; i<len; i++){
            T[i] = T[i-1] + a[i];
        }

        //Complexity of below code is O(n^2)

        for(int i=0; i<len; i++){
            for(int j=i+1; j<len; j++){
                if(T[i]== T[j]){
                    return Arrays.copyOfRange(a, i+1, j+1);
                }
            }
        }
        return new int[0];
    }
}

Test cases

package test;

import com.company.SubarrayWithZeroSum;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.Assert.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class SubarrayWithSumZeroTest {

    SubarrayWithZeroSum tester = new SubarrayWithZeroSum();

    @Test
    public void subarrayWithZeroSumTest() {

        int[] a = {2, -3, -1, 4};
        int [] output = {-3, -1, 4};
        assertEquals(Arrays.toString(output),
                Arrays.toString(tester.findSubarrayWithZeroSum(a)));
    }

    @Test
    public void subarrayWithZeroSumNoSubArrayTest() {

        int[] a = {2, -3, -2, 4};
        int [] output = {};
        assertEquals(Arrays.toString(output),
                Arrays.toString(tester.findSubarrayWithZeroSum(a)));
    }

    @Test
    public void subarrayWithZeroSumOneElementTest() {

        int[] a = {2, 0, -1, 4};
        int [] output = {0};
        assertEquals(Arrays.toString(output),
                Arrays.toString(tester.findSubarrayWithZeroSum(a)));
    }

The complexity of the algorithm to find a subarray with zero sum in a given array of integers is O(n2) with an additional space complexity of O(n) to store sum till index i.

We can optimize it further by creating a hash of all the sums which we see while adding. When we add the index i to already calculated sum till index i-1, we check if the new sum is zero? If yes, then subarray from 0 to index i add up to zero. If there is already a sum present which is equal to the current sum then there is subarray with sum zero between index when we saw the sum last and current index.

package com.company;

import java.util.Arrays;
import java.util.HashMap;

/**
 * Created by sangar on 3.12.18.
 */
public class SubarrayWithZeroSum {

    public int [] findSubarrayWithZeroSumOptimized(int[] a){

        int len = a.length;

        HashMap<Integer, Integer> T = new HashMap<Integer, Integer>();

        int sum  = 0 ;
        for(int i=0; i<len; i++){
            sum  += a[i];
            if(T.get(sum) != null){
                return Arrays.copyOfRange(a,T.get(sum)+1, i+1);
            }
            T.put(sum, i);
        }

        return new int[0];
    }
}

Test cases

package test;

import com.company.SubarrayWithZeroSum;
import org.junit.jupiter.api.Test;

import java.util.Arrays;

import static org.junit.Assert.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class SubarrayWithSumZeroTest {

    SubarrayWithZeroSum tester = new SubarrayWithZeroSum();

    @Test
    public void subarrayWithZeroSumOptimizedTest() {

        int[] a = {2, -3, -1, 4};
        int [] output = {-3, -1, 4};
        assertEquals(Arrays.toString(output),
          Arrays.toString(tester.findSubarrayWithZeroSumOptimized(a)));
    }

    @Test
    public void subarrayWithZeroSumOptimizedNoSubArrayTest() {

        int[] a = {2, -3, -2, 4};
        int [] output = {};
        assertEquals(Arrays.toString(output),
          Arrays.toString(tester.findSubarrayWithZeroSumOptimized(a)));
    }

    @Test
    public void subarrayWithZeroSumOptimizedOneElementTest() {

        int[] a = {2, 0, -1, 4};
        int [] output = {0};
        assertEquals(Arrays.toString(output),
          Arrays.toString(tester.findSubarrayWithZeroSumOptimized(a)));
    }

}

The complexity of this method is O(n) with additional space of O(n) in worst case.

Please share if there is something wrong or missing. If you are preparing for an interview, please signup for free interview kit.