## Subarrays with given sum

We are given an unsorted array containing positive and negative numbers, we need to find the number of subarrays with the given sum, k. For example,

```Input:
arr = [2, 4, -5, -5, 6] and k = -10
Output:
1
Explanation:
We can see that there is only one subarray (index 2 to 3) whose sum is -10. ```

## Thought process

If the sum of elements in the range [0, i] and [0, j] is the same, then we can say that the sum between [i + 1, j] will be zero. This statement will be more clear after understanding the below example.

Let’s take arr = [2, 4, 4, -10, 4, 6, 5, 6]. Sum of elements in the range [0, 2] = 2 + 4 + 4 = 10
Sum of elements in the range [0, 5] = 2 + 4 + 4 + (-10) + 4 + 6 = 10

The above two sums are the same. Here,i= 2 and j = 5. Let’s check whether the sum in the range [3, 5] is 0 or not. Sum of elements in the range [3, 5] = -10 + 4 + 6 = 0

Hence, the statement explained above is true.

Read more here: subarray with sum zero

We can extend this idea to this question. The modified statement is as follows if the difference of the sum of elements in the range [0, j]and [0, i] is k, then we can say that the sum between [i + 1, j] is k. Mathematically,

```If Sum[0, j] - Sum[0, i] = k, where j > i
Then, Sum[i + 1, j] = k or Sum[0, j] - k = Sum[0, i]```

For each index j in the array, if we have seen Sum[0, j] – k in the past, then we have found the subarray. But how to achieve this?

Let’s number of elements in an array is n. So, we need to maintain sum[0, 0], sum[0, 1], sum[0,2], sum[0, 3] … sum[0, n].
These sums are known as prefix sum. If the same sum occurs many times, we store the frequency of occurrence. We can use a hashmap where the key will denote the sum and value will denote the frequency of occurrence.

Any corner cases? What if k = sum[0, j], where j is an index. For this, sum[0, j] – k is 0 and we will try to find if we have seen 0 as the sum in the past. To handle this case, we will insert one entry to map as a map[0] = 1.

## Algorithm

```1. For each index in the array.
1.1 sum = sum in the range [0, i]
1.2 Check if sum - k is already encountered in the array,
i.e have we encountered any array in the past whose
the sum is sum - k.
1.3 If yes, add the frequency of sum - k to answer.
2. Finally, put the sum in the map with frequency as 1 if the sum is not seen in the past.
3. If the sum is seen in the past, update frequency as old frequency + 1.

### Show me the implementation

```// nums = given array
// k = sum
int subarraySum(vector<int>& nums, int k) {
int ans = 0;   // to store the final count
int sum = 0;  // to store sum

// to store prefix sum along with frequency
map<int, int> map;
map[0] = 1;  // for case like k = sum[0, i]

for(int i = 0; i < nums.size(); i++)
{
// calculating sum in the range[0, i]
sum += nums[i];
// if sum - k is alread seen
if(map.find(sum - k) != map.end())
ans += map[sum - k]; // add frequency

// if sum is not seen in the past
if(map.find(sum) == map.end())
map[sum] = 1;
else
map[sum] = map[sum] + 1;
}

return and;
}

```

Time Complexity of the implementation is O(n) along with the space Complexity of O(n).

## Three Sum Problem

Given an array nums of n integers, are there elements a, b,c in nums such that a+ b + c= 0? Find all unique triplets in the array which gives the sum of zero.

The solution set must not contain duplicate triplets.

This problem is commonly known as three sum problem.

```Input: = [-1, 0, 1, 2, -1, -4],
Output:
[
[-1, 0, 1],
[-1, -1, 2]
]
```

## Three Sum problem : thought process

Before going into the details of find three numbers with a given sum, do you know how to find two numbers with a given sum s? The idea is simple, we keep a map of all the numbers in we already seen, if see a number a[i], we see in the map if S-a[i] is present or not. If yes, then we found the pair, if not, we put a[i] and move forward.

This solution has linear time complexity i.e O(n) with a space complexity of O(n) as well. There is a way to avoid space complexity by sorting the input array first and using two pointer approach.

Details of 2 sum problem, you can read here: 2 Sum problem: Find pair with given sum in array

Can we use our understanding of the two-sum problem to solve this three-sum problem?

Hint 1:
What happens if we take one number from the array and say that this number will be one of the three numbers which add up to zero? What happens to our problem statement now?

Hint 2:
If you have decided that A[0] will be part of three numbers which add up to zero, all we need is two find out two numbers from index 1..len-1 which add up to OA[0]. What does that mean? It looks like a 2-sum problem, doesn’t it?

We will start with index i from 0 and len-1, each time claiming that A[i] is part of the solution. Once, we fixed A[i] in solution, we will find two numbers which add to 0-A[i] solution to that problem we already know. If we find those two numbers let’s say A[j] and A[k], then A[i],A[j] and A[k] from the triplet. If we do not find those two numbers, discard A[i] and move forward.

### Show me the implementation of three sum problem

```class Solution {

public List<List<Integer>> threeSum(int[] nums) {

List<List<Integer>> result = new ArrayList<>();
//O(n log n)
Arrays.sort(nums);

for(int i=0; i<nums.length; i++){ if (i > 0 && nums[i] == nums[i - 1]) {
continue;
}
if(nums[i] > 0) continue;
int j = i+1;
int k = nums .length -1;

int target = 0-nums[i];
/*
Below code is exactly the same to find
two numbers with a given sum using two pointers
*/
while(j<k){
if(nums[j] + nums[k] == target ){
/* If find j and k, such that element at
index i, j and K add to zero, put them in result
*/
ArrayList<Integer> triplet = new ArrayList<Integer>(
Arrays.asList(nums[i], nums[j], nums[k]));
j++;
k--;
while (j < k && nums[j] == nums[j - 1]) j++;  // skip same result
while (j < k && nums[k] == nums[k + 1]) k--; // skip same result //since we want only unique triplets, using set result.add(triplet); } else if(nums[j] + nums[k] > target){
k--;
}
else{
j++;
}

}
}
return  result;
}
}
```

If you do not come up with the idea of moving i, j and k to get unique triplets, it is OK. You can always use HashSet to filter out duplicates and then copy the HashSet to the result list. It will take more time but will give the correct result.

``` if(uniqueSet.add(triplet)){
}
```

The time complexity of the solution is O(n2) plus the complexity of sorting algorithm (never forget that to mention!!). However, sorting complexity is O(nlogn), hence the overall complexity is dominated by the quadratic expression.

Can you solve the following problems on the leetcode?
1. 3SUM
2. 3Sum Closest
3. Four sum problem.

## Arrays With Elements In The Range 1 to N

When an array has all its elements in the range of 1 to N ( where N is the length of the array ) we can use the indices to store the ordered state of the elements in the array. This ordered-state can in-turn be used to solve a variety of problems which we’ll explore soon. First, a very simple demonstration of this property.

Here is an array which has unique elements in the range of 1 to N.
Given array (A) : 5,3,1,4,2
Indices:                0,1,2,3,4

Sort in Linear Time

The first use-case of this unique property is being able to sort in O(N) time i.e. a special-case(all unique elements) of the Counting Sort. The crux of this sort is to check whether an element is at its corresponding index and swap it to its correct index if it’s not. Following is a demonstration of this logic:

Given array (A) : 5,3,1,4,2
Indices:                0,1,2,3,4

For each A[i] check if A[A[i] – 1] equals A[i] or not. If they are not equal then swap element at A[A[i] – 1] with A[i]. Basically the correct value for any index i is when A[i] contains i+1.

A[A[0] – 1] or A[5-1] orA[4] which is 2 and A[0] = 5. This means that A[A[i] – 1] is not equal to A[i] and hence not in its correct position. So we need to swap in order to put A[0] -> 5 to its correct position which is index 4 and A[0] will hold 4 after the swap. Similarly, we need to repeat this check & swap for all the elements.

What if we cancel-out the common terms and modify the check from  `A[i] != A[A[i] - 1]` to `i != A[i]-1` ?

Find The Missing Integer

A similar approach can help us find the smallest missing positive-integer in a given array. By smallest missing positive-integer, we just mean the smallest positive integer that does not exist in the given list of numbers. For example:

Given Array: `-2, 3, 0, 1, 3`
In the above case, the smallest missing positive integer is 2.

If we were to apply the usual sorting techniques and then scan the array for the smallest positive integer absent it would imply a time-complexity of O(NLog(N)) + O(N). We can definitely do better than this! At first glance, it seems that this problem does not lie within the unique property of elements being in the range of 1 to N since the numbers in the given array are well outside the range, but to solve this problem we still only need to figure out whether we have elements from 1 to N present in the given array or not.

How do we know whether the given array has elements from 1 to N? We can use the counting-sort discussed earlier to put each element in its “correct position”, i.e index 0 should hold 1, index 1 should hold 2 and so on. The smallest index that does not hold its correct element is the missing integer.

If we sort the given array using counting sort described above, we will get: `1, 0, 3, -2, 3`. And the smallest index `i` to not hold its correct value i.e. `i+1` will give us the answer to the smallest missing positive integer. In this case, that index is 1 since it does not hold 2, thus the smallest positive missing integer is 2.

Find The Duplicate Element

The third use-case of this property is to figure out the duplicate elements without using any extra space. We can iterate over the array A and mark the corresponding index of the encountered element as negative – unless it has already been marked negative! For example: if A[1] = 3 (or -3 ) then mark `A[ Abs[3] - 1]` as negative, this way whenever we encounter 3 (or -3) again in any of the A[i] we will know that the value 3 has been visited before since A[3-1] will be negative.

Given array (A) : 5,3,1,4,3
Indices:                0,1,2,3,4

When we encounter A[0] i.e. 5, we make A[5-1] i.e. A[4] negative, so the array becomes:
`5,3,1,4,-3`
Next, we encounter A[1] i.e. 3, we make A[3-1] i.e. A[2] negative, so the array becomes:
`5,3,-1,4,-3`
Next, we encounter A[2] i.e. -1, we make A[1-1] i.e. A[0] negative, so the array becomes:
`-5,3,-1,4,-3`
Next, we encounter A[3] i.e. 4, we make A[4-1] i.e. A[3] negative, so the array becomes:
`-5,3,-1,-4,-3`
Next, we encounter A[4] i.e. -3, we want to make A[3-1] i.e. A[2] negative, but in this case, A[2] is already negative thus we know that A[2] has been visited before! Which means `Abs(A[4])` i.e 3 is the duplicate element.

Here is a snippet to demonstrate the code for sorting an array in linear time as per the above approach. The exact same approach can be used to solve the other two applications i.e. Finding the Duplicate and Finding The Missing Integer.

```        int swap=0;

for(int i = 0; i < nums.length;){

if(nums[i] > 0 && nums[i] < nums.length) {

if(nums[nums[i]-1] != nums[i]){
swap = nums[i];
nums[i] = nums[nums[i] - 1];
nums[swap - 1] = swap;
}else{
i++;
}

}else{
i++;
}
}
```

If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

# Range sum query- Immutable array

Write a service which given an integer array, returns the sum of the elements between indices i and j (i ≤ j), inclusive. Example: nums = [-2, 0, 3, -5, 2, -1]
sumRange(0, 2) -> 1
sumRange(2, 5) -> -1
sumRange(0, 5) -> -3

Also, the input set does not change during the calls to the sumRange(i,j).

The brute force solution is to calculate the sum of all the elements A[i] to A[j] whenever a sumRange(i,j) is called. This method has time complexity of O(n). It is OK to have this solution for small scale but as the number of queries goes up, processing of all the numbers from i to j would be inefficient. Also, imagine a case where the array itself is very large, then `O(n)` complexity for each query will lead to choking of your service.

## Range sum query- Immutable array : thoughts

There are two hints for optimization is in the question, first, the array is immutable, it does not change. Second, we have to build a service, that means we have a server with us. These two things allow us to pre-compute and store results even before the query is made.

Now, the question is what do we pre-compute and how do we store it? We can precompute the sum of all the elements between each i and j and store them in a two-dimensional array. range[i][j] stores the sum of all the elements between index i and j. It will use `O(n2)` additional memory, however, the response time for each sumRange query will be constant. Also, the preprocessing step is `O(n2)`

Can we optimize for space as well? If I know the sum of all the elements from index 0 to index i and sum of all the elements from 0 to j, can I find the sum of all the elements between i and j? Of course, we can do it.

` Sum(i,j) = Sum(0,j) - Sum(0,i) + A[i]. `

Below diagram explains it.

However, integer array is not passed in the query request, we cannot use it while calculating the sum. Instead, we will use formula like: Sum(i,j) = Sum(0,j) – Sum(0,i-1), which is equivalent to the above.

We will pre-calculate the sum of all the elements between index 0 and j for all j>=0 and jImplementation

```class NumArray {

int[] rangeSum;
public NumArray(int[] nums) {
rangeSum = new int[nums.length];

if(nums.length>0){
rangeSum[0] = nums[0];
for(int i=1; i<nums.length; i++){
rangeSum[i] = rangeSum[i-1] + nums[i];
}
}
}

public int sumRange(int i, int j) {
if(i==0) return rangeSum[j];
return rangeSum[j] - rangeSum[i-1];
}
}
```

Now, the preprocessing step is `O(n)`. N additional space is used. At the same time query response time is `O(1)`.

Please share if there is anything wrong or missing in the post. If you are preparing for an interview and needs coaching to prepare for it, please book a free demo session with us.

## Maximum area rectangle in a histogram

A histogram is a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. Below is an example of a histogram.

Given a histogram, whose class interval is 1, find maximum area rectangle in it. Let me explain the problem in more details.

In the histogram above, there are at least 6 rectangles with areas 2, 1,5,6,2, and 3. Are there more rectangles? Yes, we can make more rectangles by combining some of these rectangles. A few are shown below.

Apparently, the largest area rectangle in the histogram in the example is 2 x 5 = 10 rectangle. The task is to find a rectangle with maximum area in a given histogram. The histogram will be given as an array of the height of each block, in the example, input will be [2,1,5,6,2,3].

## Maximum area rectangle: thoughts

First insight after looking at the rectangles above is: block can be part of a rectangle with a height less than or equal to its height. For each block of height h[i], check what all blocks on the left can be part of a rectangle with this block. All the blocks on the left side with a height greater than the current block height can be part of such a rectangle.
Similarly, all the blocks on the right side with a height greater than the current block height can be part of such a rectangle.
Idea is to calculate leftLimit and rightLimit and find the area `(rightLimit - leftLimit) * h[i]`.
Check if this area is greater than previously known area, then update the maximum area else, continue to the next block.

```class Solution {
public int largestRectangleArea(int[] heights) {

if(heights.length == 0) return 0;
int maxArea = Integer.MIN_VALUE;

for(int i=0; i<heights.length; i++){
//Find the left limit for current block
int leftLimit = findLeftLimit(heights, i);

//Find the right limit for current block
int rightLimit = findRightLimit(heights, i);

int currentArea = (rightLimit - leftLimit-1) * heights[i];
maxArea = Integer.max(maxArea, currentArea);
}

return maxArea;
}

private int findLeftLimit(int [] heights, int index){
int j = index-1;
while (j >= 0 && heights[j] >= heights[index]) j--;

return j;
}

private int findRightLimit(int [] heights, int index){
int j = index+1;
while (j < heights.length && heights[j] >= heights[index])
j++;

return j;
}
}
```

The time complexity of the implementation is O(n2); we will left and right of each block which will take n operations, we do it for n blocks and hence the complexity is quadratic. Can we optimize the time complexity?

If `heights[j] >= heights[i]` and leftLimit of index j is already known, can we safely say that it will also be the leftLimit of index i as well?
Can we say the same thing for rightLimit well? Answers to all the questions are yes. If we store the left and right limit for all indices already seen, we can avoid re-calculating them.

```class Solution {
public int largestRectangleArea(int[] heights) {

if(heights.length == 0) return 0;

int maxArea = Integer.MIN_VALUE;

//Finds left limit for each index, complexity O(n)
int [] leftLimit = getLeftLimits(heights);
//Find right limit for each index, complexity O(n)
int [] rightLimit = getRightLimits(heights);

for(int i=0; i<heights.length; i++){
int currentArea =
(rightLimit[i] - leftLimit[i] -1) * heights[i];
maxArea = Integer.max(maxArea, currentArea);
}

return maxArea;
}

private int[] getLeftLimits(int [] heights){

int [] leftLimit = new int[heights.length];
leftLimit[heights.length-1] = -1;

for(int i=0; i<heights.length; i++) {
int j = i - 1;
while (j >= 0 && heights[j] >= heights[i]) {
j = leftLimit[j];
}
leftLimit[i] = j;
}
return leftLimit;
}

private int[] getRightLimits (int [] heights){

int [] rightLimit = new int[heights.length];
rightLimit[heights.length-1] = heights.length;

for(int i=heights.length-2; i>=0; i--){
int j = i+1;
while(j<heights.length
&& heights[j] > heights[i]){
j = rightLimit[j];
}
rightLimit[i] = j;
}
return rightLimit;
}
}
```

The array `leftLimit`contains at index i the closest index j to the left of i such that `height[j] < height[i]`. You can think about each value of the array as a pointer (or an arrow) pointing to such `j` for every i. How to calculate `leftLimit[i]`? Just point the arrow one to the left and if necessary just follow the arrows from there, until you get to proper j. The key idea here to see why this algorithm runs in O(n) is to observe that each arrow is followed at most once.

### Largest area rectangle: stack-based solution

There is a classic method to solve this problem using the stack as well. Let’s see if we can build a stack-based solution using the information we already have. Let’s we do not calculate the area of the rectangle which includes the bar when we are processing it. When should we process it? Where should this bar be put on? If we want to create a rectangle with a height of this bar, we should find the left and right boundaries of such a rectangle. We should put this bar on a stack.
Now when you are processing bar j if height[j] is less than the bar on the top of the stack, we pop out the bar at the top. Why? Because this is the first bar on the right which has a height less than the height of the bar at top of the stack. This means if we want to make a rectangle with a height of the bar at the top of the stack, this index means the right boundary. This also gives away that all the blocks on the stack are in increasing order, as we never put a block which has a height less than the height of block at the top on to the stack. It means the next bar on the stack is the first bar which has a height lower than the bar at the top. To calculate the area of the rectangle with height as h[top], we need to take width as current index `j - stack.peek() - 1`

So the idea is that:

1. For each bar, take its height as the rectangle’s height. Then find the left and right boundaries of this rectangle.
2. The second top bar in the stack is always the first bar lower than the top bar on the stack on the left.
3. The bar that j points to is always the first bar lower than the top bar in the stack on the right.
4. After step 2 and 3, we know the left and right boundaries, then know the width, then know the area.
```private int maxAreaUsingStack(int[] heights){

Stack<Integer> s = new Stack<>();

int maxArea = 0;
for(int i=0; i<=heights.length; i++){
//Handling the last case
int h = i == heights.length ? 0 : heights[i];
while(!s.empty() && h < heights[s.peek()]){
int top = s.pop();
int leftLimit = s.isEmpty() ? -1 : s.peek();
int width = i-leftLimit-1;

int area = width * heights[top];
maxArea = Integer.max(area, maxArea);
}
s.push(i);
}
return maxArea;
}
```
The time complexity of the code is `O(n)` with an additional space complexity of `O(n)` If you are preparing for a technical interview in companies like Amazon, Facebook, etc and want help with preparation, please register for a coaching session with us.

# Intersection of two arrays

Given two unsorted arrays of integers, find intersection of these two arrays. Intersection means common elements in the given two arrays. For example, A = [1,4,3,2,5,6] B = [3,2,1,5,6,7,8,10] intersection of A and B is [ 1,3,2,5,6 ].

Sort array and then use binary search
As given arrays are unsorted, sort one of the arrays, preferably the larger one. Then search each element of another array in the sorted array using binary search. If the element is present, put it into the intersection array.

```class Solution {
public int[] intersection(int[] nums1, int[] nums2) {

int len1 = nums1.length;
int len2 = nums2.length;
Set<Integer> result = new HashSet<>();

for(int i=0; i<len2; i++){
if(binarySearch(nums1, nums2[i]) != -1){
}
}
int i = 0;
int[] resultArray = new int[result.size()];
for(Integer num : result){
resultArray[i++] = num ;
}

return resultArray;
}

private int binarySearch(int[] a, int key) {

for(int i=0; i<a.length; i++){
if(a[i] == key) return i;
}

return -1;
}
}
```

The time complexity of binary search method to find intersection is `O(nlogn)` for sorting and then `O(mlogn)` for searching. Effective time complexity becomes `O((n+m)logn)` which is not optimal.

Sort and use merge to find common elements
Again in this method, sort two arrays first. Then use two pointers to scan both arrays simultaneously. (Please refer to merge part of merge sort ). The difference is we will put only common elements, instead of all.

The time complexity of merge sort method is `O(nlogn) + O(mlogm)` for sorting and then `O(m+n)` for scanning both arrays. It is worst than the binary search method.

Use hash
Create a hash with key as elements from the smaller array (saves space). Then scan through other array and see if the element is present in hash. If yes, put into intersection array else do not.

```package AlgorithmsAndMe;

import com.sun.org.apache.xpath.internal.operations.Bool;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class IntersectionTwoArrays {

public List<Integer> findIntersecton(int[] a, int[] b) {
List<Integer> result = new ArrayList<>();
Map<Integer, Boolean> existingElements = new HashMap<>();

for (int i = 0; i < a.length; i++) {
existingElements.put(a[i], true);
}

for (int i = 0; i < b.length; i++) {
if (existingElements.containsKey(b[i])) {
}
}
return result;
}
}
```

Test case

```package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import AlgorithmsAndMe.IntersectionTwoArrays;

import java.util.List;
import java.util.Set;

public class IntersectonTwoArraysTest {

IntersectionTwoArrays intersectionTwoArrays
= new IntersectionTwoArrays();

@org.junit.Test
public void testIntersectionTwoArrays() {
int [] a = {1,6,3};
int [] b = {1,2,3};
List<Integer> result = intersectionTwoArrays.findIntersecton(a,b);

result.forEach(s -> System.out.println(s));
}
}

```

This method has the complexity of `O(n)` where n is the number of elements in the larger array and extra space complexity of O(m) where m is the number of elements in the smaller array.

These methods to find the intersection of two arrays do not work when there are duplicate elements in any of the array as they will be part of intersection array only once.

Please share if there is something wrong or missing. we would love to hear from you.

# Find all duplicate numbers in array

Given an array of positive integers in range 0 to N-1, find all duplicate numbers in the array. The array is not sorted. For example:
A = [2,4,3,2,1,5,4] Duplicate numbers are 2,4 whereas in A = [4,1,3,2,1,1,5,5] duplicate numbers are 1,5.

Brute force solution would be to keep track of every number which is already visited. The basic idea behind the solution is to keep track that whether we have visited the number before or not. Which data structure is good for quick lookups like this? Of course a map or hash.
The time complexity of this solution is `O(n)` but it has an additional space complexity of `O(n)`.

To reduce space requirement, a bit array can be used, where ith index is set whenever we encounter number i in the given array. If the bit is set already, its a duplicate number. It takes `O(n)` extra space which is actually less than earlier `O(n)` as only bits are used. The time complexity remains `O(n)`

## Find duplicate numbers in an array without additional space

Can we use the given array itself to keep track of the already visited numbers? How can we change a number in an array while also be able to get the original number back whenever needed? That is where reading the problem statement carefully comes. Since array contains only positive numbers, we can negate the number at the index equal to the number visited. If ever find a number at any index negative, that means we have seen that number earlier as well and hence should be a duplicate.

Idea is to make the number at ith index of array negative whenever we see number i in the array. If the number at ith index is already negative, it means we have already visited this number and it is duplicate. Limitation of this method is that it will not work for negative numbers.

### Duplicate numbers implementation

```package AlgorithmsAndMe;

import java.util.HashSet;
import java.util.Set;

public class DuplicatesInArray {

public Set<Integer> getAllDuplicates(int[] a )
throws IllegalArgumentException {

Set<Integer> result = new HashSet<>();

if(a == null) return result;

for(int i=0; i<a.length; i++) {
//In case input is wrong
if(Math.abs(a[i]) >= a.length ){
throw new IllegalArgumentException();
}

if (a[Math.abs(a[i])] < 0) {
} else {
a[Math.abs(a[i])] = -a[Math.abs(a[i])];
}
}
return result;
}
}

```

Test cases

```package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import java.util.Set;

public class DuplicatesInArrayTest {

DuplicatesInArray duplicatesInArray = new DuplicatesInArray();

@org.junit.Test
public void testDuplicatesInArray() {
int [] a = { 1,2,3,4,2,5,4,3,3};
Set<Integer> result = duplicatesInArray.getAllDuplicates(a);

result.forEach(s -> System.out.println(s));
}

@org.junit.Test
public void testDuplicatesInArrayWithNullArray() {
Set<Integer> result = duplicatesInArray.getAllDuplicates(null);

result.forEach(s -> System.out.println(s));
}

//This case should generate an exception as 3 is greater than the size.
@org.junit.Test
public void testDuplicatesInArrayWithNullArray() {
int [] a = { 1,2,3};
try{
Set<Integer> result = duplicatesInArray.getAllDuplicates(a);
} catch (IllegalArgumentException  e){
System.out.println("invalid input provided");
}
}
}

```

The complexity of the algorithm to find duplicate elements in an array is `O(n)`.

## Missing number in array

Given an array of N integers, ranging from 1 to N+1, find the missing number in that array. It is easy to see that with N slots and N+1 integers, there must be a missing number in the array.
This problem is very similar to another problem: find repeated number in an array
For example,

```Input:
A = [1,2,5,4,6] N = 5 range 1 to 6.
Output:
3.

Input:
A = [1,5,3,4,7,8,9,2] N = 8 range 1 to 9.
Output:
6
```

## Approaches to find absent number

Using hash
Create a hash with the size equal to N+1. Scan through elements of the array and mark as true in the hash. Go through the hash and find a number that is still set to false. That number will be the missing number in the array.
The complexity of this method is O(n) with additional O(n) space complexity.

Using mathmatics
We know that the sum of N consecutive numbers is N*(N+1)/2. If a number is missing, the sum of all numbers will not be equal to N*(N+1)/2. The missing number will be the difference between the expected sum and the actual sum.

Our missing number should be = (N+2) * (N+1) /2 – Actual sum; N+1 because the range of numbers is from 1 to N+1
Complexity is O(n). However, there is a catch: there may be an overflow risk if N is big enough.

Using XOR
There is a beautiful property of XOR, that is: if we XOR a number with itself, the result will be zero. How can this property help us to find the missing number? In the problem, there are two sets of numbers: the first one is the range 1 to N+1, and others which are actually present in the array. These two sets differ by only one number and that is our missing number. Now if we XOR first set of numbers with the second set of numbers, all except the absent number will cancel each other. The final result will be the output of our problem.

### Algorithm using XOR

1. Scan through the entire array and XOR all elements. Call it aXOR
2. Now XOR all numbers for 1 to N+1. Call it eXOR
3. Now XOR aXOR and eXOR, the result is the missing number

Let’s take an example and see if this works

A = [1,3,4,5] Here N = 4, Range is 1 to 5.

XORing bit representations of actual numbers
001 XOR 011 = 010 XOR 100 = 110 XOR 101 = 011 (aXOR)

XORing bit representation of expected numbers
001 XOR 010 = 011 XOR 011 = 000 XOR 100 = 100 XOR 101 = 001 (eXOR)

Now XOR actualXOR and expectedXOR;
011 XOR 001 = 010 = 2 is our answer.

### Implementation

```    public int missingNumber(int[] nums) {

int n =  nums.length;

int nXOR = 0;
for(int i=0; i<=n; i++){
nXOR ^= i;
}

int aXOR = 0;
for(int i=0; i<n; i++){
aXOR ^= nums[i];
}

return aXOR ^ nXOR;
}
```

The complexity of the XOR method is O(n) with no additional space complexity.

If you want to contribute to this blog in any way, please reach out to us: Contact. Also, please share if you find something wrong or missing. We would love to hear what you have to say.

## Segregate 0s and 1s in an array

Given an array of 0s and 1s, segregate 0s and 1s in such as way that all 0s come before 1s. For example, in the array below,

The output will be as shown below.

This problem is very similar to Dutch national flag problem

## Different methods to segregate 0s and 1s in an array

Counting 0s and 1s.
The first method is to count the occurrence of 0s and 1s in the array and then rewrite o and 1 onto original array those many times. The complexity of this method is `O(n)` with no added space complexity. The only drawback is that we are traversing the array twice.

```package com.company;

/**
* Created by sangar on 9.1.19.
*/
public class SegregateZerosAndOnes {

public void segregate(int[] a) throws IllegalArgumentException{

if(a == null) throw new IllegalArgumentException();
int zeroCount = 0;
int oneCount = 0;

for (int i = 0; i < a.length; i++) {
if (a[i] == 0) zeroCount++;
else if (a[i] == 1) oneCount++;
else throw new IllegalArgumentException();
}

for (int i = 0; i < zeroCount; i++) {
a[i] = 0;
}

for (int i = zeroCount; i < zeroCount + oneCount; i++) {
a[i] = 1;
}
}
}

```

Using two indices.
the second method is to solve this problem in the same complexity, however, we will traverse the array only once. Idea is to maintain two indices, left which starts from index 0 and right which starts from end (n-1) where n is number of elements in the array.
Move left forward till it encounters a 1, similarly decrement right until a zero is encountered. If left is less than right, swap elements at these two indice and continue again.

1. Set left = 0 and right = n-1
2. While left < right 2.a if a[left] is 0 then left++
2.b if a[right] is 1 then right– ;
2.c if left < right, `swap(a[left], a[right])`

### segregate 0s and 1s implementation

```public void segregateOptimized(int[] a) throws IllegalArgumentException{

if(a == null) throw new IllegalArgumentException();
int left = 0;
int right = a.length-1;

while(left < right){
while(left < a.length && a[left] == 0) left++;
while(right >= 0 && a[right] == 1) right--;

if(left >= a.length || right <= 0) return;

if(a[left] > 1 || a[left] < 0 || a[right] > 1 || a[right] < 0)
throw new IllegalArgumentException();

if(left < right){
a[left] = 0;
a[right] = 1;
}
}
}
```

The complexity of this method to segregate 0s and 1s in an array is O(n) and only one traversal of the array happens.

### Test cases

```package test;

import com.company.SegregateZerosAndOnes;
import org.junit.*;
import org.junit.rules.ExpectedException;

import java.util.Arrays;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
* Created by sangar on 28.8.18.
*/
public class SegregateZerosAndOnesTest {

SegregateZerosAndOnes tester = new SegregateZerosAndOnes();

@Test
public void segregateZerosAndOnesOptimizedTest() {

int[] a = {0,1,0,1,0,1};
int[] output = {0,0,0,1,1,1};

tester.segregateOptimized(a);
assertEquals(Arrays.toString(output), Arrays.toString(a));

}

@Test
public void segregateZerosAndOnesAllZerosOptimizedTest() {

int[] a = {0,0,0,0,0,0};
int[] output = {0,0,0,0,0,0};

tester.segregateOptimized(a);
assertEquals(Arrays.toString(output), Arrays.toString(a));

}

@Test
public void segregateZerosAndOnesAllOnesOptimizedTest() {

int[] a = {1,1,1,1,1};
int[] output = {1,1,1,1,1};

tester.segregateOptimized(a);
assertEquals(Arrays.toString(output), Arrays.toString(a));

}

@Test(expected=IllegalArgumentException.class)
public void segregateZerosAndOnesOptimizedIllegalArgumentTest() {

int[] a = {1,1,1,1,2};
tester.segregateOptimized(a);
}

@Test(expected=IllegalArgumentException.class)
public void segregateZerosAndOnesOptimizedNullArrayTest() {

tester.segregateOptimized(null);
}

}

```

## Difference between array and linked list

In last post : Linked list data structure, we discussed basics of linked list, where I promised to go in details what is difference between array and linked list. Before going into post, I want to make sure that you understand that there is no such thing called one data structure is better than other. Based on your requirements and use cases, you chose one or the other. It depends on what is most frequent operation your algorithm would perform in it’s lifetime. That’s why they have data structure round in interview process to understand if you can chose the correct one for the problem.

What is an array?
Array is linear, sequential and contiguous collection of elements which can be addressed using index.

Linked list is linear, sequential and non-contiguous collection of nodes, each node store the reference to next node. To understand more, please refer to Linked list data structure.

## Difference between arrays and linked list

### Static Vs dynamic size

Size of an array is defined statically at the compile time where as linked list grows dynamically at run time based on need. Consider a case where you know the maximum number of elements algorithm would ever have, then you can confidently declare it as array. However, if you do not know, the linked list is better. There is a catch : What if there is a rare chance that number of elements will reach maximum, most of the time it will be way less than maximum? In this case, we would unnecessary allocating extra memory for array which may or may not be used.

### Memory allocation

An array is given contiguous memory in system. So, if you know the address of any of the element in array, you can access other elements based position of the element.

Linked list are not store contiguous on memory, nodes are scattered around on memory. So you may traverse forward in linked list, given node (using next node reference), but you can not access nodes prior to it.

Contiguous allocation of memory required sufficient memory before hand for an array to be stored, for example if want to store 20 integers in an array, we would required 80 bytes contiguous memory chunk. However, with linked list we can start with 8 bytes and request more memory as when required, which may be wherever. Contiguous allocation of memory makes it difficult to resize an array too. We have to look for different chunk of memory, which fits the new size, move all existing elements to that location. Linked list on other hand are dynamically size and can grow much faster without relocating existing elements.

### Memory requirement

It’s good to have non-contiguous memory then? It comes with a cost. Each node of linked list has to store reference to next node in memory. This leads to extra payload of 4 bytes in each node. On the other hand, array do not require this extra payload. You  have to trade off extra space with advantages you are getting. Also, sometime, spending extra space is better that have cumbersome operations like shifting, adding and deleting operation on array. Or value stored in node is big enough to make these 4 bytes negligible in analysis.

### Operation efficiency

We do operations of data structure to get some output. There are four basic operations we should be consider : read, search, insert/update and delete.

Read on array is O(1) where you can directly access any element in array given it’s index. By O(1), read on array does not depend on size of array.
Whereas, time complexity of read on linked list is O(n) where n is number of nodes. So, if you have a problem, which requires more random reads, array will over-weigh linked list.

Given the contiguous memory allocation of array, there are optimized algorithms like binary search to search elements on array which has complexity of O(log n). Search on linked list on other hand requires O(n).

Insert on array is O(1) again, if we are writing within the size of array. In linked list, complexity of insert depends where do you want to write new element at. If insert happens at head, then it O(1), on the other hand if insert happens at end, it’s O(n).

Update means here, changing size of array or linked list by adding one more element. In array it is costly operation, as it will require reallocation of memory and copying all elements on to it. Does not matter if you add element at end or start, complexity remains O(1).
For linked list, it varies, to update at end it’s O(n), to update at head, it’s O(1).
In same vain, delete on array requires movement of all elements, if first element is deleted, hence complexity of O(n). However, delete on linked list O(1), if it’s head, O(n) if it’s tail.

To see the difference between O(1) and O(n), below graph should be useful.

#### Key difference between array and linked list are as follows

• Arrays are really bad at insert and delete operation due to internal reallocation of memory.
• Statically sized at the compile time
• Memory allocation is contiguous,  which make access elements easy without any additional pointers. Can jump around the array without accessing all the elements in between.
• Linked list almost have same complexity when insert and delete happens at the end, however no memory shuffling happens
• Search on linked list is bad.=, usually require scan with O(n) complexity
• Dynamically sized on run time.
• Memory allocation is non-contiguous, additional pointer is required to store neighbor node reference. Cannot jump around in linked list.

Please share if there is something wrong or missing. If you wan to contribute to website, please reach out to us at [email protected]