Intersection of two arrays

Intersection of two arrays

Given two unsorted arrays of integers, find intersection of these two arrays. Intersection means common elements in the given two arrays. For example, A = [1,4,3,2,5,6] B = [3,2,1,5,6,7,8,10] intersection of A and B is [ 1,3,2,5,6 ].

Sort array and then use binary search
As given arrays are unsorted, sort one of the arrays, preferably the larger one. Then search each element of another array in the sorted array using binary search. If the element is present, put it into the intersection array.

class Solution {
    public int[] intersection(int[] nums1, int[] nums2) {
        
        int len1 = nums1.length;
        int len2 = nums2.length;
        Set<Integer> result = new HashSet<>();
        
        for(int i=0; i<len2; i++){
            if(binarySearch(nums1, nums2[i]) != -1){
                result.add(nums2[i]);
            }
        }
        int i = 0;
        int[] resultArray = new int[result.size()];
        for(Integer num : result){
            resultArray[i++] = num ;
        }
        
        return resultArray;
    }
    
    private int binarySearch(int[] a, int key) {
        
        for(int i=0; i<a.length; i++){
            if(a[i] == key) return i;
        }
        
        return -1;
    }
}

The time complexity of binary search method to find intersection is O(nlogn) for sorting and then O(mlogn) for searching. Effective time complexity becomes O((n+m)logn) which is not optimal.

Sort and use merge to find common elements
Again in this method, sort two arrays first. Then use two pointers to scan both arrays simultaneously. (Please refer to merge part of merge sort ). The difference is we will put only common elements, instead of all.

The time complexity of merge sort method is O(nlogn) + O(mlogm) for sorting and then O(m+n) for scanning both arrays. It is worst than the binary search method.

Use hash
Create a hash with key as elements from the smaller array (saves space). Then scan through other array and see if the element is present in hash. If yes, put into intersection array else do not.

package AlgorithmsAndMe;

import com.sun.org.apache.xpath.internal.operations.Bool;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class IntersectionTwoArrays {

    public List<Integer> findIntersecton(int[] a, int[] b) {
        List<Integer> result = new ArrayList<>();
        Map<Integer, Boolean> existingElements = new HashMap<>();

        for (int i = 0; i < a.length; i++) {
            existingElements.put(a[i], true);
        }

        for (int i = 0; i < b.length; i++) {
            if (existingElements.containsKey(b[i])) {
                result.add(b[i]);
            }
        }
        return result;
    }
}

Test case

package Test;

import AlgorithmsAndMe.DuplicatesInArray;
import AlgorithmsAndMe.IntersectionTwoArrays;

import java.util.List;
import java.util.Set;

public class IntersectonTwoArraysTest {


    IntersectionTwoArrays intersectionTwoArrays
             = new IntersectionTwoArrays();

    @org.junit.Test
    public void testIntersectionTwoArrays() {
        int [] a = {1,6,3};
        int [] b = {1,2,3};
        List<Integer> result = intersectionTwoArrays.findIntersecton(a,b);

        result.forEach(s -> System.out.println(s));
    }
}

This method has the complexity of O(n) where n is the number of elements in the larger array and extra space complexity of O(m) where m is the number of elements in the smaller array.

These methods to find the intersection of two arrays do not work when there are duplicate elements in any of the array as they will be part of intersection array only once.

Please share if there is something wrong or missing. we would love to hear from you.

First non repeated character in string

First non repeated character in string

Given a string, find first non repeated character in a stringFor example, string is abcbdbdebab, the first non repeating character would be c. Even though e is also non repeating in string, c is output as it is first non repeating character.

Non repeating character : thoughts

What does it mean to be non-repeating character? Well, the character should occur in string only once. How about we scan the string and find what is count for each character? Store character and count in map as key value pair.
Now, that we have <character, count> key value pair for all unique characters in string, how can we find first non repeating character? Refer back to original string; scan the string again and for each character, check the corresponding count and if it is 1 return the character.

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 4.10.18.
 */
public class FirstNonRepeatingChar {

    HashMap<Character, Integer> characterCount = new HashMap<>();

    public char firstNonRepeatingCharacter(String s){
        //Best to discuss it with interviewer, what should we return here?
        if(s == null) return ' ';

        if(s.length() == 0) return ' ';

        for (char c: s.toCharArray()){
            if(!characterCount.containsKey(c)){
                characterCount.put(c,1);
            }
            else {
                characterCount.put(c, characterCount.get(c) + 1);
            }
        }
        for (char c: s.toCharArray()) {
            if(characterCount.get(c) == 1) return c;
        }

        return ' ';
    }
}
package test;

import com.company.FirstNonRepeatingChar;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class FirstNonRepeatingCharTest {

    FirstNonRepeatingChar tester = new FirstNonRepeatingChar();

    @Test
    public void firstNonRepeatingCharExists() {
        String s = "abcbcbcbcbad";
        assertEquals('d', tester.firstNonRepeatingCharacter(s));
    }

    @Test
    public void firstNonRepeatingCharDoesNotExists() {
        String s = "abcbcbcbcbadd";
        assertEquals(' ', tester.firstNonRepeatingCharacter(s));
    }

    @Test
    public void firstNonRepeatingCharWithEmptyString() {
        String s = "";
        assertEquals(' ', tester.firstNonRepeatingCharacter(s));
    }

    @Test
    public void firstNonRepeatingCharWithNull() {
        assertEquals(' ', tester.firstNonRepeatingCharacter(null));
    }
}

Complexity of this method to find first non-repeating character in a string is O(n) along with space complexity of O(1) to store the character to count map.

There is always some confusion about space complexity of above method, we think as 256 characters are used, should it not be counted as space complexity? Definitely. But in asymptotic notations, this space is independent of size of input, so space complexity remains O(1).

One more thing, even though time complexity is O(n), input string is scanned twice, first time to get count of characters and second time to find first non repeating character.

Optimization

Consider a case when string is too large with millions of characters, most of them repeated, above solution may turn slow in last where we look for character with count 1 in map.  How can we avoid scanning array second time?
How about we store some information with character in map along with count, so that we can figure out if the character is first non repeating or not.
Or we can have two maps, one stores the count and other stores the first index of character.

Once, we have created two maps as mentioned above, go through the first map and find the all characters with count 1. For each of these characters, check which one has the minimum index on second map and return that character.

Complexity of algorithm remains same, however, second scan of string is now not required. In other words, second scan is now independent of size of input as it depends on the size of first map, which is constant to 256 as that’s the number of unique 8 bit characters possible.

Find first non repeating character : Implementation

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 4.10.18.
 */
public class FirstNonRepeatingChar {
    public char firstNonRepeatingCharacterOptimized(String s){
        HashMap<Character, Integer> characterCount = new HashMap<>();
        HashMap<Character, Integer>characterIndex = new HashMap<>();
        //Best to discuss it with interviewer, what should we return here?
        if(s == null) return ' ';

        if(s.length() == 0) return ' ';

        for (int i=0; i<s.length(); i++){
            char c  = s.charAt(i);
            if(!characterCount.containsKey(c)){
                characterCount.put(c,1);
                characterIndex.put(c,i);
            }
            else {
                characterCount.put(c, characterCount.get(c) + 1);
            }
        }
        char nonRepeatedCharacter = ' ';
        int prevIndex = s.length();
        for (char c : characterCount.keySet()) {
            if(characterCount.get(c) == 1 
			&& characterIndex.get(c) < prevIndex){
                prevIndex = characterIndex.get(c);
                nonRepeatedCharacter = c;
            }
        }
        return nonRepeatedCharacter;
    }
}
package test;

import com.company.FirstNonRepeatingChar;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class FirstNonRepeatingCharTest {

    FirstNonRepeatingChar tester = new FirstNonRepeatingChar();

    @Test
    public void firstNonRepeatingOptimizedCharExists() {
        String s = "aebcbcbcbcbad";
        assertEquals('e', tester.firstNonRepeatingCharacterOptimized(s));
    }

    @Test
    public void firstNonRepeatingCharOptimizedDoesNotExists() {
        String s = "abcbcbcbcbadd";
        assertEquals(' ', tester.firstNonRepeatingCharacterOptimized(s));
    }

    @Test
    public void firstNonRepeatingCharOptimizedWithEmptyString() {
        String s = "";
        assertEquals(' ', tester.firstNonRepeatingCharacterOptimized(s));
    }

    @Test
    public void firstNonRepeatingCharOptimizedWithNull() {
        assertEquals(' ', tester.firstNonRepeatingCharacterOptimized(null));
    }
}

Please share if there is anything wrong or missing. If you are interested in taking personalized coaching sessions by our expert teachers, please signup to website and get first session free.


First non repeated character in stream

First non repeated character in stream

In last post, we discussed first non repeated character in string, where all characters where available in memory beforehand. Let’s take another variation of the problem : Given a stream of characters, find first non repeated character in stream in stat a given point of time.

For example, if stream till now is a,a,b,c,c,d; and it is asked what is first non repeated character at this point, it will be b. Next character in stream is let’s say b, then first non repeated character in stream at this point will be d.

Non repeated character in stream : thoughts

If set of characters is already present in memory, then it is very simple as we discussed here. Idea is to create a hash map with character as key and number of times is occurs in string as value; increase the count when character is seen during scan of string.
Once all characters are processed, go through string again and return first character which has count 1. However, this approach does not work with continuous stream of characters.

Why a queue?

Problem is to return first non repeating character at a given point of time. A character which was unique at a certain point of time, may not be unique at another point of time in future, that character may have be duplicated in stream coming in during those instances of time. We need to store characters in order of their arrival, however remove them as they occur more than once. At any specific time, we want the first arrived character till that point and has only one occurrence. Are you able to pattern?

First in first out pattern and queues are the best data structure to implement first in first out pattern.

One thing to be noted that even though first in first out pattern applies, removal of character is ad-hoc between two queries when stream flows in.  Characters which happen to occur twice or more between queries will be removed from the queue. Why mentioned this because then we cannot have array based queue. What data structure is best to remove a random node? Well, it’s doubly linked list. Given a node, it’s O(1) operation to remove it.

Why a hash map?

Another problem is, how can we map a character coming out of stream to a node in queue? We scan the doubly linked list based queue and delete the node which contains the character. Even though it will be O(1) operation (as max queue can grow 256 nodes why?), it is more expensive computationally to do for all the characters coming from stream.

What can quickly map a node to character, obviously a hash map. So, key value pair is store in hash map with character as key and node address as value.
Hash map actually now servers another purpose too, which is to maintain the count of occurrence. If a character is present in hash, that means it has corresponding node in queue, which implies that character has already been seen once. So we remove it from the queue if there is value for character in map.
If there is not value for the character in hash map, that implies, this is the first time we have seen this element and should be added to queue.

First non repeated character in stream : Algorithm

  1. Process next character from stream.
  2. Check in hash map if there is corresponding node in queue.
  3. If there is node already in hash map
    1. Delete the node corresponding node from queue
    2. go to step 1.
  4. If there is no node in hash map for this character
    1. Create a node and push that at the tail of queue.
    2. Update hash map with char, node address as key value pair

Algorithm looks fine, but there is an implementation problem here, can you see? We are freeing the node from queue however, we are keeping it in hash, to  indicate that character is already seen.
In ideal world, we should not store references to free memory which can be
accessed accidentally and cause segmentation fault. Idea would be to separate storing node pointer and seen functionalities with two hashes.

Create an additional hash map, character as key and bool as value, which indicates if character is seen twice already in stream before. With this hash map, we can remove the entry from other hash map which store node address of particular character, so no dangling pointers.

Modified algorithm

  1. Process next character from stream.
  2. Check in visited hash map if character was ever seen twice before.
  3. If true, go to step 1.
  4. If false, check the node address hash map, if there is node address for character
    1. Delete the node corresponding node from queue and remove entry from hash.
    2. Update visited hash map for this char as true.
    3. go to step 1.
  5. If there is no node in hash map for this character
    1. Create a node and push that at the tail of queue.
    2. Update hash map with char, node address as key value pair
package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 8.10.18.
 */
public class NonRepeatedCharInStream {

    private HashMap<Character, Boolean> visited =  new HashMap<>();
    private HashMap<Character, ListNode> address =  new HashMap<>();
    private Queue<Character> queue = new QueueImplementation<>();

    public char getNonRepeatedCharacter(){
        return queue.isEmpty() ? ' ' : queue.peek().getData();
    }

    public void process(char c){
        if(visited.containsKey(c)) return;

        if(address.containsKey(c)){
            queue.deleteNode(address.get(c));
            address.remove(c);
            visited.put(c, true);
        }
        else{
            address.put(c,queue.enqueue(c));
        }
    }
}

Queue implementation using Doubly linked list

package com.company;

/**
 * Created by sangar on 8.10.18.
 */
public class QueueImplementation<E> implements Queue<E>{
    ListNode<E> head;
    ListNode<E> tail;

    public QueueImplementation(){
        head = null;
        tail = null;
    }

    @Override
    public ListNode<E> deleteNode(ListNode<E> node){
        if(this.isEmpty()) {
            return null;
        }

        if(this.head == node){
            if(this.head.getNext() != null) this.head.getNext().setPrev(null);
            this.head = this.head.getNext();
            return node;
        }

        if(this.tail == node){
            if(this.tail.getPrev() != null) this.tail.getPrev().setNext(null);
            this.tail = this.tail.getPrev();
            return node;
        }
        /*
            We are deleting node in between. So following things happen
            1. If node has prev, set node.prev.next = node.next.
            2. If node has next, set node.next.prev = node.prev
        */
        if(node.getPrev() != null) node.getPrev().setNext(node.getNext());
        if(node.getNext() != null) node.getNext().setPrev(node.getPrev());

        return node;
    }


    @Override
    public ListNode peek() {
        if(this.isEmpty()) {
            return null;
        }
        return this.head;
    }

    @Override
    public ListNode remove() {
        if(this.isEmpty()) {
            return null;
        }
        /*
            We are deleting node at head. So following things happen
            1. Set temporary node point to head.
            2. Move head to next of node.
            3. Set prev of new head to NULL.
            4. Free the temp node.
          */
        ListNode<E> tempNode = this.head;
        this.head.setPrev(null);
        this.head = this.head.getNext();

        return tempNode;
    }

    @Override
    public ListNode enqueue(E data) {
        if(this.isEmpty()) {
            this.head = new ListNode<E>(data);
            this.tail = this.head;
            return this.head;
        }
        ListNode<E> newNode = new ListNode<E>(data,null, this.tail);
        this.tail.setNext(newNode);
        this.tail = newNode;

        return newNode;
    }

    @Override
    public boolean isEmpty() {
        return this.head == null;
    }
}

package com.company;

/**
 * Created by sangar on 8.10.18.
 */
public interface Queue<E> {
    public ListNode<E> peek();
    public ListNode<E> remove();
    public ListNode<E> enqueue(E data);
    public ListNode<E> deleteNode(ListNode<E> node);
    public boolean isEmpty();
}

Test cases (add more and see if that works)

package test;

import com.company.NonRepeatedCharInStream;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class NonRepeatedCharInStreamTest {

    NonRepeatedCharInStream tester = new NonRepeatedCharInStream();

    @Test
    public void testNormalCase() {

        tester.process('a');
        tester.process('b');
        tester.process('c');

        assertEquals('a', tester.getNonRepeatedCharacter());

        tester.process('a');
        assertEquals('b', tester.getNonRepeatedCharacter());

        tester.process('c');
        tester.process('d');
        tester.process('b');

        assertEquals('d', tester.getNonRepeatedCharacter());
    }

    @Test
    public void testNoUniqueCharCase() {

        tester.process('a');
        tester.process('b');
        tester.process('c');
        tester.process('a');
        tester.process('b');
        tester.process('c');

        assertEquals(' ', tester.getNonRepeatedCharacter());
    }
}

Let’s see how this algorithm works with an example. To start with both hash maps (visited and address) and queue are empty. Stream starts with character c, new node is added to queue, also, entry is entered into the address hash map to store address of node in queue.

first character added to queue and hash map

Next d comes from stream,  we check if it is on visited map, it is not, so we check if it on address map. It is not there, it means, we saw c first time. Add it to queue and store pointer to node in address node.

d is added to address map and queue

Next character is e, same process followed as d and it gets added to queue and address map.

Imagine that stream puts d out now. We will check in visited map, it is not, however, in address map, it is present. This is where we know that d has occurred twice. We delete it from queue and address map. However, we add it to visited map.

 first non repeated character at this moment is c 

Now comes the character f, it is not present in visited map nor in address map, so we add it to queue and store pointer in the address map.

Next comes the character d, we check visited map it true and so do not add it anywhere and skip it.

What if next character is c,  it is present in address map, so remove it from queue and map. Add it to visited map.

first non repeated character in stream
At this point of time, first non repeated character is e

Complexity to find first non repeated character in stream is O(1) as number of characters processed from stream does not impact the processing time. Space complexity is also O(1) as it independent of number of characters processed.

Please share if there is something wrong or missing. Please reach out to us at communications@algorithmsandme.com if you have doubt or need any help with your interview preparations.