Longest Substring Without Repeating Characters

Tags: , , ,

Given a string, find the longest substring without repeating characters in it. For example,

Input:
S = "abcaabaca" 
Output:
3
Explanation:
The longest substring without repeating characters will be "abc"

Input: 
"bbbbb"
Output:
1
Explanation:
The answer is "b", with a length of 1.

A brute force solution will be to scan all substrings of the given string and check which one has the longest length and no repeating characters. For a string with size n, there will be n * (n-1) substrings, and to check it each for unique characters, it will take n comparison in the worst case. So, the worst-case complexity of this algorithm is O(n3) with additional space of O(n). The code is simple enough.

package com.company;

import java.util.HashMap;

/**
 * Created by sangar on 1.1.18.
 */
public class NonRepeatingCharacters {

     boolean allUniqueCharacters(String s, int start, int end) {

        HashMap<Character, Boolean> characters = new HashMap<>();

        for (char c : s.substring(start, end).toCharArray()) {
            if(characters.containsKey(c)) return false;
            characters.put(c, Boolean.TRUE);
        }
        return true;
    }

    int longestSubstringWithoutRepeatingCharacters(String s) {
        int len = s.length();
        int maxLength = 0;
          
        for (int i =0; i < len; i++){
            for (int j=i+1; j<len; j++){
                int length = j-i;
                if (allUniqueCharacters(s, i, j)){
                    maxLength = Integer.max(maxLength, length);
                }
            }
        }
        return maxLength;
    }

    public static void main(String[] args) {
        String s = "abcdabcbb";
        System.out.println(longestSubstringWithoutRepeatingCharacters(s));
    }
}

Sliding window approach

A sliding window is an abstract concept commonly used in array/string problems. A window is a range of elements in array/string which defined by start and end indices. A sliding window is a window which “slides” its two boundaries in a certain direction.
Read fundamentals and template for a sliding window to understand more about it and how it is applied to problems.

In the brute force approach, we repeatedly checked each substring for unique characters. Do we need to check each substring? If a substring s[i,j-1] contains non repeating characters, while adding jth character, check if that character is already present in substring s[i,j-1]. Since we scan substring to ascertain the uniqueness of new characters, the complexity of this algorithm is O(n2).

How about optimizing the scanning part? What if a hash is used to store characters which are already seen in substring s[i,j-1]. In that case, checking the uniqueness of a new character is done in O(1) and overall algorithm complexity becomes linear.

 public  static int longestSubstringWithoutRepeatingCharacters(String s) {
        int len = s.length();
        HashMap<Character, Boolean> characters = new HashMap<>();

        int maxLength = 0;
        int start = 0;
        int  end = 0;
        while (start < len && end < len) {
            //Check only the last character.
            if(!characters.containsKey(s.charAt(end))){
                characters.put(s.charAt(end), Boolean.TRUE);
                end++;
            }
            else {
                int currentLength = end-start;
                maxLength = Integer.max(maxLength, currentLength);
                //Move start of window one position ahead.
                characters.remove(s.charAt(start));
                start++;
            }
        }
        return maxLength;
    }

If a character already present in substring s[i,j-1], that means, it cannot be added to the longest substring. Find the length of substring (j-i) and compare it with the current maximum length. if it is greater, the max length of the longest substring without repeating characters is (j-i).
At last move the window to the position of duplicate.

Below is an example execution of the above code.
longest substring without repeating characters

Longest substring without repeating characters : 3

There is a small optimization that helps us to skip more characters when repeating character is found instead of skipping one at a time. Store the index of each character seen in substring [i,j-1].  While processing jth character, if it is already in the hash, we know the index k where that character is in the string. There is no way that any substring can contain unique characters till k and j are in it. So, we skip all indices from i to k and start from k+1 instead of i+1 as in the above method.

Show me the optimized code

  public static int longestSubstringWithoutRepeatingCharacters3(String s) {
        int len = s.length();
        HashMap<Character, Integer> characters = new HashMap<>();

        int maxLength = 0;

        for (int start=0, end = 0; end <len; end++) {
            if (characters.containsKey(s.charAt(end))) {
                //find the index of duplicate character.
                int currentIndex = characters.get(s.charAt(end));
                start = Integer.max(currentIndex, start) + 1;
            }
            int currentLength = end - start;
            maxLength = Integer.max(maxLength, currentLength);
            //Update new location of duplicate character
            characters.put(s.charAt(end), end );
        }
        return maxLength;
    }

Complexity of find longest substring without repeating characters is hence O(n) with additional space complexity of O(n).
Please share if something is wrong or missing. We would love to hear from you.