Word break problem

Word break problem

This problem is commonly asked in the Google and Amazon interview. We all know that if you typed string in Google search box does not make sense, Google breaks that into meaningful words and asks us back if we meant those words instead of a single word. This post discusses how can we find if the given string can be broken into meaningful dictionary words. For example, if I typed algorithmsandme and given dictionary is [“algorithms”, “and”, “me”], this string is breakable in meaningful words. but if the string is algorithmsorme this is not breakable into meaningful words. You can find this problem for practice at leetcode.

Word break problem : thoughts

We start with the first character of the string, check if the character itself is a word in the dictionary? If yes, then our problem reduces to the smaller problem, that is to check if substring from index 1 to s.length is breakable or not.
If not, then we check two characters and then three characters and so on till we can check the whole string. As with every character inclusion, the problem reduces in size but remains the same, so ideal case for recursive implementation.

package AlgorithmsAndMe;

import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class WordBreak {

    public boolean wordBreak(String s, List<String> wordDict) {
        return wordBreakUtil(s, wordDict, 0, table);
    }

    private boolean wordBreakUtil(String s, 
                                   List<String> wordDict, 
                                   int index) {

        if (index == s.length()) return true;

        boolean isBreakable = false;
        for(int i=index; i<s.length(); i++) {
            isBreakable = isBreakable 
                   || wordDict.contains(s.substring(index, i+1))
                    && wordBreakUtil(s, wordDict, i + 1);
        }

        return isBreakable;
    }
}

If you notice we are solving the same problems again and again in recursive function wordBreakUtil, how can we save that repeated calculations? Best way to save the already solve problems in a cache, that way we can refer to the cache if the problem is already solved or not. If yes, do not solve it again and use the cached value. This approach is called a Top Down approach and uses memoization to avoid repeated subproblems.

package AlgorithmsAndMe;

import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class WordBreak {

    public boolean wordBreak(String s, List<String> wordDict) {
        int [] table =  new int[s.length()];
        for(int i=0; i<s.length(); i++){
            table[i] = -1;
        }
        return wordBreakUtilTopDown(s, wordDict, 0, table);
    }

    private boolean wordBreakUtilTopDown(String s, 
                            List<String> wordDict,
                            int index,
                            int[] table) {

        if (index == s.length()) return true;

        if(table[index] < 0) {
            boolean isBreakable = false;
            for (int i = index; i < s.length(); i++) {
                isBreakable = isBreakable 
                        || wordDict.contains(s.substring(index, i + 1))
                        && wordBreakUtilTopDown(s, wordDict, i + 1);
            }
            table[index] = isBreakable ? 1 : 0;
        }
        return table[index] == 1 ? true : false;
    }
  }

If you run the first solution, it will exceed the time limit on leetcode, however, the second implementation should be accepted with 4ms as the time to run. Now you can appreciate the efficiency by memoization.

Word break problem using dynamic programming

In the last two implementations, two things are evident: first, the optimal solution of a subproblem leads to the optimal solution of the original problem. Second, there are overlapping subproblems. These are two must have conditions for applying dynamic programming. We already saw the memoization and top-down approach of DP to avoid repeated solving of subproblems. How can we do it bottom up?

What if store an information if the string till index i is breakable? What will be the base case? The string before index 0 is alway breakable as empty string. So table[0] can be always true. To check if string till index i is breakable or not, we check from index 0 to index i-1 if there is any index j till which string is breakable. If yes, then we just check if substring from index j to i, that will make table[i] as true.

package AlgorithmsAndMe;

import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class WordBreak {

    public boolean wordBreak(String s, List<String> wordDict) {
        return wordBreakBottomUp(s, wordDict, 0, table);
    }

    private boolean wordBreakUtilBottomUp(String s, List<String> wordDict){

        if(s == null || s.length() == 0) return false;

        boolean[] table  = new boolean[s.length()+1];

        table[0] = true;
        for (int i = 1; i <= s.length(); i++) {
            for (int j = i - 1; j >= 0; j--) {
                if (table[j] && wordDict.contains(s.substring(j, i))) {
                        table[i] = true;
                    }
                }
            }
        }
        return table[s.length()];
    }
}

The time complexity of the above implementation of the word break problem is O(n2)

If you want to store all the strings which can be generated by breaking a particular word, below is the code.

package AlgorithmsAndMe;

import java.util.*;

public class WordBreak2 {

    public List<String> wordBreak(String s, List<String> wordDict) {
        Map<String, List<String>> map = new HashMap<>();
        return wordBreakUtil2(s, wordDict, map);
    }

    private List<String> wordBreakUtil2(String s,
                                        List<String> wordDict,
                                        Map<String, List<String>> map) {

        if(map.containsKey(s)){
            return map.get(s);
        }

        List<String> result = new ArrayList<String>();
        if (wordDict.contains(s)){
            result.add(s);
        }

        for(int i=1; i<=s.length(); i++) {
            String prefix = s.substring(0, i);
            if(wordDict.contains(prefix)){
                List<String> returnStringsList = wordBreakUtil2(s.substring(i), wordDict, map);

                for(String returnString :returnStringsList ){
                    result.add(prefix + " " + returnString);
                }
            }
        }
        map.put(s,result);

        return result;
    }
}

Please share if there is something is wrong or missing. If you are preparing for an interview and need any help with preparation, please reach out to us or book a free session.