Boggle Search Problem in Java

The Boggle game is a popular word search puzzle where players attempt to find words hidden within a grid of letters. The goal is to trace paths through adjacent letters to form words according to predefined rules. In programming terms, solving the Boggle game involves efficiently searching for all valid words from a dictionary that can be constructed from a given grid.

Boggle Search Problem in Java

Note: Word must have at least three letters.

So, in the above boggle board or grid, we can see the following words.

EAR (Index 2, 3 and 4)

EARN (Index 2, 3, 4 and 7)

RUN (Index 5, 6, 7)

WINE (Index 11, 12, 8, 2)

WIN (Index 11, 12, 8)

MAT (Index 10,14, 15)

MATE (Index 10, 15, 15, 16)

Example 1:

Input:

Output:

 
["ABCCED", "SEE"]   

Solving the Boggle Search Problem

The boggle search problem can be solved by using various approaches like, brute force approach, depth first search, optimized approach, trie approach and backtracking, etc.

We have given a 4x4 matrix of letters and a dictionary, find all the valid words in the matrix. Following are the conditions that must be follow.

  1. If a letter is used, it should not be used again in the same word search
  2. The word path can be of any direction
  3. There has to be a path of the letters forming the word (in other words all the letters in the word must have to adjacent to one another)

Depth First Traversal

Depth First Traversal involves exploring each branch of a tree or graph structure as deeply as possible before backtracking. It prioritizes depth over breadth, often implemented recursively, making it ideal for tasks like pathfinding or exhaustive search in games or algorithms like in the Boggle Search Problem.

In this approach, consider every character as a starting character and find all words starting with it. All words starting from a character can be found using Depth First Traversal. We do depth-first traversal starting from every cell. We keep track of visited cells to make sure that a cell is considered only once in a word.

File Name: BoggleDFS.java

Output:

 
Following words of dictionary are present IN Boggle Grid:
EAR
EARN
RUN
EAR
EARN
MAT
MATE
WIN
WINE
WINE   

Note that the above solution may print the same word multiple times. For example, if we add "WINE", "EARN" to the dictionary, it is printed multiple times. To avoid this, we can use hashing to keep track of all printed words.

Now Time Complexity, Since we are doing depth-first traversal for every position in the array so n*m ( time for one DFS) = n*m ( |V| + |E|) where |V| is the total number of nodes and |E| is the total number of edges which are equal to n*m.

Time Complexity: O(N2 *M2)

Auxiliary Space: O(N*M)

Optimized Approach

Step 1: Initializes the Boggle board (board), dictionary (dictionary), visited array to track visited cells, and foundWords() list to store valid words found on the board.

Step 2: Iterates through each word in the dictionary. Check if the word can be formed on the board using the canFormWord() method. Add the word to foundWords() if it can be formed.

Step 3: Iterate through each cell on the board. Call the dfs() method starting from each cell to check if the word can be formed, starting from that cell.

Step 4: If index equals the length of the word, the entire word has been matched, so return true. Checks for out-of-bounds conditions, already visited cells, or mismatched characters, returning false accordingly.

  • Marks the current cell (board[x][y]) as visited (visited[x][y] = true).
  • Recursively explores neighbouring cells in all 8 directions (x-1, y, x+1, y, x, y-1, x, y+1, etc.), continuing to match subsequent characters of the word.
  • If a valid path is found, the dfs() function returns true.

Step 5: Backtracks by marking the current cell as not visited (visited[x][y] = false) before returning.

Let's implement the above algorithm in a Java program.

File Name: BoggleSolver.java

Output:

 
Words found:
abc
beh
cfi   

Time Complexity: O(N × R × C × 8^W), where N is the number of words in the dictionary, R is the number of rows, C is the number of columns in the board, and W is the average length of words.

Auxiliary Space Complexity: O(R × C + N + W), where R × C is for the visited array, N is for storing found words, and W is for the recursive stack depth during DFS traversal.

Recursion + Binary Search Approach

In this approach, we recursively go (using backtracking) through the board and generate all possible words. For each word that are three or more letters long we check to see if it's in the dictionary. If it is, we have a match. Here is the algorithm steps:

  1. Read in the dictionary in to a container in memory.
  2. Sort the dictionary.
  3. Using backtracking, search the board for words.
  4. If a word is found and it contains 3 or more letters, do a binary search on the dictionary to see if the word is there.
  5. If searching was successful in the previous step, print the letter out.
  6. Continue step 3-5 as long as there are more words to construct on the board.

Complexity

In this solution, we do a job on the dictionary search. Using binary search, we are quickly finding out whether a word is in dictionary or not. But the real bottleneck is in searching the board for words. For an N x N board the search space is O((N*N)!).

Pruned Recursion + Prefix Tree (also known as a Trie) Approach

From the previous approach, our major concern was the enormous search space on the board. Fortunately, using a prefix tree or trie data structure we could significantly cut down on this search space. The reasoning behind this improvement is that, if a word "abc" does not occur as a prefix to any word in the dictionary there is no reason to keep searching for words after we encounter "abc" on the board. It actually cut down the run time a lot. Here is the algorithm steps.

  1. Read a word from the dictionary file.
  2. Insert it into a prefix tree data structure in memory.
  3. Repeat steps 1-2 until all words in the dictionary have been inserted into the prefix tree.
  4. Using backtracking, search the board for words.
  5. If a word is found and it contains 3 or more letters, search the prefix tree for the word.
  6. If searching was *not* successful in the previous step, return from this branch of the backtracking stage. (There is no point to continue searching in this branch, nothing in the dictionary as the prefix tree says).
  7. If searching was successful in step 5, continue searching by constructing more words along this branch of backtracking and stop when the leaf node has been reached in the prefix tree. (at that point there is nothing more to search).
  8. Repeat steps 4-7 as long as there are more words to search in the backtracking.

Complexity

This approach significantly improves on the first one. Building a prefix tree our of the dictionary words is O(W * L), where W is the number of words in the dictionary and L is the maximum length of a word in the dictionary.

Searching the board will be of the same order as the dictionary since we are not really searching words that are not in the dictionary. But in reality it will be more work than that as we still need to backtrack along the board to construct new words until we can consult the dictionary prefix tree to know whether it exists or not.

No search space + Dynamic Programming Approach

The 2nd approach mentioned above was good enough until the board size was 5. Unfortunately with a board size of 6, that too was taking forever to complete!

It got me into thinking - "Dang, this search space is still too big to search! Can I just get rid of it entirely?" And then this idea popped into my mind: instead of random constructing word after word in this infinite ocean of words why do not I take a word from the dictionary and somehow magically check whether that's available on the board or not?

It turns out, we can use a nifty dynamic programming technique to quickly check whether a word (from the dictionary in this case) can be constructed from the board or not!

Here is core point of the dynamic programming idea:

For a word of length k to be found (end location) at the [i, j]-th location of the board, the k-1'th letter of that word must be located in one of the adjacent cells of [i, j].

The base case is k = 1.

A letter of length 1 will be found (end location) in the [i, j]-th cell of the board of the only letter in the word matches the letter in the [i, j]-th location of the board.

Once our dynamic programming table is populated with the base case, we can build on top of that for any word of length k, k > 1.

Here is a sample code for this: