Matrix and graph traversals

Today, we will discuss an approach that can be used to solve quite a few problems with matrices. Problem structure is quite similar, you have a matrix, where cells represent real-world entities like land and water or oranges or walls and gates, etc. and we are asked to find something in that matrix. An example problem statement would be island problem. (taken from leetcode):

Given a 2d grid map of ‘1’s (land) and ‘0’s (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.

island problem graph dfs
In this example, there are three islands.

Most of the time these problems have two entities, in the above example, its water and land. At first sight, this problem does not give any hint that it is a graph problem. For a problem to be a graph problem, it must have nodes and edges.

What are the nodes and edges of the graph?

One hint is to the relationship between the cells of the matrix. For example, in the island problem, 0s are water, and 1s are land. Land can be connected to land only which is adjacent to the land and a piece of land is isolated if surrounded by water from all four sides.

In this case, nodes seem to be the cells with value 1(land). Most of the time, in these kinds of problems, each cell of the matrix with a certain value is a node. If we have an n x m matrix, we are looking at n x m possible nodes. Also, here be aware that we are not interested in all the cells. Based on the problem statement, we can discard some cells. For example, in the island problem, we are interested in the cells which represent the land.

Now, what will be the edges? In these kinds of problems, a statement is given like an island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. It means that a cell with value 1 is connected to neighbor cells only if they also contain 1.

Now that we know that cells containing 1 in the matrix are nodes of the graphs and connected to neighbors in they are also 1. To count the number of islands, all we have to do is find connected components in the graph. We reduced a matrix problem to a graph problem after all. Should we create an entire graph? Well no, at any point in time, we do not need the entire graph.

Traversals of graphs like DFS or BFS need a node and its neighbors.

If we are at a particular cell of a matrix, we already know the node. With the problem definition, we can know which other cells can be a neighbor of this node. So, we do not need to actually convert the matrix to a graph but just mimic the getNeighbor(Node u) function of a graph.

Number of islands

The problem statement is given above as an example, we already identified the nodes and edges of the graph implicit in the matrix. We also know that to find the number of islands in the matrix, we have to find the number of connected components in the graphs as explained above. Which traversal can we use to find the number of connected components of a graph? It is a depth-first search problem.

  public int numIslands(char[][] grid) {
    int count = 0;
    Set<Integer> visited = new HashSet<>();

    int n = grid.length;
    
    if(n <=0) return 0;
    int m = grid[0].length;

    for(int i=0; i<n; i++){
        for(int j=0; j<m; j++){
            int cell = i * m + j;
            if(!visited.contains(cell) && grid[i][j] == '1'){
                count++;
                dfs(i, j, visited, grid);
            }
        }
    }
    return count;
 }

void dfs(int i, int j, Set<Integer> visited, char[][] a){

    //Check if the node we are traversing is actually a valid node.
    if(i<0 || i>=a.length 
          || j<0 || j>=a[0].length 
          || a[i][j] == '0') return;

    int cell = i * a[0].length + j;
    if(visited.contains(cell)) return;
    visited.add(cell); 

    //Visit neighbors
    dfs(i+1, j, visited, a);
    dfs(i-1, j, visited, a);
    dfs(i, j+1, visited, a);
    dfs(i, j-1, visited, a);
}

Number of closed islands

There is another leetcode problem called Number of closed islands, it goes like:

Given a 2D grid consists of 0s (land) and 1s (water). An island is a maximal 4-directionally connected group of 0s and a closed island is an island totally (all left, top, right, bottom) surrounded by 1s. Return the number of closed islands.

Again, the similar structure of the problem, a matrix, cells representing the land and water. The only difference is now we have to find which land pieces are connected to the corners. Cells with 0s are lands, so cells with value 0 are our nodes. We will do a DFS starting from each of these nodes and see if we hit boundaries of the matrix, in that case, the node all these nodes cannot be closed islands.

  int numClosedIslands(int[][] g){
        int count = 0; 
        for(int i = 0;i < g.length;i++){
           for(int j = 0;j < g[0].length;j++){
                /* Our node is all the cells with 
                   value zero, we will do a DFS from each node
                */
                if(g[i][j] == 0){
                       count += dfs(g,i,j);
                }
           }
        }
        return count;
    }

    private int dfs(int[][] g,int row,int col){
         //if we meet the edge return 0;
         if(row < 0 || row >= g.length 
                || col < 0 || col >= g[0].length){
                 return 0;
         }
         //if we meet 1 return 1;
         if(g[row][col] == 1){
             return 1;
         }

         /* This is to make sure that 
             we do not visit same node again and again
         */
         g[row][col] = 1;

         /* Make sure that we do not reach to the boundaries 
         doing DFS. */
         int up = dfs(g,row-1,col);
         int down = dfs(g,row+1,col);
         int left = dfs(g,row,col-1);
         int right = dfs(g,row,col+1);

         //any of the sides meet the edge, is not an island;
         return up & down & left & right;
    }

Walls and gates problem

There is another problem which falls under the same criteria, the problem is known as walls and gates problem

Given a m x n 2D grid initialized with these three possible values. -1 – A wall or an obstacle. 0 – A gate. INF – Infinity means an empty room. We use the value 231 – 1 = 2147483647 to represent INF as you may assume that the distance to a gate is less than 2147483647.

Fill each empty room with the distance to its nearest gate. If it is impossible to reach a gate, it should be filled with INF.

walls-and-gates

walls-and-gates-solution
Walls and gates solutions

In this problem, we have to find the distance to the room from gates and fill it with the minimum distance. Question number one, what are our nodes? In this problem, all the cells except the walls are nodes of the graph. Why walls are not nodes? Because one cannot go through the obstacles, so the cannot be part of the path of nodes, hence they cannot be considered as a node.

We can traverse horizontal and vertical, that means nodes at (i+1, j), (i-1, j), (i, j+1) and (i, j-1) are neighbors of cell (i,j) if they are a room or a gate.

To find the minimum distance from gates to a room, start a DFS from each gate and update the distance of the room with minimum distance seen so far. All we have to do it keep track of distance while starting from a gate.

package AlgorithmsAndMe;

import java.util.HashSet;
import java.util.Set;

public class WallsAndGates {

    public int[][] wallsAndGates(int[][] grid){

        int n = grid.length;
        if(n<=0) return new int[0][0];

        int m = grid[0].length;

        Set<Integer> visited = new HashSet<>();

        for(int i=0; i<n; i++){
            for(int j=0; j<m; j++){
                //This is our gate
                if(grid[i][j] == 0){
                    dfs(i, j, grid, 0, visited);
                }
            }
        }

        return grid;
    }

    private void dfs(int x, int y, int[][] grid,
                     int distance, Set<Integer> visited){

        int cellId = x*grid[0].length + y;
        if(x<0 || x>=grid.length
                || y<0 || y>grid[0].length
                || grid[x][y] == -1
                || visited.contains(cellId)){
            return;
        }

        //Mark as visited for this round.
        visited.add(cellId);

        //update the distance
        grid[x][y] = Math.min(grid[x][y], distance);

        dfs(x+1, y, grid, distance+1,visited);
        dfs(x-1, y, grid, distance+1, visited);
        dfs(x, y+1, grid, distance+1, visited);
        dfs(x, y-1, grid, distance+1, visited);

        //Mark unvisited for the next round.
        visited.remove(cellId);

    }
}

Surrounded Regions

Given a 2D board containing ‘X’ and ‘O’ (the letter O), capture all regions surrounded by ‘X’. A region is captured by flipping all ‘O’s into ‘X’s in that surrounded region.

In this problem, again all the cells with zero are nodes, and edges are between the horizontal and vertical neighbors nodes. To capture all the nodes which are not connected to the edge of the matrix, first, find out which all are actually connected to the edge and mark them. Once we know all the nodes which are connected to the edge of the matrix, we can capture the nodes which are not connected.
To do so, we will start with nodes at the edge of the matrix and find the connected component of it and mark all the nodes in that connected component.

 public void solve(char[][] grid) {
        int n = grid.length;
        if(n<=0) return;

        int m = grid[0].length;

        //Cover the first and last rows.
        for(int j=0; j<m; j++){
            if(grid[0][j] == 'O') dfs(0, j, grid);
            if(grid[n-1][j] == 'O') dfs(n-1, j, grid);
        }

        //Cover the first and last columns.
        for(int i=0; i<n; i++){
            if(grid[i][0] == 'O') dfs(i, 0, grid);
            if(grid[i][m-1] == 'O') dfs(i, m-1, grid);
        }

        for(int i=0; i<n; i++){
            for(int j=0; j<m; j++){
                if(grid[i][j] == 'O') grid[i][j] = 'X';
                if(grid[i][j] == '1') grid[i][j] = 'O';
            }
        }

        return;
    }

    private void dfs(int x, int y, char[][] grid){

        /* If we reach the edge of the node, return 
           Or we hit 'X', in that case, we cannot move
           further in that direction.
        */
        if(x<0 || x>=grid.length
                || y<0 || y>=grid[0].length
                || grid[x][y] == 'X'){
            return;
        }

        if(grid[x][y] == 'O') {
            grid[x][y] = '1';

            dfs(x + 1, y, grid);
            dfs(x - 1, y, grid);
            dfs(x, y + 1, grid);
            dfs(x, y - 1, grid);
        }
    }  
 

The time complexity of all the solutions mentioned above is O(n * m) where n and m are rows and columns of the matrix. Because we will visit each cell at least once.
Please share if there is something wrong or missing. If you are preparing for an interviews and need help with it, please book a free session with us.