Breadth First traversal

Breadth First traversal

In the last post, we discussed depth first traversal of a graph. Today, we will discuss another way to traverse a graph, which is breadth first traversal. What is breadth first traversal? Unlike depth-first traversal, where we go deep before visiting neighbors, in breadth-first search, we visit all the neighbors of a node before moving a level down. For example, breadth first traversal of the graph shown below will be [1,2,5,3,4,6]

breadth first traversal

In breadth first search, we finish visiting all the nodes at a level before going further down the graph. For example, the graph used in the above example can be divided into three levels as shown.

breadth first search

We start with a node in level 1 which is node(1). Then visit all the nodes which are one level below node(1) which are node(2) and node(5). Then we visit all the node at level 3 which are node(3), node(4) and node(6).

Breadth First Traversal: algorithm

  1. Start with the given node u, put node u to queue
  2. While queue is not empty, repeat below steps:
    1. Dequeue fro queue and print node u.
    2. For each neighbor of u, node v
    3. If v is not visited already: add v to the queue
    4. mark v as visited

Let’s take an example and see how it works. Below is the graph and we have to find BFS for this graph.
breadth first traversal

We start from node(1), and put it in the queue.
breadth first traversal of graph

While the queue is not empty, we should pop from it and print the node. In this case, node(1) will be printed. Next, we go through all the neighbors of node(1) and put all the unvisited node on the queue. node(2) and node(5) will go on to the queue and marked as visited. Traversal = {1}

breadth first search

Again, we dequeue from the queue and this time we get node(2). We print it and go for all the neighbor node, node(3) and node(4) and mark them as visited. Traversal = {1,2}

node(5) is dequeued next and printed. Here, even though node(4) is a neighbor of node(5), it is already visited and hence not put on to the queue again. But node(6) is not yet visited, so put it on to the queue. Traversal = {1,2,5}

Now, we pop node(3) and print it, however, node(4) is already visited. Hence, nothing is added to the queue. Traversal = {1,2,5,3}

Next, node(4) is taken out from queue and printed, nothing goes on to queue. Traversal = {1,2,5,3,4}

Last, we pop node(6) and print it. Traversal = {1,2,5,3,4,6}.

At this point, the queue is empty and we stop traversal.

Breadth first traversal: implementation

public ArrayList<Integer> breadthFirstTraversal(){

        boolean[] visited = new boolean[this.G.length];
        ArrayList<Integer> traversal = new ArrayList<>();

        Queue<Integer> q = new LinkedList<>();

        //This is start node
        q.add(1);
        visited[1] = true;

        while(!q.isEmpty()){
            int u = (int)q.remove();
            traversal.add(u);

            for(int i=1; i< this.G[1].length; i++){
                if(this.G[u][i] && !visited[i]){
                    q.add(i);
                    visited[i]= true;
                }
            }
        }
        System.out.println(traversal);
        return traversal;

    }

The complexity of this code is O(V2) as at least V nodes will go in queue and for each nodes internal for loop runs V times.

Implementation of breadth-first search on graph represented by adjanceny list

  public ArrayList<Integer> breadthFirstTraversal(){

        boolean[] visited = new boolean[this.G.size()];
        ArrayList<Integer> traversal = new ArrayList<>();

        Queue<Integer> q = new LinkedList<>();

        //This is start node
        q.add(1);
        visited[1] = true;

        //This loop will run for V times, once for each node.
        while(!q.isEmpty()){
            int u = (int)q.remove();
            traversal.add(u);

            /*This loop has a worst-case complexity of O(V), where 
               node has an edge to every other node, but 
               the total number of times this loop will run is E times 
               where E number of edges.
             */
            for(int v : this.G.get(u)){
                if(!visited[v]){
                    q.add(v);
                    visited[v]= true;
                }
            }
        }
        System.out.println(traversal);
        return traversal;

    }

The complexity of Breadth First Search is O(V+E) where V is the number of vertices and E is the number of edges in the graph.

The complexity difference in BFS when implemented by Adjacency Lists and Matrix occurs due to this fact that in Adjacency Matrix, to tell which nodes are adjacent to a given vertex, we take O(|V|) time, irrespective of edges. Whereas, in Adjacency List, it is immediately available to us, takes time proportional to adjacent vertices itself, which on summation over all vertices |V| is |E|. So, BFS by Adjacency List gives O(|V| + |E|).

StackOverflow

When a graph is strongly connected, O(V + E) is actually O(V2)

Applications of Breadth first traversal

  1. To find shortest path between two nodes u and v
  2. To test bipartite-ness of a graph
  3. To find all nodes within one connected component

Please share if there is something wrong or missing. If you are preparing for an interview and want a free coaching session to guide you through it, please reach out to us at communications@algorithmsandme.com

Topological sorting

Topological sorting in a graph

Given a directed acyclic graph G (V,E), list all vertices such that for all edges (v,w), v is listed before w. Such an ordering is called topological sorting and vertices are in topological order. For example, topological sort for below graph would be: 1,2,3,5,4,6

topological sort

A topological ordering is not unique and a DAG can have more than one topological sort. For example, for above graph, 1,5,2,3,6,4 is also correct topological sort.

How to do a topological sort on a graph?

To start topological sort, we need a node which has zero incoming edges. If there is no such node in the graph, the graph is not directed acyclic graph, DAG. DAG will have at least one such node. Let’s G0 is the graph and V0 is the vertex with zero incoming node. So, V0 becomes the first vertex in order.
Now if we take out V0 and all the edges coming out from it, it leaves a graph G1, there must be a vertex V1 in G1 which has zero incoming edges. V1 is added to topological order.
TheV1 is removed from G1 which leaves us with G2 with TheV2 as candidate node. This goes on until there is no nodes remaining in the original graph G.

At any point in time, we cannot move forward, when there is no node with zero incoming nodes, it means there is a cycle in the graph and given graph is not a DAG.

Above description actually gives away the implementation detail too. We look for a vertex with zero incoming edges and take that. Then we remove all the edges coming out of that node from the graph, which will decrease the number of edges coming on to the neighbor nodes. Again select a node which has zero incoming edges and continues by removing edges.

Let’s take an example and work it out.

topological sorting

Topological sort: implementation

package com.company.Graphs;

import java.util.*;

/**
 * Created by sangar on 21.12.18.
 */
public class AdjacencyMatrix {
    private boolean [][] G;
    private boolean isDirected;

    public AdjacencyMatrix(int numNodes, boolean isDirected){
        this.G  = new boolean[numNodes+1][numNodes+1];
        this.isDirected = isDirected;
    }

    public void addEdge(int start, int dest){
        this.G[start][dest] = true;
        if(!isDirected){
            this.G[dest][start] = true;
        }
    }

    public boolean isEdge(int start, int dest){
        return this.G[start][dest];
    }

    public ArrayList<Integer> topologicalSort(){

        int[] inDegree = new int[this.G.length];
        boolean[] visited = new boolean[this.G.length];
        ArrayList<Integer> toplogicalOrder = new ArrayList<>();

        //Complexity of O(n^2)
        for(int i=1; i<this.G.length; i++) {
            for (int j = 1; j < this.G.length; j++) {
                if (this.G[i][j]) {
                    inDegree[j] += 1;
                }
            }
        }
        /* We will traverse each node */
        for(int i=1; i<this.G.length; i++){
            /* find a node with zero in degree */
            int u  = findNextNode(inDegree, visited);
            /* If there is no node with zero in degree,
            topological sort not possible */
            if(u != -1){
                toplogicalOrder.add(u);
                visited[u] = true;
                /* decrease in degree of each node adjacent
                to node wih zero in degree */
                for(int v=1; v<this.G.length; v++) {
                    if (this.G[u][v]) {
                        inDegree[v]--;
                    }
                }
            }
            else{
                System.out.println("\n Topological sort no possible");
                return toplogicalOrder;
            }
        }
        return toplogicalOrder;
    }

    private int findNextNode(int indegree[], boolean[] visited){
        for( int i=1; i< this.G.length; i++){
            if(indegree[i] == 0 && !visited[i]){
                return i;
            }
        }
        return -1;
    }
}

The complexity of topological sort implementation with adjacency matrix representation is O(V2). For each vertex we find the vertex with zero in-degree, hence the quadratic time.

Topological sort adjacency list represented graph

package com.company.Graphs;

import java.util.*;

/**
 * Created by sangar on 21.12.18.
 */
public class AdjacencyList {
    private Map<Integer, ArrayList<Integer>> G;
    boolean isDirected;

    public AdjacencyList(boolean isDirected){
        this.G = new HashMap<>();
        this.isDirected = isDirected;
    }

    public void addEdge(int start, int dest){

        if(this.G.containsKey(start)){
            this.G.get(start).add(dest);
        }else{
            this.G.put(start, new ArrayList<>(Arrays.asList(dest)));
        }

        if(!this.G.containsKey(dest)) {
            this.G.put(dest, new ArrayList<>());
        }
        //In case graph is undirected
        if(!this.isDirected) {
                this.G.get(dest).add(start);
        }
    }

    public boolean isEdge(int start, int dest){
        if(this.G.containsKey(start)){
            return this.G.get(start).contains(dest);
        }

        return false;
    }

    public ArrayList<Integer> topologicalSort(){

        int[] inDegree = new int[this.G.size()+1];
        boolean[] visited = new boolean[this.G.size()+1];
        ArrayList<Integer> topologicalOrder = new ArrayList<>();

        //Complexity of O(V + E)
        for(int u: this.G.keySet()){
            for (int v : this.G.get(u)){
                    inDegree[v] += 1;
            }
        }


        /* We will traverse each node */
        for(int i=1; i<this.G.size()+1; i++){
            /* find a node with zero in degree */
            int u  = findNextNode(inDegree, visited);
            /* If there is no node with zero in degree,
            topological sort not possible */
            if(u != -1){
                topologicalOrder.add(u);
                visited[u] = true;
                /* decrease in degree of each node adjacent
                to node wih zero in degree */
                for(int v : this.G.get(u)) {
                    inDegree[v]--;
                }
            }
            else{
                System.out.println("\n Topological sort no possible");
                return topologicalOrder;
            }
        }
        return topologicalOrder;
    }

    private int findNextNode(int inDegree[], boolean[] visited){
        for( int i=1; i< this.G.size()+1; i++){
            if(inDegree[i] == 0 && !visited[i]){
                return i;
            }
        }
        return -1;
    }
}

The complexity of this implementation is also O(V2) as still we scan all vertices to find the vertex with zero indegree.

Whenever we are updating the in-degree of all the adjacent node, we can store all the vertices for which in-degree becomes zero in a queue. So we don’t need to separately search for the node with zero in-degree, we can simply take the front of the queue, which will reduce the time complexity to O(V + E) with an additional space complexity of O(V). This is called Kahn’s algorithm.

Below is the code for topological sorting using Depth First traversal. We are using a global variable here count. Idea is that all the nodes which are descendant of node u will come after vertex u in topological order. If we fill the array in reverse order, all the descendants will be filled up first and then the starting node.

    private void topologicalSortDFS( int u, boolean[] visited,
                                    int[] topologicalOrder) {
        // Do a DFS starting from u
        if ( ! visited[u] ) {
            visited[u] = true;
            for (int v: this.G.get(u)) {
                topologicalSortDFS(v, visited, topologicalOrder);
            }
            this.count++;
            topologicalOrder[ this.G.size() + 1 - this.count ] = u;
        }
    }

    public void topologicalOrder() {

        boolean[] visited = new boolean[this.G.size() + 1];
        int[] topologicalOrder = new int[this.G.size() + 1];

        for (int i = 1; i < this.G.size() + 1; i++) {
            if (!visited[i]) {
                topologicalSortDFS(i, visited, topologicalOrder);
            }
        }
        Arrays.stream(topologicalOrder).forEach(i -> System.out.println(i));
    }

The complexity of above implementation is O(V + E) with adjacency list represented graph.

Please share if there is something wrong or missing. If you are preparing for an interview and want to have personalized coaching for your preparation, please signup for free demo class.

Graphs : Word ladder or word golf

Word ladder or word golf

Let me explain problem in detail first. We are given two words A and B and a dictionary of words. Task is to transform first word into second word by changing one character at a time. Character should be changed in such a way that the new word should be in dictionary. Find minimum number of such hops from one word to reach the second word.

For example, lets say given word are  CAT and DOG

The sequence of steps will be CAT –> COT–> COG–> DOG, hence three substitutions and we are done.
To convert COLD to WARM (from Wikipedia)
Sequence of steps will be COLD–>CORD–>CARD–>WARD–>WARM.

There are two strings/words given as input. And hoping can be done only if there is one condition : the destination of hop should differ at most one character with source. So if two words differ only in one character, we have a relationship between them.
So if take those two words as two nodes and connect them only if they differ by one character, we get a graph with two nodes and one edge.
Apply same principle to all words in dictionary and create a graph, nodes representing words and edges the relationship.
Once we have created graph as per above conditions, start breadth first traversing from the node which has values as first word and go on scanning till we get a node with value as second word. That’s all. Simple analysis! Implementation is bit tricky. Let’s look at implementation. For each word, if we go an scan every word in dictionary to figure out if they differ by one character in order to create edge will be too inefficient with complexity of O(n2). If we create buckets of words with one character of that replaced with _ and put all similar words into that bucket, we will segregate words with one character different. Let’s take an example.

Dictionary given is  {“CAT”, “BAT”, “COT”, “COG”, “COW”, “RAT”, “BUT”,”CUT”, “DOG”, “WEB”}. Now for word CAT, there are three buckets possible.

Now take word BAT, if we replace B with _ it becomes _AT, for which we already have a bucket, so we can put CAT and BAT in same bucket. Similarly when we replace second character of COT, it becomes C_T, for which we already have a bucket, hence CAT and COT are placed in once bucket. With above given words buckets will be like below.

Now connect all words in same buckets with each other. Final graph will be like below:

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define NUM_WORDS 10
#define NUM_CHAR 4 
#define true 1
#define false 0

typedef struct node_s{
        char value[NUM_CHAR];
        int wt;
        struct node_s *next;
}Node_S;

Node_S *graph_s[NUM_WORDS + 1];

Node_S * create_node_s(char *temp){

    Node_S * new_node = (Node_S *)malloc(sizeof(Node_S));
    if(new_node){
        strcpy(new_node->value, temp);
        new_node->next = NULL;
    }
    else{
        printf("\n Node cannot be allocated");
    }
    return new_node;
}
void add_edge_3(int i, char *str){

    Node_S * temp = graph_s[i];
    if(temp == NULL){
        graph_s[i] = create_node_s(str);
    }
    else{
        while(temp->next){
            temp = temp->next;
        }
        temp->next = create_node_s(str);
    }
}
/* Check if there is already exisitng bucket */
int search_buckets(char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR], int count, char *temp){
    int i;
    for(i=0; i<count; i++){
        if(!strcmp(buckets[i], temp))
            return i;
    }
    return -1;
}

/* This adds string to a bucket */
void  add_string(Node_S *pre_graph[], int index, char * str){

    Node_S * temp = pre_graph[index];

    if(!temp){
        pre_graph[index] = create_node_s(str);
    }
    else{
        while(temp->next){
            temp  = temp->next;
        }
        temp->next = create_node_s(str);
    }
}

/* This function finds the index of the word in dictionary
Inefficient, can be done using binary search as words are in sorted 
order in dictionary */

int find_index(char dict[NUM_WORDS][NUM_CHAR], char * temp){
    int i;
    for(i=0; i<NUM_WORDS; i++){
        if(!strcmp(dict[i], temp))
            return i;
        }
    return -1;
}

/* This function links all words in buckets to create edges of graph */

void create_graph(Node_S *pre_graph[NUM_WORDS * NUM_CHAR],
                        char dict[NUM_WORDS][NUM_CHAR], int count){
    int i;
    int index, index_1, index_2;
    for(i=0; i<count; i++){
        Node_S * current =  pre_graph[i];
        while(current){
            index  = find_index(dict, current->value);
                if(!graph_s[index]){
                    add_edge_3(index, current->value);
                }

                next = current->next;
                while(next){
                    index_1  = find_index(dict, next->value);
                    if(!graph_s[index_1])
                        add_edge_3(index_1, next->value);
                    add_edge_3(index_1, current->value);
                    add_edge_3(index, next->value);
                    next = next->next;
                }
                current = current->next;
        }
    }
}

/*This function creates buckets and using those buckets also
creates graph where all neighbour nodes differ by at most one
character */
void create_buckets(char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR],
                        char dict[NUM_WORDS][NUM_CHAR]){

    int i,j;
    int count = 0;
    Node_S * pre_graph[NUM_WORDS * NUM_CHAR];

    int key[NUM_WORDS * NUM_CHAR];

    for(i=0; i<NUM_WORDS * NUM_CHAR; i++){
        pre_graph[i] = NULL;
    }
    for(i=0; i<NUM_WORDS; i++){
        for(j = 0; j<NUM_CHAR-1; j++ ){
            char temp[NUM_CHAR];
            strcpy(temp, dict[i]);
            temp[j] = '_';

            int index = search_buckets(buckets, count, temp);
            if(index != -1){
                add_string(pre_graph, index, dict[i]);
            }
            else{
                strcpy(buckets[count], temp);
                add_string(pre_graph, count,dict[i]);
                count++;
            }
        }
    }
    create_graph(pre_graph, dict, count);
    // Adjancancy matrix based representation       
    for(i=0; i<NUM_WORDS; i++){
        printf("Nodes adjacent to %d : ", i);

        Node_S * current  = graph_s[i];
        while(current){
            printf("%s ", current->value);
            current = current->next;
        }
        printf("\n");
    }
}


//Driver program
int main(){
    char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR];
    char dict[NUM_WORDS][NUM_CHAR] = {"CAT", "BAT", "COT", "COG", "COW", 
                                        "RAT", "BUT","CUT", "DOG", "WEB"};
    
    create_buckets(buckets, dict);

    bfs(graph_s[0]);
    return 0;
}

This above code needs optimization, please provide suggestions to improve. Breadth First Search implementation is as follows

typedef struct queue{
        Node_S *front;
        Node_S *rear;
}Queue;

int is_empty(Queue *q){
        if(q->front == NULL && q->rear == NULL) return true;
        return false;
}

void enqueue(Queue *q, char value[]){
        Node_S * temp = (Node_S *)malloc(sizeof(Node_S));
        strcpy(temp->value, value);
        if(is_empty(q)){
                q->front = temp;
                q->rear = temp;
        }
        else{
                if(q->rear){
                        q->rear->next = temp;
                        q->rear = temp;
                }
                else{
                        printf("\n Error ");
                }
        }
}

void dequeue(Queue *q, char str[]){
        if(!is_empty(q)){
                strcpy(str, q->front->value);
                Node_S * curr = q->front;
                if(q->front == q->rear){
                        q->rear = NULL;
                        q->front = NULL;
                }
                else{
                        q->front = curr->next;
                }
                free(curr);
        }
}
void init_queue(Queue **q){
        *q = (Queue *)malloc(sizeof(Queue));
        (*q)->front = NULL;
        (*q)->rear = NULL;
}

void bfs_1(Node_S * current, char *second, char dict[NUM_WORDS][NUM_CHAR]){

        Queue *q = NULL;

        init_queue(&q);
        char str[NUM_CHAR];
        int index ;
        if(!current) return;

        enqueue(q, current->value);
        while(!is_empty(q)){
                dequeue(q, str);
                printf("\n%s ", str);
                index = find_index(dict, str);
                visited[index] = true;
                if(!strcmp(str, second)){
                        return;
                }
                Node_S *temp = graph_s[index]->next;
                while(temp){

                        index = find_index(dict, temp->value);
                        if(visited[index] == false){
                                visited[index] = true;
                                enqueue(q, temp->value);
                        }
                        temp = temp->next;
                }
        }

}