## Topological sorting in a graph

Given a directed acyclic graph G (V,E), list all vertices such that for all edges (v,w), v is listed before w. Such an ordering is called topological sorting and vertices are in topological order. For example, topological sort for below graph would be: 1,2,3,5,4,6 A topological ordering is not unique and a DAG can have more than one topological sort. For example, for above graph, 1,5,2,3,6,4 is also correct topological sort.

### How to do a topological sort on a graph?

To start topological sort, we need a node which has zero incoming edges. If there is no such node in the graph, the graph is not directed acyclic graph, DAG. DAG will have at least one such node. Let’s G0 is the graph and V0 is the vertex with zero incoming node. So, V0 becomes the first vertex in order.
Now if we take out V0 and all the edges coming out from it, it leaves a graph G1, there must be a vertex V1 in G1 which has zero incoming edges. V1 is added to topological order.
TheV1 is removed from G1 which leaves us with G2 with TheV2 as candidate node. This goes on until there is no nodes remaining in the original graph G.

At any point in time, we cannot move forward, when there is no node with zero incoming nodes, it means there is a cycle in the graph and given graph is not a DAG.

Above description actually gives away the implementation detail too. We look for a vertex with zero incoming edges and take that. Then we remove all the edges coming out of that node from the graph, which will decrease the number of edges coming on to the neighbor nodes. Again select a node which has zero incoming edges and continues by removing edges.

Let’s take an example and work it out. #### Topological sort: implementation

```package com.company.Graphs;

import java.util.*;

/**
* Created by sangar on 21.12.18.
*/
private boolean [][] G;
private boolean isDirected;

this.G  = new boolean[numNodes+1][numNodes+1];
this.isDirected = isDirected;
}

public void addEdge(int start, int dest){
this.G[start][dest] = true;
if(!isDirected){
this.G[dest][start] = true;
}
}

public boolean isEdge(int start, int dest){
return this.G[start][dest];
}

public ArrayList<Integer> topologicalSort(){

int[] inDegree = new int[this.G.length];
boolean[] visited = new boolean[this.G.length];
ArrayList<Integer> toplogicalOrder = new ArrayList<>();

//Complexity of O(n^2)
for(int i=1; i<this.G.length; i++) {
for (int j = 1; j < this.G.length; j++) {
if (this.G[i][j]) {
inDegree[j] += 1;
}
}
}
/* We will traverse each node */
for(int i=1; i<this.G.length; i++){
/* find a node with zero in degree */
int u  = findNextNode(inDegree, visited);
/* If there is no node with zero in degree,
topological sort not possible */
if(u != -1){
visited[u] = true;
/* decrease in degree of each node adjacent
to node wih zero in degree */
for(int v=1; v<this.G.length; v++) {
if (this.G[u][v]) {
inDegree[v]--;
}
}
}
else{
System.out.println("\n Topological sort no possible");
}
}
}

private int findNextNode(int indegree[], boolean[] visited){
for( int i=1; i< this.G.length; i++){
if(indegree[i] == 0 && !visited[i]){
return i;
}
}
return -1;
}
}
```

The complexity of topological sort implementation with adjacency matrix representation is O(V2). For each vertex we find the vertex with zero in-degree, hence the quadratic time.

Topological sort adjacency list represented graph

```package com.company.Graphs;

import java.util.*;

/**
* Created by sangar on 21.12.18.
*/
private Map<Integer, ArrayList<Integer>> G;
boolean isDirected;

this.G = new HashMap<>();
this.isDirected = isDirected;
}

public void addEdge(int start, int dest){

if(this.G.containsKey(start)){
}else{
this.G.put(start, new ArrayList<>(Arrays.asList(dest)));
}

if(!this.G.containsKey(dest)) {
this.G.put(dest, new ArrayList<>());
}
//In case graph is undirected
if(!this.isDirected) {
}
}

public boolean isEdge(int start, int dest){
if(this.G.containsKey(start)){
return this.G.get(start).contains(dest);
}

return false;
}

public ArrayList<Integer> topologicalSort(){

int[] inDegree = new int[this.G.size()+1];
boolean[] visited = new boolean[this.G.size()+1];
ArrayList<Integer> topologicalOrder = new ArrayList<>();

//Complexity of O(V + E)
for(int u: this.G.keySet()){
for (int v : this.G.get(u)){
inDegree[v] += 1;
}
}

/* We will traverse each node */
for(int i=1; i<this.G.size()+1; i++){
/* find a node with zero in degree */
int u  = findNextNode(inDegree, visited);
/* If there is no node with zero in degree,
topological sort not possible */
if(u != -1){
visited[u] = true;
/* decrease in degree of each node adjacent
to node wih zero in degree */
for(int v : this.G.get(u)) {
inDegree[v]--;
}
}
else{
System.out.println("\n Topological sort no possible");
}
}
}

private int findNextNode(int inDegree[], boolean[] visited){
for( int i=1; i< this.G.size()+1; i++){
if(inDegree[i] == 0 && !visited[i]){
return i;
}
}
return -1;
}
}

```

The complexity of this implementation is also `O(V2)` as still we scan all vertices to find the vertex with zero indegree.

Whenever we are updating the in-degree of all the adjacent node, we can store all the vertices for which in-degree becomes zero in a queue. So we don’t need to separately search for the node with zero in-degree, we can simply take the front of the queue, which will reduce the time complexity to `O(V + E)` with an additional space complexity of `O(V)`. This is called Kahn’s algorithm.

Below is the code for topological sorting using Depth First traversal. We are using a global variable here count. Idea is that all the nodes which are descendant of node u will come after vertex u in topological order. If we fill the array in reverse order, all the descendants will be filled up first and then the starting node.

```    private void topologicalSortDFS( int u, boolean[] visited,
int[] topologicalOrder) {
// Do a DFS starting from u
if ( ! visited[u] ) {
visited[u] = true;
for (int v: this.G.get(u)) {
topologicalSortDFS(v, visited, topologicalOrder);
}
this.count++;
topologicalOrder[ this.G.size() + 1 - this.count ] = u;
}
}

public void topologicalOrder() {

boolean[] visited = new boolean[this.G.size() + 1];
int[] topologicalOrder = new int[this.G.size() + 1];

for (int i = 1; i < this.G.size() + 1; i++) {
if (!visited[i]) {
topologicalSortDFS(i, visited, topologicalOrder);
}
}
Arrays.stream(topologicalOrder).forEach(i -> System.out.println(i));
}
```

The complexity of above implementation is `O(V + E)` with adjacency list represented graph.

Please share if there is something wrong or missing. If you are preparing for an interview and want to have personalized coaching for your preparation, please signup for free demo class.

# Dijkstra’s Algorithm to find shortest path

Given a graph, directed or undirected and two nodes, find the shortest path between these two nodes.
This is a standard problem and we don’t need to figure out what to do. We will have an adjacency list representation of the graph. The algorithm is widely published and is as below.

1. Initialize distance of all nodes from start node as INFINITE and all nodes as not finalized.
2. Take source node to start with, let’s say u.  Distance from source or start node to itself will be zero.
3. Mark u as considered and distance finalized.
4. Now for all neighbor nodes v of it, update the distance if the current distance is more than the distance of u + weight of (u,v).
For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A) will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8. Otherwise, keep the current value. {From Wikipedia}
5. Now select a node for which distance is not finalized and distance is minimum till now and make it u. Go to step 3.

Let’s work out an example and see how it works. Input graph is

To figure out the path which was followed to reach the destination from source, we can have an array to keep track of the parent node whenever distance to the node is updated. By reverse tracking parents from destination to source, we can figure out the path.

## Dijkstrat’s algorithm to find shortest path : Implementation

```#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define NUM_NODE 7
#define NUM_WORDS 10
#define NUM_CHAR 4
#define true 1
#define false 0

#define INFINITE 1000

typedef struct node{
int value;
int wt;
struct node *next;
}Node;

Node *graph[NUM_NODE + 1];

void add_edge_2(int i, int j, int wt);

int find_minimum_node(int visited[], int dist[]){
int min = INFINITE;
int index = -1;
int i;
for(i=1; i<= NUM_NODE; i++){
if(visited[i] == false && min>dist[i]){
min = dist[i];
index = i;
}
}
return index;
}

void dijstras(Node * graph[], int start, int end ){
int i;
int parent[NUM_NODE +1];
int distance[NUM_NODE+1];
int visited[NUM_NODE+1];

for(i=1; i<=NUM_NODE; i++){
visited[i] = false;
distance[i] = INFINITE;
parent[i] = -1;
}
// Mark distance of start as 0.
distance[start] =0;
for(i=1; i<=NUM_NODE; i++){
int index  = find_minimum_node(visited, distance);
if(index != -1){
Node * node = graph[index];
Node * current  = node->next;

while(current){
/*If neihbour node is not visited and its current distance is
more than distance of current node + cost of edge between
current node and this node, update the distance */
if(visited[current->value] == false && distance[current->value] >
distance[node->value] + current->wt){

distance[current->value] = distance[node->value] + current->wt;
parent[current->value] = node->value;
}
current = current->next;
}
visited[node->value] = true;
if(node->value == end)
break;
}
else{
break;
}
}

printf("\nDistance between %d and %d : %d", start , end, distance[end]);

// Printing path in reverse order,using stack, we can print it normal order to.
printf("\nPath is  (In reverse order): ");
int cur_parent =0;
while(cur_parent != -1){
printf("%d ", end );
cur_parent = parent[end];
end = cur_parent;
}
printf("\n");
}

Node *createNode(int j, int wt){

Node * new_node = (Node *)malloc(sizeof(Node));
if(new_node){
new_node->value = j;
new_node->next = NULL;
new_node->wt = wt;
}
else{
printf("\n Node cannot be allocated");
}
return new_node;
}

void addEdge(int i, int j, int wt){

Node * temp = graph[i];
if(temp == NULL){
graph[i] = createNode(j, wt);
}
else{
while(temp->next){
temp = temp->next;
}
temp->next = createNode(j, wt);
}
}

//driver program
int main(){

int i,j;
for(i=1; i<=NUM_NODE; i++){
graph[i] = createNode(i,0);
}
// creating graph with weighted edges.

dijstras(graph, 1, 6);

return 0;
}
```

The complexity of the above code is O(V2) where V is the number of vertices in the graph. This can be reduced to O(E log V) by using heaps. Heaps will reduce the complexity of searching minimum weight cost from O(V) to O(log V).
Limitation of algorithm1. It does not work with negative weights.

Please share if there is anything wrong or missing.

Posted on Categories Algorithms, Data Structures, Graph, Greedy1 Comment on Dijkstra’s Algorithm to find shortest path

## Word ladder or word golf

Let me explain problem in detail first. We are given two words A and B and a dictionary of words. Task is to transform first word into second word by changing one character at a time. Character should be changed in such a way that the new word should be in dictionary. Find minimum number of such hops from one word to reach the second word.

For example, lets say given word are  CAT and DOG The sequence of steps will be CAT –> COT–> COG–> DOG, hence three substitutions and we are done.
To convert COLD to WARM (from Wikipedia)
Sequence of steps will be COLD–>CORD–>CARD–>WARD–>WARM.

There are two strings/words given as input. And hoping can be done only if there is one condition : the destination of hop should differ at most one character with source. So if two words differ only in one character, we have a relationship between them.
So if take those two words as two nodes and connect them only if they differ by one character, we get a graph with two nodes and one edge.
Apply same principle to all words in dictionary and create a graph, nodes representing words and edges the relationship.
Once we have created graph as per above conditions, start breadth first traversing from the node which has values as first word and go on scanning till we get a node with value as second word. That’s all. Simple analysis! Implementation is bit tricky. Let’s look at implementation. For each word, if we go an scan every word in dictionary to figure out if they differ by one character in order to create edge will be too inefficient with complexity of O(n2). If we create buckets of words with one character of that replaced with _ and put all similar words into that bucket, we will segregate words with one character different. Let’s take an example.

Dictionary given is  {“CAT”, “BAT”, “COT”, “COG”, “COW”, “RAT”, “BUT”,”CUT”, “DOG”, “WEB”}. Now for word CAT, there are three buckets possible. Now take word BAT, if we replace B with _ it becomes _AT, for which we already have a bucket, so we can put CAT and BAT in same bucket. Similarly when we replace second character of COT, it becomes C_T, for which we already have a bucket, hence CAT and COT are placed in once bucket. With above given words buckets will be like below. Now connect all words in same buckets with each other. Final graph will be like below: ```#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define NUM_WORDS 10
#define NUM_CHAR 4
#define true 1
#define false 0

typedef struct node_s{
char value[NUM_CHAR];
int wt;
struct node_s *next;
}Node_S;

Node_S *graph_s[NUM_WORDS + 1];

Node_S * create_node_s(char *temp){

Node_S * new_node = (Node_S *)malloc(sizeof(Node_S));
if(new_node){
strcpy(new_node->value, temp);
new_node->next = NULL;
}
else{
printf("\n Node cannot be allocated");
}
return new_node;
}

Node_S * temp = graph_s[i];
if(temp == NULL){
graph_s[i] = create_node_s(str);
}
else{
while(temp->next){
temp = temp->next;
}
temp->next = create_node_s(str);
}
}
/* Check if there is already exisitng bucket */
int search_buckets(char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR], int count, char *temp){
int i;
for(i=0; i<count; i++){
if(!strcmp(buckets[i], temp))
return i;
}
return -1;
}

/* This adds string to a bucket */
void  add_string(Node_S *pre_graph[], int index, char * str){

Node_S * temp = pre_graph[index];

if(!temp){
pre_graph[index] = create_node_s(str);
}
else{
while(temp->next){
temp  = temp->next;
}
temp->next = create_node_s(str);
}
}

/* This function finds the index of the word in dictionary
Inefficient, can be done using binary search as words are in sorted
order in dictionary */

int find_index(char dict[NUM_WORDS][NUM_CHAR], char * temp){
int i;
for(i=0; i<NUM_WORDS; i++){
if(!strcmp(dict[i], temp))
return i;
}
return -1;
}

/* This function links all words in buckets to create edges of graph */

void create_graph(Node_S *pre_graph[NUM_WORDS * NUM_CHAR],
char dict[NUM_WORDS][NUM_CHAR], int count){
int i;
int index, index_1, index_2;
for(i=0; i<count; i++){
Node_S * current =  pre_graph[i];
while(current){
index  = find_index(dict, current->value);
if(!graph_s[index]){
}

next = current->next;
while(next){
index_1  = find_index(dict, next->value);
if(!graph_s[index_1])
next = next->next;
}
current = current->next;
}
}
}

/*This function creates buckets and using those buckets also
creates graph where all neighbour nodes differ by at most one
character */
void create_buckets(char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR],
char dict[NUM_WORDS][NUM_CHAR]){

int i,j;
int count = 0;
Node_S * pre_graph[NUM_WORDS * NUM_CHAR];

int key[NUM_WORDS * NUM_CHAR];

for(i=0; i<NUM_WORDS * NUM_CHAR; i++){
pre_graph[i] = NULL;
}
for(i=0; i<NUM_WORDS; i++){
for(j = 0; j<NUM_CHAR-1; j++ ){
char temp[NUM_CHAR];
strcpy(temp, dict[i]);
temp[j] = '_';

int index = search_buckets(buckets, count, temp);
if(index != -1){
}
else{
strcpy(buckets[count], temp);
count++;
}
}
}
create_graph(pre_graph, dict, count);
for(i=0; i<NUM_WORDS; i++){
printf("Nodes adjacent to %d : ", i);

Node_S * current  = graph_s[i];
while(current){
printf("%s ", current->value);
current = current->next;
}
printf("\n");
}
}

//Driver program
int main(){
char buckets[NUM_WORDS * NUM_CHAR][NUM_CHAR];
char dict[NUM_WORDS][NUM_CHAR] = {"CAT", "BAT", "COT", "COG", "COW",
"RAT", "BUT","CUT", "DOG", "WEB"};

create_buckets(buckets, dict);

bfs(graph_s);
return 0;
}

```

This above code needs optimization, please provide suggestions to improve. Breadth First Search implementation is as follows

```typedef struct queue{
Node_S *front;
Node_S *rear;
}Queue;

int is_empty(Queue *q){
if(q->front == NULL && q->rear == NULL) return true;
return false;
}

void enqueue(Queue *q, char value[]){
Node_S * temp = (Node_S *)malloc(sizeof(Node_S));
strcpy(temp->value, value);
if(is_empty(q)){
q->front = temp;
q->rear = temp;
}
else{
if(q->rear){
q->rear->next = temp;
q->rear = temp;
}
else{
printf("\n Error ");
}
}
}

void dequeue(Queue *q, char str[]){
if(!is_empty(q)){
strcpy(str, q->front->value);
Node_S * curr = q->front;
if(q->front == q->rear){
q->rear = NULL;
q->front = NULL;
}
else{
q->front = curr->next;
}
free(curr);
}
}
void init_queue(Queue **q){
*q = (Queue *)malloc(sizeof(Queue));
(*q)->front = NULL;
(*q)->rear = NULL;
}

void bfs_1(Node_S * current, char *second, char dict[NUM_WORDS][NUM_CHAR]){

Queue *q = NULL;

init_queue(&q);
char str[NUM_CHAR];
int index ;
if(!current) return;

enqueue(q, current->value);
while(!is_empty(q)){
dequeue(q, str);
printf("\n%s ", str);
index = find_index(dict, str);
visited[index] = true;
if(!strcmp(str, second)){
return;
}
Node_S *temp = graph_s[index]->next;
while(temp){

index = find_index(dict, temp->value);
if(visited[index] == false){
visited[index] = true;
enqueue(q, temp->value);
}
temp = temp->next;
}
}

}
```

# Minimum spanning tree Prim’s algorithm

In the last post, we discussed how to find minimum spanning tree in a graph using Kruskal algorithm. Let’s take a look at another algorithm to find the minimum spanning tree in a graph, this algorithm is called Prim’s algorithm. Problem statement remains the same as the Kruskal algorithm, given a graph with weighted edges, find MST in it.

## Prim’s algorithm for minimum spanning tree

Basic idea pf Prim’s algorithm is to maintain two sets of nodes of the graph, one contains all the nodes which are already included into the minimum spanning tree and the other contains all nodes which are not included in MST yet. Now find an edge with least weight which connects any node in the first set to any node in the second set. Such an edge is commonly called as cut.

Let two sets be A which includes all vertices of the graph which are included in MST and V is set which contains all vertices which are not included in MST yet. Let S be set of all vertices of a graph.

To start with A will be empty and V will be equal to S. Start with any one vertex and add it to A. Also, the cost of reaching any node initially is `INFINITE`. Once we add the vertex to A, update the cost of all neighbor vertex based on cost to reach vertex added in A and cost of edge between this vertex and neighboring vertices. Remove the vertex from V.

Now, select an edge which is minimum and connects one vertex in A with a vertex in V, it does not form a cycle. (If both the ends of an edge are in same component of the graph there are bound form a cycle.)

Let’s see prim’s algorithm and then we will see how can we implement it.

1. Initialize cost of each vertex with `INFINITE`. Except from the first vertex v where we will start.
2. For each neighbor u, `cost(u) < cost(v) + w(u,v), update the cost(u).`
3. ``` Select a minimum cost neighbor vertex among these vertex and see if edge forms a cycle. If it forms a cycle take another vertex till we considered every neighboring vertices. If it does not form cycle, add it to MST and go to step 2. Repeat all above steps till all the vertices are not included in tree. ```
``` See the working of algorithm.    Minimum spanning tree algorithm's implementation. First thing we need to do is to keep track of two sets. We can do so by keeping track of each node in MST. Create an array and every time we add the node in MST, mark it as added. While considering any edge in future, just check if MST[i] is true, if it is then node is already added in MST and should not be added again.Second thing is to keep track of the cost of each node. Initially all nodes are at INFINITE cost. As we traverse the graph, update each node's cost from the current node. Keep an array of integers initialized with INFINITE and every time we add a node in MST, update distance of neighboring nodes based on cost of edge from current node to them. Last step is to find the minimum cost edge between two sets. We can scan through the distance array which we discussed in above paragraph and get a node which has least cost and not already in MST (using the array values in discussed in paragraph 1) Finding minimum can be implemented using min heap or priority queues, that would reduce the complexity of overall algorithm. #include<stdio.h> #include<stdlib.h> #define NUM_NODE 6 #define MAX_VALUE 999 #define true 1 #define false 0 //Definition of node in adjacency list formation typedef struct node{ int value; int wt; struct node *next; }Node; //Edge definition for sorting and storing minimum spanning tree typedef struct edge{ int start; int end; int wt; }Edge; Node *graph[NUM_NODE + 1]; //This function creates a node in adjancency list representation with weight Node * createNode(int j, int wt){ Node *newNode = (Node *)malloc(sizeof(Node)); if(newNode){ newNode->value = j; newNode->next = NULL; newNode->wt = wt; } else{ printf("\n Node cannot be allocated"); } return newNode; } //Adds edge in graph void addEdge(int i, int j, int wt){ Node *currentNode = graph[i]; if(currentNode == NULL){ graph[i] = createNode(j, wt); } else{ while(currentNode->next){ currentNode = currentNode->next; } currentNode->next = createNode(j, wt); } } void updateWeights(int currentVertex, int weights[], int predecessors[], int visited[]){ Node *currentNode = graph[currentVertex]; if(currentNode){ //update distance for all neighbor nodes Node * itr = currentNode->next; while(itr){ if(weights[itr->value] > itr->wt && visited[itr->value] == false ){ weights[itr->value] = itr->wt; predecessors[itr->value] = currentVertex; } itr = itr->next; } } } int getMinimumVertex( int weights[], int visited[], int numNodes ){ int currentMinWeight = MAX_VALUE; int currentMinVertex = 1; for( int i = 2; i<=numNodes; i++ ){ if( !visited[i] && weights[i] < currentMinWeight ){ currentMinWeight = weights[i]; currentMinVertex = i; } } return currentMinVertex; } //Real implementation of prims algorithm int primsMSTAlgorithm(Node * graph[], int numNodes){ //Step 1 : Initialize all arrays int weights[numNodes+1]; int visited[numNodes+1]; int predecessors[numNodes+1]; for( int i=1; i<=numNodes; i++ ){ weights[i] = MAX_VALUE; } for( int i=1; i<=numNodes; i++ ){ visited[i] = 0; } for( int i=1; i<=numNodes; i++ ){ predecessors[i] = MAX_VALUE; } weights = 0; int totalCost = 0; //Step 2 : for( int i=0; i <=numNodes; i++ ){ //Greedy strategy int closestVertex = getMinimumVertex( weights, visited, numNodes ); totalCost+= weights[closestVertex]; visited[closestVertex] = true; updateWeights( closestVertex, weights, predecessors, visited ); } return totalCost; } //Driver program int main(){ int i; for(i=1; i<=NUM_NODE; i++){ graph[i] = createNode(i,0); } addEdge(1,2,4); addEdge(2,1,4); addEdge(2,3,3); addEdge(3,2,3); addEdge(4,3,2); addEdge(3,4,2); addEdge(4,1,3); addEdge(1,4,3); addEdge(4,5,3); addEdge(5,4,3); addEdge(6,5,1); addEdge(5,6,1); addEdge(1,6,5); addEdge(6,1,5); addEdge(2,5,4); addEdge(5,2,4); int totalCost = primsMSTAlgorithm(graph, NUM_NODE); printf("Cost of minimum spanning tree : %d", totalCost); return 0; } The complexity of Prim's algorithm to find minimum spanning tree in graph is O(V2). Please share if there is anything is wrong or missing. If you are preparing for an interview and need 1 to 1 coaching for preparation, please reach out to us at [email protected] or book a free session with us. ```
```          Author jitsceaitPosted on June 15, 2014May 17, 2020Categories GraphTags graphs, minimum spanning tree, msp in graph, prims algorithm to find minimum spanning treeLeave a comment on Minimum spanning tree Prim’s algorithm ```
``` Minimum Spanning Tree – Kruskal Algorithm Minimum Spanning Tree – Kruskal Algorithm Given a weighted connected undirected graph, find a minimum spanning tree in the graph. Let’s first understand what is a spanning tree? A spanning tree T of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G, with the minimum possible number of edges. Wikipedia When the graph is weighted i.e each edge of the graph has some weight to move from one node to another, a spanning tree with minimum cost is called the minimum spanning tree. If all the edges contain distinct weights, there will be a unique minimum spanning tree for the graph, however, if two edges can have same weight then there can be more than one minimum spanning tree for the given graph. For the graph shown below, second figure shows spanning trees of it.    Applications of minimum spanning trees 1. Broadcast tree preparation for broadcasting on the internet. It is implemented using spanning tree protocol. 2.  Preparing telecommunication networks, water supply networks etc. There are two widely used algorithms to find a minimum spanning tree for a graph, both based on the greedy algorithm. The basic idea of the greedy algorithm is that at every step we chose the option which gives us maximum profit at that step without worrying about future steps. These algorithms are: 1. Kruskal’s algorithm: Basic idea of the kruskal algorithm to find the minimum spanning tree in the graphs is that we take each edge one by one in increasing order of their weights. For each edge check if it makes a cycle in the existing tree? If it does not create a cycle, add it to the minimum spanning tree formed till now. If it forms a cycle, discard the edge and move to the next edge. Repeat above steps until all nodes are added in the spanning tree. Resulting tree will be the minimum spanning tree. Kruskal algorithm to find minimum spanning tree Sort all edges based on weights Start with minimum cost edge. Add it to T. For each edge in graph, repeat following steps. If current edge forms a cycle, discard the edge. If current edge does not form a cycle, add it to T. Kruskal algorithm: implementation #include<stdio.h> #include<stdlib.h> #define NUM_NODE 5 #define true 1 #define false 0 //Definition of a node in adjacency list formation typedef struct node{ int value; int wt; struct node *next; }Node; //Edge definition for sorting and storing minimum spanning tree typedef struct edge{ int start; int end; int wt; }Edge; Node *graph[NUM_NODE + 1]; //This function creates a node in adjancency list representation with weight Node * createNode(int j, int wt){ Node * new_node = (Node *)malloc(sizeof(Node)); if(new_node){ new_node->value = j; new_node->next = NULL; new_node->wt = wt; } else{ printf("\n Node cannot be allocated"); } return new_node; } //Adds edge in graph void addEdge(int i, int j, int wt){ Node * currentNode = graph[i]; if(!currentNode){ graph[i] = createNode(j, wt); } else{ while(currentNode->next){ currentNode = currentNode->next; } currentNode->next = createNode(j, wt); } } //Below functions are used for Union and Find method on the disjoint set int find(int rep[], int value){ if(rep[value] == -1 || rep[value] == value) return value; else find(rep, rep[value]); } void unionSet(int rep[], int x, int y){ int xroot = find(rep, x); int yroot = find(rep, y); rep[xroot] = yroot; } void makeSet(int rep[]){ int i; for(i=0; i<= NUM_NODE; i++){ rep[i] = -1; } } //This copies all edges to array so as to sort them on weight void copyEdges(Edge edges[], Node *graph[]){ int i, count =0; for(i=1; i<= NUM_NODE; i++){ Node * u = graph[i]; if(u){ Node * v = u->next; while(v){ edges[count].start = u->value; edges[count].end =v->value; edges[count].wt = v->wt; count++; v = v->next; } } } } //Implementation of quick sort void swap(Edge edges[], int i, int j){ int temp_start = edges[i].start; int temp_end = edges[i].end; int wt = edges[i].wt; edges[i].start = edges[j].start ; edges[i].end = edges[j].end ; edges[i].wt = edges[j].wt; edges[j].start = temp_start; edges[j].end = temp_end ; edges[j].wt = wt; } int partition(Edge edges[], int start, int nEdges){ int i,j; int pivot = edges[start].wt; i = start+1; j = nEdges; while(i<=j){ while(i<=nEdges && edges[i].wt < pivot) i++; while(j>start && edges[j].wt >= pivot) j--; if(i<j){ swap(edges, i, j); } } if(j>start) swap(edges,start, j); return j; } void quicksort(Edge edges[], int start, int nEdges){ if(start >=nEdges) return; int pivot = partition(edges, start, nEdges); quicksort(edges, start, pivot); quicksort(edges, pivot+1, nEdges); } //Real implementation of krutskal algorithm void kruskal(Node * graph[], int num_edges){ int i, count =0, wt=0; Edge edges[num_edges]; Edge mst[num_edges]; copyEdges(edges, graph); quicksort(edges, 0, num_edges-1); int rep[NUM_NODE + 1]; makeSet(rep); for(i=0; i<num_edges; i++){ //Check if edge creates cycle in the tree, If not add that to tree if(find(rep, edges[i].start) != find(rep, edges[i].end)){ unionSet(rep, edges[i].start, edges[i].end); mst[count++] = edges[i]; wt += edges[i].wt; } } // Print the minimum spanning tree for(i=0; i<count; i++){ printf("(%d, %d)\n", mst[i].start, mst[i].end); } } //Driver program int main(){ int i; for(i=1; i<=NUM_NODE; i++){ graph[i] = createNode(i,0); } addEdge(1,3,10); addEdge(1,4,7); addEdge(3,4,9); addEdge(4,2,32); addEdge(4,5,23); kruskal(graph, 5); return 0; } The complexity of the above algorithm is dominated by the sorting which takes O(E log E) where E is the number of edges. To detect whether edge forms a cycle or not, we can use the Union and Find method. 2.Prim’s algorithm: It is also based on the greedy algorithm, discussed in the next post Please share if there is anything is wrong or missing. If you are preparing for an interview and need 1 to 1 coaching for preparation, please reach out to us at [email protected] or book a free session with us.          Author jitsceaitPosted on June 14, 2014February 21, 2019Categories GraphTags graphs, kruskal algorithm, krustal minimum spanning tree algorithm, minimum spanning tree, msp in graph1 Comment on Minimum Spanning Tree – Kruskal Algorithm Connected components of graph Given an undirected graph, find the number of connected components in it. A connected component of an undirected graph is a maximal set of nodes such that each pair of nodes is connected by a path Every vertex of the graph lines in a connected component that consists of all the vertices that can be reached from that vertex, together with all the edges that join those vertices. If an undirected graph is connected, there is only one connected component. Every graph has at least one connected component that is the graph itself. For example in below graph, there are two connected components {1,2,3,4} and {5, 6}.  How to find the number of connected components in a graph? We can find the number of Connected components in a graph by traversing the graph, either depth first traversal or breadth first traversal. If we start from a node v and do a traversal of the graph, we will be able to reach all the nodes which are connected to this node through any path. These are the vertices that are part of this connected component. Now, if there are some vertices still unvisited, then there must be another component of the graph. We start traversal again with the first unvisited vertex. We continue this until all the vertices are visited. In the following implementation, we are using the Depth First traversal of the graph to find the number of components. Implementation with adjacency list representation of graph. Show me the implementation package com.company.Graphs; import java.util.*; /** * Created by sangar on 21.12.18. */ public class AdjacencyList { private Map<Integer, ArrayList<Integer>> G; boolean isDirected; public AdjacencyList(boolean isDirected){ this.G = new HashMap<>(); this.isDirected = isDirected; } public void addEdge(int start, int dest){ if(this.G.containsKey(start)){ this.G.get(start).add(dest); }else{ this.G.put(start, new ArrayList<>(Arrays.asList(dest))); } //In case graph is undirected if(!this.isDirected) { if (this.G.containsKey(dest)) { this.G.get(dest).add(start); } else { this.G.put(dest, new ArrayList<>(Arrays.asList(start))); } } } private void DFS(int node, boolean[] visited, ArrayList<Integer> traversal){ if(!visited[node]){ traversal.add(node); visited[node] = true; this.G.get(node).forEach(v -> DFS(v, visited, traversal)); } } public int connectedComponents(){ boolean[] visited = new boolean[this.G.size()]; int components = 0; for(int i=1; i<this.G.size(); i++) { ArrayList<Integer> traversal = new ArrayList<>(); if(!visited[i]) { DFS(i, visited, traversal); System.out.println(traversal); components++; } } return components; } } Implementation using adjacency matrix representation of a graph package com.company.Graphs; import java.util.ArrayList; import java.util.LinkedList; import java.util.Queue; import java.util.Stack; /** * Created by sangar on 21.12.18. */ public class AdjacencyMatrix { private boolean [][] G; private boolean isDirected; public AdjacencyMatrix(int numNodes, boolean isDirected){ this.G = new boolean[numNodes+1][numNodes+1]; this.isDirected = isDirected; } public void addEdge(int start, int dest){ this.G[start][dest] = true; if(!isDirected){ this.G[dest][start] = true; } } private void DFS(int node, boolean[] visited, ArrayList<Integer> traversal){ if(!visited[node]){ traversal.add(node); visited[node] = true; for(int i=1; i< this.G.length; i++){ if(this.G[node][i]){ DFS(i, visited, traversal); } } } } public int connectedComponents(){ boolean[] visited = new boolean[this.G.length]; int components = 0; for(int i=1; i<this.G.length; i++) { ArrayList<Integer> traversal = new ArrayList<>(); if(!visited[i]) { DFS(i, visited, traversal); System.out.println(traversal); components++; } } return components; } } If an adjacency list representation is used, the for loop will run for i from 0 to V each time it is executed, so the runtime for the adjacency matrix representation is O(V2). For an adjacency list representation of a graph, run time for the algorithm is O(V + E). Please share if there is something wrong or missing. If you are preparing for an interview and want to get a free coaching session, please reach out to us at [email protected]          Author jitsceaitPosted on June 13, 2014June 7, 2020Categories Amazon Interview questions, Data Structures, GraphTags components of graph, connected components, depth first traversal, graphs4 Comments on Connected components of graph Detect cycle in undirected graph Given an undirected graph, detect if there is a cycle in the undirected graph. For example, the below graph has cycles as 2->3->4->2 and 5->4->6->5 and a few more.  A simple definition of a cycle in an undirected graph would be: If while traversing the graph, we reach a node which we have already traversed to reach the current node, then there is a cycle in the graph. How can we detect the above condition? It’s an application of Depth First Search. Do a depth-first traversal of a graph. In depth-first traversal, we visit the current node and go deep to one of its unvisited neighbors. If a neighbor of the current node is already visited, that means there is a possibility of a cycle. However, make sure that the already visited neighbor is not the parent of the current node, otherwise, there will be always a cycle of two nodes in an undirected graph. While depth-first traversing a graph, keep track of the nodes which are on the stack at present. If there is an edge from the current node to any one of the nodes which are on recursion stack, we say there is a cycle in undirected graph. To avoid additional space, we can pass the parent pointer to the recursive call and check if the visited node is the parent. Detect cycle in undirected graph: implementation package com.company.Graphs; import java.util.*; /** * Created by sangar on 21.12.18. */ public class AdjacencyList { private Map<Integer, ArrayList<Integer>> G; private boolean isDirected; private int count; public AdjacencyList(boolean isDirected){ this.G = new HashMap<>(); this.isDirected = isDirected; } public void addEdge(int start, int dest){ if(this.G.containsKey(start)){ this.G.get(start).add(dest); }else{ this.G.put(start, new ArrayList<>(Arrays.asList(dest))); } if(!this.G.containsKey(dest)) { this.G.put(dest, new ArrayList<>()); } //In case graph is undirected if(!this.isDirected) { this.G.get(dest).add(start); } } public boolean isEdge(int start, int dest){ if(this.G.containsKey(start)){ return this.G.get(start).contains(dest); } return false; } private boolean isCycleUtil(int u, boolean[] visited, int parent){ for(int v: this.G.get(u)) { if(!visited[v]){ visited[v] = true; return isCycleUtil(v, visited, u); } else if(v != parent){ return true; } } return false; } public boolean isCycle(){ boolean[] visited = new boolean[this.G.size() + 1]; for(int i = 1; i < this.G.size() + 1; i++) { if (!visited[i]) { return isCycleUtil(i, visited, -1); } } return false; } } The complexity of the DFS approach to find cycle in an undirected graph is O(V+E) where V is the number of vertices and E is the number of edges. Please share if there is something wrong or missing. If you are preparing for an interview, please singup for free interview preparation material. Related articles Connected components of a graph Depth first traversal of graph Graph representations Breadth First traversal Topological sorting          Author jitsceaitPosted on June 12, 2014May 20, 2020Categories Amazon Interview questions, Data Structures, GraphTags cycle detection in undirected graph, cycle in undirected graph, depth first, depth first search, depth first traversalLeave a comment on Detect cycle in undirected graph Depth first traversal of graph In the last post, we discussed how to represent a graph in a programmatic way using adjacency matrix and adjacency list representation. Let’s discuss now, how to traverse a graph. There are two ways to traverse a graph: Depth First traversal, commonly called DFS and Breadth First traversal, commonly called as BFS. What is Depth First Traversal (DFS) ? Depth First search or traversal is a technique in which we go as deep in the graph as possible and explore nodes, that is we move down the graph from one node to another till we can. Once we reach a node where we cannot go further down, we backtrack and move to node one before and so on. A depth-first traversal of a graph is a very good example to understand how backtracking works. Let’s see how do we traverse a graph depth first. Start with node u. If u is already visited, return. Mark visited[u] = true. For all neighbors v of u. v is now new u, so u=v and go back to step 2 When all neighbors are processed, backtrack to parent of current u. When we reach at start node S again and there is no edge to be explored at S, that would be the end of traversal. Let’s take an example and see how it works. Given the graph below, let’s trace how depth first search will go.  We start with node(1), and mark it as visited.  Now, we pick neighbor of node(1), and go down to node(2) and mark it as visited too.  Next, we move to the neighbor of node(2), which is node(3).  After marking node(3) visited, we move to node(4).  Next, we move to node(6). At this point, we have no neighbors to visit from node(6). So, we start backtracking.  At node(2), we have another neighbor called node(4), however, node(4) is already visited when we went deep from node(3). So nothing happens, we move further back. At node(1), we have neighbor node(5). Since node(5) is not visited, we mark it as visited.  At this point, there is no node with a neighbor not traversed, so end of depth-first traversal. Implementation package com.company.Graphs; import java.util.ArrayList; /** * Created by sangar on 21.12.18. */ public class AdjacencyMatrix { private boolean [][] G; private boolean isDirected; public AdjacencyMatrix(int numNodes, boolean isDirected){ this.G = new boolean[numNodes+1][numNodes+1]; this.isDirected = isDirected; } public void addEdge(int start, int dest){ this.G[start][dest] = true; if(!isDirected){ this.G[dest][start] = true; } } public boolean isEdge(int start, int dest){ return this.G[start][dest]; } private void DFS(int node, boolean[] visited, ArrayList<Integer> traversal){ if(!visited[node]){ traversal.add(node); visited[node] = true; //Complexity of this loop is O(V * E) for(int i=1; i< this.G.length; i++){ if(this.G[node][i]){ //This will be called E times (number of edges) DFS(i, visited, traversal); } } } } public ArrayList<Integer> depthFirstTraversal(){ boolean[] visited = new boolean[this.G.length]; ArrayList<Integer> traversal = new ArrayList<>(); DFS(1, visited, traversal); System.out.println(traversal); return traversal; } } The complexity of the above implementation is O(V * E). Depth-first traversal when the graph is represented using adjacency list. Depth-first traversal adjacency list representation package com.company.Graphs; import java.util.ArrayList; import java.util.Arrays; import java.util.HashMap; import java.util.Map; /** * Created by sangar on 21.12.18. */ public class AdjacencyList { private Map<Integer, ArrayList<Integer>> G; boolean isDirected; public AdjacencyList(boolean isDirected){ this.G = new HashMap<>(); this.isDirected = isDirected; } public void addEdge(int start, int dest){ if(this.G.containsKey(start)){ this.G.get(start).add(dest); }else{ this.G.put(start, new ArrayList<>(Arrays.asList(dest))); } //In case graph is undirected if(!this.isDirected) { if (this.G.containsKey(dest)) { this.G.get(dest).add(start); } else { this.G.put(dest, new ArrayList<>(Arrays.asList(start))); } } } public boolean isEdge(int start, int dest){ if(this.G.containsKey(start)){ return this.G.get(start).contains(dest); } return false; } private void DFS(int node, boolean[] visited, ArrayList<Integer> traversal){ if(!visited[node]){ traversal.add(node); visited[node] = true; this.G.get(node).forEach(v -> DFS(v, visited, traversal)); } } public ArrayList<Integer> depthFirstTraversal(){ boolean[] visited = new boolean[this.G.size()+1]; ArrayList<Integer> traversal = new ArrayList<>(); DFS(1, visited, traversal); System.out.println(traversal); return traversal; } } The complexity of the above implementation is O(V + E). Both the above implementations are recursive implementation. We can use a stack to replace recursion. The algorithm remains the same, for each node visited, put all the adjacent nodes into the stack. Take the next node from the stack, mark it as visited, and put all the adjacent nodes to stack. Repeat the process till the stack is not empty. public ArrayList<Integer> depthFirstTraversalNonRecursive(){ boolean[] visited = new boolean[this.G.length]; ArrayList<Integer> traversal = new ArrayList<>(); Stack s = new Stack(); s.push(1); visited = true; while(!s.isEmpty()){ int currentNode = (int)s.pop(); traversal.add(currentNode); for(int i=1; i< this.G.length; i++){ if(this.G[currentNode][i] && !visited[i]){ s.push(i); visited[i] = true; } } } System.out.println(traversal); return traversal; } This implementation of dept first traversal or DFS works only if the graph has only one connected component. If there are multiple connected components in the graph, this will not work. It can be easily fixed. All we need to do is that we try to start DFS from each node of the graph. Check if a node is not already visited for earlier DFS run, then start another DFS from that node. Below is the changed code. private void DFS(int node, boolean[] visited, ArrayList<Integer> traversal){ if(!visited[node]){ traversal.add(node); visited[node] = true; for(int i=1; i< this.G.length; i++){ if(this.G[node][i]){ DFS(i, visited, traversal); } } } } public ArrayList<Integer> depthFirstTraversal(){ boolean[] visited = new boolean[this.G.length]; ArrayList<Integer> traversal = new ArrayList<>(); for(int i=1; i<this.G.length; i++) { if(!visited[i]) DFS(i, visited, traversal); } System.out.println(traversal); return traversal; } Applications of DFS Minimum spanning tree To check if graph has a cycle. Topological sorting To find strongly connected components of graph To find bridges in graph Please share if there something wrong or missing. Please reach out to [email protected] if you need personalized coaching for an interview.          Author jitsceaitPosted on June 8, 2014June 7, 2020Categories Algorithms, Data Structures, GraphTags depth first search, depth first traversal, depth first traversal of graph, graphs8 Comments on Depth first traversal of graph Graph representations Before understanding graph representation in computer languages, let’s understand what is a graph? A graph is collection of edges E : {e1,e2,33…..en} and vertices V : {v1,v2,v3…vn}. Mathematically, graph is represented as G = (V,E). An edge is pair of vertices (vi, vj) where i,j<=n and vertices are the nodes at the two ends of edges. Below is an example of a graph, where vertices are (1,2,3,4,5,6) and edges are [ (1,2), (1,4), (2,3), (3,5), (5,4), (4,6) ]  There are two types of graphs in general: directed and undirected. The graph you saw in the above example is a directed graph. A directed graph is where an edge is one way from one vertex to another, whereas the undirected graph has two-way edges, that is, there is no arrowhead at the end of the edge. In representations, if there is an edge from vertex x to vertex y, in an undirected graph, there will be an edge from vertex y to vertex x. Graphs representations There are two ways to represent graphs in programming constructs: adjacency matrix representation and adjacency list representation. Based on the need of algorithm and problem at hand, we decide which way to represent a graph. There are three criteria, which are used to evaluate any representation of a graph. How much space a particular representation will take? How much time it will take to find if a given edge exists in the graph? How long will it take to find all the neighbors of a particular node? Adjacency matrix representation of a graph In this representation, we use two dimensional array or matrix G. G(i,j) is set to true if there is an edge between vertex i and vertex j. The advantage of the adjacency matrix representation of a graph is that it is very simple to implement. However, when the number of edges between vertices is sparse, a lot of cells in array or matrix remain empty. The complexity of graph algorithms is measured in terms of E and V where E is the number of edges and V is the number of vertices. In adjacency matrix representation, memory used to represent graph is O(v2). If the graph is undirected then when there is an edge between (u,v), there is also an edge between (v,u). So transpose of the adjacency matrix is the same as the original. So we can save half the space when representing an undirected graph using adjacency matrix. Moreover, if there are no weighted edges in the graph, instead of using one byte, we can use one bit to indicate an edge. To determine if a given edge (u,v) is present, we have to look at G[u][v], if it is true, then there is an edge else not. Time complexity is O(1). To find all the neighbors of a node, we have to scan the entire row, which leads to the complexity of O(n). Adjacency matrix representation of the graph shown above will be: adjacency matrix representation of graph Adjacency matrix package com.company.Graphs; /** * Created by sangar on 21.12.18. */ public class AdjacencyMatrix { private boolean [][] G; private boolean isDirected; public AdjacencyMatrix(int numNodes, boolean isDirected){ this.G = new boolean[numNodes+1][numNodes+1]; this.isDirected = isDirected; } public void addEdge(int start, int dest){ this.G[start][dest] = true; if(!isDirected){ this.G[dest][start] = true; } } public boolean isEdge(int start, int dest){ return this.G[start][dest]; } } Tests package test.Graphs; import com.company.Graphs.AdjacencyMatrix; import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.assertEquals; /** * Created by sangar on 21.12.18. */ public class AdjacencyMatrixTest { @Test public void createGraphTest() { AdjacencyMatrix tester = new AdjacencyMatrix(5,false); tester.addEdge(1,5); tester.addEdge(3,4); tester.addEdge(2,4); tester.addEdge(1,3); tester.addEdge(3,5); tester.addEdge(5,2); assertEquals(true, tester.isEdge(3,4)); assertEquals(false, tester.isEdge(1,4)); } } Adjacency list In adjacency list representation, we have a table of all vertices of the graph. Each entry in that table will have the list of neighbors which this vertex is connected to. Space required for adjacency list representation of the graph is O(V +E). To find if there is an edge (u,v), we have to scan through the whole list at node(u) and see if there is a node(v) in it. In the worst case, it will take O(E) time, where E is the maximum number of edges in the graph. To find all the neighbors of a node, it is just returning all the nodes in the list, which is again of O(E) time complexity. Adjacency list representation can be easily extended to represent graphs with weighted edges. Each node contains another parameter weight. For the edge, (u,v)  node in the adjacency list of u will have the weight of the edge. Above graph can be represented in adjacency list as adjacency list representation of graph Adjacency list implementation package com.company.Graphs; import java.util.ArrayList; import java.util.HashMap; import java.util.Map; /** * Created by sangar on 21.12.18. */ public class AdjacencyList { private Map<Integer, ArrayList<Integer>> G; boolean isDirected; public AdjacencyList(boolean isDirected){ this.G = new HashMap<>(); this.isDirected = isDirected; } public void addEdge(int start, int dest){ if(this.G.containsKey(start)){ this.G.get(start).add(dest); }else{ this.G.put(start, new ArrayList<>(Arrays.asList(dest))); } //In case graph is undirected if(!this.isDirected) { if (this.G.containsKey(dest)) { this.G.get(dest).add(start); } else { this.G.put(dest, new ArrayList<>(Arrays.asList(start))); } } } public boolean isEdge(int start, int dest){ if(this.G.containsKey(start)){ return this.G.get(start).contains(dest); } return false; } } Tests package test.Graphs; import com.company.Graphs.AdjacencyList; import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.assertEquals; /** * Created by sangar on 21.12.18. */ public class AdjacencyListTest { @Test public void createGraphTest() { AdjacencyList tester = new AdjacencyList(false); tester.addEdge(1,5); tester.addEdge(3,4); tester.addEdge(2,4); tester.addEdge(1,3); tester.addEdge(3,5); tester.addEdge(5,2); assertEquals(true, tester.isEdge(3,4)); assertEquals(false, tester.isEdge(1,4)); } } Quick follow up question: Given a densely connected graph, there will be millions of queries to find if there is an edge between two vertices, which representation is best suited for this use case? The first hint is that it is densely connected graph, so there would not be much difference in space used in case of both representation. Finding if there is an edge between two nodes is O(1) operation in adjacency matrix representation whereas it O(E) operation in adjacency list representation. So, in this use case, we would choose the adjacency matrix representation. Please share if there is something wrong or missing. If you are preparing for an interview, please signup for free interview kit.          Author jitsceaitPosted on May 31, 2014May 17, 2020Categories Data Structures, GraphTags adjacency list, adjacency list representation of graph, adjacency matrix, adjacency matrix representation of graph, graph representation3 Comments on Graph representations Posts navigation Previous page Page 1 Page 2 ```