Minimum Spanning Tree – Kruskal Algorithm

Minimum Spanning Tree – Kruskal Algorithm

Given a weighted connected undirected graph, find a minimum spanning tree in the graph. Let’s first understand what is a spanning tree?

A spanning tree T of an undirected graph G is a subgraph that is a tree which includes all of the vertices of G, with the minimum possible number of edges.

Wikipedia
When the graph is weighted i.e each edge of the graph has some weight to move from one node to another, a spanning tree with minimum cost is called the minimum spanning tree. If all the edges contain distinct weights, there will be a unique minimum spanning tree for the graph, however, if two edges can have same weight then there can be more than one minimum spanning tree for the given graph.

For the graph shown below, second figure shows spanning trees of it.

Applications of minimum spanning trees

1. Broadcast tree preparation for broadcasting on the internet. It is implemented using spanning tree protocol.
2.  Preparing telecommunication networks, water supply networks etc.

There are two widely used algorithms to find a minimum spanning tree for a graph, both based on the greedy algorithm. The basic idea of the greedy algorithm is that at every step we chose the option which gives us maximum profit at that step without worrying about future steps. These algorithms are:
1. Kruskal’s algorithm: Basic idea of the kruskal algorithm to find the minimum spanning tree in the graphs is that we take each edge one by one in increasing order of their weights. For each edge check if it makes a cycle in the existing tree? If it does not create a cycle, add it to the minimum spanning tree formed till now. If it forms a cycle, discard the edge and move to the next edge. Repeat above steps until all nodes are added in the spanning tree. Resulting tree will be the minimum spanning tree.

Kruskal algorithm to find minimum spanning tree

  1. Sort all edges based on weights
  2. Start with minimum cost edge. Add it to T.
  3. For each edge in graph, repeat following steps.
  4. If current edge forms a cycle, discard the edge.
  5. If current edge does not form a cycle, add it to T.

Kruskal algorithm: implementation

#include<stdio.h>
#include<stdlib.h>

#define NUM_NODE 5
#define true 1
#define false 0

//Definition of a node in adjacency list formation
typedef struct node{
    int value;
    int wt;
    struct node *next;
}Node;

//Edge definition for sorting and storing minimum spanning tree
typedef struct edge{
    int start;
    int end;
    int wt;
}Edge;

Node *graph[NUM_NODE + 1];

//This function creates a node in adjancency list representation with weight 
Node * createNode(int j, int wt){

    Node * new_node = (Node *)malloc(sizeof(Node));
    if(new_node){
        new_node->value = j;
        new_node->next = NULL;
        new_node->wt = wt;
    }
    else{
        printf("\n Node cannot be allocated");
    }
    return new_node;
}
//Adds edge in graph
void addEdge(int i, int j, int wt){

    Node * currentNode = graph[i];
    if(!currentNode){
        graph[i] = createNode(j, wt);
    }
    else{
        while(currentNode->next){
            currentNode = currentNode->next;
        }
        currentNode->next = createNode(j, wt);
    }
}

//Below functions are used for Union and Find method on the disjoint set
int find(int rep[], int value){
    if(rep[value] == -1 || rep[value] == value)
        return value;
    else find(rep, rep[value]);
}

void unionSet(int rep[], int x, int y){

    int xroot = find(rep, x);
    int yroot = find(rep, y);

    rep[xroot] = yroot;
}

void makeSet(int rep[]){
    int i;
    for(i=0; i<= NUM_NODE; i++){
        rep[i] = -1;
    }
}
//This copies all edges to array so as to sort them on weight
void copyEdges(Edge edges[], Node *graph[]){
    int i, count =0;

    for(i=1; i<= NUM_NODE; i++){
        Node * u = graph[i];
        if(u){
            Node * v = u->next;
            while(v){
                edges[count].start =  u->value;
                edges[count].end =v->value;
                edges[count].wt = v->wt;
                count++;
                v = v->next;
            }
        }
    }
}

//Implementation of quick sort
void swap(Edge edges[], int i, int j){
    int temp_start = edges[i].start;
    int temp_end = edges[i].end;
    int wt = edges[i].wt;

    edges[i].start  = edges[j].start ;
    edges[i].end  = edges[j].end ;
    edges[i].wt  = edges[j].wt;

    edges[j].start  = temp_start;
    edges[j].end  = temp_end ;
    edges[j].wt  = wt;
}
int partition(Edge edges[], int start, int nEdges){
    int i,j;

    int pivot =  edges[start].wt;
    i = start+1;
    j = nEdges;
    while(i<=j){
        while(i<=nEdges && edges[i].wt < pivot)
            i++;
        while(j>start && edges[j].wt >= pivot)
            j--;

        if(i<j){
            swap(edges, i, j);
        }
    }
    if(j>start)
        swap(edges,start, j);
    
    return j;
}

void quicksort(Edge edges[], int start, int nEdges){

	if(start >=nEdges) return;
	int pivot = partition(edges, start, nEdges);

	quicksort(edges, start, pivot);
	quicksort(edges, pivot+1, nEdges);

}

//Real implementation of krutskal algorithm
void kruskal(Node * graph[], int num_edges){

    int i, count =0, wt=0;

    Edge edges[num_edges];

    Edge mst[num_edges];

    copyEdges(edges, graph);

    quicksort(edges, 0, num_edges-1);

    int rep[NUM_NODE + 1];

    makeSet(rep);
    for(i=0; i<num_edges; i++){
        //Check if edge creates cycle in the tree, If not add that to tree
        if(find(rep, edges[i].start) != find(rep, edges[i].end)){
            unionSet(rep, edges[i].start, edges[i].end);
            mst[count++] = edges[i];
            wt += edges[i].wt;
        }
    }
    // Print the minimum spanning tree
    for(i=0; i<count; i++){
        printf("(%d, %d)\n", mst[i].start, mst[i].end);
    }
}
//Driver program
int main(){

    int i;
    for(i=1; i<=NUM_NODE; i++){
        graph[i] = createNode(i,0);
    }
    addEdge(1,3,10);
    addEdge(1,4,7);
    addEdge(3,4,9);
    addEdge(4,2,32);
    addEdge(4,5,23);

    kruskal(graph, 5);

    return 0;
}

The complexity of the above algorithm is dominated by the sorting which takes O(E log E) where E is the number of edges. To detect whether edge forms a cycle or not, we can use the Union and Find method.

2.Prim’s algorithm: It is also based on the greedy algorithm, discussed in the next post

Please share if there is anything is wrong or missing. If you are preparing for an interview and need 1 to 1 coaching for preparation, please reach out to us at communications@algorithmsandme.com or book a free session with us.