Tarjan Algorithm is a very useful algorithm in finding **strongly connected components** in a directed graph. What is a strongly connected component?

## What is Strongly Connected Component?

A directed graph is said to be strongly connected if we can reach any node from every other node in the graph. A strongly connected component of a directed graph is a subset of the nodes in the graph such that any two nodes of this subset are reachable from each other.

For any two nodes `u` and `v` in graph, if they are part of a strongly connected component, there exists a path from `u` to `v` and vice-a-versa.

For example, in the graph given below, there are 4 strongly connected components.

There are two algorithms to strongly connected components one is Kosaraju’s algorithm and another one is the Tarjan algorithm.

Tarjan algorithm requires only one depth-first search traversal to find out all strongly connected components present in the graph. It is efficient as compared to Kosaraju’s algorithm as the other requires two DFS traversal.

An important thing to known to understand Tarjan’s algorithm is that depth-first traversal creates a DFS tree, meaning when we traverse a graph, it does traversal in a manner that a node is not visited again.

## Tarjan’s algorithm to find Strongly Connected Component?

Fundamental is very simple: From a node `u` if we can reach to any of its ancestors for any of its descendants (including itself), then there must be a cycle containing this node, all these nodes should be part of the same connected component.

For example, if `x` is the root of the tree and `y` is the descendant of `x` and there is an edge from `y` to another node `z` that is an ancestor of `x`, then we can say that edge `y – Z` together with the paths in the DFS tree from `z` to `x` and from `x` to `z` form a cycle, which must all be part of the same component.

The edge from

ytozis calledback edge.

How can we know if there is a back-edge from a node to another? We keep track of two things for each node `v` of the graph:

tin[v]: Represents the number of node visited beforevin DFS.low[v]: Represents smallest tin[x] of a node reachable by a back edge from the subtree ofv.

The below figure displays `tin[i]` for each node. In the simplest way, it can be calculated as in-time of the node in depth-first traversal.

The interesting part is how to calculate `low[i]`? To start with, `low[i]` is the node the node `i` itself. Going forward, for each node `u`, go through its neighbors:

1. If the neighbor v not already visited, that means, it is not a back edge. We will go forward and do the DFS on the `v`. Once DFS is finished, we will update the `low[u]` as follows:

low[u] = min(low[u], low[v]).

2. If the node `v` is already visited, that means there is a back edge from node `v` to any of the ancestor of node `u`. In that case, `low[u]` will be defined as follows

low[u] = min(low[u], tin[v])

For node v, if `tin[v] == low[v]`, `v` is then the root of the next connected component. If you want to print nodes in strongly connected components, keep track of them on the stack. As soon as we get a node with low[i] == tin[i], pop all the nodes from the stack till we get the node `i`. All of these nodes belong to same SCC.

If low[v] > tin[u], we can say that node u and v belong to different SCCs and edge **(u,v)** is a bridge edge.

### Tarjan's algorithm to find strongly connected components

package AlgorithmsAndMe; import java.util.*; public class TarjansAlgorithm { Set<Integer> visited = new HashSet<>(); /* This map stores the time when the current node is visited */ int [] tin; /* low will store minimum on tin[v] tin[p] for all p for which (v,p) is a back edge low[to] for all to for which (v,to) is a tree edge */ int [] low; //To maintain monotonic increasing order. int timer; void dfs(Graph g, int u, int parent, Stack<Integer> stack) { visited.add(u); //Put the current timer. tin[u] = timer; //Low is the time of entry to start with low[u] = timer; stack.push(u); timer++; /* Go through all the neighbors */ for (int to : g.getNeighbors(u)) { //If it is parent, nothing to be done if (to == parent) continue; /* If the neighbor was already visited get the minimum of the neighbor entry time or the current low of the node. */ if (visited.contains(to)) { low[u] = Math.min(low[u], tin[to]); } else { //Else do the DFS dfs(g, to, u,stack); /* Normal edge scenario, take the minimum of low of the parent and the child. */ low[u] = Math.min(low[u], low[to]); } } System.out.println(); if (low[u] == tin[u]){ //Print the stack. while(!stack.isEmpty() && stack.peek() != u){ System.out.print(stack.pop() + " "); } System.out.print(u + " "); stack.pop(); } } public void findConnectedComponents(Graph g) { timer = 0; Stack<Integer> stack = new Stack<>(); Iterator it = g.getNodes().iterator(); tin = new int[g.getSize() + 1]; low = new int[g.getSize() + 1]; Arrays.fill(tin, Integer.MAX_VALUE); Arrays.fill(low, Integer.MAX_VALUE); while(it.hasNext()){ int i = (int) it.next(); if (!visited.contains(i)) dfs(g, i, -1, stack); } } }

As we can see from the code that we only need one DFS traversal,therefore compexity of the algorithm will be `O(|V| + |E|)`, where `V` and `E` are total number of vertex and edges present in the graph.

### Problems solved using algorithm

1. Critical connection in a network.

Please write to us if you need coaching for your interview preparations.