## Find friend groups

There are Nstudents in a class. Some of them are friends, while some are not. Their friendship is transitive in nature. For example, if Ais a direct friend of B, and Bis a direct friend of C, then Ais an indirect friend of C. And we defined a friend circle is a group of students who are direct or indirect friends.

Given a N*N matrix M representing the friend relationship between students in the class. If M[i][j] = 1, then the `i`th and `j`th students are direct friends with each other, otherwise not. And you have to output the total number of friend circles among all the students. For example,

```Input:
[[1,1,0,0,0,0],
[1,1,0,0,0,0],
[0,0,1,1,0,0],
[0,0,1,1,1,0],
[0,0,0,1,1,0],
[0,0,0,0,0,1]
]
Output: 3```

## Thought process

When we talk about the connections or relationships, we immediately will think of graph data structure. The node is the person and the edge is the relationship between two persons. So, first, we have to figure out whether it will be a directed graph or an undirected graph. In this problem, the friendship is a mutual relationship, thus the graph is undirected.

When you are reading this problem, the concept of Strongly Connected Components(SCC) will come into your mind. Ok, we will discuss why? If A is a friend of B, B is a friend of C, then A will be a friend of C. What does it mean? A is indirectly connected to C. It means that every friend can reach every other friend through a path if they are directly or indirectly connected. So in this way, they are forming a strong group or circle, in which every vertex is connected directly or indirectly in its group/circle. Notice that all friends (both direct and indirect), who should be in one friend circle are also in one connected component​ in the graph.

A particular group of friends is a single component. In this problem we are going to find out how many components are there in the graph.

When you make a graph out of it. It will be like this(see below fig). It means there are 3 friends circles.

We can solve this problem using 2 methods: depth first search and disjoint set method

## Using Depth first traversal method

Finding connected components for an undirected graph is very easy. We can do either BFS or DFS starting from every unvisited vertex, and we get all strongly connected components

```1. Initialize all nodes as not visited.
2. Initialize variable count as 1.
3. for every vertex 'v'.
(i) If 'v' is unvisited
Call DFS(v)
count=count+1
DFS(v)
1. Mark 'v' as visited.
2. Do following for every unvisited neighbor `u`
recursively call DFS(u)```

### DFS based approach implementation

```class Solution {
Public:
void DFS(int node,vector<vector<int>> edges,vecto<bool>visited)
{
int i;
visited[node]=true;
for(int i=0;i<edges[node].size();i++)
{
if(edges[node][i]==1 && node!=i && visited[i]==false)
{
DFS(i,edges,visited);
}
}
}

//Main Method
int findCircleNum(vector<vector<int>> edges) {
int i,j;

int n=edges.size();
int count=0;
vector<bool>visited(m);

//mark all the nodes as unvisited
for(i=0;i<n;i++)
{
visited[i]=false;
}

for(i=0;i<n;i++)
{
if(visited[i]==false)
{
DFS(i,edges,visited);
count=count+1;
}
}

return count;
}
};
```

Complexity

• Time Complexity: Since we just go along the edges and visit every node, it will be O(n).
• Space Complexity: O(n), to store the visited nodes.

## Using Disjoint Sets(Union Find)

So, how to think that this problem is solved by Disjoint Sets(union-find algorithm)?

The answer is simple because we need to keep track of the set of elements(here friends) partitioned into a number of non-overlapping subsets. Disjoint Sets(Union Find) always do this work very efficiently. We will use the Union by Rank algorithm to solve this problem.

To join two nodes, we will compare the rank of parents of both nodes.

• If the rank is equal, we can make any one of the parent’s node as a parent and increment the rank of the parent node by 1.
• If the rank is not same, then we can make the parent whose rank is greater than other.

Let’s start solving this.

Union(1,2): 1 is a parent of itself and 2 is parent of itself. As both of them have different parents, so we have to connect them, and we will any of the parent as root, in this case we chose 1 and make it a parent.

Union(2,1): 1 is a parent of itself and 1 is a parent of 2, as both of them have the same parents, already joined.

Union(3,4) :3 is a parent of itself and 4 is a parent of itself. Both of them have different parents, we need to join them.

Union(4,3): 3 is parent of itself and 3 is the parent of 4. Both of them have the same parents, already joined.

Union(4,5):   3 is the parent node of 4 and 5 is the parent node of 5. Since parents are different, we have to compare the rank of the parents of both 4 and 5 nodes. 3 has higher rank then 5, it will be parent of 5 .(Used Path Compression) as shown in the below fig.

Union(5,4): As now, 4 and 5 have the same parents, already joined. Last is the node 6 which connected to itself. So, nothing to do there. At the end of this exercise, we found that there are three different sets in the graph, and that is our answer of number of groups of friends in this graph or matrix.

### Disjoint set based approach implementation

```class Solution {
public:
class Node
{
public:
int data;
Node*parent;
int rank=0;
};
//make a set with only one element.
void make(int data)
{
Node*node=new Node();
node->data=data;
node->parent=node;
node->rank=0;
mapy[data]=node;
return;
}
map<int,Node*> mapy;
//To return the address of the particular node having data as `data`
Node*find(int data)
{
auto k=mapy.find(data);
if(k==mapy.end())
{
//There is no any node created, create the node
make(data);
return mapy[data];
}
else
{
return mapy[data];
}
return NULL;
}
/*Find the representative(parent) recursively and does path compression
as well*/
Node*parents(Node*node)
{
if(node->parent==node)
{
return node;
}
return node->parent = parents(node->parent);
}
//Main Method
int findCircleNum(vector<vector<int>>edges) {
int i,j;

vector<int> v;
int m=edges.size();
int n=edges[0].size();
for(i=0;i<m;i++)
{
for(j=0;j<n;j++)
{
if(edges[i][j]==1)
{
int a=i;
int b=j;

Node*A=find(a);
Node*B=find(b);

Node*PA=parents(A);
Node*PB=parents(B);

if(PA==PB)
{
}
else
{
if(PA->rank>=PB->rank)
{
//increment rank if both sets have Same rank
PA->rank=(PA->rank==PB->rank)?PA->rank+1:PA->rank;
PB->parent=PA;
}
else
{
PA->parent=PB;
}
}

}
}
}

int number=0;
for(auto k: mapy)
{
if(k.second->parent==k.second)
{
number=number+1;
}
}
return number;
}
};
```

Complexity

• Time Complexity: For each of the edge, we need to find the parents  and do the union, which is O(mlogn).
• Space Complexity: We used a map to store the parent information, O(n).

This post is contributed by Monika Bhasin

## Cycle in undirected graph using disjoint set

In post disjoint set data structure, we discussed the basics of disjoint sets. One of the applications of that data structure is to find if there is a cycle in a directed graph.

In graph theory, a cycle is a path of edges and vertices wherein a vertex is reachable from itself.

For example, in the graph shown below, there is a cycle formed by path : 1->2->4->6->1.

Disjoint-set data structure has two operations: union and find. Union operation merges two sets into one, whereas find operation finds the representative of the set a given element belongs to.

### Using disjoint set to detect a cycle in directed grah

How can use the data structure and operations on it to find if a given directed graph contains a cycle or not?

We use an array A, which will store the parent of each node. Initialize the array with the element itself, that means to start with every node is the parent of itself.

Now, process each `edge(u,v)` in the graph and for each edge to the following: Get the root of both vertices u and v of the edge. If the roots of both nodes are different, update the root of u with the root of v. If roots are same, that means they belong to the same set and hence this edge creates a cycle.

How can we find the root of a vertex? As we know A[i] represents the parent of i; we start with i= u and go up till we find A[i] = i. It means there is no node parent of i and hence i is the root of the tree to which u belongs.

Let’s take an example and see how does it work. Below is the given directed graph and we have to if there is a cycle in it or not?

Now, we process each node of the graph one by one. First is `edge(1,2)`. The root of `node(1)` is 1 and the root of `node(2)` is 2. Since the roots of two vertices are different, we update the parent of the root of 2 which is A[2] to the root of 1 which is 1.

Next, we process `edge(2,3)`, here root of the `node(2)` is 1, whereas the root `node(3)` is 3. Again they differ, hence update A[root of 3] with root 2, i.e A[3] = 1;

Now, process `edge(2,4)`, it will end up with A[4] = 1, can you deduce why? And similarly `edge(4,6)` will also lead to update A[6] = 1.

Now, we process the `edge(6,1)`. Here, root of `node(6)` is 1 and also the root of `node(1)` is 1. Both the nodes have same root, that means there is a cycle in the directed graph.

#### To detect a cycle in direct graph : Implementation

```package com.company.Graphs;

import java.util.*;

/**
* Created by sangar on 21.12.18.
*/
private Map<Integer, ArrayList<Integer>> G;
private boolean isDirected;
private int count;

this.G = new HashMap<>();
this.isDirected = isDirected;
}

public void addEdge(int start, int dest){

if(this.G.containsKey(start)){
}else{
this.G.put(start, new ArrayList<>(Arrays.asList(dest)));
}

if(!this.G.containsKey(dest)) {
this.G.put(dest, new ArrayList<>());
}
//In case graph is undirected
if(!this.isDirected) {
}
}

public boolean isEdge(int start, int dest){
if(this.G.containsKey(start)){
return this.G.get(start).contains(dest);
}

return false;
}

public boolean isCycleWithDisjointSet() {
int[] parent = new int[this.G.size() + 1];

for (int u = 1; u < this.G.size() + 1; u++) {
//Process edge from each node.

//Find root of u
int i, j;

//Worst complexity is O(V)
for(i=u; i != parent[i]; i = parent[i]);

/*This loop will run for O(E) times for all
the vertices combined. */
for(int v: this.G.get(u)){
for(j=v; j != parent[j]; j = parent[j]);

if(i == j){
System.out.println("Cycle detected at
("+ u + "," + v + ")");
return true;
}

parent[i] = j;
}
}
return false;
}
}
```

Test cases

```package test.Graphs;

import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
* Created by sangar on 21.12.18.
*/
@Test
public void detectCycleInDirectedGraphTest() {

assertEquals(true, tester.isEdge(3,4));
assertEquals(false, tester.isEdge(1,4));

assertEquals(true, tester.isCycleWithDisjointSet());

}
}

```

Complexity of this algorithm is `O(EV)` where E is number of edges and V is vertices, where as union function in disjoint set can take linear time w.r.t to vertices and it may run for number of edge times.

Please share if there is something wrong or missing. If you are preparing for interview and interested in personalized coaching, please signup for free demo class.

## Disjoint set data structure

A disjoint set data structure or union and find maintains a collection 𝑆 = { 𝑆1, 𝑆2, ⋯ , 𝑆𝑛 } of disjoint dynamic sets. Subsets are said to be disjoint if intersection between them is NULL. For example, set {1,2,3} and {4,5,6} are disjoint sets, but {1,2,3} and {1,3,5} are not as intersection is {1,3} which is not null. Another important thing about the disjoint set is that every set is represented by a member of that set called as representative.

Operations on this disjoint set data structure:
1. Make Set: Creates a new set with one element x, since the sets are disjoint, we require that x not already be in any of the existing sets.
2. Union: Merges two sets containing x and y let’s say Sx and Sy and destroys the original sets.
3.Find: Returns the representative of the set which element belongs to.

Let’s take an example and see how disjointed sets can be used to find the connected components of an undirected graph.

To start with, we will make a set for each vertex by using make-set operation.

```for each vertex v in G(V)
do makeSet(v)
```

Next process all the edges in the graph (u,v) and connect set(u) and set(v) if the representatives of the set which contains u and set which contains v are not same.

```for each edge (u,v) in 𝐺(E)
do if findSet(u) != findSet(v)
then union(u, v)
```

Once above preprocessing steps have run, then we can easily find answer if two vertices u and v are part of same connected component or not?

```boolean isSameComponent(u, v)
if findSet(u)==findSet(v)
return True
else
return False
```

To find how many components are there, we can look at how many disjoint sets are there and that will give us the number of connected components in a graph. Let’s take an example and see how it works.

Below table shows the processing of each edge in the graph show figure above.

Now, how can we implement sets and quickly do union and find operations? There are two ways to do it.

### Disjoint set representation using an array

Simple implementation of disjoint set is using an array which maintains their representative of element i in A[i]. To this implementation to work, it is must that all the element in the set are in range 0 to N-1 where N is size of the array.

Initially, in makeSet() operation, set `A[i]=i`, for each i between 0 and N-1 and create the initial versions of the sets.

```for (int i=0; i<N; i++) A[i] = i;
```

Union operation for the sets that contain integers u and v, we scan the array A and change all the elements
that have the value A[u] to have the value A[v]. For example, we if want to connect an edge between 1 and 2 in the above set, the union operation will replace A[2] with A[1].

Now, if want to add an edge between 3 and 1. In this case, u = 3 and v = 1. A[3] = 3 and A[1] = 1. So, we will replace all the indices of A where A[i] = 1. So final array looks like this.

Similarly, if want to add an edge from 6 to 7.

```//change all elements from A[u] to A[v].
void union(int A[], int u, int v){
int temp = A[u];
for(int i=0; i<A.length; i++){
if(A[i] == temp)
A[i] = A[v];
}
}
```

findSet(v) operation returns the value of A[v].

```int findSet(int A[], int v){
return A[v]
}
```

The complexity of makeSet() operation is `O(n)` as it initializes the entire array. Union operation take every time `O(n)` operations if we have to connect n nodes, then it will be `O(n2)` operations. FindSet() operation has constant time complexity.

We can represent a disjoint set using linked list too. In that case, each set will be a linked list, and head of the linked list will be the representative element. Each node contains two pointers, one to its next element it the set and other points to the representative of the set.

To initialize, each element will be added to a linked list. To union (u, v), we add the linked list which contains u to end of the linked list which contains v and change representation pointer of each node to point to the representation of list which contained v.

The complexity of union operation is again `O(n)`. Also, find operation can be `O(1)` as it returns the representative of it.

### Disjoint set forest

The disjoint-forests data structure is implemented by changing the interpretation of the meaning of the element of array A. Now each A[i] represents an element of a set and points to another element of that set. The root element points to itself. In short, A[i] now points to the parent of i.

Makeset operation does not change, as to start with each element will be the parent of itself.
Union operation will change, if we want to connect u and v with an edge, we update A[root of u] with the root of v. How to find the root of an element? As we have the relationship that A[i] is the parent of i, we can move up the chain until we find a case where `A[i] == i`, that case, i is the root of v.

```//finding root of an element
int root(int A[],int i){
while(A[i] != i){
i = A[i];
}
return i;
}

/*Changed union function where we connect
the elements by changing the root of
one of the elements
*/

int union(int A[] ,int u ,int v){
int rootU = root(A, u);
int rootV = root(A, v);
A[rootU] = rootV ;
}
```

This implementation has a worst-case complexity of `O(n)` for union function. And also we made the worst complexity of findSet operation as `O(n)`.

However, we can do some ranking on the size of trees which are being connected. We make sure that always root of smaller tree point to the root of the bigger tree.

```void union(int[] A, int[] sz, u, v){

//Finding roots
for (int i = u; i != A[i]; i = A[i]) ;
for (int j = v; j != A[j]; j = A[j]) ;

if (i == j) return;
//Comparing size of tree to put smaller tree root under
// bigger tree's root.
if (sz[i] < sz[j]){
A[i] = j;
sz[j] += sz[i];
}
else {
A[j] = i;
sz[i] += sz[j];
}
}
```

In the next few posts, we will be discussing applications of this method to solve different problems on graphs.
Please share if there is something wrong or missing. If you are preparing for an interview, and want coaching sessions to prepare for it, please signup for free demo session.