Disjoint set data structure

The disjoint set data structure is also known as union-find data structure and merge-find set. It is a data structure that contains a collection of disjoint or non-overlapping sets. The disjoint set means that when the set is partitioned into the disjoint subsets. The various operations can be performed on the disjoint subsets. In this case, we can add new sets, we can merge the sets, and we can also find the representative member of a set. It also allows to find out whether the two elements are in the same set or not efficiently.

The disjoint set can be defined as the subsets where there is no common element between the two sets. Let's understand the disjoint sets through an example.

s1 = {1, 2, 3, 4}

s2 = {5, 6, 7, 8}

We have two subsets named s1 and s2. The s1 subset contains the elements 1, 2, 3, 4, while s2 contains the elements 5, 6, 7, 8. Since there is no common element between these two sets, we will not get anything if we consider the intersection between these two sets. This is also known as a disjoint set where no elements are common. Now the question arises how we can perform the operations on them. We can perform only two operations, i.e., find and union.

In the case of find operation, we have to check that the element is present in which set. There are two sets named s1 and s2 shown below:

Suppose we want to perform the union operation on these two sets. First, we have to check whether the elements on which we are performing the union operation belong to different or same sets. If they belong to the different sets, then we can perform the union operation; otherwise, not. For example, we want to perform the union operation between 4 and 8. Since 4 and 8 belong to different sets, so we apply the union operation. Once the union operation is performed, the edge will be added between the 4 and 8 shown as below:

When the union operation is applied, the set would be represented as:

s1Us2 = {1, 2, 3, 4, 5, 6, 7, 8}

Suppose we add one more edge between 1 and 5. Now the final set can be represented as:

s3 = {1, 2, 3, 4, 5, 6, 7, 8}

If we consider any element from the above set, then all the elements belong to the same set; it means that the cycle exists in a graph.

How can we detect a cycle in a graph?

We will understand this concept through an example. Consider the below example to detect a cycle with the help of using disjoint sets.

U = {1, 2, 3, 4, 5, 6, 7, 8}

Each vertex is labelled with some weight. There is a universal set with 8 vertices. We will consider each edge one by one and form the sets.

First, we consider vertices 1 and 2. Both belong to the universal set; we perform the union operation between elements 1 and 2. We will add the elements 1 and 2 in a set s1 and remove these two elements from the universal set shown below:

s1 = {1, 2}

The vertices that we consider now are 3 and 4. Both the vertices belong to the universal set; we perform the union operation between elements 3 and 4. We will form the set s3 having elements 3 and 4 and remove the elements from the universal set shown as below:

s2 = {3, 4}

The vertices that we consider now are 5 and 6. Both the vertices belong to the universal set, so we perform the union operation between elements 5 and 6. We will form the set s3 having elements 5 and 6 and will remove these elements from the universal set shown as below:

s3 = {5, 6}

The vertices that we consider now are 7 and 8. Both the vertices belong to the universal set, so we perform the union operation between elements 7 and 8. We will form the set s4 having elements 7 and 8 and will remove these elements from the universal set shown as below:

s4 = {7, 8}

The next edge that we take is (2, 4). The vertex 2 is in set 1, and vertex 4 is in set 2, so both the vertices are in different sets. When we apply the union operation, then it will form the new set shown as below:

s5 = {1, 2, 3, 4}

The next edge that we consider is (2, 5). The vertex 2 is in set 5, and the vertex 5 is in set s3, so both the vertices are in different sets. When we apply the union operation, then it will form the new set shown as below:

s6 = {1, 2, 3, 4, 5, 6}

Now, two sets are left which are given below:

s4 = {7, 8}

s6 = {1, 2, 3, 4, 5, 6}

The next edge is (1, 3). Since both the vertices, i.e.,1 and 3 belong to the same set, so it forms a cycle. We will not consider this vertex.

The next edge is (6, 8). Since both vertices 6 and 8 belong to the different vertices s4 and s6, we will perform the union operation. The union operation will form the new set shown as below:

s7 = {1, 2, 3, 4, 5, 6, 7, 8}

The last edge is left, which is (5, 7). Since both the vertices belong to the same set named s7, a cycle is formed.

How can we represent the sets graphically?

We have a universal set which is given below:

U = {1, 2, 3, 4, 5, 6, 7, 8}

We will consider each edge one by one to represent graphically.

First, we consider the vertices 1 and 2, i.e., (1, 2) and represent them through graphically shown as below:

In the above figure, vertex 1 is the parent of vertex 2.

Now we consider the vertices 3 and 4, i.e., (3, 4) and represent them graphically shown as below:

In the above figure, vertex 3 is the parent of vertex 4.

Consider the vertices 5 and 6, i.e., (5, 6) and represent them graphically shown as below:

In the above figure, vertex 5 is the parent of vertex 6.

Now, we consider the vertices 7 and 8, i.e., (7, 8) and represent them through graphically shown as below:

In the above figure, vertex 7 is the parent of vertex 8.

Now we consider the edge (2, 4). Since 2 and 4 belong to different sets, so we need to perform the union operation. In the above case, we observe that 1 is the parent of vertex 2 whereas vertex 3 is the parent of vertex 4. When we perform the union operation on the two sets, i.e., s1 and s2, then 1 vertex would be the parent of vertex 3 shown as below:

The next edge is (2, 5) having weight 6. Since 2 and 5 are in two different sets so we will perform the union operation. We make vertex 5 as a child of the vertex 1 shown as below:

We have chosen vertex 5 as a child of vertex 1 because the vertex of the graph having parent 1 is more than the graph having parent 5.

The next edge is (1, 3) having weight 7. Both vertices 1 and 3 are in the same set, so there is no need to perform any union operation. Since both the vertices belong to the same set; therefore, there is a cycle. We have detected a cycle, so we will consider the edges further.

How can we detect a cycle with the help of an array?

Consider the below graph:

The above graph contains the 8 vertices. So, we represent all these 8 vertices in a single array. Here, indices represent the 8 vertices. Each index contains a -1 value. The -1 value means the vertex is the parent of itself.

Now we will see that how we can represent the sets in an array.

First, we consider the edge (1, 2). When we find 1 in an array, we observe that 1 is the parent of itself. Similarly, vertex 2 is the parent of itself, so we make vertex 2 as the child of vertex 1. We add 1 at the index 2 as 2 is the child of 1. We add -2 at the index 1 where '-' sign that the vertex 1 is the parent of itself and 2 represents the number of vertices in a set.

The next edge is (3, 4). When we find 3 and 4 in array; we observe that both the vertices are parent of itself. We make vertex 4 as the child of the vertex 3 so we add 3 at the index 4 in an array. We add -2 at the index 3 shown as below:

The next edge is (5, 6). When we find 5 and 6 in an array; we observe that both the vertices are parent of itself. We make 6 as the child of the vertex 5 so we add 5 at the index 6 in an array. We add -2 at the index 5 shown as below:

The next edge is (7, 8). Since both the vertices are parent of itself, so we make vertex 8 as the child of the vertex 7. We add 7 at the index 8 and -2 at the index 7 in an array shown as below:

The next edge is (2, 4). The parent of the vertex 2 is 1 and the parent of the vertex is 3. Since both the vertices have different parent, so we make the vertex 3 as the child of vertex 1. We add 1 at the index 3. We add -4 at the index 1 as it contains 4 vertices.

Graphically, it can be represented as

The next edge is ( 2,5 ). When we find vertex 2 in an array, we observe that 1 is the parent of the vertex 2 and the vertex 1 is the parent of itself. When we find 5 in an array, we find -2 value which means vertex 5 is the parent of itself. Now we have to decide whether the vertex 1 or vertex 5 would become a parent. Since the weight of vertex 1, i.e., -4 is greater than the vertex of 5, i.e., -2, so when we apply the union operation then the vertex 5 would become a child of the vertex 1 shown as below:

In an array, 1 would be added at the index 5 as the vertex 1 is now becomes a parent of vertex 5. We add -6 at the index 1 as two more nodes are added to the node 1.

The next edge is (1,3). When we find vertex 1 in an array, we observe that 1 is the parent of itself. When we find 3 in an array, we observe that 1 is the parent of vertex 3. Therefore, the parent of both the vertices are same; so, we can say that there is a formation of cycle if we include the edge (1,3).

The next edge is (6,8). When we find vertex 6 in an array, we observe that vertex 5 is the parent of vertex 6 and vertex 1 is the parent of vertex 5. When we find 8 in an array, we observe that vertex 7 is the parent of the vertex 8 and 7 is the parent of itself. Since the weight of vertex 1, i.e., -6 is greater than the vertex 7, i.e., -2, so we make the vertex 7 as the child of the vertex and can be represented graphically as shown as below:

We add 1 at the index 7 because 7 becomes a child of the vertex 1. We add -8 at the index 1 as the weight of the graph now becomes 8.

The last edge to be included is (5, 7). When we find vertex 5 in an array, we observe that vertex 1 is the parent of the vertex 5. Similarly, when we find vertex 7 in an array, we observe that vertex 1 is the parent of vertex 7. Therefore, we can say that the parent of both the vertices is same, i.e., 1. It means that the inclusion (5,7) edge would form a cycle.

Till now, we have learnt the weighted union where we perform the union operation according to the weights of the vertices. The higher weighted vertex becomes a parent and the lower weighted vertex becomes a child. The disadvantage of using this approach is that some nodes take more time to reach its parent. For example, in the above graph, if we want to find the parent of vertex 6, vertex 5 is the parent of vertex 6 so we move to the vertex 5 and vertex 1 is the parent of the vertex 5. To overcome such problem, we use the concept 'collapsing find'.

How collapsing find technique works?

Consider the above example. Once we know the parent of the vertex 6 which is 1 then we directly add the vertex 6 to the vertex 1. We will also update the array. In an array, add 1 at the index 6 because the parent of 6 is now 1. The process of directly linking a node to the direct parent of a set is known as a collapsing find. Similarly, we can link the nodes 8 and 4 to the node 1.