Minimum Spanning Tree using Kruskal's Algorithm in C++

Introduction of the Kruskal's Algorithm:

In the fast-changing world of tech and info, algorithms are super important for solving hard problems. One cool algorithm that's simple and works well is Kruskal's algorithm. It comes from graph theory and is great for finding the smallest way to connect things in a graph.

Now, a "minimum spanning tree" might sound fancy, but it's key for designing networks, planning buildings, and solving optimization problems. It's about making the smallest connected graph with the least total weight on its edges. Kruskal's algorithm, named after a smart guy named Joseph Kruskal, is all about making the best choices at each step to get the smallest overall solution.

As we get into how Kruskal's algorithm works, we'll look at the basic ideas, how to use it, and where it's handy in real life. Whether you love computers, are a student, or a pro developer, knowing Kruskal's algorithm helps you understand graphs better and why they matter. Let's dive in and see how cool and useful Kruskal's algorithm is for graphs and making things more efficient!

Implementation in C++

Program:

Let us take an example to implement the minimum spanning tree using Kruskal's algorithm in C++

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

// Define a structure to represent an edge
struct Edge {
 int src, dest, weight;
};

// Define a structure to represent a subset for union-find
struct Subset {
 int parent, rank;
};

// Compare function for sorting edges based on their weights
bool compareEdges(const Edge &a, const Edge &b) {
 return a.weight < b.weight;
}

// Find set of an element in union-find (with path compression)
int find(Subset subsets[], int i) {
 if (subsets[i].parent != i)
 subsets[i].parent = find(subsets, subsets[i].parent);
 return subsets[i].parent;
}

// Perform union of two subsets (by rank)
void Union(Subset subsets[], int x, int y) {
 int xroot = find(subsets, x);
 int yroot = find(subsets, y);

 if (subsets[xroot].rank < subsets[yroot].rank)
 subsets[xroot].parent = yroot;
 else if (subsets[xroot].rank > subsets[yroot].rank)
 subsets[yroot].parent = xroot;
 else {
 subsets[yroot].parent = xroot;
 subsets[xroot].rank++;
 }
}

// Kruskal's algorithm to find Minimum Spanning Tree
void Kruskal(vector<Edge> &edges, int V) {
 // Sort the edges in non-decreasing order of their weights
 sort(edges.begin(), edges.end(), compareEdges);

 // Allocate memory for creating V subsets
 Subset *subsets = new Subset[V];

 // Create V subsets with single elements
 for (int i = 0; i < V; ++i) {
 subsets[i].parent = i;
 subsets[i].rank = 0;
 }

 // Initialize result
 vector<Edge> result;

 // Number of edges to be taken is equal to V-1
 int edgeIndex = 0;
 while (result.size() < V - 1) {
 // Pick the smallest edge and increment the index for the next iteration
 Edge nextEdge = edges[edgeIndex++];

 int x = find(subsets, nextEdge.src);
 int y = find(subsets, nextEdge.dest);

 // If including this edge doesn't cause a cycle, add it to the result
 if (x != y) {
 result.push_back(nextEdge);
 Union(subsets, x, y);
 }
 }

 // Print the result
 for (const Edge &edge : result)
 cout << edge.src << " -- " << edge.dest << " == " << edge.weight << endl;

 // Cleanup
 delete[] subsets;
}

int main() {
 // Example usage
 int V = 4; // Number of vertices
 vector<Edge> edges = {
 {0, 1, 10},
 {0, 2, 6},
 {0, 3, 5},
 {1, 3, 15},
 {2, 3, 4}
 };

 Kruskal(edges, V);

 return 0;
}

Output:

2 -- 3 == 4
0 -- 3 == 5
0 -- 1 == 10

Explanation:

1. Sort Edges by Weight:

Kruskal's algorithm begins by sorting all the edges of the graph in ascending order based on their weights. It is essential for the greedy strategy employed by the algorithm.

2. Initialize Subsets:

Create subsets for each vertex in the graph. Initially, each vertex is in its own subset.

3. Iterate Over Sorted Edges:

Starting with the smallest edge, iterate through the sorted list of edges.

4. Check for Cycle:

For each edge, check whether including it in the growing spanning tree would create a cycle. It is done by checking if the source and destination vertices of the edge belong to the same subset. If they don't, it means adding this edge won't create a cycle.

5. Union Operation:

If adding the edge doesn't create a cycle, include it in the minimum spanning tree. Update the subsets by performing the union operation, merging the subsets of the source and destination vertices.

6. Repeat Until Spanning Tree is Formed:

Continue this process until the minimum spanning tree has V-1 edges, where V is the number of vertices. At this point, the spanning tree spans all vertices without forming any cycles.

7. Result:

The result is a minimum spanning tree that connects all vertices with the minimum total edge weights.

Approach 2:

Let us take another example to implement the minimum spanning tree using Kruskal's algorithm in C++

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

struct Edge {
 int src, dest, weight;
};

// Helper class for Union-Find with path compression
class UnionFind {
public:
 vector<int> parent, rank;

 UnionFind(int size) {
 parent.resize(size);
 rank.resize(size, 0);
 for (int i = 0; i < size; ++i) {
 parent[i] = i;
 }
 }

 int find(int x) {
 if (parent[x] != x) {
 parent[x] = find(parent[x]); // Path compression
 }
 return parent[x];
 }

 void unite(int x, int y) {
 int rootX = find(x);
 int rootY = find(y);

 if (rootX != rootY) {
 if (rank[rootX] < rank[rootY]) {
 parent[rootX] = rootY;
 } else if (rank[rootX] > rank[rootY]) {
 parent[rootY] = rootX;
 } else {
 parent[rootY] = rootX;
 rank[rootX]++;
 }
 }
 }
};

bool compareEdges(const Edge &a, const Edge &b) {
 return a.weight < b.weight;
}

void kruskalMST(vector<Edge> &edges, int V) {
 sort(edges.begin(), edges.end(), compareEdges);

 UnionFind uf(V);
 vector<Edge> result;

 for (const Edge &edge : edges) {
 int rootSrc = uf.find(edge.src);
 int rootDest = uf.find(edge.dest);

 if (rootSrc != rootDest) {
 result.push_back(edge);
 uf.unite(rootSrc, rootDest);
 }
 }

 for (const Edge &edge : result) {
 cout << edge.src << " -- " << edge.dest << " == " << edge.weight << endl;
 }
}

int main() {
 int V = 4;
 vector<Edge> edges = {
 {0, 1, 10},
 {0, 2, 6},
 {0, 3, 5},
 {1, 3, 15},
 {2, 3, 4}
 };

 kruskalMST(edges, V);

 return 0;
}

Output:

Minimum Spanning Tree:
Edge: 2 -- 3, Weight: 4
Edge: 0 -- 3, Weight: 5
Edge: 0 -- 1, Weight: 10

Explanation:

Sorting Edges:

The algorithm starts by sorting all the edges in the graph based on their weights in non-decreasing order.

It is often done using a sorting algorithm, and the sorting process is a key step in Kruskal's algorithm.

Union-Find Data Structure:

Kruskal's algorithm uses a data structure called Union-Find to efficiently detect cycles in the graph.

It keeps track of disjoint sets and allows quick checks for whether adding an edge would create a cycle.

Edge Selection:

Starting from the smallest edge, the algorithm iterates through the sorted edges.

For each edge, it checks if adding it to the growing MST would create a cycle by using the Union-Find data structure.

Adding Edges to MST:

If adding the edge does not create a cycle, it is added to the MST.

This process continues until the MST has V-1 edges, where V is the number of vertices in the graph.

Kruskal's Algorithm vs Prim's Algorithm:

Kruskal's Algorithm

Strategy:

Greedy: Kruskal's algorithm follows a greedy strategy, selecting the smallest available edge at each step.

Edge Selection:

Independent of Vertices: It selects edges independently of the vertices.

Data Structures:

Disjoint-Set (Union-Find): It is typically uses disjoint-set data structures to efficiently check for cycles and perform union operations.

Complexity:

Edge Sorting: It involves sorting all the edges based on their weights, which can be time-consuming in dense graphs.

Suitability:

Sparse Graphs: It generally performs well on sparse graphs where the number of edges is significantly less than the maximum possible.

Parallelization:

More amenable to parallelization due to the independent nature of edge selection.

Prim's Algorithm:

Strategy:

Greedy: Prim's algorithm is also a greedy algorithm, selecting the smallest available edge at each step.

Edge Selection:

Dependent on a Starting Vertex: It selects edges based on a starting vertex and then grows the MST from that starting point.

Data Structures:

Priority Queue: It typically uses a priority queue to efficiently select the smallest edge connected to the growing MST.

Complexity:

No Sorting: It does not require sorting of all edges, making it potentially more efficient in dense graphs.

Suitability:

Dense Graphs: It can perform well on dense graphs due to its avoidance of sorting all edges.

Parallelization:

Limited Parallelization: It is more challenging to parallelize due to the dependency on the starting vertex and the need for a priority queue.

Comparison Overview:

Edge Sorting: Kruskal's involves sorting all edges, while Prim's doesn't require sorting. For dense graphs, Prim's may have an advantage in terms of time complexity.
Parallelization: Kruskal's is often more parallelizable because the process of selecting edges is more independent. In contrast, Prim's can be more challenging to parallelize due to its dependency on a starting vertex.
Space Complexity: Kruskal's typically has higher space complexity due to the use of disjoint-set data structures. Prim's may use less memory in certain scenarios.
Starting Vertex: Kruskal's does not rely on a specific starting vertex, making it more straightforward to implement. Prim's requires selecting a starting vertex, which can influence the resulting MST.
Applications: Both algorithms are used in various applications, such as network design, but the choice between them often depends on the specific characteristics of the graph and the problem requirements.

Advantages of Kruskal's Algorithm:

There are several advantages of the Kruskal's Algorithm. Some main advantages of the Kruskal's Algorithm are as follows:

Simplicity: Kruskal's algorithm is relatively simple to understand and implement. Its straightforward logic and ease of implementation make it accessible for both learning and practical use.
Greedy Approach: The algorithm follows a greedy approach by selecting the smallest edge at each step. This local optimization strategy leads to a globally optimal solution, ensuring that the overall minimum spanning tree is efficiently identified.
Efficiency: Kruskal's algorithm is efficient, especially when implemented with data structures like disjoint-set data structures (union-find). These structures ensure quick checks for cycles and efficient subset union operations.
Optimality: The algorithm guarantees the optimality of the solution. The minimum spanning tree produced by Kruskal's algorithm always has the smallest total edge weights, making it suitable for scenarios where minimizing the overall cost or weight is crucial.
Distributed Computing: Kruskal's algorithm is suitable for distributed computing environments. The nature of its operations, particularly the ability to work with disconnected components, makes it applicable in scenarios where data and computation are distributed across multiple nodes.
Versatility: Kruskal's algorithm is not limited to specific types of graphs. It can be applied to both dense and sparse graphs, making it versatile for a wide range of applications.
No Initial Assumptions: Unlike some other algorithms, Kruskal's algorithm does not require any assumptions about the starting point. It starts with the smallest edge and builds the spanning tree incrementally, ensuring that the choice of a specific vertex as the starting point does not influence the solution.
Parallelization: The algorithm's structure allows for parallelization, making it suitable for implementation in parallel computing environments. This feature can lead to improved performance in certain scenarios.

Kruskal's algorithm is a powerful tool for efficiently solving the minimum spanning tree problem in graphs, providing a balance between simplicity, optimality, and versatility.

Applications of Kruskal's Algorithm:

There are several applications of Kruskal's Algorithm. Some main applications of the Kruskal's Algorithm are as follows:

Network Design: Kruskal's algorithm is widely used in designing communication networks, such as computer networks, telecommunication networks, and transportation networks. It helps in establishing the most efficient connections between nodes to minimize costs or maximize data transfer rates.
Circuit Design: In electronic circuit design, Kruskal's algorithm can be applied to minimize the total wire length while ensuring that all components are connected. It is particularly important in the design of integrated circuits and printed circuit boards.
Urban Planning: Kruskal's algorithm can be utilized in urban planning to optimize the layout of roads, utilities, and infrastructure. It assists in creating a network that efficiently connects different parts of a city or region.
Resource Management: In resource management and allocation, Kruskal's algorithm helps optimize the distribution of resources by identifying the most efficient pathways or connections. It can be applied in scenarios such as water distribution, energy transmission, or pipeline networks.
Wireless Sensor Networks: In wireless sensor networks, where sensors need to be connected with the least amount of communication overhead, Kruskal's algorithm can be used to establish efficient communication links while minimizing overall energy consumption.
Image Segmentation: In image processing, Kruskal's algorithm can be employed for image segmentation. It helps identify the most significant edges or connections in an image, contributing to tasks such as object recognition and computer vision.
Robotics: In robotics, particularly in the path planning and coordination of robotic agents, Kruskal's algorithm can be used to determine the optimal paths for multiple robots to navigate through an environment.
Biology and DNA Sequencing: In computational biology, Kruskal's algorithm can be applied to analyze biological data, such as the identification of evolutionary relationships among species or the sequencing of DNA fragments.
VLSI Design (Very Large-Scale Integration): Kruskal's algorithm is useful in VLSI design for optimizing the layout of components on a chip. It helps minimize the total wire length, reducing delays and improving overall performance.
Game Design: Kruskal's algorithm can be used in game development to create realistic terrain, road networks, or mazes. It assists in generating game environments that are both connected and efficient.

These applications highlight the versatility of Kruskal's algorithm in solving optimization problems related to connectivity and resource utilization across various domains.

Disadvantages of Kruskal's Algorithm:

There are several disadvantages of Kruskal's Algorithm. Some main disadvantages of the Kruskal's Algorithm are as follows:

Inefficiency with Dense Graphs: Kruskal's algorithm can be inefficient when dealing with dense graphs where the number of edges is close to the maximum possible. This is because sorting a large number of edges can be time-consuming.
Space Complexity: The algorithm's space complexity is relatively high, especially when using additional data structures like disjoint-set data structures. It can be a drawback in situations where memory usage is a concern.
Doesn't Handle Disconnected Graphs Well: Kruskal's algorithm assumes that the given graph is connected. If the graph is not connected, additional steps or modifications are needed to handle disconnected components, making the implementation more complex.
Not Suitable for Dynamic Graphs: The algorithm is designed for static graphs and does not naturally adapt to changes in the graph structure. If the graph is dynamic and edges are added or removed frequently, Kruskal's algorithm may not be the most efficient choice.
Equal Weight Edges: If the graph contains edges with equal weights, Kruskal's algorithm may produce different minimum spanning trees for different implementations or sorting methods. This lack of uniqueness can be a disadvantage in certain scenarios.
No Consideration for Edge Costs: Kruskal's algorithm only considers the weights of edges and does not take into account other factors, such as the cost of adding an edge. In some practical applications, the cost of adding an edge may not solely depend on its weight.
May Not Be Optimal for Certain Cases: While Kruskal's algorithm guarantees the minimum spanning tree's optimality in terms of edge weights, it may not always produce the most optimal solution for specific real-world applications where other factors need consideration.
Not Well-Suited for Parallelization: Although some parts of Kruskal's algorithm can be parallelized, the overall structure is not inherently well-suited for parallel processing. It can limit its performance in certain high-performance computing environments.

Despite these disadvantages, Kruskal's algorithm remains a powerful and widely used method for finding minimum spanning trees in various applications. Understanding its limitations is important when choosing an algorithm for a specific problem or application.

Conclusion:

In summary, Kruskal's algorithm, a notable method in graph theory, efficiently solves the Minimum Spanning Tree (MST) problem by adopting a greedy strategy. Its simplicity and effectiveness stem from sorting edges based on weights and utilizing the Union-Find data structure for cycle detection. By iteratively selecting the smallest non-cyclic edges, Kruskal's algorithm constructs an MST that connects all vertices with minimal total edge weight. This versatility makes it applicable in diverse fields such as network design, urban planning, and robotics. While its time complexity is influenced by edge sorting, Kruskal's algorithm remains an essential tool for optimizing connectivity in various scenarios.

Next Topicstd::is_trivially_assignable in C++

← prev next →