Merge overlapping Intervals in C++

Merging overlapping intervals is a common computational problem that arises in various domains, including computer science, mathematics, and real-world applications like scheduling, calendar management, and data analysis. The goal is to take a collection of intervals, where each interval represents a range of values, and merge any overlapping intervals into a single and consolidated interval. This process simplifies the representation of data and can help in various tasks, such as finding available time slots, optimizing schedules, or reducing the complexity of data.

Interval Representation: An interval is typically represented as a pair of values [start, end], where start is the beginning of the interval, and end is the endpoint. It signifies a closed interval, meaning both the start and end points are included in the interval.

Sorting: It's essential to sort the intervals based on their starting values to efficiently merge overlapping intervals. Sorting ensures that intervals with the earliest start times are considered first when determining overlaps. Common sorting algorithms like quicksort or mergesort are often used.

Merging Process:

Initialize an empty result set to store the merged intervals.
Iterate through the sorted intervals one by one, starting from the first interval.
For each interval, check if it overlaps with the previously merged interval (if any) by comparing its start value with the end value of the last merged interval.
Suppose there is an overlap (i.e., start <= last_merged_end), merge the current interval with the last merged interval. To do this, update the end value of the last merged interval to the maximum of its current end value and the end value of the current interval.
If there is no overlap, add the current interval to the result set as a new merged interval.

Result: After processing all intervals, the result set will contain a set of non-overlapping intervals that cover the same range as the original intervals but with overlapping intervals merged into one.

This process simplifies the representation of intervals, making it easier to work with and analyze data. It's important to note that the efficiency of this operation depends on the sorting step, which typically has a time complexity of O(n log n) for n intervals. Once sorted, the merging step is linear, making the overall time complexity of the merging algorithm O(n log n) due to the sorting step.

Program-1: Brute-Force Approach (Quadratic Time)

This approach involves comparing each interval with every other interval to check for overlaps. For each interval, iterate through all other intervals and merge overlapping ones. It has a time complexity of O(n^2) but is straightforward to implement.

#include <iostream>
#include <vector>
std::vector<std::pair<int, int>> mergeOverlappingIntervals(const std::vector<std::pair<int, int>>& intervals) {
    std::vector<std::pair<int, int>> mergedIntervals;
    for (const auto& current_interval : intervals) {
        bool merged = false;
        std::pair<int, int> merged_interval = current_interval;
        for (const auto& other_interval : intervals) {
            if (current_interval != other_interval) {
                if (current_interval.first <= other_interval.second && current_interval.second >= other_interval.first) {
                    // There is an overlap, merge the intervals.
                    merged_interval.first = std::min(current_interval.first, other_interval.first);
                    merged_interval.second = std::max(current_interval.second, other_interval.second);
                    merged = true;
                }
            }
        }
        if (!merged) {
            mergedIntervals.push_back(current_interval);
        } else {
            // Ensure the merged interval is not already in the result set.
            bool alreadyInResult = false;
            for (const auto& result_interval : mergedIntervals) {
                if (result_interval == merged_interval) {
                    alreadyInResult = true;
                    break;
                }
            }
            if (!alreadyInResult) {
                mergedIntervals.push_back(merged_interval);
            }
        }
    }
    return mergedIntervals;
}
int main() {
    std::vector<std::pair<int, int>> intervals = {{1, 3}, {2, 6}, {8, 10}, {15, 18}};
    std::vector<std::pair<int, int>> merged = mergeOverlappingIntervals(intervals);
    for (const auto& interval : merged) {
        std::cout << "[" << interval.first << ", " << interval.second << "] ";
    }
    return 0;
}

Output:

[1, 6] [8, 10] [15, 18]

Explanation:

Function Definition:

The code defines a function named mergeOverlappingIntervals.
It takes a vector of pairs of integers, representing intervals, as input.
The function returns a vector of merged intervals.

Algorithm Overview:

The function uses a brute-force approach to merge overlapping intervals.
It iterates through each interval in the input vector and checks for overlaps with all other intervals.
If an overlap is found, it merges the intervals.
The merged intervals are stored in a new vector and returned.

Initialization:

A vector called mergedIntervals is created to store the merged intervals.

Main Loop:

The code iterates through each interval in the input vector one by one.
It uses a nested loop to compare the current interval with all other intervals.

Overlap Detection:

It checks if there is an overlap between the current interval and the other interval using start and end values.

Merging Overlapping Intervals:

If an overlap is detected, the code updates the current interval to represent the merged interval.
The merged interval has its start set to the minimum of the two interval's starts, and its end set to the maximum of the two interval's ends.

Adding Merged Interval to Result:

After checking the current interval against all other intervals, it is added to the mergedIntervals vector.
The code ensures that merged intervals are not duplicated in the result.

Return Result:

The function returns the mergedIntervals vector containing the merged intervals.

Complexity Analysis

Time Complexity:

The main loop iterates through each interval in the input vector, resulting in O(n) iterations, where n is the number of intervals.
For each interval in the main loop, there is a nested loop that iterates through all other intervals. In the worst case, this nested loop compares the current interval with every other interval, resulting in O(n) comparisons for each interval in the main loop.
Therefore, the overall time complexity is O(n^2), where n is the number of intervals.

Space Complexity:

The space complexity is primarily determined by the storage of the merged intervals in the mergedIntervals vector.
In the worst case, if no intervals can be merged, the mergedIntervals vector will contain the same number of intervals as the input vector, resulting in a space complexity of O(n).
Additionally, there are some auxiliary variables used for comparisons and temporary storage, but their space usage is relatively small and doesn't significantly impact the overall space complexity.

In summary, the provided brute-force approach has a time complexity of O(n^2) and a space complexity of O(n), where n is the number of intervals in the input vector. This approach is less efficient compared to sorting-based approaches, which can achieve a time complexity of O(n log n) with a space complexity of O(n).

Program-2: Sorting and Merging (Linear Arithmetic Time)

Step 1: Sorting the Intervals

In this method, the first step is to sort the input intervals based on their start values.
Sorting is essential because it helps group overlapping intervals together, making it easier to identify and merge them.
Common sorting algorithms like quicksort or mergesort are typically used here, which have an average time complexity of O(n log n).

Step 2: Merging Overlapping Intervals

After sorting the intervals, you iterate through them in the sorted order.
You maintain a variable, often called mergedIntervals or something similar, to store the merged intervals.

Iterative Merging Process:

Start with the first interval (the one with the smallest start value) and consider it as the "current interval".
Move to the next interval and check if it overlaps with the current interval.
After that, compare the start of the current interval with the end of the next interval to determine overlap.
If the start of the current interval is less than or equal to the end of the next interval, it means they overlap.
If they overlap, merge the two intervals by updating the end of the current interval to be the maximum of its current end and the end of the next interval.
If there's no overlap, add the current interval to the result (mergedIntervals) because you've found the end of this merged interval.
Continue this process, moving through the sorted intervals one by one, merging overlapping intervals when encountered.

Example:

#include <iostream>
#include <vector>
#include <algorithm>
std::vector<std::pair<int, int>> mergeOverlappingIntervals(const std::vector<std::pair<int, int>>& intervals) {
    if (intervals.empty()) {
        return {};  // Return an empty vector if there are no intervals.
    }
    // Sort the intervals based on their start values.
    std::vector<std::pair<int, int>> sortedIntervals = intervals;
    std::sort(sortedIntervals.begin(), sortedIntervals.end());
    std::vector<std::pair<int, int>> mergedIntervals;
    mergedIntervals.push_back(sortedIntervals[0]);
    for (int i = 1; i < sortedIntervals.size(); ++i) {
        std::pair<int, int> currentInterval = sortedIntervals[i];
        std::pair<int, int>& lastMergedInterval = mergedIntervals.back();
        if (currentInterval.first <= lastMergedInterval.second) {
            // Merge the overlapping intervals.
            lastMergedInterval.second = std::max(lastMergedInterval.second, currentInterval.second);
        } else {
            // No overlap, add the current interval to the result.
            mergedIntervals.push_back(currentInterval);
        }
    }
    return mergedIntervals;
}
int main() {
    std::vector<std::pair<int, int>> intervals = {{1, 3}, {2, 6}, {8, 10}, {15, 18}};
    std::vector<std::pair<int, int>> merged = mergeOverlappingIntervals(intervals);
    for (const auto& interval : merged) {
        std::cout << "[" << interval.first << ", " << interval.second << "] ";
    }
    return 0;
}

Output:

[1, 6] [8, 10] [15, 18]

Explanation:

In this example, a function named mergeOverlappingIntervals is defined, which takes a vector of intervals as input.
The code checks if the input vector of intervals is empty. If it's empty, the function returns an empty vector, as there are no intervals to merge.
The intervals are sorted based on their start values using the std::sort Sorting ensures that overlapping intervals are adjacent, simplifying the merging process.
A new vector called mergedIntervals is created to store the merged intervals.
The first interval (the one with the smallest start value) is added to mergedIntervals as the initial merged interval.
After that, the code iterates through the sorted intervals starting from the second interval.
It checks if there is an overlap with the last merged interval (the one at the end of the mergedIntervals vector).
If there is an overlap, the code merges the current interval with the last merged interval by extending the end of the last merged interval if necessary.
If there is no overlap, the current interval is added to the mergedIntervals vector as a new merged interval.
The mergedIntervals vector now contains the merged intervals with no overlaps. These merged intervals represent the non-overlapping ranges obtained from the input intervals.
The function returns the mergedIntervals vector as the result.

Complexity Analysis

Time Complexity :

The time complexity of this approach is mainly determined by the sorting step, which has a time complexity of O(n log n), where n is the number of intervals.
The merging step, which follows, is linear in time complexity because it processes each interval once.
Overall, the time complexity of this method is O(n log n) due to the sorting step, and it is considered efficient for handling large datasets.

Space Complexity:

The space complexity primarily depends on the space required to store the sorted intervals and the merged intervals.
Sorting is typically done in-place on the original intervals, so it doesn't significantly impact the space complexity.
The merged intervals are stored in a separate vector or data structure, which has a space complexity of O(n), as it may contain all intervals if there are no overlaps.

In summary, the "Sorting and Merging" method efficiently merges overlapping intervals with a time complexity of O(n log n) due to the sorting step. It is suitable for handling large datasets and is a standard approach for solving this problem in practice.

Program-3:Stack-based Approach (Linear Time)

Sort the intervals based on their start values.
Initialize an empty stack to store merged intervals.
Iterate through the sorted intervals, pushing intervals onto the stack if they don't overlap with the top interval, and merging them if they do.

Example:

#include <iostream>
#include <vector>
#include <stack>
#include <algorithm>
std::vector<std::pair<int, int>> mergeOverlappingIntervals(const std::vector<std::pair<int, int>>& intervals) {
    // Check if the input intervals are empty.
    if (intervals.empty()) {
        return {};
    }
    // Sort the intervals based on their start values.
    std::vector<std::pair<int, int>> sortedIntervals = intervals;
    std::sort(sortedIntervals.begin(), sortedIntervals.end());
    // Create a stack to hold merged intervals.
    std::stack<std::pair<int, int>> intervalStack;
    intervalStack.push(sortedIntervals[0]);
    // Iterate through the sorted intervals and merge overlapping ones.
    for (int i = 1; i < sortedIntervals.size(); ++i) {
        std::pair<int, int> currentInterval = sortedIntervals[i];
        std::pair<int, int> topInterval = intervalStack.top();
        if (currentInterval.first <= topInterval.second) {
            // Overlapping intervals, merge them.
            intervalStack.pop();
            topInterval.second = std::max(topInterval.second, currentInterval.second);
            intervalStack.push(topInterval);
        } else {
            // Non-overlapping interval, push it onto the stack.
            intervalStack.push(currentInterval);
        }
    }
    // Extract merged intervals from the stack.
    std::vector<std::pair<int, int>> mergedIntervals;
    while (!intervalStack.empty()) {
        mergedIntervals.push_back(intervalStack.top());
        intervalStack.pop();
    }
    // Reverse the order to get intervals in the correct order.
    std::reverse(mergedIntervals.begin(), mergedIntervals.end());
    return mergedIntervals;
}
int main() {
    std::vector<std::pair<int, int>> intervals = {{1, 3}, {2, 6}, {8, 10}, {15, 18}};
    std::vector<std::pair<int, int>> merged = mergeOverlappingIntervals(intervals);
    for (const auto& interval : merged) {
        std::cout << "[" << interval.first << ", " << interval.second << "] ";
    }
    return 0;
}

Output:

[1, 6] [8, 10] [15, 18]

Explanation:

A function named mergeOverlappingIntervals is defined to take a vector of intervals as input.
The code checks if the input vector of intervals is empty. If it's empty, the function returns an empty vector, indicating that there are no intervals to merge.
The intervals are sorted based on their start values using the std::sort Sorting ensures that overlapping intervals are adjacent, making it easier to merge them.
A stack called intervalStack is created to hold intervals during the merging process.
The code iterates through the sorted intervals starting from the second interval.
It checks if there is an overlap with the interval at the top of the stack (the last merged interval).
If there is an overlap, the code merges the current interval with the interval at the top of the stack by extending the end of the last merged interval if necessary.
If there is no overlap, the current interval is pushed onto the stack as a new interval to be merged.
After processing all intervals, the merged intervals are extracted from the stack and stored in a vector called mergedIntervals.
The code reverses the order of intervals in the mergedIntervals vector to obtain the merged intervals in the correct order (from left to right).
The function returns the mergedIntervals vector, which now contains the merged non-overlapping intervals.

In summary, this stack-based approach efficiently merges overlapping intervals by iterating through the sorted intervals and using a stack to manage the merging process. The result is a vector of merged intervals with no overlaps.

Complexity Analysis

Time Complexity:

Sorting the intervals initially takes O(n log n) time, where 'n' is the number of intervals.
The subsequent iteration through the sorted intervals is linear, i.e., O(n).
Each interval is pushed onto or popped from the stack exactly once, which is also O(n).
Overall, the time complexity is dominated by the sorting step, resulting in a time complexity of O(n log n).

Space Complexity:

The space complexity primarily depends on the usage of additional data structures:

A sorted copy of the intervals is created, which requires O(n) additional space.
A stack is used to hold intervals during merging, which may temporarily store up to O(n) intervals in the worst cases.
The mergedIntervals vector stores the merged intervals, which can also have a maximum size of O(n).
Therefore, the space complexity is O(n) due to the additional data structures.
The stack-based approach for merging overlapping intervals has a time complexity of O(n log n) and a space complexity of O(n), making it an efficient method for handling large datasets of intervals.

Program-4: Binary Search Tree Approach

One alternative approach to merging overlapping intervals in C++ is to use a Binary Search Tree (BST) data structure. Here's a high-level explanation of this approach:

Define a Structure for Interval Nodes: Create a structure to represent interval nodes, which contain the start and end values of an interval and a reference to the left and right child nodes in the BST.

Insert Intervals into the BST: Insert each interval into the BST based on its start value. When inserting, you need to handle cases where the new interval may overlap with existing nodes. In such cases, update the end value of the overlapping interval node to the maximum of the new interval's end value and the existing node's end value.

Traverse the BST: In-order traversal of the BST will give you the intervals in sorted order. As you traverse the tree, you can accumulate merged intervals, updating them as needed when overlaps are encountered.

Collect Merged Intervals: During the in-order traversal, you can maintain a stack to keep track of the merged intervals. As you visit each interval node, compare it with the top of the stack to identify overlaps. If there's an overlap, merge the intervals; otherwise, push the current interval onto the stack.

Example:

#include <iostream>
#include <vector>
#include <stack>
struct Interval {
    int start;
    int end;
    Interval(int s, int e) : start(s), end(e) {}
};
struct IntervalNode {
    Interval interval;
    IntervalNode* left;
    IntervalNode* right;
    IntervalNode(const Interval& i) : interval(i), left(nullptr), right(nullptr) {}
};
void insertInterval(IntervalNode*& root, const Interval& newInterval) {
    if (!root) {
        root = new IntervalNode(newInterval);
        return;
    }
    if (newInterval.end < root->interval.start) {
        insertInterval(root->left, newInterval);
    } else if (newInterval.start > root->interval.end) {
        insertInterval(root->right, newInterval);
    } else {
        // Merge overlapping intervals
        root->interval.start = std::min(root->interval.start, newInterval.start);
        root->interval.end = std::max(root->interval.end, newInterval.end);
    }
}
void inOrderTraversal(IntervalNode* root, std::vector<Interval>& mergedIntervals) {
    if (!root) {
        return;
    }
    inOrderTraversal(root->left, mergedIntervals);
    mergedIntervals.push_back(root->interval);
    inOrderTraversal(root->right, mergedIntervals);
}
std::vector<Interval> mergeOverlappingIntervals(const std::vector<Interval>& intervals) {
    IntervalNode* root = nullptr;
    for (const Interval& interval : intervals) {
        insertInterval(root, interval);
    }
    std::vector<Interval> mergedIntervals;
    inOrderTraversal(root, mergedIntervals);
    return mergedIntervals;
}
int main() {
    std::vector<Interval> intervals = {Interval(1, 3), Interval(2, 6), Interval(8, 10), Interval(15, 18)};
    std::vector<Interval> merged = mergeOverlappingIntervals(intervals);
    for (const Interval& interval : merged) {
        std::cout << "[" << interval.start << ", " << interval.end << "] ";
    }
    return 0;
}

Output:

[1, 6] [8, 10] [15, 18]

Explanation:

In this example, the code begins by including necessary C++ libraries for input/output (iostream), working with vectors (vector), and utilizing stacks (stack).

It defines two custom data structures:

Interval: Represents an interval with a start and end value.

IntervalNode: Represents a node in a Binary Search Tree (BST) that holds an interval, as well as left and right child nodes.

The insertInterval function is responsible for inserting intervals into the BST while ensuring that overlapping intervals are merged. If a new interval overlaps with an existing interval in the tree, it updates the existing interval's boundaries to encompass both intervals.
The inOrderTraversal function performs an in-order traversal of the BST, which is essential for collecting merged intervals. It appends intervals to the mergedIntervals vector in sorted order.

The main function, mergeOverlappingIntervals, orchestrates the entire process:

It initializes an empty BST (root).
Iterates through the input intervals, inserting them into the BST.
Calls the inOrderTraversal function to populate the mergedIntervals vector with merged intervals.
The merged intervals are returned as a vector from the mergeOverlappingIntervals
In the main function, sample intervals are provided, and the code prints the merged intervals to the console.
The program concludes by returning 0 to indicate successful execution.
This approach uses a Binary Search Tree to efficiently merge overlapping intervals, resulting in a vector of non-overlapping intervals. The core idea is to maintain intervals in sorted order using the BST, which simplifies the merging process and produces the desired merged intervals.

Complexity Analysis:

Time Complexity:

The time complexity for inserting 'n' intervals into a BST can vary but is typically O(n * log n) for a well-balanced tree. In the worst case, when the tree becomes degenerate (linear), it can be O(n^2). However, on average, it tends to be O(n * log n) with a well-balanced tree.
The in-order traversal of a BST takes O(n) time since it processes each node once.
Overall, the time complexity of the code is mainly dominated by the BST insertion step, which is O(n * log n) on average for a balanced tree.

Space Complexity:

The space complexity for storing the intervals in the BST is O(n) because, in the worst case, the BST may contain 'n' nodes (one for each interval).
The mergedIntervals vector stores the merged intervals, which can also have a maximum size of 'n'.
The code uses a small amount of additional space for variables and function call stacks, but these do not significantly impact the overall space complexity.
The overall space complexity of the code is O(n) due to the storage of intervals in the BST and the mergedIntervals vector.

Benefits of Merging Overlapping Intervals:

There are several benefits of the Merging Overlapping Intervals. Some main benefits of the Merging Overlapping Intervals are as follows:

Data Reduction: Merging overlapping intervals reduces the amount of data to manage. It is particularly useful in scenarios where you have a large dataset of intervals, such as scheduling, where the merged intervals represent consolidated time slots, saving memory and computational resources.

Improved Visualization: Merged intervals simplify data visualization. Instead of dealing with numerous individual intervals, you can represent the data with a smaller set of merged intervals, making it easier to interpret and display in graphs or charts.

Simplified Analysis: Merged intervals simplify data analysis tasks. When intervals are merged, you can focus on the broader time periods rather than analyzing individual events or intervals. This simplification can lead to more efficient and straightforward analysis.

Drawbacks of Merging Overlapping Intervals:

There are several drawbacks of the Merging Overlapping Intervals. Some main drawbacks of the Merging Overlapping Intervals are as follows:

Loss of Precision: Merging intervals may lead to a loss of precision. When intervals are merged, you lose information about the specific timing of events or events that occur within the merged intervals. It can be a limitation in applications where precise timing matters.

Complexity: Implementing interval merging algorithms can be complex, especially when dealing with many intervals or when additional constraints need to be considered. The sorting and merging process may introduce computational overhead.

Application-Specific: The benefits and limitations of interval merging are often application-specific. What works well in one context may not work as effectively in another. Understanding the specific requirements of your problem is crucial for choosing the right approach.

Next TopicOverride keyword in C++

← prev next →