Merge Sort on Singly Linked Lists

This article explains implementing merge sort on singly linked lists - finding middle nodes, recursively sorting left/right halves, and merging sorted sub lists. Analyzes time and space complexity. Useful for engineers working with linked lists.

Linked lists allow efficient insertion/deletion but can be tricky to sort. Merge sort uses divide-and-conquer technique to split the list into sublists, sort sublists, and merge them in order.

Merge Sort on Singly Linked Lists

The Merge Sort

Merge sort is an algorithm that uses a divide-and-conquer approach. It solves a problem by breaking it down into subproblems, solving them individually, and then combining the results. When it comes to sorting a linked list, merge sort follows these steps;

Algorithm

To divide the linked list into two parts, you can follow these steps;

  1. Start with two pointers called slow and fast, pointing to the linked list's head node.
  2. Move the pointer by one node at a time. The fast pointer is by two nodes at a time.
  3. When the fast pointer reaches the end of the linked list, the slow pointer will point to the node.

Next, you can recursively sort each sublist in a merge sort fashion;

  1. Apply merge sort recursively on both right halves of the linked list.
  2. The base cases for recursion are when there are either 0 or 1 nodes in a sublist, which means they are already sorted.

Finally, merge the lists;

  1. Create a node. Set a tail pointer to keep track of the last node in our new merged list.
  2. Use left and right pointers to traverse through each sorted sublist.
  3. Compare nodes pointed by left and right and insert the value into our merged list at its tail position.
  4. Advance that pointer (left or right) from which we inserted a value into our merged list.
  5. Repeat this process until one of our sublists is fully traversed.
  6. Once we have finished traversing one sublist, append all remaining nodes from another sublist to our merged list.

Additionally, here are some optimizations that can enhance performance when using merge sort on linked lists

  1. Keep track of a tail pointer for each sublist of traversing from the head every time we want to insert an element into it.
  2. Of dividing sublists in half (to find their middle point), count all nodes in advance and divide them by 2 (n/2) for more balanced sublists.

One can save stack space by employing natural recursion on a list.

Analysis

  • Time Complexity - O(n log n) - Breaking down the list has log n levels, and merging each level costs O(n) time.
  • Space Complexity - O(n) - Merge sort uses O(n) space to store sorted sublists. The recursion stack uses O(log n) space.

Advantages of Merge Sort

Efficiency - The O(n log n) time complexity of merge sort makes it highly efficient on large datasets. The divide and conquer strategy helps break the problem into smaller subproblems that can be solved independently.

Adaptability - Merge sort works well on different linked list structures like singly, doubly, and circular linked lists. The node traversal logic must be adapted based on the linked list type.

Stability - Merge sort is a stable sorting algorithm, meaning the original ordering of equal keys is preserved after sorting. This stability property can be important for data sets with multiple records with the same key.

In-place sorting - Merge sorting can be implemented without requiring extra space for storing a list copy. The merging step is done by modifying links rather than creating a new list.

Parallelism - The independent subproblems created by divide and conquer can be solved in parallel, improving speed on multi-core systems.

Recursion - The recursive implementation of merge sort aligns well with the pointer-linked structure of linked lists. No random access is required as in array sorts.

Debugging ease - The step-wise nature of merge sort with clear divide, sort, and merge steps makes debugging simpler than iterative sorting techniques.

Low overhead - Unlike other algorithms like quicksort, the merge sort has little overhead from function calls and minimal indexes or extra data structures are needed.

Disadvantages of Merge Sort on Linked Lists

  1. Recursive overhead - The recursive implementation of merge sort can lead to significant overhead on the call stack for large input sizes, resulting in stack overflow errors. Iterative merge sort can avoid this issue.
  2. Not in place for naive implementation - The merge step often requires creating a separate merged list, temporarily doubling the memory usage. This can be optimized with in-place merging.
  3. Difficult to parallelize - The sequential nature of merge sort and dependence on prior steps makes it tricky to parallelize without additional coordination overhead.
  4. Linked list manipulation overhead - The pointer manipulation to split and merge lists adds overhead compared to array indexing in languages like C/C++.
  5. Requires full linked list traversal - Merge sort always requires traversing the full linked list, which can be slow for large lists if early termination is needed.
  6. It is not cache-friendly. Linked lists have poor locality of reference, so merge sort loses out on speedups from cache hits that array sorting gets.
  7. Insertion sort faster for small lists - The overhead of merge sort recursion and merging is not optimal for sorting very small lists where insertion sort is usually faster.
  8. Difficult to tune - Merge sort is not easily tuned or adapted like quicksort's pivot selection or insertion sort's early termination.

So, in summary, factors like recursive overhead, manipulation cost, lack of locality, and difficulty parallelizing can make merge sort's O(n log n) less ideal compared to other sorts like quicksort in practice. The linked list structure also removes some caching benefits of merge sort on arrays.

Python Implementation

Output:

Merge Sort on Singly Linked Lists

Explanation

  1. Node class defines the nodes of the linked list with data and next pointer.
  2. mergeSort function takes head of the linked list as argument.
  3. Base cases - if the head is None or has just 1 node, it is sorted.
  4. Find the middle node using the slow and fast pointer techniques.
  5. Break the list into two halves at the middle node using slow.next = None.
  6. Recursively call mergeSort on the left and right halves.
  7. Call the merge function to merge the two sorted lists.
  8. The merge function takes a left and right-sorted list as arguments.
  9. Create a dummy node to build a sorted list. Initialize tail to dummy.
  10. Loop until left or right is None:
    • Compare data at left and right nodes.
    • Append smaller data to tail with tail.next.
    • Advance pointer of appended node.
    • Update tail to new last node.
  11. Append the remaining nodes of any non-null list.
  12. Return dummy.next, which now points to the sorted list head.
  13. printList prints the linked list.
  14. Driver code tests mergeSort on sample input.

So, in summary, finding the middle, recursively sorting halves, and merging using dummy node and tail pointer techniques are the key aspects of implementing merge sort on linked lists in Python.

Conclusion

In summary, merge sort is an efficient, stable sorting algorithm well-suited for linked lists. By leveraging a divide-and-conquer approach and recursively splitting the list into smaller sorted sublists, it can sort a linked list in O(n log n) time. The merging phase combines the sorted sublists efficiently using a dummy node and tail pointer. Implementing merge sort on linked lists requires adapting the pointer manipulation logic but provides fast, in-place sorting.






Latest Courses