Python Solution of Median of Two Sorted Arrays

In this problem, we will be given two arrays. Let the arrays be array1 and array2. These two arrays will be sorted and can have different sizes. Let the size of the two arrays be n and m where n is the number of elements in array1 and let m be the number of elements in array2.

Let us see some examples to understand the problem.

Input: array1 = [-5, 2, 7, 9, 11], array2 = [-13, -10, -6, 3, 6, 13]

Output: The median of the two arrays is 3.

If we will merge both the arrays the new array will be sorted = [-13, -10, -6, -5, 2, 3, 6, 7, 9, 11, 13]

Since the number of elements is 11, i.e., an odd number of elements, median = (n + m - 1) / 2, the 6th element. Sorted[6] = 3

Input: array1 = [3, 6, 9, 10], array2 = [4, 7, 10, 11, 12, 13]

Output: The median of two sorted arrays is 9.50

If we merge both the arrays, the new array will be sorted = [3, 4, 6, 7, 9, 10, 10, 11, 12, 13]

Since the number of elements is 10, i.e., even number of elements, median = [(n + m) / 2, ((n + m ) / 2) + 1], we will take the average of these two values. Therefore median will be (sorted[5] + sorted[6]) / 2 = 9.50.

Approach - 1

The most basic approach to solve this problem is to create a third array. This third array will store the sum of two arrays; hence, the total size of this array will be n + m. We will sort this new array and find the median element. We must check if n + m is odd or even to find the median. As shown above, if the total length of the third array is odd, then the median is present at (n + m - 1) / 2 index of the sorted array. However, if the length of the third array is even, the median is the average of the two elements present at the indices (n + m) / 2 and ((n + m ) / 2) - 1 of the sorted array.

The idea is very similar to the example shown above. We will follow these steps to solve the problem.

  • Create a new array, array3. array3 will contain all the elements present in array1 and array2. We can do this by using the arithmetic operator on the two lists. Python lists are not added elementwise.
  • Then, we will sort the array3 using the built-in sorted() function.
  • The next step is to check if the length of array3 is even or odd
  • If the length of array3 is odd, then find median as array3[(n + m - 1) // 2]. If the length of the array3 is even then we will find the average of array3[n + m) // 2] and array3[((n + m ) // 2) - 1].

Now we will see the Python code for the above approach:

Code

Output:

The sorted array is:
 [-13, -10, -6, -5, 2, 3, 6, 7, 9, 11, 13]
The median of the sorted array is: 3
The sorted array is:
 [3, 4, 6, 7, 9, 10, 10, 11, 12, 13]
The median of the sorted array is: 9.5

Time Complexity: We have to sort the third array of size (n + m). The time complexity will be O(Log (n + m)) to sort this array.

Auxiliary Space: We are using extra memory space to store the third array, making the space complex O(n + m).

Approach - 2

We will use an efficient approach to sort the two arrays, decreasing the time complexity.

The two arrays that are given to us here are sorted. We can use this property and merge the two arrays more efficiently instead of adding and sorting them later. We will maintain the count of the arrays that we have put in order, and when we reach half the total number of elements in both arrays, we will print the median element. In this approach, too, we have to deal with the two cases that appear based on the sum of the lengths of the two arrays.

If the sum of lengths is odd, then the median is the value present in the (m + n) / 2 index after merging the two arrays.

If the sum of lengths is even, then the median is the average of the values in the (m + n) / 2 and (m + n) / 2 + 1 indices after merging the two arrays.

Let us have a dry run to understand this approach.

  • Let us take an example of two sorted arrays array1 = [3, 7] and array2 = [1, 2, 4, 5, 6].
  • Here n = length of array1 = 2 and m = length of array2 = 5
  • The sum of the lengths = 7, which means the median is the 4th value.
  • We will run a loop from 0 to 3.
  • In the first iteration, we will compare the 0th element of the two arrays. Since 1 < 4, median = 1
  • In the second iteration, we will compare the 0th element of array1 and the first element of array2. Since, 2 < 4, m1 = 2
  • In the third iteration, we will compare the 0th element of array1 and the second element of array2. Since, 3 < 4, m1 = 3
  • In the final iteration, we will compare the second element of array1 and the second element of array2. Since, 4 < 7, m1 = 4.
  • The final value of m1 is the median of the two arrays, which is 4.
  • We don't need to go further because the array's length is odd.

Below is the Python code for this approach.

Code

Output:

The median of the sorted array is: 3
The median of the sorted array is: 9.5

Time Complexity: We need to sort the merged array in this approach. We have found the median using a single for loop. Therefore, the time complexity of this approach is O(m + n).

Auxiliary Space: Unlike the previous approach, we have not stored anything where we stored merged arrays. Therefore, the space complexity of this approach is constant, i.e., O(1).

Approach - 3

We will use binary search in this approach.

As we saw in the previous approach, the property that the given two arrays are sorted can be utilized to design an efficient approach to find the median of two arrays. One of the most commonly used approaches for sorted arrays is binary search. We will use binary search to divide the two arrays without explicitly merging and sorting them.

We must find a point that can divide the merged array into two equal parts to find the median. We can use binary search to find a point in both the arrays individually. When we merge the two arrays, those points will merge too and will be the point that divides the merged array into two equal parts.

Let us take an example: array1 = [-5, 2, 7, 9, 11], array2 = [-13, -10, -6, 3, 6, 13]

  • We have to find two cuts on these arrays. Let the cut1 be between 2 and 7 in the first array and cut2 be between 3 and 6 in the second array. When we merge the two halves individually, they will be half1 = [-13, -10, -6, -5, 2, 3] and half2 = [6, 7, 9, 11, 13].
  • As we can see, 3 is the median of this array, and the two cuts, when merged, form two halves of the merged array. Hence, we have to find the index of cut1 and cut2 for the two sorted arrays through binary search.

Now, we will see the steps we must follow to solve this problem using Python.

  • We will apply binary search on the array that has a bigger length.
  • To find the median, we should know where the median lies. As we have seen in previous examples, when merged to one, the median of the sorted arrays lies at (m + n + 1) / 2 of the merged array. We will call this median_position.
  • We will find the mid-point of the array that has a bigger length. Let this be array1, and the length be n. The other array will be array2, and its length will be m. The midpoint of array1 will be the cut of the first array. Let the mid-point be called as cut1. Now the number of elements on the left side of cut1 is equal to the value of cut1. Obviously, on the left side of the merged array, the remaining number of elements will be median_position - cut1. These elements will be present in the second array. Therefore, the cut of the second array, i.e., cut2 = median_position - cut1.
  • After the two cuts, there are four parts of the two arrays.
    • L1 = left part of cut1
    • L2 = left part of cut2
    • R1 = right part of cut1
    • R2 = right part of cut2
  • But to reach the final answer, we need to confirm if the partitions made are correct or not. To validate a partition, we need to check if the value of L1 <= R2 and L2 <= R1. If this condition follows, we will check if the total length (m + n) is odd or even. If the total length is even, we will return the average of max(L1, L2) and min(R1, R2). We will return max(L1, L2) if the total length is odd.
  • However, if the condition does not hold true, we will use the binary search conditions. If the value of L1 is greater than the value of R2, then we will shift mid to cut1 - 1.

Code

Output:

The median of the sorted array is: 3
The median of the sorted array is: 9.5

Time Complexity: We are applying binary search on the array with greater length. Therefore, the time complexity will be O(max(log m, log n)). We applied this condition to pass the cases where the length of one of them is 0.

Auxiliary Space: We have not used extra space to store the array. Therefore, the space complexity is constant, hence, O(1).

Approach - 4

In this approach, we will use Priority Queue min Heap to find the median value.

We will find the median value using the priority queue by iterating over the arrays until we reach the median index.

Let us understand the algorithm with the help of an illustration.

array1 = [-5, 2, 7, 9, 11], array2 = [-13, -10, -6, 3, 6, 13], n = 5, m = 6

(n = length of array1, m = length of array2)

Step 1: We will initialize a priority queue p_q. We will add the elements of the first array to the priority queue.

p_q.push(-5)

p_q.push(2)

p_q.push(7)

p_q.push(9)

p_q.push(11)

When we add all the first array elements to the priority queue, the final queue will be [-5, 2, 7, 9, 11].

Step 2: Now we will add the elements of the second array to the priority queue.

p_q.push(-13)

p_q.push(-10)

p_q.push(-6)

p_q.push(3)

p_q.push(6)

p_q.push(13)

When we add the elements of the second array to the priority queue, the final queue will be p_q = [-13, -10, -6, -5, 2, 3, 6, 7, 9, 11, 13]

Step 3: The final step is to find the median value. We will find the median using approaches based on the condition that the priority queue length is even or odd.

If the length of the priority queue is odd

Then, we will iterate over the queue for (n + m) // 2 number of times. In each iteration, we will pop an element from the queue.

At the end of the iteration, the top element of the queue will be the median.

If the length of the priority queue is even

Then we will iterate over the priority queue upto (n + m) // 2 and ((n + m) // 2) - 1. We will return the average of the top elements of the queue.

Here is the Python code for this approach.

Code

Output:

The median of the sorted array is: 3
The median of the sorted array is: 9.5

Time Complexity: Here, the time complexity will be O(max(n, m) * log(max(n, m))). The time complexity is this because we used a priority queue for both arrays.

Auxiliary Space: We have used extra space to store the priority queue. Hence, the space complexity will equal the length of the priority queue, i.e., O(n + m).