Interpolation Search

In this article, we will explore Interpolation Search in detail, discussing its principles, advantages, limitations, and practical applications.

Introduction

Interpolation Search is a searching algorithm that uses an interpolation formula to estimate the position of the target value in a sorted array or list.

Unlike binary search, which always selects the middle element, Interpolation Search makes a more intelligent guess based on the distribution of the data. It uses a formulaic approach to determine the position of the target element within the array.

It is particularly effective when the elements are uniformly distributed.

The uniform distribution of the dataset means the interval between the elements should be uniform (does not have a large difference).

How Interpolation Search Works:

It uses the idea of the interpolation formula to estimate the probable location of the target element.

It calculates the probable position using an interpolation formula that considers the range and values of the data elements.

Interpolation Formula = low + [(high - low) * (X - A[low])]/(A[high] - A[low])]

low - Left pointer
high - Right pointer
A[low] - Element at the left pointer
A[high] - Element at the right pointer
X - Target element to be searched

This estimation guides the algorithm to narrow down the search range and thus achieve faster retrieval.

Algorithm:

The algorithm can be summarized in the following steps:

Initialize low and high indices to the start and end of the array, respectively.
Calculate the probe position using the interpolation formula.
Compare the probe element with the target element.
1. If they are equal, the search is successful.
2. If the probe element is greater, update the high index to the probe position minus one.
3. If the probe element is smaller, update the low index to the probe position plus one.
Repeat steps 2-3 until the target element is found or the low index exceeds the high index.

Python Implementation:

# Function
def interpolation_search(arr, target):
    
    low = 0  # Starting index
    high = len(arr) - 1  # Ending index

    # Loop until low <= high and
    # target is between arr[low] and arr[high]
    while low <= high and arr[low] <= target <= arr[high]:

        if low == high:
            # If the target is at the low index
            if arr[low] == target:
                return low
            
            # Element not found
            return -1

        # Estimate the position of the target element using the interpolation formula
        pos = low + ((target - arr[low]) * (high - low)) // (arr[high] - arr[low])

        if arr[pos] == target:
            return pos
        
        elif arr[pos] < target:
            # Search in the right portion of the array
            low = pos + 1
        else:
            # Search in the left portion of the array
            high = pos - 1

    # If the search key is not found in the array, return -1
    return -1

# Example usage:
sorted_list = [2, 4, 7, 9, 12, 15, 18, 20, 23, 25]
print("List =", sorted_list)
# Target element
target_element = 20
print("Target Element =", target_element)

# Calling the interpolation_search function
result = interpolation_search(sorted_list, target_element)

# Printing the result
if result != -1:
    print("Element found at index:", result)
else:
    print("Element not found")

Output:

Explanation:

In starting we have an array = [2, 4, 7, 9, 12, 15, 18, 20, 23, 25], and the intervals between the elements are 2 and 3 which can be treated as uniform.

Let's find the target element = 15 in this array.

We call the interpolation_search with the necessary parameters and the returned index is stored in the result.

target_element = 15

First Iteration:

low = 0, arr[low] = 2

high = 9, arr[high] = 25

Here, low <= high and the target = 15 is between arr[low] = 2 and arr[high] = 25.

Hence, both conditions of the while loop are satisfied.

We then check if the low is equal to the high or not.

if low == high:
            # If the target is at the low index
            if arr[low] == target:
                return low

If yes, we then check if the arr[low] is equal to the target or not. If yes, we return the low index.

Else, we return -1 indicating that the target element is not found.

Now, calculate the probable position using the interpolation formula.

pos 	= low + ((target - arr[low]) * (high - low)) // (arr[high] - arr[low])
= 0 + ((20 - 2) * (9 - 0)) // (25 - 2)
= 0 + (18 * 9) // 23
= 162 // 23
= 7

Check if the element at pos is equal to the target or not.

if arr[pos] == target:
return pos

Here, arr[pos] = arr[7] = 20 which is equal to the target. We return the pos = 7.

result = 7

Finally, we print the result into the console.

if result != -1:
    print("Element found at index:", result)
else:
    print("Element not found")

Output: Element found at index: 7

Time Complexity Analysis:

Average Case: O(log logn) - When the data is uniformly distributed.

Worst Case: O(n) - When the data is not uniform, making it less efficient than the binary search.

C++ Implementation:

// Interpolation Search Algorithm
#include <iostream>
using namespace std;

// Search Function
int interpolation_search(int arr[], int n, int target)
{
    int low = 0; // Left Pointer
    int high = n - 1; // Right Pointer

    // If there is only one element
    if (low == high)
    {
        // If it is the target element
        if (arr[low] == arr[high])
        {
            return low;
        }
        // If target Not Found
        return -1;
    }
    
    
    int pos = -1;

    // Loop until low <= high and
    // target is between arr[low] and arr[high]
    while (low <= high && arr[low] <= target && target <= arr[high])
    {
        // Estimate the position of the target element using the interpolation formula
        pos = low + ((target - arr[low]) * (high - low)) / (arr[high] - arr[low]);

        if (arr[pos] == target){
            return pos;
        }

        else if (arr[pos] < target){
 // Search in the right portion of the array
            low = pos + 1;
        }

        else{
 // Search in the left portion of the array
            high = pos - 1;
        }
    }
    // Element not found
    return -1;
}

// Driver Function
int main()
{
    // An array
    int arr[10] = {2, 4, 7, 9, 12, 15, 18, 20, 23, 25};
    int n = 10;
    // Print the array
    cout << "Array = [ ";
    for (int i = 0; i < n; i++)
    {
        cout << arr[i] << ", ";
    }
    cout << "]" << endl;
    
    // Target element
    int target_element = 7;
    cout << "Target Element = " << target_element << endl;
    
    // Calling the interpolation_search function 
    int result = interpolation_search(arr, n, target_element);

    // Printing the result.
    if (result != -1){
        cout << "Element found at index = " << result;
    }
    else{
        cout << "Element not found.";
    }
    return 0;
}

Output:

C Implementation:

// Interpolation Search Algorithm
#include <stdio.h>

// Search Function
int interpolation_search(int arr[], int n, int target)
{
    int low = 0; // Left Pointer
    int high = n - 1; // Right Pointer

    // If there is only one element
    if (low == high)
    {
        // If it is the target element
        if (arr[low] == arr[high])
        {
            return low;
        }
        // If target Not Found
        return -1;
    }
    
    
    int pos = -1;

    // Loop until low <= high and
    // target is between arr[low] and arr[high]
    while (low <= high && arr[low] <= target && target <= arr[high])
    {
        // Estimate the position of the target element using the interpolation formula
        pos = low + ((target - arr[low]) * (high - low)) / (arr[high] - arr[low]);

        if (arr[pos] == target){
            return pos;
        }

        else if (arr[pos] < target){
 // Search in the right portion of the array
            low = pos + 1;
        }

        else{
 // Search in the left portion of the array
            high = pos - 1;
        }
    }
    // Element not found
    return -1;
}

// Driver Function
int main()
{
    // An array
    int arr[10] = {2, 4, 7, 9, 12, 15, 18, 20, 23, 25};
    int n = 10;
    // Print the array
    printf("Array = [ ");
    for (int i = 0; i < n; i++)
    {
        printf("%d, ", arr[i]);
    }
    printf("]\n");
    
    // Target element
    int target_element = 18;
    printf("Target Element = %d\n", target_element);
    
    // Calling the interpolation_search function 
    int result = interpolation_search(arr, n, target_element);

    // Printing the result.
    if (result != -1){
       printf("Element found at index = %d", result);
    }
    else{
        printf("Element not found.");
    }
    return 0;
}

Output:

Advantages of Interpolation Search:

Faster Search - It narrows down the search space based on the distribution of the data resulting in the fast searching of a value.
Outperforms Binary Search - It outperforms binary search when the search has to be performed on a large dataset. It reduces the number of comparisons required which makes it a time-efficient algorithm.
Efficient for Uniformly Distributed Dataset - The whole idea of this algorithm is based on data distribution. It performs exceptionally well when the dataset is uniformly distributed reducing the time complexity to log(log(n)).

Limitations of Interpolation Search:

The main disadvantage is that it requires a uniform dataset. In the case of non-uniform datasets, it leads to poor performance and even worse time complexity than the linear search.

Also, when the elements have large differences, the formula can result in a position that is outside of the valid range.

Another disadvantage we can think of is it requires extra calculation making it more complex than the binary search.

Practical Applications of Interpolation Search:

Databases - It can be used in databases to perform searching on sorted data enhancing the retrieval time.
Scientific Data Analysis - We can use it in Scientific data analysis for large datasets with a uniform distribution.
Time-Sensitive Applications - It can be used in time-sensitive applications where fast retrieval is a crucial factor.

CONCLUSION:

Interpolation search is a fast and powerful search algorithm that provides a more efficient alternative to linear and binary search algorithms. It performs exceptionally well for uniformly distributed data. It estimates the position of the target element using the interpolation formula and reduces the search space. It enhances the search performance making it a valuable tool for searching.

While it may have limitations for non-uniform datasets, it remains a valuable tool in various applications where search speed and efficiency play a crucial role.

Next TopicQuick Sort Using Hoare's Partition

← prev next →