What is Hashing in C

In C programming language, hashing is a technique that involves converting a large amount of data into a fixed-size value or a smaller value known as a hash. The hash is generated through a hash function, which maps the input data to an output hash. The resulting hash value can then be used to efficiently search, retrieve, and compare data within large data sets.

Hashing is commonly used in data structures such as hash tables, which are arrays that store data in a way that allows for quick insertion, deletion, and retrieval of data. The hash function used to generate the hash value maps the key (or the data to be stored) to an index within the hash table. This index is then used to store the data in the corresponding location within the array.

Hashing is useful for several reasons. Firstly, it can reduce the amount of memory required to store large data sets by converting the data into a smaller value. Secondly, it can improve the performance of algorithms by allowing for faster searching and retrieval of data. Finally, it can help to ensure data integrity by detecting duplicate data and preventing collisions (when two different keys map to the same index).

The process of hashing involves three main steps: creating the hash function, generating the hash value, and storing the data in the hash table.

Creating the hash function involves designing an algorithm that maps the input data to a fixed-size value. This algorithm should be designed to distribute the data evenly across the hash table to reduce the likelihood of collisions. A good hash function should also be fast, simple, and deterministic (i.e. it should always produce the same output for the same input).

Once the hash function is created, the next step is to generate the hash value for the data. This involves passing the data through the hash function, which returns a fixed-size hash value. This value is then used as an index within the hash table to store the data.

Storing the data in the hash table involves placing the data in the corresponding location within the array. If a collision occurs (i.e. if two different keys map to the same index), the hash table may use a technique called chaining to store both keys in the same index. In chaining, a linked list is created for each index, and the keys are added to the linked list.

Hashing in C can be implemented using several different methods, including the division method, multiplication method, and the folding method. The division method involves taking the remainder of the key divided by the size of the hash table to determine the index. The multiplication method involves multiplying the key by a constant value and then taking the fractional part of the result to determine the index. The folding method involves breaking the key into several parts, adding them together, and then using the result to determine the index.

Implementation of a hash table in C using arrays:

#include<stdio.h>
#define size 7
int array[size];
void init()
{   
    int i;
    for(i = 0; i < size; i++)
        array[i] = -1;
}

void insert(int val)
{   
    int key = val % size;
    
    if(array[key] == -1)
    {   
        array[key] = val;
        printf("%d inserted at array[%d]\n", val,key);
    }
    else
    {   
        printf("Collision : array[%d] has element %d already!\n",key,array[key]);
        printf("Unable to insert %d\n",val);
    }
}

void del(int val)
{
    int key = val % size;
    if(array[key] == val)
        array[key] = -1;
    else
        printf("%d not present in the hash table\n",val);
}

void search(int val)
{
    int key = val % size;
    if(array[key] == val)
        printf("Search Found\n");
    else
        printf("Search Not Found\n");
}

void print()
{
    int i;
    for(i = 0; i < size; i++)
        printf("array[%d] = %d\n",i,array[i]);
}

int main()
{
    init();
    insert(10); 
    insert(4);  
    insert(2);
    insert(3); 

    printf("Hash table\n");
    print();
    printf("\n");

    printf("Deleting value 10..\n");
    del(10);
    printf("After the deletion hash table\n");
    print();
    printf("\n");

    printf("Deleting value 5..\n");
    del(5);
    printf("After the deletion hash table\n");
    print();
    printf("\n");

    printf("Searching value 4..\n");
    search(4);
    printf("Searching value 10..\n");
    search(10);

    return 0;
}

Output

10 inserted at array[3]
4 inserted at array[4]
2 inserted at array[2]
Collision : array[3] has element 10 already!
Unable to insert 3
Hash table
array[0] = -1
array[1] = -1
array[2] = 2
array[3] = 10
array[4] = 4
array[5] = -1
array[6] = -1

Deleting value 10..
After the deletion hash table
array[0] = -1
array[1] = -1
array[2] = 2
array[3] = -1
array[4] = 4
array[5] = -1
array[6] = -1

Deleting value 5..
5 not present in the hash table
After the deletion hash table
array[0] = -1
array[1] = -1
array[2] = 2
array[3] = -1
array[4] = 4
array[5] = -1
array[6] = -1

Searching value 4..
Search Found
Searching value 10..
Search Not Found

Hashing is a technique used in computer programming to quickly search and retrieve data from large datasets. In C programming, hashing is often used to implement hash tables or associative arrays. Here are some usage, advantages, and disadvantages of hashing in C:

Usage:

Hashing can be used to implement efficient data lookup operations, such as searching for a specific value in a large array or table.
Hashing can be used to implement data structures like hash tables, which provide constant-time lookup, insertion, and deletion operations.

Advantages:

Hashing provides fast data retrieval and search times, making it useful for large datasets where performance is a concern.
Hashing is relatively simple to implement in C and can be used to build complex data structures like hash tables or hash maps.
Hashing can also be used for data security purposes, such as password storage or data encryption.

Disadvantages:

Hashing collisions can occur, which can lead to reduced performance and longer search times.
Hashing requires a good hash function that can evenly distribute the data across the hash table. Creating a good hash function can be challenging and time-consuming.
Hashing can consume a lot of memory, especially if the hash table needs to store a large number of items or if the hash function has a high collision rate.

In summary, hashing is a useful technique for quickly searching and retrieving data in large datasets, but it has some limitations such as collisions, the need for a good hash function, and high memory consumption.

Conclusion:

Hashing in C is a powerful technique that allows for efficient searching, retrieval, and comparison of data within large data sets. It involves creating a hash function that maps input data to a fixed-size hash value, which is then used as an index within a hash table to store the data. By using hashing, programmers can improve the performance of algorithms and reduce the amount of memory required to store large data sets.

Next TopicBubble sort program in C

← prev next →