Hash Function in Data StructureThere are almost 150 Zettabytes of data getting generated each day, which is equivalent to 150 trillion Gigabytes of data. With such an enormous speed of data growth, there emerges a need to store this data in an effective and efficient manner. By an effective and efficient manner of storage, we mean a way that will provide us the flexibility of retrieving the data in a minimal amount of time, because the more time required for an operation will directly increase the cost associated with that particular operation. So, in order to reduce the cost of an operation and do that task in an efficient manner, we need to reduce the retrieval time of the data for that particular task. And the solution for reducing the retrieval time is the Hash function or hash table. The hash function is used to map or bind the data to a particular hash value and then that hash value will be used as an index or a key to store that value in the hash table. The main benefit of storing data in the hash tables is that the retrieval time of the data stored in the hash tables is of unit time. That means the data that is stored in the hash table can be read with time complexity of O(1). So, in this way, the hash tables play an important role in drastically reducing the time required for reading the data from the hash tables. And for the working of the hash tables, it requires a hash function. Now let us see what is a hash function and how it works. Hash Function can be defined as an algorithm or a function that is used to map or convert data of bigger size or length to a fixed or small index or hash value. In other words, a hash function can be defined as an algorithm that will be used to convert the data of higher length or size to data that is within a fixed range or size. The input parameter that is passed to a hash function is the input data that needs to map to some hash data. And the output or result provided by a hash function depicts the hash value or the hashes that are associated with that input parameter value. The hash functions are associated with the hash tables that will actually store the data in the memory and the hash function is used only to map those values to the hash tables. The hash value returned by the hash function for the data item passed as an input parameter is then further used as an index to map or store that input data into the hash table. Or, we can say that the hash value returned by the hash function for the data item passed as an input parameter is used as a key for storing that data which will help in the easy and efficient retrieval of the stored data. For an ideal hash function to work, it should satisfy two basic properties or conditions so that it can deliver optimal results within a specified span of computation period. These two basic properties or conditions for an efficient hash function to store data in the hash table are:
These are two main conditions that need to be satisfied by a hash function while generating the output data or hash values in order to maintain the efficiency of a hash function. In conjunction with hash tables, hash functions are used to store and retrieve data items or data records. The hash function translates each datum or record associated with a key into a hash number which is used to index the hash table. When an item is to be added to the table, the hash code may index an empty slot (which is also called a bucket), in which case the item is added to the table there. The way input data is mapped to the indexes of the hash table using the hash function results in the creation of different types of hashing. In this article, we are going to see two main types of hashing that have their own benefits and drawbacks. The two main types of hashing types that we are going to understand are the chained hashing method and the open address hashing method. In chained hashing, each slot that is present in the hash table acts as a head node for the input element that has got that index as a hash value output of the hash function. So, if that index's head node is empty then data is added at that head node otherwise if some data is already present at that index's head node, then the new incoming data is appended or added after that head node. In short, we can say that the indexes of the hash tables act as the head nodes of the linked list. For instance, if we have ten indexes in a hash table starting from 0(zero) and ending at 9(nine). Then we have ten individual separate linked lists and the head nodes of all these ten different linked lists are stored at the indexes of this hash table. Then a hash function is used to map or store values in these different linked lists. The major benefit of the chained hashing is that we can store any amount of data in this format. For storing a lot of data, we just need to add data or append data to the last existing object or data in the linked list corresponding to the index value or hash value returned for that data by the hash function. But storing more data in the chained hash tables reduces the data searching or data retrieval efficiency of the hash table. Because for instance if the linked list preset at the index 1 have n elements stored in it then the time required for the searching or retrieval of the last element of that linked list will be O(n) which is far greater than the time required for the searching or retrieval of the data in open address chaining of the hash tables. In open addressing hash tables, the hash or the key value is calculated and then the input data is mapped or placed at the index value that is returned by the hash function. The major difference between the chained hashing method and the open addressing hashing method is that we can add any amount of data in the chained hashing technique but in the open addressing hashing technique the data added is equal to the number of indexes present in that hash table. For instance, if we have ten indexes in a hash table starting from 0(zero) and ending at 9(nine). Then we can only store ten data in this type of hash table. But one of the major benefits of the open addressing hash tables is that it requires constant time for the retrieval of data stored in the hash tables. Other than these depending upon the computational logic, the hash function is used to create the resultant hash values, there are also different types of hash functions. Some of the major types of hash functions are:
Other than these hash functions mentioned above, the user can use any type of hash logic that the user wants to implement and create a hash function according to their needs. Let us understand the concept of hashing and the use of hash function in the whole process with the help of an example. Let us assume we have a hash table having ten slots or indexes starting from the index value or slot value zero(0). And the last slot value is nine(9). And the hash function we are using in this example is the modulus hash function, which means the input data that is passed as a parameter to the hash function will undergo the modulus operation as a part of the hashing and then the result of this modulus operation is returned as the output of the hash function that will act as an index or slot key to store or map that input data in the hash table. Initially, the hash table looks like this. All the slots in the hash table will be empty.
So, now let us assume the input data is 25. We pass this input data is passed to the hash function. In the hash function, the modulus operation is performed, the modulus of the 25 will be 5. So, the resultant value that will be returned by the hash function as the hash value for the input data 25 is 5. Thus, the input data having value 25 will be stored in the hash table slot number 5. The hash table after adding the data at slot number 5 is like:
So, now let us assume the input data is 1. We pass this input data to the hash function. In the hash function, the modulus operation is performed, the modulus of the 1 will be 1. So, the resultant value that will be returned by the hash function as the hash value for the input data 1 is 1. Thus, the input data having value 1 will be stored in the hash table slot number 1. The hash table after adding the data at slot number 1 is like this:
So, now let us assume the input data is 493. We pass this input data to the hash function. In the hash function, the modulus operation is performed, the modulus of the 493 will be 3. So, the resultant value that will be returned by the hash function as the hash value for the input data 493 is 3. Thus, the input data having value 493 will be stored in the hash table slot number 3. The hash table after adding the data at slot number 3 is like this:
So, this is how the final hash table looks like after adding data to the table four times. Now let us assume the input data is 975. We pass this input data to the hash function. In the hash function, the modulus operation is performed, the modulus of the 975 will be 5. So, the resultant value that will be returned by the hash function as the hash value for the input data 975 is 5. Thus, the input data having value 975 will be stored in the hash table slot number 5. But slot number 5 is already occupied by the data having a value of 25. So, this is the constraint of the openchain hashing technique that we can store only some specific amount of data in the hash table. Once the storage is done, the next step is to retrieve the data from the hash table. For the search operation, the same hash function is used to find the data stored in the hash table. In search operation, the search key is again passed to the hash function and the slot number or index is calculated and the data from that index is retrieved and it is matched with the search key. If the search key and the data fetched from the hash table match then the search operation is successful otherwise if both the data don't match the search operation is unsuccessful. For the actual reallife application of the hashing and hash tables, it is needed that we should implement the concept of hash tables with the use of hash function programmatically. So, for better understanding how we can code a hash table with the use of hash functions let us write a sample java code that will simulate the functioning of a hash table programmatically. Code:Output: Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 0 10 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 1 85 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 3 47 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 7 149 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 2 Contents of the hash table are:: Key Associated Value  0 10 1 85 3 47 7 149 Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 3 Enter key for which the data you want:: 3 The value associated with the entered key is = 47 Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 4 Enter the key that you want to Delete:: 1 Data deleted Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 2 Contents of the hash table are:: Key Associated Value  0 10 3 47 7 149 Type [N or n] to terminate the program. Type [Y or y] to continue the program. N In the code written above, first, we added data in the hash table four times and then confirmed the insertion by printing the contents of the hash table. After that, we searched the value present in the hash table by the search key and printed the obtained result. And the last delete operation is performed on the hash table by deleting the contents of the hash table and then the updated hash table is printed by calling the print_data() function. Hash tables can be implemented in other languages like C++, Python, JavaScript, etc also. Let us see a sample C++ code that will have all the basic operations that are needed to be performed on a hash table. C++ Code:Output: Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 101 56 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 102 87 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. Y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 1 Enter the key and value that you want to add to the Hash Table:: 104 97 Data Added Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 2 Contents of the hash table are:: Key Associated Value  101 56 102 87 104 97 Type [N or n] to terminate the program. Type [Y or y] to continue the program. y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 3 Enter key for which the data you want:: 102 The value associated with the entered key is = 87 Type [N or n] to terminate the program. Type [Y or y] to continue the program. y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 4 Enter the key that you want to Delete:: 101 Data deleted Successfully. Type [N or n] to terminate the program. Type [Y or y] to continue the program. y Please Choose one of the Operations:: 1. To Insert Data in the Hash Table. 2. To Insert Data from the Hash Table. 3. To Search Data in the Hash Table. 4. To Remove or Delete Data From the Hash Table. 2 Contents of the hash table are:: Key Associated Value  102 87 104 97 Type [N or n] to terminate the program. Type [Y or y] to continue the program. n So, in this code also the same sequence of the operations is followed that is first the insertion operation followed by the search operation and then in the last, delete or remove operation is performed and the results of all these operations are verified at each step by printing the contents that are present in the hash table after the successful completion of that particular task. Because of the constant time required for insert and searching the data in the hash table. Hash tables find their application in many fields of computer problems. Some of the applications or use cases of a Hash table are:
So, this article gives us a brief knowledge about what is a hash function and how to use a hash function to add data in a hash table, and what are the major benefits and use cases of a hash table and implementation of the hash table in different programming languages like Java, C++.
Next TopicComplete Binary Tree
