Hashing Algorithm in Python

An Introduction

Hashing is a crucial concept in computer science and cryptography. It refers to taking input data, also known as a message, and applying a mathematical function or algorithm. This process generates a fixed-size sequence of characters, usually a hexadecimal number or a string of characters, known as the hash value.

  • Input: Any text
  • Output: Fixed-length text (hash value)
Hashing Algorithm in Python

Properties of Good Hash Functions

  • Non-reversibility: It is vital for cryptographic applications that the input that generates a particular output is difficult to determine.
  • Fixed-size output: It is easier to compare and store hash values since it generates a fixed-size output, regardless of the input size.
  • Deterministic: A hash function must be deterministic, meaning the output should always be the same for the same input.

Examples of hash functions

Some popular hash functions include:

  • SHA-1
  • SHA-2
  • SHA-3
  • MD5
  • Blake2
  • BLAKE3

Hash functions in Python

Hash functions in Python are simpler for creating hash values from plain text.

Python's hash() function generates a unique hash value (a fixed-size integer) for an object. This hash value represents the object's identity, and it is used in various data structures and algorithms, including sets and dictionaries, to compare and look up objects quickly.

The hash() function applies only to specific objects; not all objects can be used.

The hash() function cannot hash mutable objects such as lists, dictionaries, and sets.

It is important to understand which objects can be hashed in Python.

Hashable objects

  1. Strings
  2. Integers (as well as other numeric types such as float and complex)
  3. Tuples (if the elements provided are hashable)
  4. Frozen set
  5. Bytes and Bytearray
  6. Custom objects with a properly implemented __hash__() method-
  7. Immutable built-in types include bool, NoneType, and constants like True, False, and None.

Unhashable objects

The following objects in Python are mutable, which means that they can be changed after they are created:

  1. Lists
  2. Dictionaries
  3. Sets
  4. Other objects that can be changed, such as instances of user-defined classes that have a mutable attribute
  5. Custom objects with a __hash__() method that returns None or raises an exception
  6. Functions, which include methods and lambdas
  7. File objects
  8. Network sockets and other I/O objects

Example 1:

Let us consider an example.

Program

Output:

The hash value for string -8608911714887531465
The hash value for tuple 590899387183067792
The hash value for float 922337203685490786

Explanation:

The inbuilt hash function accepts the input and performs some algorithm, then returns the hash value.

Example 2:

Let us consider the program which does not produce the hash value for certain objects.

Program

Output:

ERROR!
Traceback (most recent call last):
	File "<string>", line 4, in <module>
TypeError: unhashable type: 'list'

Explanation:

The hash() function cannot hash mutable objects such as lists, dictionaries, and sets.

Example 3:

Output:

1844674407370948697

Explanation:

The objective of the code is to define a custom class named MyHashableClass and create an instance of that class with the value 89. Then, it calculates and prints the hash value of that instance using the hash() function.

Hashlib library

Python offers a range of hashing algorithms that one can use based on their requirements. The built-in hash functions provided by Python include cryptographic ones from the "hashlib" module, which can be used for hashing data. Let's quickly look at some commonly used hashing algorithms and learn how to use them.

SHA-1 (Secure Hash Algorithm 1):

SHA-1 produces a 160-bit hash value and was commonly used in the past. However, due to vulnerabilities, it is now considered deprecated for cryptographic purposes.

Example program

Output:

1477e90a7106add0a379f738f823e4c810cfceec

Explanation:

SHA-1 takes the input and produces the 160-bit output in hexadecimal format.

SHA-256 (Secure Hash Algorithm 256):

SHA-256 is part of the SHA-2 family and produces a 256-bit hash value. It is suitable for most cryptographic applications.

The most used hashing algorithm is the SHA256 algorithm, which is more secure than MD5.

Example program

Output:

0d5bad3f01155a5ec3e352d2925eee7700af0225f7891beccd1dc1ddef50393f

Explanation:

SHA256 function has taken the input and "SHA 256" and produced the output in hexadecimal format of length 256 bits.

SHA-3 (Secure Hash Algorithm 3):

SHA-3 is a family of Secure Hash Algorithms that provide robust security and high performance. It is available in various bit lengths, such as SHA-3-256 for a 256-bit hash.

Example program

Output:

367f56e5e185665949cf91a86f88058ba5e724a0ddc7f91ef85a50b893ff828c

Explanation:

SHA3 function has taken the input as "SHA 3" and produced the output of length 256-bit in hexadecimal format.

Message Digest (MD5()):

MD5 function produces a 128-bit hash value and is commonly used for checksums. However, it is not recommended for cryptographic purposes due to its vulnerabilities.

Example program

Output:

9b10c9985311d8a19afc271140d7258e

Explanation:

The MD5() function takes the input "message digest" and generates a 128-bit hash value.

SHA-384:

The Secure Hash Algorithm 2 (SHA-2) family includes the function called SHA-384, which is designed to ensure cryptographic security. It generates a 384-bit (48-byte) hash value, making it a reliable hash function for data integrity verification, password storage, and other cryptographic applications.

Example Program

Output:

7bc670b88b26aabf94f6102c66ff3a28e6580174addb091675a3e034ea06968abe428c477a674a832d0ae44177398d76

Explanation:

The SHA-384 function takes the input " SHA384 Hashing " and generates a 384-bit(48-byte) hash.

SHA-224:

The SHA-224 function belongs to the SHA-2 family of hash functions and generates a 224-bit (28-byte) hash value. Regarding data integrity checks, SHA-224 is a reliable and secure hash function that can be used for cryptographic purposes. This hash function is ideal for situations where a shorter hash length is sufficient. It is commonly used instead of a 256-bit or 512-bit hash, which may be considered excessive.

To calculate the SHA-224 hash of a string, here is an example:

Example Program

Output:

8552d8b7a7dc5476cb9e25dee69a8091290764b7f2a64fe6e78e9568

Explanation:

Sha224 function takes the input "Hello, world!" and produces a 224-bit output.

SHA-512:

SHA-512, or Secure Hash Algorithm 512 (SHA-512), is a hashing algorithm that converts text of any length into a fixed-size string. Each output is 512 bits long (64 bytes). It is a secure and widely used hash function for various cryptographic and security applications.

Example Program

Output:

9592e8c12960603d57a43f5367177c44f0f8c08e4040a9cf380cf6cdef35b4cfbfa01eed16fea0c18206994afeb29d1091e658a310d16cd906d94d84a88a036b

Explanation:

Sha512 function takes the input "GOOD" and produces 512-bit output.

Blake2

Blake2 is a fast and secure cryptographic hash function that allows for producing hash outputs of different lengths. The two main variants of Blake2, Blake2b and Blake2s, have different output lengths.

Blake2b: This variant allows for generating hash outputs of various lengths, such as 256 bits (32 bytes), 384 bits (48 bytes), and 512 bits (64 bytes). The desired output length can be specified while creating the Blake2b hash object.

Example Program

Output:

Blake2b 256-bit Hash: 4fa69a156f24d1f62e01757dcd998048ae036aed66cf72f002ca28143cddfc8f
Blake2b 384-bit Hash: 537747f30e72fe46f91ded3e6a33953ecbc708408ca839ebfd1a2074b3a03cb13abf399d87ca0e95a6a67c52cab4969e
Blake2b 512-bit Hash: 7c8f1ef9d911109531ddd5a990178c6568efe091c3f97648ece2c0ca0a526652074c7f3e13f23ea098786423485da832f4e60f912bfc179e8782a997df27e78c

Explanation:

In this program, we create three instances of Blake2b hash objects, each with a different output length: 256 bits, 384 bits, and 512 bits. We then provide an input string "Hello, Blake2b!" encoded in UTF-8 to update each hash object. After that, we obtain the hexadecimal representation of the hash values for all the objects. Finally, we print the hash values associated with each output length.

Blake2s: The variant is specifically designed for producing shorter hash lengths. The output of this hash variant is typically either 128 bits (16 bytes) or 256 bits (32 bytes) in length. When creating a Blake2s hash object, you can specify the desired output length.

Let us consider a simple program that produces different length outputs.

Example Program

Output:

Blake2s 128-bit Hash: d19972b915240fb371cf10c264473eaf
Blake2s 256-bit Hash: 98fe241a6c678f49d0513a0cc7c0f18ab8dff75ef0733d70440c7ca5d9a992eb

Explanation:

In this program, we utilize the hashlib.blake2s function to generate two Blake2s hash objects with different output lengths, i.e., 128 bits (16 bytes) and 256 bits (32 bytes). We then update both hash objects with the same input string, "Hello, Blake2s!" that is UTF-8 encoded. Finally, we acquire the hexadecimal representations of the hash values for both the 128-bit and 256-bit versions and print them.

RIPEMD

RIPEMD, which stands for RACE Integrity Primitives Evaluation Message Digest, is a group of cryptographic hash functions utilized for various security and cryptographic applications. The most widely known versions of RIPEMD are RIPEMD-160 and RIPEMD-128. To give you a better understanding of what RIPEMD is all about and how it is used, here is an overview.

RIPEMD-160: RIPEMD-160 produces a 160-bit hash value. It is used for data integrity and cryptocurrency, such as Bitcoin.

Example Program

Output:

RIPEMD-160 Hash: 2704a6e10487112bb98f615643b3c689c331f39c

Explanation:

In this example, we first import the hashlib library. Then, we create a RIPEMD-160 hash object by using hashlib. new('ripemd160'). Next, we update the hash object with the data we want to hash. Finally, we retrieve the hexadecimal representation of the hash using the hexdigest() method.

RIPEMD-128: RIPEMD-128 produces a 128-bit hash value but is less commonly used and may not be widely supported in cryptographic applications compared to RIPEMD-160.

Example Program

Output:

e999f30fc21ab12ab9ed89f55ba6a1de

Explanation:

The program begins with the import of the hashlib library. After that, we create a hash object of RIPEMD-128 using hashlib. new('ripemd128'). Then, we add the data we want to hash to the hash object. Lastly, we obtain the hexadecimal representation of the hash using the hexdigest() method.

Bcrypt Hashing Algorithm

Bcrypt is the most widely used secure password hashing algorithm to protect user passwords. Once the password is hashed, it cannot be reversed to its original form. It is a key component in securing user authentication systems.

The output of bcrypt includes:

  1. A prefix indicating the hash algorithm (e.g., "$2a$", "$2b$", "$2y$", etc.).
  2. The cost factor and salt.
  3. The actual hashed password.

Example Program

Output:

$2b$12$g47LIm7P7q9NthP62I0MUOae6OWTS19ys1EL9NeNzF8le/eK3TRei

Explanation:

The above program takes input and produces the output of 60 characters. It produces different outputs every time you run the program.

The output of bcrypt includes:

  1. A prefix indicating the hash algorithm (Here "$2a$").
  2. The cost factor and salt. ($12 is the cost factor and salt in the above output).
  3. The actual hashed password. (The remaining characters are the hashed password).

Advantages

Hashing is an important process for securely storing passwords. It allows for efficient data retrieval and ensures data integrity. Hashing is essential for various security protocols, such as message authentication codes (MACs), digital signatures, and data integrity verification. Additionally, hashing algorithms produce consistent and deterministic output for a given input, making them reliable for data security.

Conclusion

Hash functions are essential tools in computer science and security. They are crucial in ensuring data integrity, efficiency, and cryptographic security. Hash functions take input data and create a unique fixed-size hash value representing the input. This makes detecting even the slightest changes in the original data is easy. Hash functions are mostly used in various applications, such as data verification, password storage, digital signatures, and cryptocurrencies like Bitcoin. Strong cryptographic hash functions, such as SHA-256 and SHA-3, play a pivotal role in ensuring the security of digital systems by providing properties like preimage resistance and collision resistance. The choice of hash function depends on the specific essentials of the application. Carefulness is crucial to adapting to evolving security threats and advancements in computing technology.