Struct Module in Python

In this tutorial, we will learn about the Python's struct module and understand its functions.

The struct module in Python provides tools for working with C-style data structures and binary data. It's used for packing and unpacking data to/from binary representations according to specified formats. This is especially useful when dealing with low-level binary data formats, such as those used in networking protocols, file formats, and more.

It provides functions to create and interpret packed binary data, allowing us to work with data at a byte level. It's commonly used when we required reading or writing binary files, sending or receiving binary data over a network, or interacting with low-level hardware interfaces.

Struct Functions

Let's understand the functions of the struct module.

struct.pack() - The struct.pack() function is used to pack values into a binary string according to a specified format. This function is particularly useful when you need to convert Python data types into a binary representation that can be written to a file, sent over a network, or used in low-level data manipulations.

The syntax of struct.pack() is as follows:

Syntax -

  • format: A format string that specifies the desired binary layout of the packed data.
  • v1, v2, ...: Values to be packed into the binary format.

Let's understand the following example -

Example -

Output:

Packed data: b'*\x00\x00\x00\xcd\xcc\x0c@\x00\x00\x00Hello      '

Explanation -

In this output, each value has been packed according to the format string 'i f 10s':

  • '*\x00\x00\x00': The packed integer 42 (in little-endian byte order).
  • \xcd\xcc\x0c@: The packed floating-point number 3.14 (IEEE 754 single-precision format).
  • Hello: The packed bytes of the string 'Hello' (padded with spaces to meet the length of 10 characters specified in the format).
  • The result is a sequence of bytes that represents the packed data in binary format.

struct.unpack() - The struct.unpack() function is used to unpack binary data into a tuple of values according to a specified format. It takes a format string and a bytes-like object (usually obtained from reading a binary file or similar source) and returns a tuple containing the unpacked values.

Syntax:

  • format - A string that specifies the format of the binary data.
  • buffer - The bytes-like object containing the packed binary data.

Example:

Output:

Unpacked values: (42, 3.140000104904175, b'Hello      ')
In this example:
- `42` is the unpacked integer value.
- `3.140000104904175` is the unpacked floating-point value.
- `b'Hello      '` is the unpacked bytes object representing the string 'Hello'.

The struct.unpack() function interprets the binary data in the provided format and returns a tuple of unpacked values.

struct.calcsize() - The struct.calcsize() function is used to calculate the size (in bytes) required to store packed data according to a given format string. It takes a format string as an argument and returns the size required for packing data in that format.

Syntax:

  • format - A string that specifies the format of the binary data.

Example -

Output:

Size required: 18 bytes

Explanation -

In this example, the format string `'i f 10s'` indicates that you are packing an integer (`i`), a floating-point number (`f`), and a 10-byte string (`10s`). The calculated size of 18 bytes is the total size required to store data packed according to this format string.

The struct.calcsize() function is useful when you need to allocate memory or determine the storage requirements for packed binary data before actually performing the packing operation.

struct.pack_into() - The struct.pack_into() function in Python's struct module is used to pack values according to a given format string into a mutable buffer. This function allows you to pack data directly into a pre-allocated buffer, which can be useful for optimizing memory usage and reducing unnecessary memory copies.

Syntax:

  • format - A string that specifies the format of the binary data.
  • buffer - A writable buffer-like object (e.g., bytearray) where the packed data will be stored.
  • offset - The starting position in the buffer where the packed data should be placed.
  • value1, value2, ... - The values to be packed into the buffer according to the given format.

Example:

Output:

Packed data buffer: bytearray(b'\x00\x00*\x00@33\x0f\xb5\xc3')

Explanation -

In this example, the integer value `42` and the float value 3.14 are packed into the `data_buffer` bytearray starting at index 2 (offset). The resulting packed data is stored in the specified buffer, and you can see the updated values in the buffer after packing.

Remember that the buffer should be writable and large enough to store the packed data. Also, the specified offset should be within the valid range of the buffer.

struct.unpack_from() - The struct.unpack_from() is used to unpack data from a buffer according to a given format string, starting from a specified offset. This function allows you to extract data from a specific position in a binary buffer without unpacking the entire buffer.

Syntax:

struct.unpack_from(format, buffer, offset=0)

  • format - A string that specifies the format of the binary data.
  • buffer - A readable buffer-like object (e.g., bytes, bytearray) from which data will be unpacked.
  • offset -The starting position in the buffer from which unpacking should begin. Default is 0.

Example -

Output:

Unpacked values: (42, 3.140000104904175)

In this example, the struct.unpack_from() function is used to extract an integer and a float from the packed_data buffer starting at index 2 (offset). The resulting values are unpacked and printed.

The specified offset should be within the valid range of the buffer. Also, the format string used for unpacking should match the format used for packing the data into the buffer.

Conclusion

The struct module in Python provides powerful tools for packing and unpacking binary data in various formats. It's particularly useful for working with binary data that needs to be exchanged between different systems or written to files. The module allows you to specify a format string that defines the structure of the data, including the data types and their sizes.