rarfile module in PythonIn the following tutorial, we will discuss the rarfile module of the Python programming language. We will understand the different classes of the rarfile module along with some examples. So, let's get started. Understanding the Python rarfile moduleThe rarfile module in Python is used to read the RAR archive. The interface is built as zipfile-like as possible. The basic functionalities of the rarfile module:
Now, before we start working with the module, let us install it. How to install the rarfile module in Python?In order to install the rarfile module, we will be using the pip installer following the command shown below: Syntax: In order to verify the module is installed properly, we can create a new file and add the import statement to see if it returns any errors or not. File: verify.py Now, save the Python file and run the execution command using the command prompt: Syntax: If the above Python file does not raise any import error, we are good to go and head onto the Facebook messenger bot building procedure. However, if it does raise an exception, it is recommended to reinstall the module and refer to its official documentation. Now, let us understand the basics of the rarfile module. Classes of the rarfile moduleThe rarfile module provides multiple classes that we can use as per the requirements. These classes are:
We will discuss these classes in brief. Understanding the RarFile classThe RarFile class of the rarfile module is used to parse the RAR structure, providing access to the files in the archive. The syntax of the execution of the RarFile class is shown below: Syntax: Some of the methods and attributes of the RarFile class is shown below: 1. comment= None This attribute is used to state the archive comment. The value can either be a Unicode string or None. 2. filename= None This attribute is used the provide the name of the file, if available. The value can either be a Unicode string or None. 3. __enter__() This method is used to open context. 4. __exit__(type, value, traceback) This method is used to exit context. 5. __iter__() This method is used to iterate over members. 6. setpassword(pwd) This method is used to set the password used during extraction. 7. needs_password() This method returns True in case any archive entries need a password for extraction. 8. namelist() This method returns a list containing the names of the file in the archive. 9. infolist() This method returns the RarInfo objects for all files/directories in the archive. 10. volumelist() This method returns the filenames of the archive volumes. If the archive is of only a single volume, the list consists of the name of the main archive file. 11. getinfo(name) This method returns RarInfo for file. 12. open(name, mode = 'r', pwd = None) This method returns the file-like object (RarExtFile) from where the data can be read. The object implements io.RawIOBase interface, so we can further wrap it with io.BufferReader and io.TextIOWrapper. In the previous versions of Python, where the io module is not available, it implements only read(), seek(), tell(), and close() methods. The object is seek-able, although the seeking is quick only on uncompressed files. On compressed files, the seeking is implemented by reading ahead and restarting the decompression. Parameters:
13. read(name, pwd = None) This method returns the uncompressed data for archive entry. It is recommended to use the open() method for larger files. Parameters:
14. close() This method is used to release open resources. 15. printdir(file = None) This method is used to print a list of the files in the archive to stdout or a given file. 16. extract(member, path = None, pwd = None) This method is used to extract a single file into the current directory. Parameters:
17. extractall(path = None, members = None, pwd = None) This method is used to extract all files into the current directory. Parameters:
18. testrar(pwd = None) This method is used to read all files and test CRC. 19. strerror() This method returns an error string if parsing fails or None if no exception has occurred. Understanding the RarInfo classThe RarInfo class of the rarfile module is used as an entry in the RAR archive. Timestamps as datetime are without time zone in RAR3, with UTC zone in RAR5 archives The syntax of the execution of the RarInfo class is shown below: Syntax: Some of the methods and attributes of the RarInfo class is shown below: 1. filename This attribute contains the name of the file with a relative path. The value of this attribute is always a Unicode string specifying the path separated by a Path separator as '/'. 2. date_time This attribute consists of the timestamp of File modification. It can be used as a tuple of (year, month, day, hour, minute, second). RAR5 allows archives where it is missing, and it's None then. 3. comment This attribute includes the optional file comment field. The value consists of a Unicode string. (RAR3-only) 4. file_size This attribute is used to specify the uncompressed size. 5. compress_size This attribute is used to specify the compressed size. 6. compress_type This attribute is used to specify the method of the compression: One of the RAR_M0, …, RAR_M5 constants. 7. extract_version This attribute consists of the minimal version of RAR that is required for decompression. As (major*10 + minor), so 2.9 is 29. RAR3: 10, 20, 29 RAR5 does not have such a field in the archive, and it is set to 50. 8. host_os This attribute specifies the Host OS type, one of RAR_OS_* constants. RAR3: RAR_OS_WIN32, RAR_OS_UNIX, RAR_OS_MSDOS, RAR_OS_OS2, RAR_OS_BEOS RAR5: RAR_OS_WIN32, RAR_OS_UNIX 9. mode This attribute is used to specify the file attributes. It may be either dos-style or unix-style, depending on host_os. 10. mtime This attribute is used to specify the time of the file modification. The value can be the same as the date_time attribute; however, as a datetime object with extended precision. 11. ctime This attribute is an optional time field specifying the time of creation. It also acts as a datetime object. 12. atime This attribute is also an optional time field specifying the time of last access. It also acts as a datetime object. 13. arctime This attribute is also an optional time field specifying the archival time. It also acts as a datetime object. (RAR3-only) 14. CRC This attribute is used to specify the CRC-32 of the uncompressed file. The value of this attribute is an unsigned int. RAR5: may be None. 15. blake2sp_hash This attribute is used to specify the Blake2SP hash over decompressed data. (RAR5-only) 16. volume This attribute is used to specify the volume nr, beginning from 0. 17. volume_file This attribute is used to specify the volume file name where the file begins. 18. file_redir This attribute consists of a tuple of (type, flags, target). (RAR5-only). If not None, the file is the link of some sort. (RAR5-only) Type is one of the constants:
Flags may contain bits:
19. is_dir() This method is used to return True if the entry is a directory. New in version 4.0. 20. is_symlink() This method is used to return True if the entry is a symlink. New in version 4.0. 21. is_file() This method is used to return True if the entry is a normal file. New in version 4.0. 22. needs_password() This method is used to return True if data is stored password protected. 23. isdir() This method is used to return True if the entry is a directory. Deprecated since version 4.0. Understanding the RarExtFile classThe RarExtFile class of the rarfile module works as a base class for objects similar to the file that RarFile.open() returns. Bases: io.RawIOBase The syntax for the RarExtFile class is shown below: Syntax: This class provides public methods and common CRC checking Behavior:
Some attributes and methods of the RarExtFile class are shown below: 1. name= None This attribute is used to specify the file name of the archive entry 2. read(n=-1) This method is used to read all or a specified amount of data from archive entry. 3. close() This method is used to close open resources. 4. readinto(buf) This method is used to define zero-copy read directly into the buffer. It Returns bytes read. 5. tell() This method returns the current reading position in uncompressed data. 6. seek(offset, whence = 0) This method is used to seek data. On uncompressed files, the seeking works by actual seek, so it is fast. On compressed files, it is slow - forward seeking happens by reading ahead, backward by re-opening and decompressing from the start. 7. readable() This method returns True 8. writable() This method returns False as writing is not supported. 9. seekable() This method returns True as seeking is supported, although it's slow on compressed files. 10. readall() This method is used to read all the remaining data 11. fileno() This method returns the underlying file descriptor if one exists. OSError is raised if the IO object does not utilize a file descriptor. 12. isatty() This method returns if this is an 'interactive' stream. It also returns False if it can't be determined. 13. readline() This method is used to read and return a line from the stream. The line terminator is always b'n' for binary files; for text files, we can use the newlines argument to open in order to select the line terminator(s) recognized. If the size is given, size bytes will be read at most. 14. readlines() This method is used to return a list of lines from the stream. We can specify the hint to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds the hint. Understanding the nsdatetime classThe nsdatetime class of the rarfile module represents the Datetime that carries nanoseconds. This class does not support Arithmetic and will lose nanoseconds. Bases: datetime.datetime New in version 4.0 The syntax of the nsdatetime class is shown below: Syntax: Some attributes and methods of the nsdatetime class are shown below: 1. nanosecond This attribute consists of the number of nanoseconds ranging from 0 to 999999999. 2. isoformat(sep = 'T', timespec = 'auto') This method is used to format with nanosecond precision by default. 3. astimezone(tz=None) This method is used to convert to new time zone. 4. replace(year = None, month = None, day = None, hour = None, minute = None, second = None, microsecond = None, tzinfo = None, *, fold = None, nanosecond = None) This method is used to return new timestamp with given fields replaced. Functions of the rarfile moduleSome of the functions of the rarfile module are as follows:
Constants of the rarfile moduleSome of the constants of the rarfile module are as follows
Warnings and Exceptions of the rarfile moduleSome of the Warnings and Exceptions of the rarfile module are as follows:
Working of the rarfile moduleLet us now consider the following example demonstrating the working of the rarfile module. Example: Output: myfolder/helloWorld.py 101 myfolder/image.jpg 281466 myfolder/readme.txt 49 b'Hello Python learners!\r\nWelcome to Javatpoint.com' myfolder/ 0 Explanation: In the above snippet of code, we have imported the required module. We have then used the RarFile class to select the RAR archive. We have then used the for-loop to iterate through the files present in the archive and print the filenames along with their sizes. We have then used the if conditional statement to check if the RAR archive contains a readme.txt file. At last, we have read the file using the read() method.
Next TopicStemming Words using Python
|