Tar command in Linux/Unix with Examples

The tar command is short for tape archive in Linux. This command is used for creating Archive and extracting the archive files. In Linux, it is one of the essential commands which facilitate archiving functionality. We can use this command for creating uncompressed and compressed archive files and modify and maintain them as well.

Tar is a utility of computer software to collect several files into a single archive file in computing. Often, it is known as tarball for backup and distribution purposes. The title is acquired from "tape archive" because it was actually developed to specify data on sequential I/O devices using none of their file systems. The archive data groups made by tar include many file system parameters like directory organization, file access permission, ownership, timestamps, and name. In favor of pax, POSIX abandoned tar, tar yet sees widespread use.

First, it was announced in Version 7 Unix in January 1979, substituting the tp program. To store the data, the file structure was standardized in POSIX.1-1988 and later POSIX.1-2001 and became a pattern supported by almost all modern file archiving systems. Unix-like operating systems generally contain tools for supporting tar files and utilities used for compressing them, like bzip2 and gzip.
Since Microsoft 10 April 2018 Update, BSD-tar has been contained in Microsoft Windows, and there are two or more third-party tools to read and write these patterns on Windows.

File formats of tar

There are many tar file formats available, including current and historical ones. Two tar formats are written in POSIX: pax and ustar.

Header

The file header record includes the metadata of a file. In the header record, the information is encoded in the ASCII standards to ensure flexibility across distinct architectures using distinct byte orderings. Hence, if every file is an ASCII text file in an archive and contains ASCII names, the archive is an ASCII text file (having several NUL characters).

Several fields are mentioned in the following table, defined by the actual Unix tar format. The link file/indicator type table contains a few modern extensions. A field is filled with various NUL bytes if it's unused.

Pre-POSIX.1-1988 tar header:

Field Field Size Field Offset
File name 100 0
File mode 8 100
Numeric user ID of the owner 8 108
Numeric user ID of the group 8 116
Byte file size 12 124
Last modification time in the form of numeric Unix time 12 136
Checksum for the header record 8 148
Link indicator 1 156
Linked file name 100 157

Ustart format

In the Ustar format, almost every modern tar program write and read archives, announced by the POSIX IEEE P1003.1 standard from 1988. It introduced extra header fields. Previous tar programs would avoid the additional information, but new programs will check the "Ustar" string presence to decide if the newer format is in use.

Field Field Size Field Offset
Many fields, like in the previous format 156 0
Type flag 1 156
Similar field like in the previous format 100 157
"ustar" Ustar indicator, then NUL 6 257
"00" Ustar version 2 263
User name of the owner 32 265
Group name of the owner 32 297
Major number of the device 8 329
Minor number of the device 8 337
Filename prefix 155 345

POSIX.1-2001/pax

Sun proposed a technique to add extensions to a tar format in 1997. Later, it was approved for the POSIX.1-2001 standard. The format is called pax format or extended tar format. Some tags are specified by the POSIX standard, including mtime, atime, linkpath, gname, uname, sizes, gid, uid, and a character set definition for group/user names and path names.

Key implementations of tar

The key implementations are mentioned in the origin order:

  • Solaris tar comes as a default on the Solaris OS, based on the actual Unix V7.
  • GNU tar is a default on almost every Linux distribution. It's based on a public domain implementation pdtar which began in 1987. Recent editions can use many formats, such as GNU, pax, v7, and ustar.
  • FreeBSD tar has become a default tar on almost every Berkeley Software Distribution-based OS, such as Mac OS X.
  • Schily tar is historically important as a few of its extensions were famous. It was first released in April 1997.
  • Python tarfile supports two or more tar formats, such as gnu, pax, and ustar; it can read but not make the SunOS tar extended and V7 formats; pax is a default format for archive creation. It has been available for users since 2003.

Additionally, most cpio and pax implementations can create and read two or more tar file types.

Syntax of tar command:

Options in the tar command

Various options in the tar command are listed below:

  1. -c: This option is used for creating the archive.
  2. -f: This option is used for creating an archive along with the provided name of the file.
  3. -x: This option is used for extracting archives.
  4. -u: It can be used for adding an archive to the existing archive file.
  5. -t: It is used for displaying or listing files inside the archived file.
  6. -A: This option is used for concatenating the archive files.
  7. -v: It can be used to show verbose information.
  8. -j: It is used for filtering archive tar files with the help of tbzip.
  9. -z: It is a zip file and informs the tar command that makes a tar file with the help of gzip.
  10. -r: This option is used for updating and adding a directory or file in an existing .tar file.
  11. -W: This option is used for verifying the archive file.

Introduction to Archive File

The archive file can be defined as a file that contains multiple files with metadata. These files are used for collecting more than one data file together in an individual file for easier storage and portability. It can be also used for compressing files to consume less storage space.

Examples of tar command

Some of the important examples which are widely used in tar command are as follows:

1. Making an uncompressed tar archive with -cvf option

This option makes a tar file known as file.tar. It is the archive of every .txt file inside mydir directory.

The command is as follows:


Tar command in Linux

2. Extracting files through the archive with -xvf option

This option can extract files through archives.

The command is as follows:


Tar command in Linux

3. gzip compression over tar archive with -z option

This option makes a tar file known as file.tar.gz. It is the archive of every .txt file.

The command is as follows:


Tar command in Linux

4. Extracting the gzip tar archive with -xvzf option

This option can extract the files through file.tar.gz tar archived files.

The command is as follows:


Tar command in Linux

5. Making compressed tar files with the -j option

This option will help us to create and compress archive files. Both decompress and compress takes more time as compared to gzip.

The command is as follows:


Tar command in Linux

6. Untar single specified directory or file in Linux

This option is used to untar any file in our current directory or inside the specified directory with the -C option.

The command is as follows:


Tar command in Linux

Or,


Tar command in Linux

7. Untar multiple .tar.tbz, .tar.gz, .tar files in Linux

This option will help us to untar or extract more than one file from tar.bz2, tar.gz, and a tar archive file.

The example of this option is as follows:


Tar command in Linux

Or,


Tar command in Linux

Or,


Tar command in Linux

8. Check the size of the existing tar.tbz, tar.gz, tar file

The command will help us to show the archive file's size in kilobytes (KB) which is mentioned above.

The command is as follows:

Or,

Or,


Tar command in Linux

9. Update the existing tar file

In Linux, the command for updating an existing tar file is as follows:

10. Content list and describe the tar file with the tf option

This option will help us to list the whole archive file's list. Also, we can list particular content inside any tar file.

The command is as follows:


Tar command in Linux

11. View the archive with the -tvf option

In Linux, we can use the -tvf option to view the archive.

The command is as follows:


Tar command in Linux

12. Pass the filename as the argument to find a tar file

This option can view the archived files with their information.

The command is as follows:


Tar command in Linux

13. Using pipe for throwing 'grep command' to search what we are searching for

This option will help us to only list the mentioned image or text in grep through archived files.

The command is as follows:

Or,


Tar command in Linux

Introduction to Wildcards

Alternatively, wildcards are referred to as a wildcard character or wild character in Linux. It is a symbol that is used for representing or replacing multiple characters.

Typically, wildcards are either a question mark (?) which illustrates an individual character or an asterisk (*) which illustrates multiple characters.

Example-

14. Find a .png format image

It will help us for extracting only files along with the .png extension from the file.tar archive. The -wildcards option informs tar for interpreting wildcards in the file name to be extracted.

The name of the file (*.png) is enclosed inside the single quotes for protecting the wildcard (*) through being incorrectly expanded by any shell.

The command is as follows:


Tar command in Linux

Note: In the above command, the "*" symbol is applied in the position of the name of the file for taking each file available in that specific directory.

15. Delete files from the tar archive

We can use the --delete option for removing files and a tar archive.

The command is as follows:

Example:

Output:

Tar command in Linux

hello1.txt file has been removed from the file.tar archive:

Tar command in Linux
Next TopicLinux Find File




Latest Courses