Javatpoint Logo
Javatpoint Logo

Awk Command Usage

Awk is a general-purpose scripting language that is designed for advanced data manipulating and report generating. The awk scripting language is widely used as a reporting and analysis toolkit. Unlike various programming languages, which are procedural, awk is a data-driven. It means users can define a set of operations to perform a task against the input text.

Awk command does not require compiling and allows users to use a variable, string functions, numeric functions, and logical operations. It takes data as an input, performs an action accordingly, and sends back the result as standard output.

Awk is a utility that makes a programmer write small but impressive programs as statements that define the text patterns. The defined text patterns should be able to search in each line of the document, and when a match is found in a line, an action is applied.

The awk command is widely used for pattern scanning and processing. It searches one or more files in the system and checks whether they contain the specified pattern. If the specified pattern matches in the file, it performs the corresponding actions.

How its name became awk?

The name of this scripting language became awk on the basis of the alphabet of three persons who wrote its original version in 1977. These persons are Alfred Aho, Peter Weinberger, and Brian Kernighan. These three people were from the AT&T Bell Laboratories Unix pantheon. After the original awk version of 1977, many other contributions since then and awk has continued to evolve.

Awk is a complete scripting language and a complete text manipulation toolkit that allows manipulating text files from the command line.

What can we perform using the awk command?

Using the awk command, we can perform the following activities:

1. Awk operations:

  1. We can scan a file line by line.
  2. Performs action(s) if match found in line(s)
  3. We can split each of the input lines into fields.
  4. Perform compression between input line/fields to pattern

2. Useful for:

  1. Generate formatted reports
  2. Transform data files

3. Programming constructs:

  1. Arithmetic and String operations
  2. Format output lines
  3. Apply conditions and loops

Rules, patterns, and actions of awk

There are some rules, patterns, and actions to write an awk command. Each of the awk commands comprised of patterns and actions. The action is performed on the text that matches the input pattern, and this pattern is enclosed within a curly brace ({} ). When pattern and action are together in awk command, it is called a rule. Each awk program is written inside a single quote (').

When a rule doesn't contain any pattern, entire records (lines) are considered as matched. We can even input more than one statement in action, and each of them is separated by semi-colons ( ; ) or newline. If there is no action in a rule, by default, it prints the whole record.

Commonly used awk statements.

There are different types of statements in the awk command, including input, conditions, expressions, output statements, and lots more. However, the most commonly used awk statements are:

  • exit: It stops the execution of a complete program and exits the cmd window.
  • next: It stops the current record processing and moves to the next input data to process.
  • print: it prints the fields, variables, records and other custom text.
  • printf: it provides more control to users over the output format like C and bash.

Comment (#): Any sentence or program was written after a hash mark (#), and until it gets ends, it is considered a comment. A comment is only used for users to understand the line or as a hint, and it doesn't play any role in awk command execution.

A backslash (\): Using the backslash (\) key, we can break any long lines into multiple lines.

Let's see a simple awk command to print a string:

Simply type the following awk command on the console and press enter.


Awk Command Usage

In the above command, the print statement simply prints a given sting "Welcome to awk command" on the screen. A statement encloses under the double quotes (") represents a string value that prints as it is on the screen.

Awk special field identifiers

Special field identifier returns the value available at the mentioned identifier's location. A couple of special field identifiers represent the specific field and data location in a line.

  • $0: Represents the entire line of a record.
  • $1: Represents the first field of a line.
  • $2: Represents the second field of line.
  • $5: Represents the fifth field of line.
  • $15: Represents the 15th field of line.
  • $NF: It refers to the "number of fields" and represents the last field of the line.

Consider a 'txt' file "awk_file.txt" that contains the following statement: "Awk is a general-purpose scripting language that is designed for advanced data manipulating and reports generating." Now, using the awk special field identifiers, we will retrieve the corresponding values.

Input text file in command screen

Before manipulating and reading data from a file, we have to input this file on the command screen. First of all, we input the text file "awk_file.txt" in our command to retrieve values from it. To input the file, use the command:

as


Awk Command Usage

In the above command, /home/jtp1234 is the directory structure where the text file "awk_file.txt" is present. Now, to read and print the value from a text file using awk special field identifiers and statements print, use the below command:


Awk Command Usage

The above command print statement prints the data present at location 1st ($1), 4th ($4) and 6th ($6) in the "awk_file.txt" file. If you want to print data available at the last (end) location of the file, use the identifier $NF. Look at the below command that prints data available at the last location of the file:


Awk Command Usage

The special patterns BEGIN and END Rules.

Awk includes two special patterns known as BEGIN and END. A BEGIN rule is executed once before performing any actions on the command. It runs before awk reads any text files. An END rule is executed after performing complete action on records. We can use multiple BEGIN and END rules in the command, and all of them will be executed in order as they are defined.

For example: Let's print the "Process Starting" string at the beginning and "Process End" at the end of the text file "awk_file.txt".


Awk Command Usage

If an awk scripting program only includes a BEGIN pattern, it executes an action, and the input will not proceed.

Awk Command Usage

If an awk program only includes an END pattern, it only proceeds input before performing the rule actions.

Awk Command Usage

Awk Built-in Variables

Awk has several built-in variables that contain information about the file and allows us to control the program execution. Below are some most commonly used built-in awk variables:

  • NF - contains the number of fields in the file (or record).
  • NR - contain a number of the current lines (or record).
  • FILENAME - contain the name of an input file that is currently in use.
  • FS - used for field separator.
  • RS - used for record separator.
  • OFS - used for output field separator.
  • ORS - used for output record separator.

Let's see an example to print the file name which is currently in use and the total number of lines available in it:


Awk Command Usage

Awk Built-in Functions

Awk contains several built-in functions that we can use and call in our programs. We can use these built-in functions from both the command shell and script program. Some of the built-in awk functions are given below:

  • Numeric functions: Numeric functions works with numbers such as int(), atan2(), sin(), and rand().
  • String functions: These functions are used for string manipulation, such as match() for string match, split() for string split, and sprint().
  • Time functions: Time functions are used to deal with timestamps.
  • I/O functions: I/O functions deals with files and shell commands.
  • Bitwise functions: It is used to perform bitwise operations.
  • Calling Built-in: It defines how to call built-in awk functions.
  • Type functions: Type functions are used for time information.
  • I18N functions: These functions are used for string translation.

Look at the basic way to call awk functions in a command shell. In the example, we will use some of the numeric functions.

Awk Numeric Functions:

Followings are the list of all built-in numeric function that works with numbers. The extra parameters are passed and enclosed within square brackets([]).

  • int(x): returns the nearest integer value to x which is available between 0 (zero) and x.
  • sqrt(x): returns a positive square root value of x.
  • sin(x): returns the sine value of x.
  • cos(x): returns the cosine value of x.
  • exp(x): returns the exponential value of x (e ^ x); if x is out of range, it generate an error. The range of x depends on your device's floating-point representation.
  • log(x): returns the natural logarithm value of x (if x is a positive number otherwise it return a NaN ("not a number")).
  • atan2(y, x): returns the arctangent value of y / x in radians.
  • rand(): returns a random number. The value of the rand() function is equally distributed between zero and one.

Example of awk Numeric Functions:

Print integer number using int() function:


Awk Command Usage

Calculate the square root of 100 using sqrt() function:


Awk Command Usage

Print sine value using sin() function:


Awk Command Usage

Print cosine value using cos() function:


Awk Command Usage

Print exponential value using exp() function:


Awk Command Usage

Print logarithm value using log() function; if the input value is negative, it reports a warning by displaying nan.


Awk Command Usage

Print arctangent value using atan2() function. The function atan2() returns the arctangent of a given value. In this command, we calculate the arctangent of 0 (zero) and -1, equal to the constant mathematical PI.


Awk Command Usage

Print random value between 0 and 1 using rand() function. The value of the rand() function is equally distributed between zero and one.

Awk Command Usage

Awk Scripts

If you face difficulty in using the command line, especially for the longer program (command), and if you are familiar with the traditional script program, you may transfer your awk command to the script program.

In our scripting example, we are going to perform all of the following operations:

  • Going to acknowledge the shell which executable files use to run the script.
  • Use FS field separator variable to read the input file and separates fields separated using a colon (:).
  • The OFS (output field separator) uses a colons (:) to separate fields in the output result.
  • Initialize and set a counter to 0 (zero).
  • Set the second field $2=" " of each line of text to blank.
  • Display the output line with a modified second field.
  • Increment the previously set counter value.
  • Print the value of the counter.

The awk BEGIN rule completes the initial steps, while its END rule returns a counter value. On the other hand, the middle rule has no name, no pattern, so it matches each line and modifies the second field and increment the counter.

Awk Command Usage

We are providing the below script as text so that you can copy and paste in your program and execute:

Save the above script as omit.awk file and execute this script by typing the following command using chmod as below:


Awk Command Usage

Now, we run this script and pass the /etc/passwd file to the script. The "passwd" file will proceed omit.awk script.


Awk Command Usage

The scripts file proceeds and print and display each line as shown below:

Awk Command Usage

Let's see another example that uses expressions and control flow statements to print squares numbers from 1 to 5:

Awk Command Usage

If you feel difficult to read, write and understand one-line commands as the above, you may create a separate long script program and execute that program in your awk command.

Let's see how the above square printing command is written into script program and used in command. Write the above program as script program and save it with program.awk file.

program.awk

Execute the above script bypassing the file name program.awk to the awk interpreter:


Awk Command Usage

We can also run an awk script program as an executable program by using a directive and setting the awk interpreter:

program2.awk

Save the above script file as program2.awk and execute the below command to run the program:


Awk Command Usage

Some other awk commands:

Consider a text file "employee.txt" containing the following data on which we will apply some awk commands and manipulate it.

employee.txt

1. Print a file:

The default behavior of the awk command is to print every line of record from the input file.


Awk Command Usage

Output

John Manager Account 48000
Michel Content Developer 35000
Ashutosh Content Developer 30000
James Manager Sales 50000
Akash Software Developer 40000
John Manager Marketing 45000
Mike Product Manager 40000

2. Print the line that matches input data:

Awk command to print all the lines that match the 'Content' word.


Awk Command Usage

Output

Michel Content Developer 35000
Ashutosh Content Developer 30000

3. Print all the line that doesn't match an input data:

Awk command to print all the lines that don't match an input data 'Content' word.


Awk Command Usage

Output

John Manager Account 48000
James Manager Sales 50000
Akash Software Developer 40000
John Manager Marketing 45000
Mike Product Manager 40000

4. Splitting a line into fields :

The awk command splits its record for each line when a by default whitespace character matches and stores them in the $n variables. For example, a line contains five words, awk stores each record in $1, $2, $3, $4, and $5, respectively. The complete line represents $0. Let's splits line and print value available at $1 and $4 representing Name and Salary fields, respectively.


Awk Command Usage

Output

John 48000
Michel 35000
Ashutosh 30000
James 50000
Akash 40000
John 45000
Mike 40000

5. Use of NR built-in variables (Display Line Number)

Awk command with NR built-in variable prints all the lines along with the line number.


Awk Command Usage

Output

1 John Manager Account 48000
2 Michel Content Developer 35000
3 Ashutosh Content Developer 30000
4 James Manager Sales 50000
5 Akash Software Developer 40000
6 John Manager Marketing 45000
7 Mike Product Manager 40000

6. Use of NF built-in variables (Display Last Field)

Awk command with NF built-in variable prints the last field that is a salary of record.


Awk Command Usage

Output

John 48000
Michel 35000
Ashutosh 30000
James 50000
Akash 40000
John 45000
Mike 40000

7. Another use of NR built-in variables (Display Line From 2 to 5)

awk 'NR==2, NR==5 {print NR,$0}' employee.txt

Awk Command Usage

Output

2 Michel Content Developer 35000
3 Ashutosh Content Developer 30000
4 James Manager Sales 50000
5 Akash Software Developer 40000

Consider another file, test.txt containing the following data.

test.txt

1) To print the first item of each row along with row number (NR) separated with "-" in test.txt:


Awk Command Usage

Output

1 - James 
2 - Shiv    
3 - Ratan

2) Print the second row/item from test.txt:


Awk Command Usage

Output

A12
B6
M42

3) Find the length of the longest line present in the file:


Awk Command Usage

Output

12

4) To count the total lines in a file:


Awk Command Usage

Output

3





Youtube For Videos Join Our Youtube Channel: Join Now

Help Others, Please Share

facebook twitter pinterest

Learn Latest Tutorials


Preparation


Trending Technologies


B.Tech / MCA