AWK CommandThe awk command is used for text processing in Linux. Although, the sed command is also used for text processing, but it has some limitations, so the awk command becomes a handy option for text processing. It provides powerful control to the data. The Awk is a powerful scripting language used for text scripting. It searches and replaces the texts and sorts, validates, and indexes the database. It is one of the most widely used tools for the programmer, as they write the scaled-down effective program in the form of a statement to define the text patterns and designs. It acts as a filter in Linux. It is also referred as gawk (GNU awk) In Linux. The AWK command s a domain-specific language developed for text processing and used as a reporting and data extraction tool. It is a data-driven scripting language composed of a group of actions to be taken against textual data streams- either directly run on files or utilized as a part of a pipeline- for the aim of transforming and extracting data like generating formatted reports. This language uses regular expressions, associative arrays, and string datatypes. The language is Turing-complete, and the early AWK Bell Lab users often specified well-structured AWK programs, while AWK has a restricted application domain and was specifically developed for supporting one-liner programs. How is it named as AWK?This command is named by using the first letter of the name of three people who wrote the original version of this command in 1977. Their names are Alfred Aho, Peter Weinberger, and Brian Kernighan and they were from AT & T Bell Laboratories. Features of AWK commandVarious features of the Awk command are as follows:
Syntax: The Awk command is used as follows: The options can be:
AWK program structureThe AWK program is a collection of the pairs of pattern actions, specified as: Where the action is a collection of commands and the condition is an expression. The input is divided into records, where the records are isolated by newline characters by default so that the input is divided into different lines. The program checks all records against all of the conditions and runs the action for all expressions that are true. Either the action or the condition may be absent. The condition defaults to the same as all records. The action (default) is to show the record. Like sed, it is a similar pattern-action structure. In addition to common logical and arithmetic operators, AWK expressions contain the '~' (tilde operator), which is the same as a regular expression against any string. Without the tilde operator, /regexp/ is the same as the current record as accessible syntactic sugar. The syntax acquires from sed, which is acquired through the ed editor, in which / is applied for searching purposes. This syntax of applying slashes for regular expressions as delimiters was adopted by ECMAScript and Perl and are common now. Also, the tilde operator was accepted by Perl. Implementations and versionsOriginally, AWK was specified in 1977 and distributed using Version 7 Unix. The AWK authors started developing the language in 1985, most importantly by including user-defined functions. This language is defined in the "The AWK Programming Language" book, released in 1988, and its implementation was available in the UNIX System V releases. This release was sometimes known as nawk or "new awk" for avoiding confusion with the conflicting older versions. This implementation was published in 1996 under a free software license and is still managed by Brian Kernighan.
How to define AWK Script?To define the awk script, use the awk command followed by curly braces {} surrounded by single quotation mark '' as follows: The above command will print the inputted string every time we execute the command. Press CTRL+D key to terminate the program. Consider the below output: AWK Command ExamplesTo better understand the Awk command, have a look at the below example: Let's create a data to apply the various awk operations. Consider student data from different streams. To create data, execute the cat command as follows: Press CTRL + D key to save the file and ESC key to exit from the command-line editor. It will create data. Consider the below output: A student data has been created, and we will operate the awk command on this data. Example1: List students with the specified pattern. Consider the below command: Output: Example2: Default behaviour of awk command. If we do not specify the pattern, it will show all of the content of the file. Consider the below command: We have not specified any pattern in the above command so, it will display all lines of the file. Output: Example3: Print the specified column. If we specify the column number on this command, it will print that line only. Consider the below output: The above command will print the column number 1 and 5. If column 5 does not exist in the file system, it will only print column number 1. Consider the below output: Consider the below command: The above command will list the column number 1 & 2. Consider the below output: Built-in variables in AWK commandAwk command supports many built-in variables, which include $1, $2, and so on, that break the file content into individual segments. NR: It is used to show the current count of the lines. The awk command performs action once for each line. These lines are said as records. NF: It is used to count the number of fields within the current database. FS: It is used to create a field separator character to divide fields into the input lines. OFS: It is used to store the output field separator. It separates the output fields. ORS: It is used to store the output record separator. It separates the output records. It prints the content of the ORS command automatically. Example4: Print the output and display the line number. To display the line number in output, use the NR variable with the Awk command as follows: Consider the below output: Example5: Print the last field of the file. To display the last field of the file, execute the NF variable with the Awk command as follows: Consider the below output: Example6: Separate the output in the specified format. To separate the output by a '-' symbol or (:) semicolon, specify it with ORS command as follows: The above command will separate the output by an underscore (_) symbol. Consider the below output: Example7: Print the square of the numbers from 1 to 8. To print the numbers from 1 to 8, execute below command: The above command will print the square of 1 to 8. consider the below output: square of 1 is 1 square of 2 is 4 square of 3 is 9 square of 4 is 16 square of 5 is 25 square of 6 is 36 square of 7 is 49 square of 8 is 64 Example8: Calculate the sum of a particular column. Let's create a data to apply the sum operation on a column. To create students marks data, execute the cat command as follows: Press CTRL+D to save the file. We have successfully created StudentsMarks data. We can check it by executing the cat command as follows: To calculate the third column of the created data, execute the below command: Output: 600 Consider the below output: Example9: Find some of the individual records. To print some of the individual student marks record, execute the below command: The above command will print the individual's name with his marks from the StudentMarks file. Consider the below output: Example10: Find the value of exp 8. To find the value of exp 8, execute the below command: The above command will print the value of exp 8. consider the below output: Next TopicLinux make command |