awk


Overview

Like sed, the awk utility can apply a set of commands to one or more datafiles. Run from the command line, or shell script, awk code will look something like this:

awk 'BEGIN {
    statement
    statement
    statement
}
/regular expression/ {
    statement
    statement
    statement
}
END {
    statement
    statement
    statement
} inputfile.xyz | statement(s) > outputfile.xyz

BEGIN and END are executed once. regular expression receives the data stream, searching for matching terms, which, if found, have statements applied against them.

Here is a snippet of code that will parse various parameters from a WebLogic config.xml file:

 ###
 ### Get WebLogic configuration information 
 ### /usr/local/username/config/domain/config.xml"
 ###
 
 awk 'BEGIN{ctr=0}{
     FS="<"
     RS=">"
     for ( i=1; i<=NF; i++ )
     {
         if ($i ~ /^Server /)
         {
            #print $0
            num = split ($0,word," ")
            for (x=num; x >= 0; x--)
            {
               #print word[x]
               if(word[x] ~ /Machine/)
               {
                  Machine=word[x]
               }
               if(word[x] ~ /Name/)
               {
                  Name=word[x]
               }
               if(word[x] ~ /ListenPort/)
               {
                  ListenPort=word[x]
               }
               if(word[x] ~ /ListenAddress/)
               {
                  ListenAddress=word[x]
               }
 
               if(word[x] ~ /<Server/)
                   print Name ":" ListenAddress ":" ListenPort ":" Machine
            }
         }
     }
 }' $BASE/config.xml | sort > $LOGDIR/domaininfo.out

 


Hello World:

###
### hello.sh
###
### Hello world
###

awk 'BEGIN {
    print "hello world"
}'

 


Sort contents of a file

###
### print1.sh
###
### Print 2nd and 4th fields
###
### Usage:
###    print1.sh filename
###

xfile=$1 

awk '{
    print $2, $4
}' $xfile | sort

 


Derive number of records in a file

###
### recnt.sh
###
### Usage:
###    recnt.sh filename
###
### Derive number of fields in a file
###

xfile=$1

awk 'BEGIN {
    print
    print "Number of records"
    print
}
{
    print $2, $1
}
END {
    print
    print "Number of records: "  NR
}' $xfile

 


Derive ratios between columns

###
### ratio.sh
###
### Usage:
###
###    ratio.sh filename
###
### filename has 2 columns of numbers. 
### Print columns 1 and 2 as well as a  
### third column which is the ratio of 
### the numbers in columns 1 and 2. 
###

xfile=$1 

awk '$1 < $2 {
    print $0, $1/$2
}' $xfile 

 


If pattern does not match

###
### notmatch.sh
###
### Usage:
###
###    notmatch.sh filename
###
###
file=$1

awk '$0 !~ /a href/ {
        print $0
    }' $file

 


Print hyperlinks in an html file

###
### htmlpattern.sh
###
### Usage:
###
###    htmlpattern.sh filename.html
###
### Print hyperlinks in a file
###
file=$1

awk 'BEGIN {
        RS="<"
        FS=">"
    } $1 ~ /a href/ {
        print "Found " $1
    }' $file

 


Increment

###
### relational.sh
###
### Usage:
###
###    relational.sh filename
###
###
file=$1

awk '$0 ~ /a href/ {
        num++
        xyz += num
        print $num ", " $xyz
    }' $file

 


Logical

###
### logical.sh
###
### Usage:
###
###    logical.sh /etc/passwd
###
###  If the third field is both greater than or equal to 100
###  and less than 200, print the record
###
file=$1

awk 'BEGIN { 
         FS=":" 
     }
     $3 >= 100 && $3 < 200 {
         print $0
     }' $file

 


Print Arguments

###
### logical.sh
###
### Usage:
###
###    logical.sh /etc/passwd
###
###  If the third field is both greater than or equal to 100
###  and less than 200, print the record
###
awk 'BEGIN {
     for (i = 1; i < ARGC; i++)
        printf "%s ", ARGV[i]
        printf "\n"
     }' $*

 


Variables

###
### variables.sh
###
### Usage:
###
###    variables.sh filename keyword1 keyword2
###
### Print value of various variables
###

awk -v key1=$2 -v key2=$3 ' 

     $0 ~ /key phrase/ {

     printf "\n\n---------------------------------------------\n\n"

     printf "Current input record: 			\n\n\t\"%s\"", $0

     printf "\n\n"

     print "Key phrase: " key1 ", " key2 

     printf "\n\nRecords read so far (NR):			\t\t\"%s\"", 	NR
     printf "\nFields in the current record (NF):		\t\t\"%s\"", 	NF
     printf "\nValue of 1st field ($1):				\t\"%s\"", 	$1
     printf "\nValue of 2nd field ($2):				\t\"%s\"", 	$2
     printf "\nOutput field separator (OF):			\t\t\"%s\"", 	OF
     printf "\nOutput record separator (OR):			\t\t\"%s\"", 	OR
     printf "\nFilename of current input file (FILENAME):	\t\t\"%s\"", 	FILENAME
     printf "\nPrint format for floating point (OFMT):		\t\t\"%s\"", 	OFMT
     printf "\nNumber of command-line arguments (ARGC):		\t\"%s\"", 	ARGC
     printf "\nFirst command-line argument (ARGV[1]):		\t\t\"%s\"", 	ARGV[1]
     printf "\nRecord number in current file (FNR):		\t\t\"%s\"", 	FNR
     printf "\nLength of string matched (RLENGTH):		\t\t\"%s\"", 	RLENGTH
     printf "\nStart of string matched (RSTART):		\t\t\"%s\"", 	RSTART
     printf "\nSubscript separator (SUBSEP):			\t\t\"%s\"", 	SUBSEP
     printf "\nHOME Env Variable (ENVIRON[\"HOME\"]): 		\t\t\"%s\"", 	ENVIRON["HOME"]
     printf "\nHOSTNAME Env Variable (ENVIRON[\"HOSTNAME\"]):	\t\t\"%s\"", 	ENVIRON["HOSTNAME"]
     printf "\nInput field separator (FS):			\t\t\"%s\"", 	FS
     printf "\nInput record separator (RS):			\t\t\"%s\"", 	RS

     printf "\n\n"

}' $1


 


Built-in variables

NR Number of records read so far
NF Number of fields in the current record
FS Input field separator. Default is whitespace
RS Input record separator. Default is newline
$0 Current input record
$n Value of nth field of current input record
$1 Value of 1st field of current input record
$2 Value of 2nd field of current input record
OF Output field separator. Default is a space.
OR Output record separator. Default is a newline.
FILENAME Filename of current input file
OFMT Output print format for floating point. Default is %.6g, which outputs a value having six digits to the right of the decimal point.
ARGC Number of command-line arguments
ARGV Array of command-line arguments
FNR Record number in current file
RLENGTH Length of string matched by match function
RSTART Start of string matched by match function
SUBSEP Subscript separator

 


Operators

+ Add
- Subtract
* Multiply
/ Divide
% Modulo
== Equality
!= Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
++ Increase by one
++ Decrease by one
+= Add and assign
-= Subtract and assign
*= Multiply and assign
/= Divide and assign
%= Modulo and assign
&& Both expressions must be true
|| Either expression can be true


 

Home