awk


Overview

Like sed, the awk utility can apply a set of commands to one or more datafiles. Run from the command line, or shell script, awk code will look something like this:

awk 'BEGIN {
    statement
    statement
    statement
}
/regular expression/ {
    statement
    statement
    statement
}
END {
    statement
    statement
    statement
} inputfile.xyz | statement(s) > outputfile.xyz

BEGIN and END are executed once. regular expression receives the data stream, searching for matching terms, which, if found, have statements applied against them.

Here is a snippet of code that will parse various parameters from a WebLogic config.xml file:

 ###
 ### Get WebLogic configuration information 
 ### /usr/local/username/config/domain/config.xml"
 ###
 
 awk 'BEGIN{ctr=0}{
     FS="<"
     RS=">"
     for ( i=1; i<=NF; i++ )
     {
         if ($i ~ /^Server /)
         {
            #print $0
            num = split ($0,word," ")
            for (x=num; x >= 0; x--)
            {
               #print word[x]
               if(word[x] ~ /Machine/)
               {
                  Machine=word[x]
               }
               if(word[x] ~ /Name/)
               {
                  Name=word[x]
               }
               if(word[x] ~ /ListenPort/)
               {
                  ListenPort=word[x]
               }
               if(word[x] ~ /ListenAddress/)
               {
                  ListenAddress=word[x]
               }
 
               if(word[x] ~ /<Server/)
                   print Name ":" ListenAddress ":" ListenPort ":" Machine
            }
         }
     }
 }' $BASE/config.xml | sort > $LOGDIR/domaininfo.out

 


Hello World:

###
### hello.sh
###
### Hello world
###

awk 'BEGIN {
    print "hello world"
}'

 


Sort contents of a file

###
### print1.sh
###
### Print 2nd and 4th fields
###
### Usage:
###    print1.sh filename
###

xfile=$1 

awk '{
    print $2, $4
}' $xfile | sort

 


Derive number of records in a file

###
### recnt.sh
###
### Usage:
###    recnt.sh filename
###
### Derive number of fields in a file
###

xfile=$1

awk 'BEGIN {
    print
    print "Number of records"
    print
}
{
    print $2, $1
}
END {
    print
    print "Number of records: "  NR
}' $xfile

 


Derive ratios between columns

###
### ratio.sh
###
### Usage:
###
###    ratio.sh filename
###
### filename has 2 columns of numbers. 
### Print columns 1 and 2 as well as a  
### third column which is the ratio of 
### the numbers in columns 1 and 2. 
###

xfile=$1 

awk '$1 < $2 {
    print $0, $1/$2
}' $xfile 

 


If pattern does not match

###
### notmatch.sh
###
### Usage:
###
###    notmatch.sh filename
###
###
file=$1

awk '$0 !~ /a href/ {
        print $0
    }' $file

 


Print hyperlinks in an html file

###
### htmlpattern.sh
###
### Usage:
###
###    htmlpattern.sh filename.html
###
### Print hyperlinks in a file
###
file=$1

awk 'BEGIN {
        RS="<"
        FS=">"
    } $1 ~ /a href/ {
        print "Found " $1
    }' $file

 


Increment

###
### relational.sh
###
### Usage:
###
###    relational.sh filename
###
###
file=$1

awk '$0 ~ /a href/ {
        num++
        xyz += num
        print $num ", " $xyz
    }' $file

 


Logical

###
### logical.sh
###
### Usage:
###
###    logical.sh /etc/passwd
###
###  If the third field is both greater than or equal to 100
###  and less than 200, print the record
###
file=$1

awk 'BEGIN { 
         FS=":" 
     }
     $3 >= 100 && $3 < 200 {
         print $0
     }' $file

 


Print Arguments

###
### logical.sh
###
### Usage:
###
###    logical.sh /etc/passwd
###
###  If the third field is both greater than or equal to 100
###  and less than 200, print the record
###
awk 'BEGIN {
     for (i = 1; i < ARGC; i++)
        printf "%s ", ARGV[i]
        printf "\n"
     }' $*

 


Variables

###
### variables.sh
###
### Usage:
###
###    variables.sh filename keyword1 keyword2
###
### Print value of various variables
###

awk -v key1=$2 -v key2=$3 ' 

     $0 ~ /key phrase/ {

     printf "\n\n---------------------------------------------\n\n"

     printf "Current input record: 			\n\n\t\"%s\"", $0

     printf "\n\n"

     print "Key phrase: " key1 ", " key2 

     printf "\n\nRecords read so far (NR):			\t\t\"%s\"", 	NR
     printf "\nFields in the current record (NF):		\t\t\"%s\"", 	NF
     printf "\nValue of 1st field ($1):				\t\"%s\"", 	$1
     printf "\nValue of 2nd field ($2):				\t\"%s\"", 	$2
     printf "\nOutput field separator (OF):			\t\t\"%s\"", 	OF
     printf "\nOutput record separator (OR):			\t\t\"%s\"", 	OR
     printf "\nFilename of current input file (FILENAME):	\t\t\"%s\"", 	FILENAME
     printf "\nPrint format for floating point (OFMT):		\t\t\"%s\"", 	OFMT
     printf "\nNumber of command-line arguments (ARGC):		\t\"%s\"", 	ARGC
     printf "\nFirst command-line argument (ARGV[1]):		\t\t\"%s\"", 	ARGV[1]
     printf "\nRecord number in current file (FNR):		\t\t\"%s\"", 	FNR
     printf "\nLength of string matched (RLENGTH):		\t\t\"%s\"", 	RLENGTH
     printf "\nStart of string matched (RSTART):		\t\t\"%s\"", 	RSTART
     printf "\nSubscript separator (SUBSEP):			\t\t\"%s\"", 	SUBSEP
     printf "\nHOME Env Variable (ENVIRON[\"HOME\"]): 		\t\t\"%s\"", 	ENVIRON["HOME"]
     printf "\nHOSTNAME Env Variable (ENVIRON[\"HOSTNAME\"]):	\t\t\"%s\"", 	ENVIRON["HOSTNAME"]
     printf "\nInput field separator (FS):			\t\t\"%s\"", 	FS
     printf "\nInput record separator (RS):			\t\t\"%s\"", 	RS

     printf "\n\n"

}' $1


 


Built-in variables

NR Number of records read so far
NF Number of fields in the current record
FS Input field separator. Default is whitespace
RS Input record separator. Default is newline
$0 Current input record
$n Value of nth field of current input record
$1 Value of 1st field of current input record
$2 Value of 2nd field of current input record
OF Output field separator. Default is a space.
OR Output record separator. Default is a newline.
FILENAME Filename of current input file
OFMT Output print format for floating point. Default is %.6g, which outputs a value having six digits to the right of the decimal point.
ARGC Number of command-line arguments
ARGV Array of command-line arguments
FNR Record number in current file
RLENGTH Length of string matched by match function
RSTART Start of string matched by match function
SUBSEP Subscript separator

 


Operators

+ Add
- Subtract
* Multiply
/ Divide
% Modulo
== Equality
!= Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
++ Increase by one
++ Decrease by one
+= Add and assign
-= Subtract and assign
*= Multiply and assign
/= Divide and assign
%= Modulo and assign
&& Both expressions must be true
|| Either expression can be true


 

Home

 

 

 

 

 



Warning: require_once(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/setget5/public_html/p/awk/index.php on line 571

Warning: require_once(index.class.php): failed to open stream: No such file or directory in /home/setget5/public_html/p/awk/index.php on line 571

Warning: require_once(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /home/setget5/public_html/p/awk/index.php on line 571

Fatal error: require_once(): Failed opening required 'index.class.php' (include_path='.:/usr/lib/php:/usr/local/lib/php') in /home/setget5/public_html/p/awk/index.php on line 571