awk
Overview
Like sed, the awk utility can apply a set of commands to one or more datafiles. Run from the command line, or shell script, awk code will look something like this:
awk 'BEGIN { statement statement statement } /regular expression/ { statement statement statement } END { statement statement statement } inputfile.xyz | statement(s) > outputfile.xyzBEGIN and END are executed once. regular expression receives the data stream, searching for matching terms, which, if found, have statements applied against them.
Here is a snippet of code that will parse various parameters from a WebLogic config.xml file:
### ### Get WebLogic configuration information ### /usr/local/username/config/domain/config.xml" ### awk 'BEGIN{ctr=0}{ FS="<" RS=">" for ( i=1; i<=NF; i++ ) { if ($i ~ /^Server /) { #print $0 num = split ($0,word," ") for (x=num; x >= 0; x--) { #print word[x] if(word[x] ~ /Machine/) { Machine=word[x] } if(word[x] ~ /Name/) { Name=word[x] } if(word[x] ~ /ListenPort/) { ListenPort=word[x] } if(word[x] ~ /ListenAddress/) { ListenAddress=word[x] } if(word[x] ~ /<Server/) print Name ":" ListenAddress ":" ListenPort ":" Machine } } } }' $BASE/config.xml | sort > $LOGDIR/domaininfo.out
Hello World:
### ### hello.sh ### ### Hello world ### awk 'BEGIN { print "hello world" }'
Sort contents of a file
### ### print1.sh ### ### Print 2nd and 4th fields ### ### Usage: ### print1.sh filename ### xfile=$1 awk '{ print $2, $4 }' $xfile | sort
Derive number of records in a file
### ### recnt.sh ### ### Usage: ### recnt.sh filename ### ### Derive number of fields in a file ### xfile=$1 awk 'BEGIN { print print "Number of records" print } { print $2, $1 } END { print print "Number of records: " NR }' $xfile
Derive ratios between columns
### ### ratio.sh ### ### Usage: ### ### ratio.sh filename ### ### filename has 2 columns of numbers. ### Print columns 1 and 2 as well as a ### third column which is the ratio of ### the numbers in columns 1 and 2. ### xfile=$1 awk '$1 < $2 { print $0, $1/$2 }' $xfile
If pattern does not match
### ### notmatch.sh ### ### Usage: ### ### notmatch.sh filename ### ### file=$1 awk '$0 !~ /a href/ { print $0 }' $file
Print hyperlinks in an html file
### ### htmlpattern.sh ### ### Usage: ### ### htmlpattern.sh filename.html ### ### Print hyperlinks in a file ### file=$1 awk 'BEGIN { RS="<" FS=">" } $1 ~ /a href/ { print "Found " $1 }' $file
Increment
### ### relational.sh ### ### Usage: ### ### relational.sh filename ### ### file=$1 awk '$0 ~ /a href/ { num++ xyz += num print $num ", " $xyz }' $file
Logical
### ### logical.sh ### ### Usage: ### ### logical.sh /etc/passwd ### ### If the third field is both greater than or equal to 100 ### and less than 200, print the record ### file=$1 awk 'BEGIN { FS=":" } $3 >= 100 && $3 < 200 { print $0 }' $file
Print Arguments
### ### logical.sh ### ### Usage: ### ### logical.sh /etc/passwd ### ### If the third field is both greater than or equal to 100 ### and less than 200, print the record ### awk 'BEGIN { for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] printf "\n" }' $*
Variables
### ### variables.sh ### ### Usage: ### ### variables.sh filename keyword1 keyword2 ### ### Print value of various variables ### awk -v key1=$2 -v key2=$3 ' $0 ~ /key phrase/ { printf "\n\n---------------------------------------------\n\n" printf "Current input record: \n\n\t\"%s\"", $0 printf "\n\n" print "Key phrase: " key1 ", " key2 printf "\n\nRecords read so far (NR): \t\t\"%s\"", NR printf "\nFields in the current record (NF): \t\t\"%s\"", NF printf "\nValue of 1st field ($1): \t\"%s\"", $1 printf "\nValue of 2nd field ($2): \t\"%s\"", $2 printf "\nOutput field separator (OF): \t\t\"%s\"", OF printf "\nOutput record separator (OR): \t\t\"%s\"", OR printf "\nFilename of current input file (FILENAME): \t\t\"%s\"", FILENAME printf "\nPrint format for floating point (OFMT): \t\t\"%s\"", OFMT printf "\nNumber of command-line arguments (ARGC): \t\"%s\"", ARGC printf "\nFirst command-line argument (ARGV[1]): \t\t\"%s\"", ARGV[1] printf "\nRecord number in current file (FNR): \t\t\"%s\"", FNR printf "\nLength of string matched (RLENGTH): \t\t\"%s\"", RLENGTH printf "\nStart of string matched (RSTART): \t\t\"%s\"", RSTART printf "\nSubscript separator (SUBSEP): \t\t\"%s\"", SUBSEP printf "\nHOME Env Variable (ENVIRON[\"HOME\"]): \t\t\"%s\"", ENVIRON["HOME"] printf "\nHOSTNAME Env Variable (ENVIRON[\"HOSTNAME\"]): \t\t\"%s\"", ENVIRON["HOSTNAME"] printf "\nInput field separator (FS): \t\t\"%s\"", FS printf "\nInput record separator (RS): \t\t\"%s\"", RS printf "\n\n" }' $1
Built-in variables
NR Number of records read so far NF Number of fields in the current record FS Input field separator. Default is whitespace RS Input record separator. Default is newline $0 Current input record $n Value of nth field of current input record $1 Value of 1st field of current input record $2 Value of 2nd field of current input record OF Output field separator. Default is a space. OR Output record separator. Default is a newline. FILENAME Filename of current input file OFMT Output print format for floating point. Default is %.6g, which outputs a value having six digits to the right of the decimal point. ARGC Number of command-line arguments ARGV Array of command-line arguments FNR Record number in current file RLENGTH Length of string matched by match function RSTART Start of string matched by match function SUBSEP Subscript separator
Operators
+ Add - Subtract * Multiply / Divide % Modulo == Equality != Not equal > Greater than < Less than >= Greater than or equal <= Less than or equal ++ Increase by one ++ Decrease by one += Add and assign -= Subtract and assign *= Multiply and assign /= Divide and assign %= Modulo and assign && Both expressions must be true || Either expression can be true