Using AWK to get stuff done

This entry was posted by on Saturday, 17 April, 2010 at

AWK is a very powerful scripting language. It can really help you get stuff done. An example is the parsing of the output files of simulation software and deriving summaries, averages, confidence intervals and the like.

Generic awk statement:
awk '{if([pattern])print [something]}' [inputfile]

For this you can use the simple ‘print’ function. It works perfect for the following cases (among many others):

print $0 (prints entire line)
print $1 (prints only column one)
print $1-$3 "\t" sqrt($1)/sqrt($2) "\t" 25-$2

(print some results of mathematic operations and separate them by a tab)

However, the latter may become a problem, depending on what the result is of 25-$2. If $2>25 obviously the result is negative. Some strange behaviour of awk’s print commando is now witnessed: the minus character is eating up the tabs!

file = [ 12 -23.5 ]
awk '{print $1 "\t" $2}' file
expect: 12 [tab] -23
get: 1223.5

This has to do with the meaning of the ‘-‘ (dash or minus) in awk: it means left-align as opposed to right-align (default). If you need more advanced printing, use awk’s printf function (similar to those in C and C++):

awk '{printf "%d \t %f \n",$1,$2}' file

result: 12 [tab] 24.500000 (print a digit and a float, separated by a tab and end with a newline)

We have a file with two columns, 1 is an index and 2 a value. Now filter out all values equal to 0:
cat out.matlab | awk '{if ($2!=0) print $0 }'

Say you have a (config) file and want to replace every occurrence of FOO with BAR:

awk '{gsub(/FOO/,"BAR");}' config.txt > new_config.txt

Or, from a bash script we want to write a variable to the location marked with placeholder FOO:

awk '{gsub(/FOO/,"'${BARS[$CURRENT_BAR]}'");} config.txt > new_config.txt

Find an average of a file with numbers (one per line):

awk 'BEGIN{sum=0}{sum+=($1)}END{print sum/NR}' delays.txt > avg_delay.txt

… and there are many more applications of awk …


Leave a Reply