█████ ██ ██ ██ ██
██ ██ ██ ██ ██ ██
███████ ██ █ ██ █████
██ ██ ██ ███ ██ ██ ██
██ ██ ███ ███ ██ ██
awk programming language awk command line tool essential
- Text manipulation
- Automation
- OS:
ubuntu
- IDE:
vim
,VS Code
- Shell:
bash
- Style guidelines:
shellcheck
awk
use the
awk -f
flag followed by the file containing the awk command
vagrant@ubuntu-xenial:~/awk$ cat names.txt
Igor Mc Gregor
Tomas Steele
Wayne Johnson
Molly Black
Levy Woodstock
vagrant@ubuntu-xenial:~/awk$ cat swap.awk
{print $2, $1}
vagrant@ubuntu-xenial:~/awk$ awk -f swap.awk names.txt
Mc Igor
Steele Tomas
Johnson Wayne
Black Molly
Woodstock Levy
vagrant@ubuntu-xenial:~/awk$
field separator with
-F
followed by the separator string or character
agrant@ubuntu-xenial:~/awk$ cat comma.txt
one,two,three,four,five
vagrant@ubuntu-xenial:~/awk$ awk -F , '{print $2}' comma.txt
two
vagrant@ubuntu-xenial:~/awk$
interactive mode
vagrant@ubuntu-xenial:~/awk$ awk -F - '{print $2}'
one-two-three
two
^C
vagrant@ubuntu-xenial:~/awk$
set variable with
-v
flag
vagrant@ubuntu-xenial:~/awk$ awk -v hi=HELLO '{print hi, $0}' names.txt
HELLO Igor Mc Gregor
HELLO Tomas Steele
HELLO Wayne Johnson
HELLO Molly Black
HELLO Levy Woodstock
vagrant@ubuntu-xenial:~/awk$
FS
input field separator
RS
input record separator
OFS
output field separator
ORS
output record separator
NR
number of records
NF
number of fields
FILENAME
Filename
FNR
filename record
- usefull if concatenates more than one file as input
$0
Entire lineNF
the nth field - awk '{print $NF}' text.txt // print the last field
- awk '{print $(NF-1)}' text.txt // print the last -1 field
$
can asign a value to a field - awk '{$1="value"; print $0} // print "value" in field 1 and the followed fields
variables are case sensitive all are global scope
vagrant@ubuntu-bionic:~$ awk '{hello=$1; bye=$2; print hello, bye}'
hello goodbye
hello goodbye
^C
vagrant@ubuntu-bionic:~$
apply operators to variables
vagrant@ubuntu-bionic:~$ awk '{a=1; b=2; print a + b}'
3
^C
vagrant@ubuntu-bionic:~$ awk '{a=1; b=2; print a b}'
12
^C
vagrant@ubuntu-bionic:~$ awk '{a=1; b=2; print a, b}'
1 2
^C
vagrant@ubuntu-bionic:~$
- Math operators
-
-
-
- /
- %
- ^
-
- Increment and decrement operators
- ++
- --
- accept pre increment/decrement and post increment/decrement
vagrant@ubuntu-bionic:~$ awk '{a=1; b=2; print a++, ++b}'
1 3
- Assignment operators
- =
- +=
- -=
- *=
- /=
- %=
- ^=
- Comparision operators
- ==
- !=
- <
- <=
- >
- >= return 1 or true, 0 or false
- Regular expression comparison
- ~
- !~
- Array subscript
- []
vagrant@ubuntu-bionic:~$ awk '{a[1]=$1; a[2]=$2; a[3]=$3; print a[1], a[2], a[3]}'
one two three
one two three
^C
vagrant@ubuntu-bionic:~$
vagrant@ubuntu-xenial:~/awk$ awk -v hi=HELLO '{print hi, $0}' names.txt names2.txt
HELLO Igor Mc Gregor
HELLO Tomas Steele
HELLO Wayne Johnson
HELLO Molly Black
HELLO Levy Woodstock
HELLO Ronald Rivero
HELLO Walt Whitman
vagrant@ubuntu-xenial:~/awk$
regular expresions must be enclosed in quotes
accept the ,!:;
characters as separators
vagrant@ubuntu-xenial:~/awk$ awk -F '[,!:;]' '{print $4}'
one two,three:four;five
five
^C
vagrant@ubuntu-xenial:~/awk$
/abc/ matches "abc", "xxabcxx"" does not match "abxxc", "ab", "ABC"
vagrant@ubuntu-xenial:~/awk$ awk '/on/{print $0}' feaw_greatnesses.txt
Great the risen and fallen nations, and their poets,
Helmsmen of nations, choose your craft! where
Great the daring and venture of sailors on new
explorations.
vagrant@ubuntu-xenial:~/awk$
/a.c/ matches "abc", "axc" does not match "ac"
\ removes special meaning /a\c/ matches "a.c" does not match "ac"
\\ escape the \
/^abc/ matches that begins with "abc" does not match "dabc"
/abc$/ matches that ends with "abc" does not match "abcd"
[] matches any character in the set
/a[xyz]c/ matches any single character in the set, but only one matches "axc", "ayc", "azc" does not match "abc" does not match "axyzc"
ranges
/a[a-z]c/ matches a range of characters from a to z matches "abc", ... "axc", "ayc", "azc"
combined ranges /a[a-zA-Z]c/ matches a range of characters from a to z and A to Z matches "abc", ... "axc", "ayc", "azc", "aAc", "aYc", "aZc"
/a[^a-z]c/ matches any single character not in the set matches "a.c", "aBc" does not match "abc"
zero o more repeats /ab*c/ matches one or more repeats of "ab" matches "abc", "abbbc", "ac"
one or more repeats /ab+c/ matches one or more repeats of "ab" matches "abc", "abbbc" does not match "ac"
zero or one repeats /ab?c/ matches zero or one repeats of "ab" matches "abc", "ac" does not match "abbbc"
specific number of repeats {n} matches exactly n repeats /ab{3}c/ matches "abbbc" does not match "abbbbc"
{n,} matches at least n repeats /ab{3,}c/ matches "abbbc", "abbbbc"
{n, m} repeats from n to m /ab{3,4}/ matches "abbbc", "abbbbc" does not match "abbbbbbc"
/(ab)+c/ matches one or more repeats of "ab" matches "abc", "ababc", "abababc"
greediness
Control the flow execution
if for
or line separator is specified by RS
separate the fields by ,
and the records (lines) by !
vagrant@ubuntu-xenial:~/awk$ cat bigline.txt
one,two,three!four,five,six!seven,eight,nine!ten,eleven,twelve
vagrant@ubuntu-xenial:~/awk$ awk 'BEGIN{RS="!";FS=","} {print $2}' bigline.txt
two
five
eight
eleven
vagrant@ubuntu-xenial:~/awk$
RS=""
empty string as separator takes the line as one field
vagrant@ubuntu-xenial:~/awk$ cat address.txt
Alex Wolf
873 New Hamps Blvd
New York, NY 3334
Young Cheng
1253 Craws Street
Rockford, IL 4356
Marck Zoom
8 Washersmith Ave
San Francisco, CA 18636
vagrant@ubuntu-xenial:~/awk$ awk 'BEGIN {RS="";FS="\n"} {name=$1; address=$2; zip=$3; print "name:" name, "address:" address, "city:" zip} ' address.txt
name:Alex Wolf address:873 New Hamps Blvd city:New York, NY 3334
name:Young Cheng address:1253 Craws Street city:Rockford, IL 4356
name:Marck Zoom address:8 Washersmith Ave city:San Francisco, CA 18636
vagrant@ubuntu-xenial:~/awk$ ;
default field separator is
space default record separator is
\n
new lineORS
output record separatorOFS
output field separator
vagrant@ubuntu-xenial:~/awk$ cat names2.txt
Ronald Rivero
Walt Whitman
vagrant@ubuntu-xenial:~/awk$ awk 'BEGIN{OFS=", ";ORS=" / "} {print $2, $1}' names2.txt
Rivero, Ronald / Whitman, Walt / vagrant@ubuntu-xenial:~/awk$
The comma between $2 and $1 means that awk will use the OFS