Download the
bash-basics.zip archive.
This contains the sample programs discussed here.
Use the unzip tool to extract the contents:
$ unzip bash-basics.zip
Then change directory to bash-basics and run each of the
files.
Executing a bash script file
Bash script files can be named as you like.
The ".sh" extension used
in these examples is merely a convention which can assist editors.
Unlike Windows systems, the extension is not an essential feature
which determines the usage.
In fact, the UNIX command set contains many Bash scripts
which have no file suffix whatsoever.
All scripts can be executed explicitly
through bash:
$ bash SOME-SCRIPT.sh
In some circumstances you can omit the "bash" call and the script
can act as a standalone executable. This is what needs to happen:
The file itself must be executable by you. This means
that you need "x" permission on the script.
Check the permissions by doing a long-listing of the file
$ ls -l SOME-SCRIPT.sh (ll is an alias for ls -l)
If you are
the owner of the script you can add that permission with
statements like:
$ chmod +x SOME-SCRIPT.sh executable by any user
$ chmod 700 SOME-SCRIPT.sh executable only by owner
The script must either
be identified by its path prefix or have its containing
directory in the PATH variable. A full path
to the script might be:
/usr/local/bin/SOME-SCRIPT.sh
If the script is in the shell's current directory, this is also a full path:
./SOME-SCRIPT.sh
The ability to invoke a script "by itself" in the latter case:
SOME-SCRIPT.sh
usually means that "." in the PATH.
The file must identify itself as to how it is to be executed.
The determination of how a file is "executed" in Linux
is by the first two bytes in the file,
a 16-bit value called the magic number of a file.
If the first two characters are #!,
this indicates that the file
is a text-based script file, and that
the remaining portion of the first line provides the
program to run the script.
Thus, a Bash script begins with this first line:
#!/bin/bash
The "#" character is also understood as a comment, and so if
we run this file by the explicit call:
$ bash scalars.sh
then the first line is simply taken as a comment line with no further
significance. This first line of a file, when starting with #!
is sometimes called the shebang line.
If the file has no identifiable magic number, e.g.,
there no shebang line, a Bash script.
Where to put scripts
Initially when you're developing a script, it can be in any directory
since you'll be running it from that directory. Once you've written
the script and want to use it like this
SOME-SCRIPT
regardless of the directory that you're in, there are two common places
to put this script:
~/bin/:
Use this location for your personal scripts.
/usr/local/bin/:
Use this location for scripts that other users can use.
If you're the only user on the system, it doesn't make much
of a difference.
Prior to relocation of the script,
double-check that there is no such script already in use with
that name by doing:
$ which SOME-SCRIPT
and observing that there is no conflict.
The Bash Language
The Bash language has three main functions:
execute commands interactively
extend the set of commands via scripts
build up, via sourcing,
the user environment with variables, aliases, functions
In particular, Bash, per se, is not a general purpose
programming script language like, say, Perl, Python or TCL.
Its main orientation is towards executing the
standard UNIX command set and Bash scripts rely heavily on
the standard UNIX commands.
Interactive Execution
When a shell is run
interactively the lines of the bash program
are created
statement-by-statement instead of creating the entire
script before running it.
From the bash language point-of-view, a shell
is interactive if the prompt variable, PS1 is defined,
since all statements receive this prompt before entry.
Furthermore, interactive execution source
the statments
instead of
executing (see below). Interactive execution also permits
many user-friendly control features not necessary in script
execution such as:
line repeat control with up and down arrows
line editing and extension features
tab-based command and filename completion
Variables and Values
The program scalars.sh illustrates basic principles of Bash
variables and values. In particular, there are effectively no
data types in Bash. Every value can be regarded as a string.
Values are created in several ways:
within uninterpolated quotes: ' '
within interpolated quotes: " "
the output of a command within shell evaluated
back quotes
` ` or a command within the syntax $( )
a bareword which is not a Bash reserved word
and contains no special operator characters
The most basic operation on strings is concatenation, which, in Bash,
is simply juxtaposition. In general, whitespace sequences
are collapsed into a single blank;
whitespace sequences at the ends of
strings are truncated (i.e., trimmed).
Variables are defined using the assign operator "=" in a very strict sort
of way. Once a variable, v, is defined, its value is automatically used
with the expression $v.
A double-quoted variable's value, like "$y", can
behave differently from $y
when the value has internal whitespace. If there is any doubt,
it is recommended to always use double quotes.
A newline is interpreted as a statement terminator.
A semicolon (;) can also be used as a statement terminator if you want
two or more statements on the same line.
scalars.sh
#!/bin/bash
# file: scalars.sh
x='aa' # no spaces allowed around the "=" operator
y="bb BB"
z=`date` # these are back-quotes (found in upper left of keyboard)
w=$(pwd) # alternative to backquotes
u="${x}QQ$y" # interpolated, the brackets "{ }" protect the value of $x
v='${x}QQ$y' # not-interpolated
echo '$x =' "$x"
echo '$y =' $y # compare $y unquoted
echo '$y =' "$y" # versus "$y"
echo '$z =' "$z"
echo '$w =' "$w"
echo '$u =' "$u"
echo '$v =' "$v"
echo "-------------------------------"
echo first second # barewords allowed
echo "first second" # use quotes to preserve space
echo "-------------------------------"
# numerical strings can be treated numerically inside $(( .. ))
echo $((12+13-7))
n=7
n=$((n*2))
echo $n
echo "-------------------------------"
# echo always terminates with new-line, use "-n" to avoid that
echo -n Type a line to be read "=> " # careful to quote "=>"
read x # read string from standard input
echo "You typed this =>" "|$x|" # double-quotes preserve internal
# spacing, but string is trimmed
Command-line arguments
As mentioned above,
one of the primary purpose of the bash language is to extend the
set of commands. For this reason Bash provides simple access to the
command-line parameters.
Bash uses the variables "$1", "$2", etc.
The expression "$0" is the command name itself.
For the most general usage, they should be double-quoted in order
to maintain the integrity of any internal blanks
within an argument.
Try running this next program these ways:
$ args.sh
$ args.sh a b c
$ args.sh "a b" c
args.sh
#!/bin/bash
# file: args.sh
echo command: "$0" # This is the command (with full pathname)
# to get the simple "last part" of the full command,
# call external basename operation
echo command: $(basename "$0")
echo
# The syntax "$#" gives the number of command-line arguments.
echo number of command line args: "$#"
echo
echo first: "$1"
echo second: "$2"
echo third: "$3"
# "$@" (and "$*") is used to represent
# all the command line arguments as a single string
echo all: "$@"
if-else statements
The bash if-else syntax is unusual compared to other languages.
The format looks like this:
if ...
then
some statements
elif ...
some statements
else
some statements
fi
The "..." sections represent boolean "tests".
The chained elif and
the else parts are optional. The "then" syntax is often
written on the same line as the if portion like this:
if ...; then
In Bash, newline acts like a statement terminator and one
joins by using a semicolon separator.
Program exit status
As an example, consider
pingtest.sh
host="$1"
[ "$host" ] || { echo usage: `basename $0` "<host or ip>"; exit 1; }
if ping -w 2 -c 1 "$host" > /dev/null
then
echo status=$?
echo Can ping "$host"
else
echo status=$?
echo Cannot ping "$host"
fi
Try these:
What is happening is that the
ping operation with the options used is a single ping
which can either succeed or
fail within 2 seconds with these two possible outcomes:
it succeeds with exit status is 0, the
test is true and the if part is executed.
it fails with non-zero exit status, the
test is false and the else part is executed.
The $? construct used in
echo status=$?
is a bash special variable which
gives the exit status of a previous command (and so it has to come
before the second echo statement).
Observer that the notion of true and false in these
bash tests can be counterintuitive: an exit status of 0 means true,
non-zero means false.
In a C++ or Java program, you
can set the exit status with the exit function:
exit(0) (default) means success,
exit(1) means failure.
The && and || operators
The && and || operators have pretty much the same sense as other
language. Both are short-circuit operations. In bash they can be
used to express the chaining of operations based on success or
failure. A good example is:
c++ myprog.cc && a.out
in which we only run the compiled program if the compilation succeeds.
Boolean expressions in test statements
What is considered as boolean expression in an if test
uses this syntax:
if [ BOOLEAN-EXPRESSION ]; then
statements ...
fi
The only value regarded as false
is the empty string.
Bash does not recognize any numerical types per se, only strings
used in a numerical context.
An undefined value is, in every way, equivalent to the empty string
in Bash. You have to be careful about using an undefined
variable in a script since it may be an exported variable
and, thereby, implicitly defined. You can always explicitly undefine
a variable x by:
unset x
You can verify the values of false by running this sample script:
falsetest.sh
x=0; [ "$x" ] && echo x is true 1;
x=""; [ "$x" ] && echo x is true 2;
x=" "; [ "$x" ] && echo x is true 3;
unset x; [ "$x" ] && echo x is true 4;
x=false; [ "$x" ] && echo x is true 5;
An example usage is this line in pingtest.sh:
In this example
host is the first parameter; if undefined, give a "usage" message.
Unary file information operators
A number of common Bash constructions use the unary "" prefix
file test operators, e.g.,
-e NAME: NAME exists as a file (of some type)
-f NAME: NAME exists as a regular file
-d NAME: NAME exists as a directory
An example of this appears in the ~/.bashrc startup script:
if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
fi
There are many good usages for these operators for error checking
in scripts,
such as:
[ -d /etc/X11 ] && { cd /etc/X11; ls -l; }
[ -f "$input_file" ] || { echo no such file "$input_file"; exit 1; }
Binary test operators
The if operator (and other test operators)
can be used with boolean expressions along with appropriate syntax.
The expression to be tested
are normally within single brackets [ .. ].
Within these we have these operator usages:
However both double brackets [[ .. ]]
and double parentheses (( .. ))
can serve as delimeters.
The operators < and >
normally represent "file redirection",
but can be used for lexicographic comparison,
within
[[ .. ]]
and numerical comparison within
(( .. )).
Here are some examples:
test_values.sh
#!/bin/bash
x=15
y=6
z="aa a"
w="aa b"
u=02
v=2
echo -n "$u equals $v lexicographically:" $'\t'
if [ "$u" = "$v" ]; then echo "yes"; else echo "no"; fi
echo -n "$u not equals $v lexicog.:" $'\t'
if [ "$u" != "$v" ]; then echo "yes"; else echo "no"; fi
echo -n "$u equals $v numerically:" $'\t'
if [ "$u" -eq "$v" ]; then echo "yes"; else echo "no"; fi
echo -n "$u not equals $v numeric.:" $'\t'
if [ "$u" -ne "$v" ]; then echo "yes"; else echo "no"; fi
echo -n "$x is less than $y numeric.:" $'\t'
if [ "$x" -lt "$y" ]; then echo "yes"; else echo "no"; fi
echo -n "$x less than $y numerically:" $'\t'
if (( "$x" < "$y" )); then echo "yes"; else echo "no"; fi
echo -n "$x less than $y lexicog.:" $'\t'
if [[ "$x" < "$y" ]]; then echo "yes"; else echo "no"; fi
echo -n "'$z' less than '$w' lex.:" $'\t'
if [[ "$z" < "$w" ]]; then echo "yes"; else echo "no"; fi
Other good usages are error checking based on number of script
command line arguments:
[ $# -eq 0 ] && { echo "must have at least one arg."; exit 1; }
Subtle syntax issues
The way Bash deals with strings has certain
unexpected consequences. Consider this program:
errors.sh
x="a"
y="a b"
["$x" ] && echo "non-empty"
[ "$x"] && echo "non-empty"
[ $x ] && echo "non-empty"
[ $y ] && echo "non-empty"
When executed, the 3 out of 4 test lines are flagged as errors:
line 4: [a: command not found
line 5: [: missing `]'
line 7: [: a: unary operator expected
The first two mistakes were caused by having the expression
"$x" touch a bracket.
The last was caused by the missing quotes around
the $y expression in which case it interpreted the inserted
expression "a b"
as the operator a with argument b.
String patterns and the case statement
Bash uses primitive globbing patterns for various matching
operations. The most common is the usage of "*" which
matches any sequence of characters. Less common is "?"
which matches any single character and even less common
are character sets, such as "[A-Z]" and "[^0-9]".
These type of expressions stand in contrast to more powerful
regular expression pattern generators which, in Bash,
are only available through auxiliary commands.
Glob patterns are simple, familiar patterns such as
those used commonly in file listing:
$ ls *.html # all HTML files (not starting with ".")
$ ls .??* # all dot files except "." and ".."
$ ls test[0-3] # "test0", "test1", "test2", "test3"
The Bash case statement distinguishes itself from
an if/else constructions primarily by its
ability to test its cases by matching the argument against
glob patterns. The syntax is like this:
case $file in
*.txt) # treat $file like a text file
;;
*.gif) # treat it like a GIF file
;;
*) # catch-all
;;
esac
Unlike C++ or Java syntax, the
break exits an enclosing loop, not exit the
particular case.
Loops
Bash has both for and while loops.
However, the type of
control for these is typically not numerical.
The most common looping structure in Bash is the for/in structure
like this:
for x in ...
do
statements involving $x
done
The "..." is a list of things generated in a number of ways.
The x
is the loop variable which iterates through each item in the list.
For example, try running this program in your home directory
fileinfo.sh
for x in *; do
file $x
done
In this case the things iterated are the files in the current directory.
One can use numerical-like
looping with the double-parentheses like those used
for numerical comparison above. One can do a numerical looping
as follows:
for ((i=1; i<=10; ++i)); do
echo $i
done
Another common list of "things" are the command-line arguments.
Consider these examples:
loopargs-for.sh
for i in "$@"; do
echo "$i"
done
loopargs-while.sh
while [ "$1" ]; do
echo "$1"
shift
done
Run this program with some command-line arguments, such as:
$ loopargs-for.sh a b c
$ loopargs-while.sh a b c
Although these two version appear to function identically, the
latter version is, in fact, more general because the argument
advancing is more under control of the program.
The while loop also has an advantage in its ability to read
"live" input.
For example, this simple program reads and echos input lines:
while read line; do
echo $line
done
Deferred evaluation
Bash, like most script languages, unlike C++ or Java,
has the ability to do deferred or late evaluation in
which an expression is constructed and "executed" using the
eval operation. This has subtle usages.
The following program
illustrates the initialization of a list of variables in a loop,
something
that would be impossible (at least, extremely awkward)
to do in C++ or Java:
eval-example.sh
n=1
# set x, y, z, w, u to the squares of 2, 3, 4, 5, 6, resp.
for v in x y z w u; do
((++n)) # we do not need $((++n))
eval "$v=$(($n*$n))" # eval "$v=$((n*n))" also works
done
echo $x $y $z $w $u
Processing command-line options
Command-line arguments commonly consist of option arguments
beginning with a "-".
Consider, for example, the unzip command:
$ unzip -q -o FILE.zip -d /usr/local
which extracts FILE.zip into /usr/local, doing so with no output
(-q) and overriding existing files (-o).
The FILE.zip portion is the argument and others are options.
Some options, like -d, take an argument themselves.
The unzip command takes many more options (mostly prior to the
argument).
The bash
builtin operation getopts is meant to assist in extracting these
options from the command line. Consider this program:
getopts-test.sh
# get first group of options
while getopts "noqs" flag
do
echo $flag $OPTIND $OPTARG
done
echo $flag $OPTIND
# shift arguments out and look for a non-option argument
shift $((OPTIND-1))
echo $1
shift
# start over after the non-option argument
OPTIND=1
while getopts "d:" flag
do
echo $flag $OPTIND $OPTARG
done
echo $flag $OPTIND
Running this command
$ getopts-test -q -o FILE.zip -d /usr/local
yields the output:
q 2
o 3
? 3
FILE.zip
d 3 /usr/local
? 3
The while loop
while getopts "noqs" flag
runs through the arguments looking for
-n, -o, -q, -s options.
OPTIND gives the position of the option (1-based). When
a non-option argument is encountered the while loop terminates
with flag set to ?.
We can keep on going by shifting everything out and resetting
OPTIND back to 1.
The second part of the option search uses:
while getopts "d:" flag
The d:" syntax
indicates that the d option also takes an argument. In
this case, the $OPTARG expression captures that value.
What is really useful about getopts is that it can capture common
"compressed form" argument usage like this:
$ getopts-test.sh -qos FILE.zip -d/usr/local
and understand render it the same as if this were entered:
$ getopts-test.sh -q -o -s FILE.zip -d /usr/local
String Processing in the Bash Language
The Bash language in
itself has very primitive, unintuitive string-processing operations.
Typically these operations are augmented by various
standard UNIX string processing operations such as
sed, awk and tr which we'll see later.
string_processing.sh
#!/bin/bash
str="abcdefghijkl"
echo '$str' $'\t\t' $str
echo '${str:0:6}' $'\t' ${str:0:6} # substring starting at 0 of length 6
echo '${str:8}' $'\t' ${str:8} # substring starting at position 8
echo "---------------------------------------------"
file="/home/rkline/bin/test.cpp"
echo '$file' $'\t\t' $file
echo '${file#/*/}' $'\t' ${file#/*/} # remove smallest top match to /*/
echo '${file##/*/}' $'\t' ${file##/*/} # remove largest top match to /*/
echo '${file%.cpp}' $'\t' ${file%.cpp} # remove bottom .cpp match
relpath=${file#/}
echo '${file#/}' $'\t' $relpath '(relpath)'
echo '${relpath%%/*}' $'\t' ${relpath%%/*} # remove largest bottom match to /*
echo '${relpath%/*}' $'\t' ${relpath%/*} # remove largest bottom match to /*
A sample utility script
The following Bash utility script, zipit,
illustrates many features of the above sections. This
script is intended to simplify the command-line usage
of the zip operation for zipping directories as well
as regular files.
zipit
#!/bin/bash
ZIP=/usr/bin/zip
RM=/bin/rm
[ -x $ZIP ] || { echo "No such executable: $ZIP"; exit 1; }
[ -x $RM ] || { echo "No such executable: $RM"; exit 1; }
if [ $# -eq 0 ]; then
echo usage: $(basename $0) "<file_or_dir1> ..."
exit 1
fi
for i in "${@}"; do # process all command-line args
echo "processing: $i"
if [ ! -d "$i" -a ! -f "$i" ]; then # not directory, not file
echo " $i is not directory, nor file, skipping"
continue
else
i="${i%/}" # remove trailing "/" if any
target="$i.zip" # define the target
if [ -f "$target" ]; then
echo " remove old $target"
$RM $target # remove target if exists
fi
if [ -d "$i" ]; then
$ZIP -r "$target" "$i" # zip directory recursively
else
$ZIP "$target" "$i" # zip a file
fi
fi
done
A few points to make about this script:
Usage information is provided when no arguments are given
in order to facilitate
user understanding of the operation's intended syntax.
It provides ample error checking.
Line-by-line comments are used due to aid understanding of the
obscure Bash syntax.
Executables are specified by full paths and defined by variables.
Functions
Functions represent a generalization of aliases which can use parameters
in non-trivial ways.
In Bash, functions must be defined before being used.
In practice, they are often grouped into files of functions
and sourced before usage.
Functions are supposed to emulate the way commands work.
They do not return values in the usual way; in fact,
any value sent back by
the return statement must be an integer which
acts like the exit code of an executable.
functions.sh
#!/bin/bash
function foo # or: function foo(), or simply: foo()
{
[ $# -ne 0 ] || {
echo "*** foo: must have at least 1 arg."
return 1
}
echo "$1"
echo "$@"
# "return 0" implicit
}
echo "---> call: foo"
if foo; then
echo success: $?
else
echo failure: $?
fi
echo
echo '---> call: foo aa bb cc'
if foo aa bb cc; then
echo success: $?
else
echo failure: $?
fi
Lists
Bash creates a list simply enough using the parentheses
with whitespace separators, e.g.:
L=(aa bb cc)
Unfortunately, the remaining
syntax for list access operations is unintuitive and basically
hard to remember when you're not using it frequently. Every usage
is surrounded by
${ .. }
For example,
L=(aa bb cc dd ee ff)
${#L[@]} size of the list
"${L[@]}" iterator in for loop
"${L[2]}" element #2 (0 based)
"${L[@]:1:4}" elements in positions 1-4
K=(xx yy zz)
("${L[@]}" "${K[@]}") concatentation of L & K
Here is a relatively simple example:
list_examples1.sh
# capture the output of a command in a list
klinewords=( $(grep kline /usr/share/dict/words) )
# print all the output elements
num=1
for i in "${klinewords[@]}"; do
echo "$num: $i"
num=$((num+1))
done
String Processing with commands
The Bash language relies heavily on the UNIX-like environment
in which it resides in order to create utility scripts.
This environment includes
standard UNIX string processing operations such as these:
sed:
(stream editor) for regular-expression substitution
grep:
can be used to perform match testing with
-c (count) option;
the -e option uses regular expression instead of glob patterns
awk:
captures the fields of a line (separated by whitespace) and
does operations on these fields;
tr: translate from
one list of characters to another;
often used to convert case
of a string
These external operations are used in Bash via standard I/O.
All above operations act on text files when given file name as a parameter,
or act from standard input with no arguments.
A common bash expression
which uses an external OPERATION
to compute some internal value
looks something like this:
result=`echo "input string" | OPERATION`
or
result=$(echo "input string" | OPERATION)
The pipe operator "|" is crucial for
passing the input string to OPERATION via echo.
The following program illustrates some
of these external operations.
string_operations.sh
#!/bin/bash
str="Hello /there: 1 22 33 Testing/";
echo str $'\t\t\t'"$str"
result=$(echo "$str" | tr a-z A-Z) # this uses tr in a typical way
echo 'lower-to-upper case' $'\t'"$result"
echo 'sub T by ** (1)' $'\t'"$(echo "$str" | sed 's/T/**/')"
echo 'sub T by ** (2)' $'\t'"$(echo "$str" | sed 's/T/**/gi')"
echo 'sub blank seq. by _' $'\t'"$(echo "$str" | sed 's/[ ]\+/_/g')"
echo 'remove trailing /' $'\t'"$(echo "$str" | sed 's/\/$//')"
# the "\1" is a "back-reference" to a match surrounded by parentheses
echo 'surround digits' $'\t'"$(echo "$str" | sed 's/\([0-9]\+\)/*\1*/g')"
# the "grep -q pattern" reports success or failure of finding a match
# through its exit status
echo -n 'str matches digit' $'\t'
if echo "$str" | grep -q '[0-9]'; then echo yes; else echo no; fi
echo -n 'str matches "ee"' $'\t'
if echo "$str" | grep -q 'ee'; then echo yes; else echo no; fi
# this uses awk in a very simplistic way
echo 'second chunk of str' $'\t'"$(echo "$str" | awk '{ print $2 }')"