Bash Basics
— print (last updated: Jun 11, 2009) print

Select font size:
Download the bash-basics.zip archive. This contains the sample programs discussed here. Use the unzip tool to extract the contents:
$ unzip bash-basics.zip
Then change directory to bash-basics and run each of the files.

Executing a bash script file

Bash script files can be named as you like. The ".sh" extension used in these examples is merely a convention which can assist editors. Unlike Windows systems, the extension is not an essential feature which determines the usage. In fact, the UNIX command set contains many Bash scripts which have no file suffix whatsoever.

All scripts can be executed explicitly through bash:
$ bash SOME-SCRIPT.sh
In some circumstances you can omit the "bash" call and the script can act as a standalone executable. This is what needs to happen:
  1. The file itself must be executable by you. This means that you need "x" permission on the script. Check the permissions by doing a long-listing of the file
    $ ls -l SOME-SCRIPT.sh             (ll is an alias for ls -l)
    
    If you are the owner of the script you can add that permission with statements like:
    $ chmod +x SOME-SCRIPT.sh          executable by any user
    $ chmod 700 SOME-SCRIPT.sh         executable only by owner
    
  2. The script must either be identified by its path prefix or have its containing directory in the PATH variable. A full path to the script might be:
    /usr/local/bin/SOME-SCRIPT.sh 
    
    If the script is in the shell's current directory, this is also a full path:
    ./SOME-SCRIPT.sh
    
    The ability to invoke a script "by itself" in the latter case:
    SOME-SCRIPT.sh
    
    usually means that "." in the PATH.
  3. The file must identify itself as to how it is to be executed. The determination of how a file is "executed" in Linux is by the first two bytes in the file, a 16-bit value called the magic number of a file. If the first two characters are #!, this indicates that the file is a text-based script file, and that the remaining portion of the first line provides the program to run the script. Thus, a Bash script begins with this first line:
    #!/bin/bash
    
    The "#" character is also understood as a comment, and so if we run this file by the explicit call:
    $ bash scalars.sh
    
    then the first line is simply taken as a comment line with no further significance. This first line of a file, when starting with #! is sometimes called the shebang line. If the file has no identifiable magic number, e.g., there no shebang line, a Bash script.

Where to put scripts

Initially when you're developing a script, it can be in any directory since you'll be running it from that directory. Once you've written the script and want to use it like this
SOME-SCRIPT
regardless of the directory that you're in, there are two common places to put this script: If you're the only user on the system, it doesn't make much of a difference. Prior to relocation of the script, double-check that there is no such script already in use with that name by doing:
$ which SOME-SCRIPT
and observing that there is no conflict.

The Bash Language

The Bash language has three main functions: In particular, Bash, per se, is not a general purpose programming script language like, say, Perl, Python or TCL. Its main orientation is towards executing the standard UNIX command set and Bash scripts rely heavily on the standard UNIX commands.

Interactive Execution

When a shell is run interactively the lines of the bash program are created statement-by-statement instead of creating the entire script before running it. From the bash language point-of-view, a shell is interactive if the prompt variable, PS1 is defined, since all statements receive this prompt before entry.

Furthermore, interactive execution source the statments instead of executing (see below). Interactive execution also permits many user-friendly control features not necessary in script execution such as:

Variables and Values

The program scalars.sh illustrates basic principles of Bash variables and values. In particular, there are effectively no data types in Bash. Every value can be regarded as a string. Values are created in several ways:
  1. within uninterpolated quotes: ' '
  2. within interpolated quotes: " "
  3. the output of a command within shell evaluated back quotes ` `
    or a command within the syntax $( )
  4. a bareword which is not a Bash reserved word and contains no special operator characters

The most basic operation on strings is concatenation, which, in Bash, is simply juxtaposition. In general, whitespace sequences are collapsed into a single blank; whitespace sequences at the ends of strings are truncated (i.e., trimmed).

Variables are defined using the assign operator "=" in a very strict sort of way. Once a variable, v, is defined, its value is automatically used with the expression $v. A double-quoted variable's value, like "$y", can behave differently from $y when the value has internal whitespace. If there is any doubt, it is recommended to always use double quotes.

A newline is interpreted as a statement terminator. A semicolon (;) can also be used as a statement terminator if you want two or more statements on the same line.

scalars.sh
#!/bin/bash # file: scalars.sh x='aa' # no spaces allowed around the "=" operator y="bb BB" z=`date` # these are back-quotes (found in upper left of keyboard) w=$(pwd) # alternative to backquotes u="${x}QQ$y" # interpolated, the brackets "{ }" protect the value of $x v='${x}QQ$y' # not-interpolated echo '$x =' "$x" echo '$y =' $y # compare $y unquoted echo '$y =' "$y" # versus "$y" echo '$z =' "$z" echo '$w =' "$w" echo '$u =' "$u" echo '$v =' "$v" echo "-------------------------------" echo first second # barewords allowed echo "first second" # use quotes to preserve space echo "-------------------------------" # numerical strings can be treated numerically inside $(( .. )) echo $((12+13-7)) n=7 n=$((n*2)) echo $n echo "-------------------------------" # echo always terminates with new-line, use "-n" to avoid that echo -n Type a line to be read "=> " # careful to quote "=>" read x # read string from standard input echo "You typed this =>" "|$x|" # double-quotes preserve internal # spacing, but string is trimmed

Command-line arguments

As mentioned above, one of the primary purpose of the bash language is to extend the set of commands. For this reason Bash provides simple access to the command-line parameters. Bash uses the variables "$1", "$2", etc. The expression "$0" is the command name itself. For the most general usage, they should be double-quoted in order to maintain the integrity of any internal blanks within an argument.

Try running this next program these ways:
$ args.sh 
$ args.sh  a     b    c
$ args.sh "a     b"   c

args.sh
#!/bin/bash # file: args.sh echo command: "$0" # This is the command (with full pathname) # to get the simple "last part" of the full command, # call external basename operation echo command: $(basename "$0") echo # The syntax "$#" gives the number of command-line arguments. echo number of command line args: "$#" echo echo first: "$1" echo second: "$2" echo third: "$3" # "$@" (and "$*") is used to represent # all the command line arguments as a single string echo all: "$@"

if-else statements

The bash if-else syntax is unusual compared to other languages. The format looks like this:
if ...
then
  some statements
elif ...
  some statements
else
  some statements
fi
The "..." sections represent boolean "tests". The chained elif and the else parts are optional. The "then" syntax is often written on the same line as the if portion like this:
if ...; then
In Bash, newline acts like a statement terminator and one joins by using a semicolon separator.

Program exit status

As an example, consider

pingtest.sh
host="$1" [ "$host" ] || { echo usage: `basename $0` "<host or ip>"; exit 1; } if ping -w 2 -c 1 "$host" > /dev/null then echo status=$? echo Can ping "$host" else echo status=$? echo Cannot ping "$host" fi
Try these:
$ pingtest.sh taz.cs.wcupa.edu
$ pingtest.sh www.wcupa.edu
What is happening is that the ping operation with the options used is a single ping which can either succeed or fail within 2 seconds with these two possible outcomes: The $? construct used in
echo status=$?
is a bash special variable which gives the exit status of a previous command (and so it has to come before the second echo statement).

Observer that the notion of true and false in these bash tests can be counterintuitive: an exit status of 0 means true, non-zero means false. In a C++ or Java program, you can set the exit status with the exit function: exit(0) (default) means success, exit(1) means failure.

The && and || operators

The && and || operators have pretty much the same sense as other language. Both are short-circuit operations. In bash they can be used to express the chaining of operations based on success or failure. A good example is:
c++ myprog.cc && a.out
in which we only run the compiled program if the compilation succeeds.

Boolean expressions in test statements

What is considered as boolean expression in an if test uses this syntax:
if [ BOOLEAN-EXPRESSION ]; then
  statements ...
fi
The only value regarded as false is the empty string. Bash does not recognize any numerical types per se, only strings used in a numerical context. An undefined value is, in every way, equivalent to the empty string in Bash. You have to be careful about using an undefined variable in a script since it may be an exported variable and, thereby, implicitly defined. You can always explicitly undefine a variable x by:
unset x
You can verify the values of false by running this sample script:

falsetest.sh
x=0; [ "$x" ] && echo x is true 1; x=""; [ "$x" ] && echo x is true 2; x=" "; [ "$x" ] && echo x is true 3; unset x; [ "$x" ] && echo x is true 4; x=false; [ "$x" ] && echo x is true 5;
An example usage is this line in pingtest.sh:
[ "$host" ] || { echo usage: `basename $0` "<host or ip>"; exit 1; }
In this example host is the first parameter; if undefined, give a "usage" message.

Unary file information operators

A number of common Bash constructions use the unary "" prefix file test operators, e.g., An example of this appears in the ~/.bashrc startup script:
if [ -f ~/.bash_aliases ]; then
    . ~/.bash_aliases
fi
There are many good usages for these operators for error checking in scripts, such as:
[ -d /etc/X11 ] && { cd /etc/X11; ls -l; }
[ -f "$input_file" ] || { echo no such file "$input_file"; exit 1; }

Binary test operators

The if operator (and other test operators) can be used with boolean expressions along with appropriate syntax. The expression to be tested are normally within single brackets [ .. ]. Within these we have these operator usages:
=, !=                             (lexicographic comparison)
-eq, -ne, -lt, -le, -gt, -ge      (numerical comparison)
However both double brackets [[ .. ]] and double parentheses (( .. )) can serve as delimeters. The operators < and > normally represent "file redirection", but can be used for lexicographic comparison, within [[ .. ]] and numerical comparison within (( .. )). Here are some examples:

test_values.sh
#!/bin/bash x=15 y=6 z="aa a" w="aa b" u=02 v=2 echo -n "$u equals $v lexicographically:" $'\t' if [ "$u" = "$v" ]; then echo "yes"; else echo "no"; fi echo -n "$u not equals $v lexicog.:" $'\t' if [ "$u" != "$v" ]; then echo "yes"; else echo "no"; fi echo -n "$u equals $v numerically:" $'\t' if [ "$u" -eq "$v" ]; then echo "yes"; else echo "no"; fi echo -n "$u not equals $v numeric.:" $'\t' if [ "$u" -ne "$v" ]; then echo "yes"; else echo "no"; fi echo -n "$x is less than $y numeric.:" $'\t' if [ "$x" -lt "$y" ]; then echo "yes"; else echo "no"; fi echo -n "$x less than $y numerically:" $'\t' if (( "$x" < "$y" )); then echo "yes"; else echo "no"; fi echo -n "$x less than $y lexicog.:" $'\t' if [[ "$x" < "$y" ]]; then echo "yes"; else echo "no"; fi echo -n "'$z' less than '$w' lex.:" $'\t' if [[ "$z" < "$w" ]]; then echo "yes"; else echo "no"; fi
Other good usages are error checking based on number of script command line arguments:
[ $# -eq 0 ] && { echo "must have at least one arg."; exit 1; }

Subtle syntax issues

The way Bash deals with strings has certain unexpected consequences. Consider this program:

errors.sh
x="a" y="a b" ["$x" ] && echo "non-empty" [ "$x"] && echo "non-empty" [ $x ] && echo "non-empty" [ $y ] && echo "non-empty"
When executed, the 3 out of 4 test lines are flagged as errors:
line 4: [a: command not found
line 5: [: missing `]'
line 7: [: a: unary operator expected
The first two mistakes were caused by having the expression "$x" touch a bracket. The last was caused by the missing quotes around the $y expression in which case it interpreted the inserted expression "a b" as the operator a with argument b.

String patterns and the case statement

Bash uses primitive globbing patterns for various matching operations. The most common is the usage of "*" which matches any sequence of characters. Less common is "?" which matches any single character and even less common are character sets, such as "[A-Z]" and "[^0-9]".

These type of expressions stand in contrast to more powerful regular expression pattern generators which, in Bash, are only available through auxiliary commands. Glob patterns are simple, familiar patterns such as those used commonly in file listing:
$ ls *.html      # all HTML files (not starting with ".")
$ ls .??*        # all dot files except "." and ".."
$ ls test[0-3]   # "test0", "test1", "test2", "test3"
The Bash case statement distinguishes itself from an if/else constructions primarily by its ability to test its cases by matching the argument against glob patterns. The syntax is like this:
case $file in
  *.txt)  # treat $file like a text file
          ;;
  *.gif)  # treat it like a GIF file
          ;;
  *) # catch-all
     ;;
esac

Unlike C++ or Java syntax, the break exits an enclosing loop, not exit the particular case.

Loops

Bash has both for and while loops. However, the type of control for these is typically not numerical. The most common looping structure in Bash is the for/in structure like this:
for x in ...
do
  statements involving $x
done
The "..." is a list of things generated in a number of ways. The x is the loop variable which iterates through each item in the list. For example, try running this program in your home directory

fileinfo.sh
for x in *; do file $x done
In this case the things iterated are the files in the current directory. One can use numerical-like looping with the double-parentheses like those used for numerical comparison above. One can do a numerical looping as follows:
for ((i=1; i<=10; ++i)); do
  echo $i
done
Another common list of "things" are the command-line arguments. Consider these examples:

loopargs-for.sh
for i in "$@"; do echo "$i" done

loopargs-while.sh
while [ "$1" ]; do echo "$1" shift done
Run this program with some command-line arguments, such as:
$ loopargs-for.sh a b c
$ loopargs-while.sh a b c
Although these two version appear to function identically, the latter version is, in fact, more general because the argument advancing is more under control of the program.

The while loop also has an advantage in its ability to read "live" input. For example, this simple program reads and echos input lines:
while read line; do
  echo $line
done

Deferred evaluation

Bash, like most script languages, unlike C++ or Java, has the ability to do deferred or late evaluation in which an expression is constructed and "executed" using the eval operation. This has subtle usages.

The following program illustrates the initialization of a list of variables in a loop, something that would be impossible (at least, extremely awkward) to do in C++ or Java:

eval-example.sh
n=1 # set x, y, z, w, u to the squares of 2, 3, 4, 5, 6, resp. for v in x y z w u; do ((++n)) # we do not need $((++n)) eval "$v=$(($n*$n))" # eval "$v=$((n*n))" also works done echo $x $y $z $w $u

Processing command-line options

Command-line arguments commonly consist of option arguments beginning with a "-". Consider, for example, the unzip command:
$ unzip -q -o FILE.zip -d /usr/local
which extracts FILE.zip into /usr/local, doing so with no output (-q) and overriding existing files (-o). The FILE.zip portion is the argument and others are options. Some options, like -d, take an argument themselves. The unzip command takes many more options (mostly prior to the argument).

The bash builtin operation getopts is meant to assist in extracting these options from the command line. Consider this program:

getopts-test.sh
# get first group of options while getopts "noqs" flag do echo $flag $OPTIND $OPTARG done echo $flag $OPTIND # shift arguments out and look for a non-option argument shift $((OPTIND-1)) echo $1 shift # start over after the non-option argument OPTIND=1 while getopts "d:" flag do echo $flag $OPTIND $OPTARG done echo $flag $OPTIND
Running this command
$ getopts-test -q -o FILE.zip -d /usr/local
yields the output:
q 2
o 3
? 3
FILE.zip
d 3 /usr/local
? 3
The while loop
while getopts  "noqs" flag
runs through the arguments looking for -n, -o, -q, -s options. OPTIND gives the position of the option (1-based). When a non-option argument is encountered the while loop terminates with flag set to ?. We can keep on going by shifting everything out and resetting OPTIND back to 1.

The second part of the option search uses:
while getopts  "d:" flag
The d:" syntax indicates that the d option also takes an argument. In this case, the $OPTARG expression captures that value.

What is really useful about getopts is that it can capture common "compressed form" argument usage like this:
$ getopts-test.sh -qos FILE.zip -d/usr/local
and understand render it the same as if this were entered:
$ getopts-test.sh -q -o -s FILE.zip -d /usr/local

String Processing in the Bash Language

The Bash language in itself has very primitive, unintuitive string-processing operations. Typically these operations are augmented by various standard UNIX string processing operations such as sed, awk and tr which we'll see later.

string_processing.sh
#!/bin/bash str="abcdefghijkl" echo '$str' $'\t\t' $str echo '${str:0:6}' $'\t' ${str:0:6} # substring starting at 0 of length 6 echo '${str:8}' $'\t' ${str:8} # substring starting at position 8 echo "---------------------------------------------" file="/home/rkline/bin/test.cpp" echo '$file' $'\t\t' $file echo '${file#/*/}' $'\t' ${file#/*/} # remove smallest top match to /*/ echo '${file##/*/}' $'\t' ${file##/*/} # remove largest top match to /*/ echo '${file%.cpp}' $'\t' ${file%.cpp} # remove bottom .cpp match relpath=${file#/} echo '${file#/}' $'\t' $relpath '(relpath)' echo '${relpath%%/*}' $'\t' ${relpath%%/*} # remove largest bottom match to /* echo '${relpath%/*}' $'\t' ${relpath%/*} # remove largest bottom match to /*

A sample utility script

The following Bash utility script, zipit, illustrates many features of the above sections. This script is intended to simplify the command-line usage of the zip operation for zipping directories as well as regular files.

zipit
#!/bin/bash ZIP=/usr/bin/zip RM=/bin/rm [ -x $ZIP ] || { echo "No such executable: $ZIP"; exit 1; } [ -x $RM ] || { echo "No such executable: $RM"; exit 1; } if [ $# -eq 0 ]; then echo usage: $(basename $0) "<file_or_dir1> ..." exit 1 fi for i in "${@}"; do # process all command-line args echo "processing: $i" if [ ! -d "$i" -a ! -f "$i" ]; then # not directory, not file echo " $i is not directory, nor file, skipping" continue else i="${i%/}" # remove trailing "/" if any target="$i.zip" # define the target if [ -f "$target" ]; then echo " remove old $target" $RM $target # remove target if exists fi if [ -d "$i" ]; then $ZIP -r "$target" "$i" # zip directory recursively else $ZIP "$target" "$i" # zip a file fi fi done
A few points to make about this script:

Functions

Functions represent a generalization of aliases which can use parameters in non-trivial ways. In Bash, functions must be defined before being used. In practice, they are often grouped into files of functions and sourced before usage.

Functions are supposed to emulate the way commands work. They do not return values in the usual way; in fact, any value sent back by the return statement must be an integer which acts like the exit code of an executable.

functions.sh
#!/bin/bash function foo # or: function foo(), or simply: foo() { [ $# -ne 0 ] || { echo "*** foo: must have at least 1 arg." return 1 } echo "$1" echo "$@" # "return 0" implicit } echo "---> call: foo" if foo; then echo success: $? else echo failure: $? fi echo echo '---> call: foo aa bb cc' if foo aa bb cc; then echo success: $? else echo failure: $? fi

Lists

Bash creates a list simply enough using the parentheses with whitespace separators, e.g.:
L=(aa bb cc)
Unfortunately, the remaining syntax for list access operations is unintuitive and basically hard to remember when you're not using it frequently. Every usage is surrounded by
${ .. }
For example,
L=(aa bb cc dd ee ff)
${#L[@]}                size of the list
"${L[@]}"               iterator in for loop
"${L[2]}"               element #2 (0 based)
"${L[@]:1:4}"           elements in positions 1-4
K=(xx yy zz)
("${L[@]}" "${K[@]}")   concatentation of L & K
Here is a relatively simple example:

list_examples1.sh
# capture the output of a command in a list klinewords=( $(grep kline /usr/share/dict/words) ) # print all the output elements num=1 for i in "${klinewords[@]}"; do echo "$num: $i" num=$((num+1)) done

String Processing with commands

The Bash language relies heavily on the UNIX-like environment in which it resides in order to create utility scripts. This environment includes standard UNIX string processing operations such as these: These external operations are used in Bash via standard I/O. All above operations act on text files when given file name as a parameter, or act from standard input with no arguments. A common bash expression which uses an external OPERATION to compute some internal value looks something like this:
result=`echo "input string" | OPERATION`
or
result=$(echo "input string" | OPERATION)
The pipe operator "|" is crucial for passing the input string to OPERATION via echo. The following program illustrates some of these external operations.

string_operations.sh
#!/bin/bash str="Hello /there: 1 22 33 Testing/"; echo str $'\t\t\t'"$str" result=$(echo "$str" | tr a-z A-Z) # this uses tr in a typical way echo 'lower-to-upper case' $'\t'"$result" echo 'sub T by ** (1)' $'\t'"$(echo "$str" | sed 's/T/**/')" echo 'sub T by ** (2)' $'\t'"$(echo "$str" | sed 's/T/**/gi')" echo 'sub blank seq. by _' $'\t'"$(echo "$str" | sed 's/[ ]\+/_/g')" echo 'remove trailing /' $'\t'"$(echo "$str" | sed 's/\/$//')" # the "\1" is a "back-reference" to a match surrounded by parentheses echo 'surround digits' $'\t'"$(echo "$str" | sed 's/\([0-9]\+\)/*\1*/g')" # the "grep -q pattern" reports success or failure of finding a match # through its exit status echo -n 'str matches digit' $'\t' if echo "$str" | grep -q '[0-9]'; then echo yes; else echo no; fi echo -n 'str matches "ee"' $'\t' if echo "$str" | grep -q 'ee'; then echo yes; else echo no; fi # this uses awk in a very simplistic way echo 'second chunk of str' $'\t'"$(echo "$str" | awk '{ print $2 }')"


© Robert M. Kline