Bash Basics

Editor Choices

When you choose an editor/IDE application, these are some desirable features: Considering the vim (full version) and nano shell editors, both support syntax colorization based on the file-type specified by the first line. Additionally, vim can remember where you were editing when re-opened.

The gedit editor, in addition to supporting file-type syntax colorization, supports multi-file editing.

Geany IDE

Going beyond basic text editors means invoking some IDE which manages and maintains multiple opened files in some way, like NetBeans or Eclipse. At the moment these higher level IDEs are actually more than what we need and introduce an unnecessary level of complexity.

A very good simple IDE is geany which supports syntax highlighting and multi-file editing while maintaining your opened files from one invocation to another. It has a somewhat primitive notion of multi-file projects, where only one project at a time can be active. A very useful feature is that easy to open new files within the same folder as other already-opened files.

Install it by:
$ sudo apt-get install geany  
Once installed, you can access it through the menu system:
Applications ⇾ Programming ⇾ Geany
All you really have to do is to open any file to get going. There is very little initial configuration you have to do. These are possibilities:
  1. From View menu, I remove Message Window and Side Bar.
  2. Also from View menu, the first choice gives the option of changing the font size (and/or other features) — perhaps do this later after you have opened a file.
  3. Other features can be changed from Edit ⇾ Preferences. For example, from the Editor tab, I uncheck Code Folding.

Geany Projects

If you want to use geany for files in more than one unrelated directory, it is probably a good idea to separate them into distinct projects. Like any project the information about it is stored in a file within your home directory. Geany makes this explicit, and so it is easiest to use it if prepare by:
$ mkdir ~/projects
When you create a new project, any opened one must be closed. A popup menu gives:
     Name:
 Filename: (the file maintaining your project information)
Base path: (the starting directory holding the files of your project)
It is easiest to set the Name field first. This will automatically set the corresponing Filename in ~/projects. Then use the navigation button to the right of the Base path field to set the project's starting directory.

Install bash_basics

Download the source archive bash_basics.zip. You can unzip it anywhere, such as into your home folder, which is what we'll assume. From the shell:
$ cd
$ unzip Downloads/bash_basics.zip
Once extracted, have Geany open any one of the files, say, scalars.sh. You can now make this into a project or not, or later, it's up to you. From the command shell change directory to where you've installed it:
$ cd ~/bash_basics

Execution options

For the most part we will execute scripts directly from an external terminal shell, like this:
$ cd ~/bash_basics
$ ./scalars.sh
The script execution is expecting user input, so key in something after this and hit Enter:
Type a line to be read => 
Geany, like other IDEs, provides an option of running it through the IDE. One of the limitations of such execution is that it takes more extra effort than it's worth to provide run-time arguments which are commonly used by shell scripts.

Bash manual

There is far too much content in the Bash language to be covered in any single document like this one, a tutorial, or even an introductory textbook. Inevitably, if you need to write programs in Bash, you will have to consult the on-line manual:
$ man bash
The man command displays its content inside a pager. For the most part, I only ever use these features of the pager: You can create a high-quality PDF document containing the contents of this man page with this simple command:
$ man -Tps bash | ps2pdf - bash.pdf
What is happening is: Double-clicking the file will open it using the built-in PDF reader which can also be summoned from the command shell by:
$ gnome-open bash.pdf &
The standard document viewer on most Ubuntu systems is evince. MATE uses a clone called atril. One great feature of atril/evince is that it remembers and displays the last page you were viewing in the document, allowing better continuity while researching a topic from a man page.

Executing a bash script file

Bash script files can be named as you like. Unlike Windows systems, the extension is not an essential feature which determines the usage. As we mentioned above, the ".sh" extension is merely a convention which can assist editor recognition. All scripts can be executed explicitly using the bash executable:
$ bash SOME-SCRIPT.sh
For the most part, we want to treat these scripts as if they were standalone executables and omit the explicit "bash" call. This is what needs to happen:
  1. The file itself must be executable by you. This means that you need "x" permission on the script. Check the permissions by doing a long-listing of the file
    $ ls -l SOME-SCRIPT.sh             (ll is an alias for ls -l)
    
    If you are the owner of the script you can add that permission with statements like:
    $ chmod +x SOME-SCRIPT.sh          executable by any user
    
    or
    $ chmod 700 SOME-SCRIPT.sh         executable only by owner
    
  2. The file must either be locatable by its path prefix or have its containing directory in the PATH variable. A full path to the script might be:
    /usr/local/bin/SOME-SCRIPT.sh 
    
    If the script is in the shell's current directory, this is also a full path:
    ./SOME-SCRIPT.sh
    
    The ability to invoke a script "by itself" in the latter case:
    SOME-SCRIPT.sh
    
    means that current working directory "." must be in the PATH.
  3. The file must identify itself as to how it is to be executed. The determination of how a file is "executed" in Linux is by the first two bytes in the file, a 16-bit value called the magic number of a file. If the first two characters are #!, this indicates that the file is a text-based script file, and that the remaining portion of the first line provides the program to run the script. Thus, a Bash script begins with this first line:
    #!/bin/bash
    
    The "#" character is also understood as a comment, and so if we run this file by the explicit call:
    $ bash scalars.sh
    
    then the first line is simply taken as a comment line with no further significance. This first line of a file, when starting with #! is sometimes called the shebang line. If the file has no identifiable magic number, e.g., there no shebang line, a Bash script.

What to name scripts

The ".sh" extension is not at all necessary in Linux. Generally Linux systems do not use extensions for their system scripts. As explained above, a script's function is technically specified in the shebang line. Some editors, like Eclipse with shell support, require the .sh for recognition; none of the editors mentioned so far do.

Try this experiment:
  1. Make two copies of scalars.sh without any extension:
    $ cp scalars.sh scalars1
    $ cp scalars.sh scalars2
    
  2. Open Geany on scalar1: you still get the syntax highlighting.
  3. Open Geany on scalar2, delete the top shebang line. Close and re-open it. Now you lose the syntax highlighting.
  4. Repeat with nano and gedit to prove the point:
    $ nano scalars1
    $ nano scalars2
    $ gedit scalars1
    $ gedit scalars2
    

Where to put scripts

When you're developing a script, it can be in any directory and you can run it from that directory. Once you've written the script and want to make general usage of it like this
SOME-SCRIPT
regardless of the directory that you're in, there are two common places to put this script: If you're the only user on the system, the issue becomes whether you want root to easily access your script or not. Prior to relocation of the script within the system, double-check that there is no such script already in use with that name by doing:
$ which SOME-SCRIPT
and observing that there will be no conflict.

The Bash Language

The Bash language has three main functions: In particular, Bash, per se, is not a general purpose programming script language like, say, Perl, Python or TCL. Its main orientation is towards executing the standard UNIX command set and Bash scripts rely heavily on the standard UNIX commands.

Interactive Execution

When a shell is run interactively the lines of a bash program are created one-by-one. Shell code usually is considers the script to be interactive if the prompt variable, PS1 is defined, since all statements receive this prompt before entry. In interactive execution, Bash will source each statement, which is a form of execution in which all variable settings are retained. Interactive execution also permits many user-friendly control features not necessary in script execution such as:

Variables and Values

The program scalars.sh illustrates basic principles of Bash variables and values. In particular, the only scalar data type is a string. Values are created in several ways:
  1. within uninterpolated quotes: ' '
  2. within interpolated quotes: " "
  3. the output of a command within shell evaluated back quotes ` `or within $( )
  4. a bareword which is not a Bash reserved word and contains no special operator characters

The most basic operation on strings is concatenation, which, in Bash, is simply juxtaposition. In general, whitespace sequences are collapsed into a single blank; whitespace sequences at the ends of strings are truncated (i.e., trimmed).

Variables are defined using the assign operator "=" in a very strict sort of way. Once a variable, v, is defined, its value is automatically used with the expression $v. A double-quoted variable's value, like "$y", can behave differently from $y when the value has internal whitespace. If there is any doubt, it is recommended to always use double quotes.

A newline is interpreted as a statement terminator. A semicolon (;) can also be used as a statement terminator if you want two or more statements on the same line.

scalars.sh
#!/bin/bash
# file: scalars.sh
 
x='aa'        # no spaces allowed around the "=" operator
y="bb    BB"
z=`date`      # these are back-quotes (found in upper left of keyboard)
w=$(pwd)      # alternative to backquotes
u="${x}QQ$y"  # interpolated, the brackets "{ }" protect the value of $x
v='${x}QQ$y'  # not-interpolated
 
echo '$x =' "$x"
echo '$y =' $y          # compare $y unquoted
echo '$y =' "$y"        # versus "$y"
echo '$z =' "$z"
echo '$w =' "$w"
echo '$u =' "$u"
echo '$v =' "$v"
echo "-------------------------------"
 
echo   first     second     # barewords allowed
echo  "first     second"    # use quotes to preserve space
echo "-------------------------------"
 
# numerical strings can be treated numerically inside $(( .. ))
echo $((12+13-7))
n=7
n=$((n*2))
echo $n
echo "-------------------------------"
 
# echo always terminates with new-line, use "-n" to avoid that
 
echo -n Type a line to be read "=> "  # careful to quote "=>"
read x                                # read string from standard input
echo "You typed this =>" "|$x|"       # double-quotes preserve internal 
                                      # spacing, but string is trimmed

echo and printf

Although echo is the most common output statement, Bash also supports the C-style printf statement, e.g.,
printf "num=%05d\n" 27
echo AFTER
prints:
num=00027
AFTER
There is an equivalent to sprintf (printf to a variable) in the form of
printf -v num "%05d" 27
For most situations, echo is more common. It is easy to use and, for the most part does what you want in a simple manner. One "problem" spot is printing control characters like \t for tab. The bash syntax for this control character has the cumbersome form:
$'\t'
For example, these two statements generate the same output:
echo   $'\t'foo
printf "\tfoo\n"
As you can imagine the printf version is more memorable. On feature available to echo which is not available to printf is colorization. When used with the -e flag, echo interprets certain special convoluted escape sequences as indication to change the color of the output. For example this prints "HELLO" in bold red followed by "THERE" in (normal) black
echo -e "\033[01;31m HELLO \033[0m THERE"
The output need not be separated like this, we are simply making it easier to see

Other types and declarations

Bash, just as other languages, does support additional structured data types in the form of lists and maps (associative lists). It also provides a way of assigning a type to a variable through a the declare statement. Here is an example script:

scalar-declares.sh
# "-r" means cannot be reset, so is a constant
declare -r CONSTANT="cannot be reset"
 
# "-i" means act like an integer and evaluate expressions accordingly
declare -i num_only
 
# -u/-l means convert value strings to upper/lower case
declare -u uppercaseval
declare -l lowercaseval
 
# -x means export variable (make available to subshells)
declare -x exportedval="some exported value"
 
CONSTANT="something else"  # this will fail with error message
echo $CONSTANT
 
num_only=5*4+3
echo $num_only
 
uppercaseval="some STRING"
lowercaseval="some STRING"
 
echo $uppercaseval
echo $lowercaseval

Command-line arguments

One of the primary purpose of the bash language is to extend the set of commands. For this reason Bash provides simple access to the command-line parameters. Bash uses the variables "$1", "$2", etc. The expression "$0" is the command name itself. They should be double-quoted. Use these test-runs:
$ ./args.sh 
$ ./args.sh  a     b    c
$ ./args.sh "a     b"   c

args.sh
#!/bin/bash
# file: args.sh
 
echo command: "$0" # This is the command (with full pathname)
 
# to get the simple "last part" of the full command,
# call external basename operation
echo command: $(basename "$0")  
echo 
 
# The syntax "$#" gives the number of command-line arguments.
echo number of command line args: "$#"
echo 
 
echo first:  "$1"
echo second: "$2"
echo third:  "$3"
 
# "$@" (and "$*") is used to represent 
# all the command line arguments as a single string
 
echo all: "$@"

if-else statements

The bash if-else syntax is unusual compared to other languages. The format looks like this:
if ...
then
  some statements
elif ...
  some statements
else
  some statements
fi
The "..." sections represent boolean "tests". The chained elif and the else parts are optional. The "then" syntax is often written on the same line as the if portion like this:
if ...; then

Program exit status

As an example, consider

pingtest.sh
host="$1"
[ "$host" ] || { echo usage: `basename $0` "<host or ip>"; exit 1; }
if ping -w 2 -c 1 "$host" > /dev/null
then 
  echo status=$?
  echo Can ping "$host"
else
  echo status=$?
  echo Cannot ping "$host"
fi
Try these:
$ ./pingtest.sh 
$ ./pingtest.sh 8.8.8.8
$ ./pingtest.sh 1.1.1.1
What is happening is that the ping operation with the options used is a single ping which can either succeed or fail within 2 seconds with these two possible outcomes: The notion of true and false in these bash tests can be counter-intuitive: an exit status of 0 means true, non-zero means false. The $? construct used in
echo status=$?
is a Bash special variable which gives the exit status of a previous command (and so it has to come before the second echo statement).

The && and || operators

The && and || operators are much the same sense as other languages using short-circuit execution. In Bash they are often used to express the chaining of operations based on success or failure. A good example is:
c++ myprog.cc && a.out
in which we only run the compiled program if the compilation succeeds.

Boolean expressions in test statements

What is considered as boolean expression in an if test uses this syntax:
if [ BOOLEAN-EXPRESSION ]; then
  statements ...
fi
The only value regarded as false is the empty string. Bash does not recognize any numerical types per se, only strings used in a numerical context. An undefined value is, in every way, equivalent to the empty string in Bash. You have to be careful about using an undefined variable in a script since it may be an exported variable and, thereby, implicitly defined. You can always explicitly undefined a variable x by:
unset x
You can verify the values of false by running this sample script:

falsetest.sh
x=0;     [ "$x" ] && echo x is true 1; 
x="";    [ "$x" ] && echo x is true 2; 
x=" ";   [ "$x" ] && echo x is true 3; 
unset x; [ "$x" ] && echo x is true 4; 
x=false; [ "$x" ] && echo x is true 5;
An example usage is this line in pingtest.sh:
[ "$host" ] || { echo usage: $(basename $0) "<host or ip>"; exit 1; }
In this example host is the first parameter; if undefined, give a "usage" message.

Unary file information operators

A number of common Bash constructions use the unary "" prefix file test operators, e.g., An example of this appears in the ~/.bashrc startup script:
if [ -f ~/.bash_aliases ]; then
    . ~/.bash_aliases
fi
There are many good usages for these operators for error checking in scripts, such as:
[ -d /etc/X11 ] && { cd /etc/X11; ls -l; }
[ -f "$input_file" ] || { echo no such file "$input_file"; exit 1; }

Binary test operators

The if operator (and other tests) can be used with boolean expressions using appropriate syntax. The test expressions are normally within single brackets [ .. ]. Within these we have these operator usages:
=, !=                             (lexicographic comparison)
-eq, -ne, -lt, -le, -gt, -ge      (numerical comparison)
However both double brackets [[ .. ]] and double parentheses (( .. )) can serve as delimiters. The operators < and > normally represent "file redirection", but can be used for lexicographic comparison, within [[ .. ]] and numerical comparison within (( .. )).

The following program illustrates some examples:

test-values.sh
#!/bin/bash
x=15
y=6
z="aa a"
w="aa b"
u=02
v=2
 
printf "$u equals $v lexicographically:\t"
if [ "$u" = "$v" ]; then echo "yes"; else echo "no"; fi
 
printf "$u not equals $v lexicog.:\t"
if [ "$u" != "$v" ]; then echo "yes"; else echo "no"; fi
 
printf "$u equals $v numerically:\t"
if [ "$u" -eq "$v" ]; then echo "yes"; else echo "no"; fi 
 
printf "$u not equals $v numeric.:\t"
if [ "$u" -ne "$v" ]; then echo "yes"; else echo "no"; fi 
 
printf "$x is less than $y numeric.:\t"
if [ "$x" -lt "$y" ]; then echo "yes"; else echo "no"; fi
 
printf "$x less than $y numerically:\t"
if (( "$x" < "$y" )); then echo "yes"; else echo "no"; fi 
 
printf "$x less than $y lexicog.:\t"
if [[ "$x" < "$y" ]]; then echo "yes"; else echo "no"; fi
 
printf "'$z' less than '$w' lex.:\t"
if [[ "$z" < "$w" ]]; then echo "yes"; else echo "no"; fi
Other good usages are error checking based on number of script command line arguments:
[ $# -eq 0 ] && { echo "must have at least one arg."; exit 1; }

Subtle syntax issues

The way Bash deals with strings has certain unexpected consequences. Consider this program:

errors.sh
x="a"
y="a b"
 
["$x" ] && echo "non-empty"
[ "$x"] && echo "non-empty"
[ $x ] && echo "non-empty"
[ $y ] && echo "non-empty"
When executed, the 3 out of 4 test lines are flagged as errors:
line 4: [a: command not found
line 5: [: missing `]'
line 7: [: a: unary operator expected
The first two mistakes were caused by having the expression "$x" touch a bracket. The last was caused by the missing quotes around the $y expression in which case it interpreted the inserted expression "a b" as the operator a with argument b.

String patterns and the case statement

Bash uses primitive globbing patterns for various matching operations. The most common is the usage of "*" which matches any sequence of characters. Less common is "?" which matches any single character and even less common are character sets, such as "[A-Z]" and "[^0-9]".

These type of expressions stand in contrast to more powerful regular expression pattern generators which, in Bash, are only available through auxiliary commands. Glob patterns are simple, familiar patterns such as those used commonly in file listing:
$ ls *.html      # all HTML files (not starting with ".")
$ ls .??*        # all dot files except "." and ".."
$ ls test[0-3]   # "test0", "test1", "test2", "test3"
The Bash case statement distinguishes itself from an if/else constructions primarily by its ability to test its cases by matching the argument against glob patterns. The syntax is like this:
case "$file" in
  *.txt)  # treat "$file" like a text file
          ;;
  *.gif)  # treat it like a GIF file
          ;;
  *) # catch-all
     ;;
esac 

Unlike C++ or Java syntax, the break exits an enclosing loop, not exit the particular case.

Loops

Bash has both for and while loops. However, the type of control for these is typically not numerical. The most common looping structure in Bash is the for/in structure like this:
for x in ...
do
  statements involving $x
done
The "..." is a list of things generated in a number of ways. The x is the loop variable which iterates through each item in the list. For example, try running this program in your home directory

fileinfo.sh
for x in *; do
  file "$x"
done
In this case the things iterated are the files in the current directory. One can use numerical-like looping with the double-parentheses like those in for numerical comparison above:
for ((i=1; i<=10; ++i)); do
  echo $i
done 
Another common list of "things" are the command-line arguments. Consider these examples:

loopargs-for.sh
for i in "$@"; do
  echo "$i"
done

loopargs-while.sh
while [ "$1" ]; do
  echo "$1"
  shift
done
Run this program with some command-line arguments, such as:
$ ./loopargs-for.sh a b c
$ ./loopargs-while.sh a b c
Although these two version appear to function identically, the latter version is, in fact, more general because the argument advancing is more under control of the program.

Reading lines in Bash

The while loop also has an advantage in its ability to read live input. For example, this simple program reads and echos input lines:
while read line; do
  echo "$line"
done
In a programmatic setting, it is often useful to process lines generated from the output of some command. Say we want to process all words starting with "my" in the system dictionary (/usr/share/dict/words) by removing the initial "my" part.

The following two scripts represent two possible ways of doing so:

process-lines-1.sh
#!/bin/bash
 
count=0
while read line; do
  ((++count))
  printf "%2d: %s\n" "$count" "${line#my}"
done < <(grep ^my /usr/share/dict/words)
echo "==> $count words processed"

process-lines-2.sh
#!/bin/bash
 
count=0
grep ^my /usr/share/dict/words | while read line; do
  ((++count))
  printf "%2d: %s\n" "$count" "${line#my}"
done
echo "==> $count words processed"
The command
grep ^my /usr/share/dict/words
is used to generate the target information. The two respective approaches to processing this are:
  1. input redirection into the "while ... done" loop using the manufactured "input device"
    <(grep ^my /usr/share/dict/words)
  2. piping (i.e., |) the command into the "while ... done" loop.
It turns out that only the former method works as we want it to. The problem with the latter method is that the count variable is being manipulated in a subshell created by the pipe operation and so its value cannot be used upon exiting the while loop.

In contrast, the former method with the odd syntax "<(..)" turns out to be more useful.

Command-line options

Command-line arguments commonly consist of option arguments beginning with a "-". Consider, for example, the unzip command:
$ unzip -q -o FILE.zip -d /usr/local
which extracts FILE.zip into /usr/local, doing so with no output (-q) and overriding existing files (-o). The FILE.zip portion is the argument and others are options. Some options, like -d, take an argument themselves. The unzip command takes many more options (mostly prior to the argument).

The options can be "compressed" under certain circumstances. For example, this is an equivalent call:
$ unzip -qo FILE.zip -d /usr/local
The bash built-in operation getopts is meant to assist in extracting these options from the command line. Consider this program:

getopts-test.sh
# get first group of options
while getopts  "noqs" flag
do
  echo $flag $OPTIND $OPTARG
done
echo $flag $OPTIND
 
# shift arguments out and look for a non-option argument
shift $((OPTIND-1))
echo $1
shift
 
# start over after the non-option argument
OPTIND=1
while getopts  "d:" flag
do
  echo $flag $OPTIND $OPTARG
done
echo $flag $OPTIND
Running this command
$ ./getopts-test.sh -q -o FILE.zip -d /usr/local
yields the output:
q 2
o 3
? 3
FILE.zip
d 3 /usr/local
? 3
The while loop
while getopts "noqs" flag
runs through the arguments looking for -n, -o, -q, -s options. OPTIND gives the position of the option (1-based). When a non-option argument is encountered the while loop terminates with flag set to ?. We can keep on going by shifting everything out and resetting OPTIND back to 1.

The second part of the option search uses:
while getopts "d:" flag
The "d:" syntax indicates that the d option also takes an argument. In this case, the $OPTARG expression captures that value.

Setting option flag variables

A useful style of option sensing is to set "option flag" variables like this:

optflags.sh
#!/bin/bash
 
unset opt_a; unset opt_b; unset opt_c; unset opt_d
unset optarg_c; unset optarg_d;
 
while getopts "abc:d:" flag; do
  [ "$flag" = "?" ] && exit 1;
  eval "opt_$flag=1"
  if [ "$flag" = "c" -o "$flag" = "d" ]; then
    eval "optarg_$flag=$OPTARG"
  fi
done
shift $((OPTIND-1))
 
echo "opt_a=$opt_a; opt_b=$opt_b; opt_c=$opt_c;"
echo "optarg_c=$optarg_c;"
echo "optarg_d=$optarg_d;"
 
echo "non-opts=$@"
Try these:
$ ./optflags.sh
$ ./optflags.sh -abc foo -d bar foobar barfoo
What is happening is that the variables opt_a, opt_b, and opt_c are being created through deferred evaluation using the Bash eval function. The actual $flag, say "b", subtitutes into the evaluated expression:
eval "opt_$flag=1"
thus defining opt_b and setting it. We can later test for the presence of the "b" flag by:
if [ "$opt_b" ]; then ...

Built-in string processing operations

The Bash language itself has very unintuitive string-processing operations. Later we'll see how to use UNIX commands to do string processing.

string-processing.sh
#!/bin/bash
 
str="abcdefghijkl"
 
printf "%s\t\t%s\n" '$str'       "$str"
printf "%s\t\t%s\n" '${#str}'    "${#str}"     # string length
printf "%s\t%s\n"   '${str:0:6}' "${str:0:6}"  # substring starting at 0 of length 6
printf "%s\t%s\n"   '${str:8}'   "${str:8}"    # substring starting at position 8
echo "----------------------------------"
 
file="/home/rkline/bin/test.cpp"
 
printf "%s\t\t%s\n"  '$file'          "$file"
printf "%s\t%s\n"    '${file#/*/}'    "${file#/*/}"   # remove mimimal top match to /*/
printf "%s\t%s\n"    '${file##/*/}'   "${file##/*/}"  # remove maximal top match to /*/
 
printf "%s\t%s\n"    '${file%.cpp}'   "${file%.cpp}"  # remove bottom .cpp match 
echo "----------------------------------"
 
relpath=${file#/}
 
printf "%s\t%s %s\n" '${file#/}'      "$relpath"        ' = relpath'
printf "%s\t%s\n"    '${relpath%/*}'  "${relpath%/*}"    # remove mimimal bottom match to /*
printf "%s\t%s\n"    '${relpath%%/*}' "${relpath%%/*}"   # remove maximal bottom match to /*

The zipit utility script

The following Bash script, zipit.sh, illustrates many features from the above sections. This script is intended to simplify the command-line usage of the zip operation for archiving regular files and directories.

zipit
#!/bin/bash
 
ZIP=/usr/bin/zip
RM=/bin/rm
 
[ -x $ZIP ] || { echo "No such executable: $ZIP"; exit 1; }
[ -x $RM ] || { echo "No such executable: $RM"; exit 1; }
 
if [ $# -eq 0 ]; then
  echo usage: $(basename $0) "<file_or_dir1> ..."
  exit 1
fi
 
for i in "${@}"; do		      # process all command-line args
  echo "processing: $i"
  if [ ! -d "$i" -a ! -f "$i" ]; then     # not directory, not file
    echo "  $i is not directory, nor file, skipping"
    continue
  else
    i="${i%/}"			      # remove trailing "/" if any
    target="$i.zip"		      # define the target
    if [ -f "$target" ]; then
      echo "  remove old $target"   	     
      $RM $target   		      # remove target if exists
    fi
    if [ -d "$i" ]; then
      $ZIP -r "$target" "$i"   # zip directory recursively
    else
      $ZIP "$target" "$i"      # zip a file
    fi
  fi
done
A few points to make about this script: To make this script usable by all users, move it into a common path location:
$ sudo cp zipit /usr/local/bin/
It is common to drop the .sh extension for system commands on on Linux system. Our intended usage is (for you or root):
zipit <FILE OR DIR>

Functions

Functions offer an improvement of aliases. They must be defined before being used. In practice, they are often grouped into Bash files which are sourced within the script which uses them.

Functions are supposed to emulate the way commands work. They do not return values in the usual way; any value sent back by the return statement must be an integer which acts like the exit code of an executable.

functions.sh
#!/bin/bash
 
function foo   # or: function foo(), or simply: foo()
{
  [ $# -ne 0 ] || { 
     echo "*** foo: must have at least 1 arg." 
     return 1
  }
  echo "$1"
  echo "$@"
  # "return 0" implicit
}
 
echo "---> call: foo"
if foo; then
  echo success: $?
else
  echo failure: $?
fi
echo
echo '---> call: foo aa bb cc'
if foo aa bb cc; then
  echo success: $?
else
  echo failure: $?
fi

Lists and Maps

Bash supports both indexed lists and associative lists (maps). Bash version 3 and below does not support maps. One can make explicit declarations of both:
declare -a myList
declare -A myMap
The former myList declaration is unnecessary, as it is implicit in common initialization statements like this (no separation around the "="):
myList=( aa bb cc )
The latter myMap declaration is required.

Indexed Lists

Indexed lists are zero-based, and elements can be assigned positionally. The above myList declaration is equivalent to:
myList=( [0]=aa  [1]=bb  [2]=cc )
We can access an element by index:
"${myList[2]}"
The list can be appended or modified by syntax like this:
myList[3]=dd
and existing elements can be removed like this:
unset myList[2]
Accessing a list by positions can be problematic as unassigned positions are simply ignored. Therefore, the ordering of elements (aa bb cc) could conceivably have been created by the declaration:
myList=(  [9]=cc  [2]=aa  [5]=bb )
The preferred method of producing the list values is via the complicated syntax:
"${myList[@]}"
For example, the way to iterate through the list of values is this:
for elt in "${myList[@]}"; do
  echo "$elt"
done
This syntax is also used to append/prepend to a list or concatenate two lists, e.g.,
myList=( "${myList[@]}" "$appended_elt" )
myList=( "${myList[@]}" "${anotherList[@]}" )
The following is a relatively simple example. The list is created by dumping the elements within the parentheses

show-words.sh
# capture the output of a command in a list
match=test
matchwords=( $(grep "$match" /usr/share/dict/words) )
 
numwords=${#matchwords[@]}  # number of elements in list
width=${#numwords}          # length of the $numwords string
 
num=1
for word in "${matchwords[@]}"; do 
  printf "%${width}d: %s\n" "$num" "$word"
  ((++num))
done

Maps

As indicated above, a map (associative list) must be declared. Here is a sample initialization:
declare -A ages=([john]=12 [joe]=13 [jim]=15)
The declaration need not have an initialization, and we can build the map with individual key/value assignments:
ages[carla]=44
ages[dave]=22
Like the indexed list, elements can be removed with the unset operation:
unset ages[john]
The internal structure is a hash which is not "linked" and so the order of key/value pairs is effectively random. Like the indexed list, we can retrieve the values with
"${ages[@]}"
However, when using maps, it is usually of interest to obtain the keys as well, which can be done by this syntax:
"${!ages[@]}"
This syntax even works with indexed lists, producing the sequence of underlying indices. The key sequencing provides the most common way of iterating through a map:
for key in "${!ages[@]}"; do
  value="${ages[$key]}"
  # ...
done

System command string processing

The Bash language relies heavily on the UNIX-like environment in which it resides in order to create utility scripts. This environment includes many standard UNIX string processing operations such as these: These external operations are used in Bash via standard I/O. All above operations act on text files when given file name as a parameter, or act from standard input with no arguments. A common bash expression which uses an external OPERATION to compute some internal value looks something like this:
result="$(echo "input string" | OPERATION)"
The pipe operator "|" is crucial for passing the input string to OPERATION via echo. The following program illustrates some of these external operations.

string-operations.sh
#!/bin/bash
 
str="Hello /there@  1  22  33    Testing/";
 
printf "test string:\t\t%s\n"           "$str"
 
result="$(echo "$str" | tr a-z A-Z)"    # using tr in a typical way
 
printf "%s\t\t%s\n" 'lower-to-upper'    "$result"
printf "%s\t\t%s\n" 'sub T by ** (1)'   "$(echo "$str" | sed 's/T/**/')"
printf "%s\t\t%s\n" 'sub T by ** (2)'   "$(echo "$str" | sed 's/T/**/gi')"
 
printf "%s\t%s\n" 'sub blank seq. by _' "$(echo "$str" | sed 's/[ ]\+/_/g')"
 
printf "%s\t%s\n" 'remove trailing /'   "$(echo "$str" | sed 's/\/$//')"
 
# "\1" is a "back-reference" to a matched substring surrounded by parentheses
 
printf "%s\t%s\n" 'digit seq.'   "$(echo "$str" | sed 's/\([0-9]\+\)/[\1]/g')"
 
# awk can easily find portions of a string separated by blank space
 
printf "%s\t%s\n" 'second chunk of str:' "$(echo "$str" | awk '{ print $2 }')"
 
# the "grep -q pattern" reports match success or failure via exit status
 
printf "%s\t" 'str matches digit:' 
if echo "$str" | grep -q '[0-9]'; then echo yes; else echo no; fi
 
printf "%s\t" 'str matches "ee":' 
if echo "$str" | grep -q 'ee'; then echo yes; else echo no; fi


© Robert M. Kline