Python Basics

Eclipse and PyDev

In preparation for installing eclipse, we need a version of Java JDK installed on the system. In some special cases, only the Oracle-created version will do, but the OpenJDK version works fine for all of our needs. Install it as a package by:
$ sudo apt-get install openjdk-8-jdk

Install Eclipse

Ubuntu provides Eclipse and PyDev as packages, but I cannot get them to provide the full PyDev functionality, so I use the packages available from the eclipse home site:
http://www.eclipse.org
Yet another issue is that the latest Eclipse (4.5, Mars) appears to have some issues with the GTK3 version on Ubuntu, so I am using the earlier version (4.4, Luna).

Within the Eclipse downloads available from this site there are many choices. We will use the basic Java Developers version. You can download it here from the Computer Science server:
eclipse-java-luna-SR2-linux-gtk-x86_64.tar.gz
The archive extracts to the directory eclipse and we want it moved to
/usr/local/eclipse
Assuming it downloaded into the Downloads directory, you can achieve the desired outcome in this one liner:
$ sudo tar xzf ~/Downloads/eclipse-java-luna-SR2-linux-gtk-x86_64.tar.gz -C /usr/local/
In particular you, the system admin, is the owner of target folder /usr/local/eclipse.

Create a launcher and start it

Create the file (as root):

/usr/share/applications/eclipse.desktop
[Desktop Entry]
Type=Application
Version=1.0
Name=Eclipse
Exec=/usr/local/eclipse/eclipse %F
Icon=/usr/local/eclipse/icon.xpm
Terminal=false
Categories=GTK;Development;IDE;
StartupNotify=true
select
Eclipse should then appear in your Applications ⇾ Programming menu.

Start it up. It asks for a project directory, the standard one being:
workspace
Always take this choice to match up with the usage in the documents.

Eclipse-based plugin installation

In Eclipse, plugins are available through one of these choices:
Help ⇾ Install New Software
Help ⇾ Eclipse Marketplace
In the former, you get a drop-down, entry box:
Work with:
to which you add a URL which summons up the relevant software. The Marketplace is easier because it will find the relevant software from a keyword search.

Install Python Pydev

Open Eclipse MarketPlace and enter this into the Find box:
Find:
One entry should come up:
PyDev - Python IDE for Eclipse.
Click Install and then follow through:
  1. Two software choices are selected. Click Confirm.
  2. Review Licenses:
    I accept ...
    Finish.
  3. Security warning: OK.
  4. Selection Needed popup. Checking the checkbox:
    Do you trust these certificates?
    Brainwy Software; Pydev; Brainwy
    Click OK.
  5. Restart Eclipse.

PyDev Configuration

Go through Window ⇾ Preferences and select
PyDev ⇾ Interpreters ⇾ Python Interpreter
Click the button.

Add this information:
Interpreter Name:       python
Interpreter Executable: /usr/bin/python
Keying in the correct executable will cause immediate recognition of the python installation.

Click Apply to activate.

Click OK to leave.

Every time Python software is added, you must go through the last step this procedure which constructs the PYTHONPATH environment variable thereby making it be recognized by PyDev.

Control-C Copying

I had to do one odd step to make Control-C successfully copy text in PyDev:
Window ⇾ Preferences ⇾> Java ⇾ Editor ⇾ Typing
Uncheck Update imports and restart Eclipse.

Tab settings for Linux Editors

In Eclipse PyDev a TAB is, by default, equivalent to using 4 spaces, so I recommend that you ensure that all the editors you use adhere to this specification. Some examples:

Python on Linux systems

Python is a standard part Linux systems. It exists in two forms: The compatibility issue means that Python2 programs are likely to not work using Python3, even for the most basic things like the print operation. You have to explicitly refer to python3 to use it. We'll stick with Python2 since some key software used is still written in that version.

In contrast to Bash, Python is a complete programming language intended to work in any platform. Importable modules can be employed to extend Python's capabilities to handle any programming need. Python's most singular syntactic feature is that program blocks are understood solely through indentation in contrast to ending tokens. Python bears some syntactic similarities to JavaScript, for example: Python is a pure object-oriented language in that everything in Python is an object, including the scalar types.

Python syntax is very clean, effective and fairly minimal. In comparison, Perl (although I like this language) is bloated with syntax and operators on the order of Bash. Moreover, Perl's notion of object-oriented construction gives the impression of an "completely hacked add-on."

Python indentation

Dealing with Python's indentation takes some planning. In Python, the first line of a block defines the "indentation" for that block (possibly empty). All subsequent statements within that block must use the same indentation. Thus both of these are incorrect:
x = 3
  y = 2
x = 3
if x == 3:
    y = 2
  z = 4
In the latter case, a fix would align the "z = 4" with either the "if" or the "y = 2", giving two possible programs with different behaviors. Thus you can see why there cannot be a "Format the Source" feature in a Python Editor! In contrast, despite any visual defects, the following script is OK:
x = 3
if x == 3:
        y = 2
else:
  z = 4
A more serious issue is that correct indentation is not just visual. For example, if a tab is of size 4, but is not replaced by blank spaces, then this will be wrong:
x = 3
if x == 3:
    y = 2    # type TAB to statement
    z = 4    # type 4 spaces to statement

Running Python as an interpreter

One very important characteristic which makes it similar to Bash is the ability to run as an interpreter as well as a script language. See:
python docs: interpreter
The interpreter is activated by calling "python" with no arguments. This is the favorite manner to illustrate the behavior of python language features in documentation. The interpreter acts in "echo mode" in values of expressions automatically printed without need of a print statement. You can use this a simple on-line calculator. Here is an sample run:
$ python
  ..........
>>> 2+3
5
>>> "hello world"
'hello world'
>>> print "hello world"
hello world
>>> x = 5
>>> x
5
>>> if x == 5:
...   print "yes:5"
... 
yes:5
>>> if x == 6:
...      print "yes:6"
... 
>>> quit()                   (or Ctrl-D)

Internal Documentation

Python itself provides much of the documentation available on what operations are available. For example, we can use the dir function with this interpreted content:
dir(int)            member operations available to int, float, str types
dir(float)
dir(str)
import os
dir(os)             importable entities from the os module
Going further, we have additional module documentation:
print os.__doc__    basic documentation
print help(os)      extensive man-page documentation using a pager

Running a Python script

The most basic "hello world" python script, hello.py, is this one-liner:
print "Hello World"
The print statement functions like the Bash echo statement in that a newline is automatically appended. We can run this explicitly with the interpreter:
$ python hello.py
With preparation, we can also run this script by itself:
$ ./hello.py
What is necessary for this to happen are the following modifications:
  1. add an initial "shebang" line. The favorite is this:
    #!/usr/bin/env python
    print "Hello World"                      
    but we could just as well use this
    #!/usr/bin/python
    print "Hello World"                      
    The presumed advantage of the former is that it will /usr/bin/env will find the "preferred" python installation as the first occurrence in the system PATH.
  2. Make the script executable. The simplest way:
    $ chmod +x hello.py
    

Install Python Basics

Download the source archive python_basics.zip. Extract the archive into the workspace directory:
$ unzip ~/Downloads/python_basics.zip -d ~/workspace/
Install python_basics as a PyDev project in Eclipse. Right-click on the Package Explorer window:
  1. New ⇾ Project ⇾ PyDev ⇾ PyDev Project, then Next.
  2. Set Project Name to python_basics. Should see:
    Project location contains existing Python files. The created project will include them.
    Click Finish.
  3. Open Associated Perspective? Check
    Remember my decision
    Then click Yes.
The python files within a PyDev project can be executed through PyDev by selecting the file through right-click and choosing Run As ⇾ Python Run. Alternatively, simply navigate a terminal shell to the directory and use that to run scripts, because the built-in run feature doesn't work well if the script needs arguments.

Modules

PyDev refers to python files as modules. There also the notion of a package which we'll see later. If you right-click on the python_basics project line, you'll see these two choice from the New submenu.

The python import is a way of making available the code from other python modules to augment the functionality. One sees imports typically in one of two forms: Python refers to the entities imported as attributes.

Like Java's CLASSPATH, Python can refer to an external PYTHONPATH to find directories where modules are to be located. Python can also look in a sys.path variable for such directories.

On an import, python looks for either the compiled version my_module.pyc or the source version, my_module.py. If the source version has a later modification date (recently edited), then it compiles the .py file and replaces the older .pyc file. After this determination, the package my_pack is included into the program in such a way that it is actually only included once despite possibly being referenced multiple times through other imports.

The difference between the import styles (a) and (b) above is how one would use my_entity:

Importing vs. Executing

Like Java and other languages, the same file types serve for executable programs as well as repositories for classes and other data which can be imported into other scripts. Here are two Python scripts which illustrate the difference between execution and importation. The main difference is that the special variable __name__ recognizes the importing module whereas for execution, __name__ is "__main__". Recognizing these differences allows the module script to behave differently when executed versus imported.

hello_module.py
#!/usr/bin/env python
 
def saySomething():
    print "calling saySomething function"
 
print "Activation module: " + __name__ 
 
if __name__ == "__main__":
    print __file__ + " running as a script"

call_hello_module.py
#!/usr/bin/env python
 
from hello_module import saySomething
 
saySomething()

Try running both:
$ python hello_module.py
Activation module: __main__
hello_module.py running as a script

$ python call_hello_module.py
Activation module: hello_module
calling saySomething function
As indicated, the "main" section is only activated when the module is run directly. When the module is imported, the main block is not executed.

Packages

A Python module is more-or-less what we consider to be a program or script. When imported it gives access to attributes defined within via a "dotted-access" format, like this:
my_module.attribute()
Functions are one sort of attribute which can be defined in a module.

A Python package represents a group of packages within a directory, or within subdirectories. Access to each module or sub-package also use the "." construction. See:
https://docs.python.org/2/tutorial/modules.html#packages
In this example, we have a the following directory structure:
mypack/
  __init__.py
  moduleA.py
  moduleB.py
The __init__.py files are required. They indicate that the directory is a Python package which provides access to submodules:
mypack.moduleA
mypack.moduleB
If all you want is access to these submodules, then __init__.py can be empty. If, however, you want the name mypack to act like a module, the code for it goes into the __init__.py module.

In this example all 3 files, including __init__.py, have identical code:
def saySomething():
    print "calling saySomething from " + __name__
If you open __init__.py, you'll see that PyDev recognizes this file as something special and names the editor tab according to the directory, thereby avoiding confusion if there are more than one such file open.

The file which activates all of them is this:

call_mypack.py
#!/usr/bin/env python
 
#import mypack        # not necessary with next two
import mypack.moduleA
import mypack.moduleB
 
mypack.saySomething()            # from __init__.py
mypack.moduleA.saySomething()
mypack.moduleB.saySomething()
The test run goes like this:
$ python call_mypack.py 
calling saySomething from mypack
calling saySomething from mypack.moduleA
calling saySomething from mypack.moduleB
What gets confusing in package usage that the import statement must still refer to a modules and the module, by default, uses the full path name. For example these three are both wrong:
  1. This one gets flagged as a syntax error:
    import mypack.moduleA
     
    moduleA.saySomething()
    The module must use the full name mypack.moduleA unless an alias is given.
  2. Although syntactically correct, this one is flagged as a runtime error.
    import mypack
     
    mypack.moduleA.saySomething()
    We actually need to import moduleA per se, not the package.
You can, however, simplify the usage presentation like this with an alias:
import mypack.moduleA as moduleA
 
moduleA.saySomething()
or like this to get the attribute directly:
from mypack.moduleA import saySomething
 
saySomething()

Subpackages

The package idea extends into subpackages, i.e., we could continue:
mypack/
  subpack/
    __init__.py
    moduleC.py
giving us access to the sub-package via the syntax:
mypack.subpack
and submodules of subpack by:
mypack.subpack.moduleC

Scalars

Python has numeric types: int, float, long, complex. One difference over C/Java is that long has unlimited precision. The float type has at least double precision (like double in C/Java). Python also has a bool type with constants True and False.

Python has a special object None which acts like null in JavaScript and other languages; however x = None does not at all as if x were undefined. Python has no explicit test for "definedness" of a variable; it is considered unusual in Python to test whether a variable is defined or not.

Python strings are created by single, double, or triple-single, or triple-double quotes. These are good sites:
http://docs.python.org/library/string.html
http://docs.python.org/library/stdtypes.html#numeric-types-int-float-long-complex
Try these little experiments. Start a Python interpreter, select, copy/paste + return:

type(1)
type(1L)
type(1.1)
type(True)
type(None)
select

x = 'a string'
y = "another 'string'"
z = '''yet another 
string'''
w = """a 'fourth' "example" \ of a string\n
string"""
x
y
z
w
print x
print y
print z
print w
select
All quotes are equal, except the triple versions permit embedded newlines (very useful). In Python, triple quotes take serve to comment regions, replacing the C-style /* ... */.

Substrings (and individual characters) use bracketing with ranges like x[2] and x[2:8]. String concatenation is done with the "+" operator, except that Python does not coerce numeric types to string; for example, these are errors:
"x" + 2
"y" + 3.3
You can use explicit casting to disambiguate "+". In general it is a good idea to use Python's excellent string format operation if you want to combine the values of variables into a single string. Here is a demo program:

scalars.py
#!/usr/bin/env python
 
a = 'abcdefghijklm'
x = "7"
y = 5
 
print "a =", a
print "a[2] =", a[2]
print "a[2:] =", a[2:]
print "a[2:8] =", a[2:8]
print
print "x =", x, "   : ", type(x)
print "y =", y, "   : ", type(y)
print 'x + str(y) = ', x + str(y)
print 'int(x) + y = ', int(x) + y
print
print '"|{}|{}|".format(x,y) =         ', "|{}|{}|".format(x,y)
print '"|{:05d}|{: >5}|".format(x,y) = ', "|{:05d}|{: >5}|".format(y,x)
 
import sys
 
print "------------------------------------------"
print "type something: ",      # note the trailing comma to avoid newline
line = sys.stdin.readline();
print "|{}|".format(line)
print "------------------------------------------"
print "|{}|".format(line.strip())
Regarding Python's printf-style format operator, the {...} expressions represent argument insertion points. A literal "{" or "}", is gotten by "{{" or "}}", respectively. The minimal form is simply {}, but you can use an more complex version like this:
{position:format-info}
Without the position, arguments are taken in order (0,1,...), but we can take them out of order by making the position explicit. For example, the outcomes of the two in each group are the equal:
"|{:05d}|{: >5}|".format(y,x)
"|{1:05d}|{0: >5}|".format(x,y)
"{}--{}".format('foo',33)
"{1}--{0}".format(33,'foo')

Controls

Python's control structure syntax is quite different from C-style. See:
http://docs.python.org/tutorial/controlflow.html
The tokens used to delimit sections are similar to those used in Bash
if expression:
   ...
elif expression:
   ...
else:
   ...
The biggest difference is the indentation requirements. The if/elif/else tokens must be at the same indentation level and there is no "fi" equivalent. Some other points are these: Here is a demo program:

controls.py
#!/usr/bin/env python
 
a = 0; b = ""; c = False; d = 0.0; e = None; f = 0L; g = "0"
 
if a: print "a true"
else: print "a false"
 
if b: print "b true"
else:  print "b false"
 
if c: print "c true"
else: print "c false"
 
if d: print "d true"
else: print "d false"
 
if e: print "e true"
else: print "e false"
 
if f: print "f true"
else: print "f false"
 
if g: print "g true"
else: print "g false"
 
x = '123'; y = '57'
if x < y: 
    print "true: {} < {} as strings".format(x,y)
else: 
    print "false: {} < {} as strings".format(x,y)
 
u = int(x); v = int(y)
if u < v:   
    print "true: {} < {} as ints".format(u,v)
else: 
    print "false: {} < {} as ints".format(u,v)
 
if '2' == 2: 
    print 'yes'
else: 
    print 'no'
 
if '57' > 57: 
    print 'yes'
else: 
    print 'no'
 
a = 22; b = 33;
if a < 50 and b > 50:
    print "1"
elif a > 50 or b > 50:
    pass
else:
    print "3"

Lists

Like other programming languages, Python supplies data structures for lists. Usually we think of lists as indexed lists, i.e, the association of an an element to an integer position. However, like Java, Python provides additional list-like structures through its collections module. Regarding indexed lists, there are two kinds: Some points about their usage: Here is a demo program:

lists.py
#!/usr/bin/env python
 
t = ( 44, 'x', 3.5, )
s = 1,2,3,55,66,77,88
l = [ 44, 'x', 3.5, ]
r = range(19,31)          # inclusive/exclusive on arguments
q = range(19,31,2)        # interval skip
 
print "t =", t, type(t), len(t)
print "s =", s, type(s), len(s)
print "l =", l, type(l), len(l)
print "r =", r, type(r), len(r)
print "q =", q
print "-------------------------------------------"
print t[1], l[2]
l[2] = 3.7
print "-------------------------------------------"
print s + t
print l + r
print "-------------------------------------------"
for x in s: 
    print x
print "-------------------------------------------"
for x in l: 
    print x
print "-------------------------------------------"
print r[:7], r[3:7]    # list slices
print s[2:5]           # tuple slice
 
x,y = t[:2]
print "multiple assignments from tuple/list: {},{}".format(x,y)
 
r[3:7] = ['hello']
print r
print "-------------------------------------------"
print [str(x) for x in l]
print "__".join([str(x) for x in l])
print "-------------------------------------------"
print "l =", l, l.pop(), "l =", l
l.append('hello')
print l
l.insert(0, 'test')
print l
l.insert(2, 'again')
print l
del l[1]
print l
del l[1:3]
print l

Maps (Dict)

In Python, a map (associative list) is referred to as a dictionary. Internally these are simple HashMaps. Python also supports collections.OrderedDict equivalent to the Java LinkedHashMap which retains the entry order of the key/value pairs for iterations over the map. Here is a demo program:

dicts.py
#!/usr/bin/env python
 
grade = { "A": 4.0, "B": 3.1, "C": 2.0, "D": 1.0 };
print 'grade =', grade
print grade["A"]
grade["B"] = 3.0
print 'grade =', grade
print grade.keys()
print grade.values()
for key in grade.keys():
    print key 
for value in grade.values():
    print value
 
for key,value in grade.iteritems():
    print "{}: {}".format(key,value)
 
print "-----------------------------------------"
weight = dict()        # or weight = {}
weight['John'] = 180
weight['Ellen'] = 135
weight['Joe'] = 185
print 'weight =', weight
del weight['Joe']
print 'weight =', weight
weight.update( {'Dave':200,'Mary':140} )
print 'weight =', weight
 
for key in ('Ellen','Jane','Joe','Mary'):
    if key in weight: 
        print 'contains', key
    else: 
        print 'not contains', key
 
print "-----------------------------------------"
age = dict( John=50, Joe=34, Ellen=15, Marty=44 )  # using keyword args
age['Paul'] = 18
age['Joan'] = 33
print 'age =', age
for key,value in age.iteritems():
    print "%s : %s" % (key,value)
 
print "-----------------------------------------"
import collections
age_ordered = collections.OrderedDict([
  ( 'John', 50 ),
  ( 'Joe', 34 ),
  ( 'Ellen', 15 ),
  ( 'Marty', 44 ),
])
 
age_ordered['Paul'] = 18
age_ordered['Joan'] = 33
 
print 'age_ordered =', age_ordered
 
for key,value in age_ordered.iteritems():
    print "%s : %s" % (key,value)
Key points about Python dict objects are:

Regular expression operations

These web pages give useful descriptions of Python regular expression usage:
http://www.tutorialspoint.com/python/python_reg_expressions.htm   (basic)
http://docs.python.org/howto/regex.html   (more detailed)
There are five principle operations which use regular expressions:
  1. match a string with a pattern, giving a boolean (yes/no) result
  2. extract portions of a string which match a pattern
  3. substitute portions of a string which match a pattern by replacement string
  4. split a string into an array by removing portions which match a pattern
  5. extract (grep) a subarray of array elements which match a pattern
In Python the re module holds all the functionality for regular expression usage. Some points are these: The first demo program has three parts: a matching-only comparison of match and search, an extraction illustration using the group member function, and replacement.

regex1.py
#!/usr/bin/env python
 
import re
 
print """\
--------------------------------------
match vs. search
--------------------------------------
"""
 
pattern = r'\w\d{2}'
 
tests = [ "ABCD474", "A474" ]
 
for str_to_match in tests:
    # "match" must match at beginning
    if re.match( pattern, str_to_match ):
        print "match: '{}' with pattern {}: yes" . format(str_to_match, pattern)
    else:
        print "match: '{}' with pattern {}: no" . format(str_to_match, pattern)
 
print "==============================="
 
for str_to_match in tests:
    # "match" must match at beginning
    if re.search( pattern, str_to_match ):
        print "search: '{}' with pattern {}: yes" . format(str_to_match, pattern)
    else:
        print "search: '{}' with pattern {}: no" . format(str_to_match, pattern)
 
print """
--------------------------------------
extract
--------------------------------------
"""
 
pattern = r"(a+)\s*:\s*(\d+)\s*(\w+)"
 
str_to_match = " AaA: 272xy7-88";
 
match = re.search( pattern, str_to_match, re.IGNORECASE )
if match:
    print "'{}' matches '{}'".format(str_to_match, pattern)
    print match.group()
    print match.group(1)
    print match.group(2)
    print match.group(3)
    print match.groups()
    print match.group(1,3)
else:
    print "no match"
 
print """
--------------------------------------
replacement
--------------------------------------
"""
 
str_to_match = " 234aaAA  22bbbb  3cc  ";
 
print "string_to_match = '{}'".format(str_to_match)
 
#----------------------------
 
news = re.sub( string = str_to_match, pattern = r"\d+", repl="***" )
print "(0) news = '{}'".format(news)
 
#----------------------------
 
def repfunc1(match):
    return "[{}]".format(match.group(0))
 
news = re.sub( string = str_to_match, pattern = r"\d+", repl=repfunc1 )
print "(1) news = '{}'".format(news)
 
#----------------------------
 
def repfunc2(match):
    return "[{}]".format( int(match.group(0)) + 1 )
 
news = re.sub( string = str_to_match, pattern = r"\d+", repl=repfunc2 )
print "(2) news = '{}'".format(news)
 
#----------------------------
 
def repfunc3(match):
    return "[{}]{}".format(int(match.group(1)) + 1, "@" * len(match.group(2)))
 
news = re.sub( string = str_to_match, pattern = r"(\d+)(a+)", repl=repfunc3, 
               flags=re.IGNORECASE )
print "(3) news = '{}'".format(news)

Compiled regular expressions

Programming languages which support regular expression pattern matching do so with an internal compiled version of the pattern. In Python, this compilation is implicit for functions called at the module level (re.match, re.search, re.sub), but Python (like Java) gives the user an explicit object representing a compiled version.

As an example of explicit regular expression compilation, suppose the module-level call were:
match = re.search( r"d+", "a Bc Dd e ddd f DDD", re.IGNORECASE )
then the equivalent usage with explicit compilation is:
cp = re.compile( "d+", re.IGNORECASE )
match = cp.search( "a Bc Dd e ddd f DDD" )
The point is that using the compiled version is more efficient when the same regular expression is used multiple times in a program.

The second program repeats the first two parts of the first program, except using a compiled pattern. In particular the code in the first treats pattern as a regular expression per se:
pattern = r"-a-regular-expression"
The second version treats pattern as an object compiled from a separate regular expression:
regex = r"-a-regular-expression"
pattern = re.compile(regex, ...)
Here is the program:

regex2.py
#!/usr/bin/env python
 
import re
 
print """\
--------------------------------------
match vs. search
--------------------------------------
"""
 
regex = r'\w\d{2}'
pattern = re.compile(regex)
 
tests = [ "ABCD474", "A474" ]
 
for str_to_match in tests:
    if pattern.match( str_to_match ):
        print "match: '{}' with pattern {}: yes" . format(str_to_match, regex)
    else:
        print "match: '{}' with pattern {}: no" . format(str_to_match, regex)
 
 
print "==============================="
 
for str_to_match in tests:
    # "match" must match at beginning
    if pattern.search( str_to_match ):
        print "search: '{}' with pattern {}: yes" . format(str_to_match, regex)
    else:
        print "search: '{}' with pattern {}: no" . format(str_to_match, regex)
 
 
print """
--------------------------------------
extract
--------------------------------------
"""
 
regex = r"(a+)\s*:\s*(\d+)\s*(\w+)"
pattern = re.compile(regex, re.IGNORECASE)
 
str_to_match = " AaA: 272xy7-88";
 
match = pattern.search( str_to_match )
if match:
    print "'{}' matches '{}'".format(str_to_match, regex)
    print match.group()
    print match.group(1)
    print match.group(2)
    print match.group(3)
    print match.groups()
    print match.group(1,3)
else:
    print "no match"
A third demo program illustrates using a compiled regular expression to match against all the lines of a file, which in this case, is the previous "controls.py" program.

regex3.py
#!/usr/bin/env python
 
import re
 
pattern = "[a-d]\s*(true|false)"
 
test_file = 'controls.py'
 
print "match lines in file '{}' to pattern '{}'".format(test_file, pattern)
 
# compiled pattern
cp = re.compile(pattern = pattern, flags = re.IGNORECASE)
 
f = open(test_file)
lines = f.readlines()
f.close()
 
matching = []
for line in lines:
    line = line.rstrip("\n")
    if cp.search(line):
        matching.append(line)
 
print "\n----- MATCHING LINES -----"
print "\n".join(matching)
The lines which match the regular expression are put into a list which is printed out.

Functions and Classes

Some key points about Python functions: Here is a demo program:

funcs.py
#!/usr/bin/env python
 
def F(a,b):
    print "*** F: a = {}, b = {}".format(a,b)
 
def G(a,b=100,c=200):
    print "*** G: a = {}, b = {}, c = {}".format(a,b,c)
 
F(3,5)
F(b=7, a=10)
 
G(5)
G(5, 10)
G(5, 10, 15)
G(5, c=20, b=30)
#==================================
 
x = 77
 
def F2():
    print "*** F2: x = {}".format(x)
 
def F3():
    x = 33
    print "*** F3: x = {}".format(x)
 
def F4():
    global x
    x = 99
    print "--> F4: x = {}".format(x)
 
print "x = {}".format(x)
F2()
 
F3()
print "x = {}".format(x)
 
F4()
print "x = {}".format(x)
In Python classes, the usual "this" found in other languages is replaced by "self". Furthermore, it requires explicit usage (like Php, say). Initialized data members are static and also referred to with "self". Python can employ both a static and non-static usage of the same variable name. Here is a demo program:

classes.py
#!/usr/bin/env python
 
g = 5
 
class Foo(object):
    s1 = 66         # static
    s2 = g          # init from global
    __h = g + 10    # static hidden
 
    def __init__(self, a=200):  # constructor
        self.mem1 = 100         # non-static member
        self.mem2 = self.s1     # init from static
        self.mem3 = g           # init from global
        self.__hmem = a         # hidden
        self.show()             # call show member function
 
    def show(self):
        print "  show: s1 =", self.s1
        print "  show: __h =", self.__h
        print "  show: mem1 =", self.mem1
        print "  show: __hmem =", self.__hmem
 
    def bar(self):
        self.s1 += 1000     # this separates dynamic and static occurrences
        self.__h += 1000    # same here
        self.__hmem += 1000
        self.mem1 += 1000
 
print 'g =', g
print 'Foo.s1 =', Foo.s1
print 'Foo.s2 =', Foo.s2
print "--------------------"
 
print 'create foo:'
foo = Foo()    # instantiate, no "new"
print 'foo.mem1 =', foo.mem1 
print 'foo.mem2 =', foo.mem2
print 'foo.mem3 =', foo.mem3
 
try:
    print foo.__hmem
except Exception as err:
    print err
 
print vars(foo)
print "------------------------------------------------"
print 'create foo1:'
foo1 = Foo(333)
 
print "foo1.bar()"
foo1.bar()
 
print "foo1.show()"
foo1.show()
 
# static occurrences are separate from dynamic ones for same member name
 
print 'Foo.s1 =', Foo.s1
print 'foo1.s1 =', foo1.s1
print vars(foo)
 
print "------------------------------------------------"
 
class Person(object): 
    pass
 
joe = Person()
joe.fname = "Joe"
joe.lname = "Jones"
joe.age = 33
 
print vars(joe)
The other feature shown in this program is exception handling:
try:
    # Exception-generating code
except Exception as err:
    print err

Command-line arguments

The Python argparse module offers a complete solution to the problem of dealing with command-line arguments. See
http://docs.python.org/library/argparse.html
According to argparse usage, command-line arguments are thought of as non-option positional arguments (the positioning matters) and option arguments (the positioning of these doesn't matter, although the order may matter). Option arguments are either of the single-dash short style such as "-v" or the the double-dash long style such as "--verbose". In addition, a built-in mechanism for creating a help synopsis and usage information for option error usage is provided.

The parser is created by:
parser = argparse.ArgumentParser()
You add argument descriptors to it one-by-one with
parser.add_argument(...)
The first parameter used dictates whether it is an optional argument (starts with a "-") or a positional argument (does not). When all argument descriptors are defined, you put them into effect by parsing the arguments with:
args = parser.parse_args()
Afterwards the member names of the args object are used to provide the values (or perhaps just indication of presence) of the command arguments. The parameters available to parser.add_argument are varied and complex with many, many capabilities of which our simple examples only touch the surface.

args1.py
#!/usr/bin/env python
 
import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument('arg1')
parser.add_argument('arg2', nargs='?', default='foo')
parser.add_argument('arg3', nargs='?')
parser.add_argument('-t', '--test', action='store_true')
 
args = parser.parse_args()
 
print args
print args.arg1, args.arg2, args.arg3, args.test
Try out these test usages:
$ ./args1.py -h
$ echo $?                   (success status on correct option usage)
$ ./args1.py
$ echo $?                   (success status on incorrect option usage)
$ ./args1.py aaa
$ ./args1.py aaa bbb
$ ./args1.py aaa bbb ccc
$ ./args1.py -t aaa bbb
$ ./args1.py --test aaa bbb
$ ./args1.py aaa bbb -t
$ ./args1.py aaa -t bb       (options cannot be in the middle of positionals)
Another example is the following which permits a variable number of positional arguments.

args2.py
#!/usr/bin/env python
 
import argparse
 
synopsis = """This command takes an arbitrary
number of positional parameters with optional
file output
"""
 
parser = argparse.ArgumentParser(description = synopsis)
 
parser.add_argument('infile', nargs='*')
parser.add_argument('-o', metavar='outfile')
 
args = parser.parse_args()
 
print args
Try out these test usages:
$ ./args2.py -h                  (metavar shows up here)       
$ ./args2.py aaa
$ ./args2.py aaa bbb -o fff
$ ./args2.py -o fff aaa bbb ccc


© Robert M. Kline