Perl has a special array variable @ARGV (0 based)
which holds the command-line parameters.
Like Bash, $0 is the script name (full path) itself.
If the parameters are files, Perl provides a simple syntax
for accessing the contents of the files, line by line, using the syntax:
<ARGV>
In this modality, ARGV is called a file handle and the
bracket syntax <..>
represents "the next line" from the files in the
command-line parameter list. Consider this Perl script:
dump-files.pl
#!/usr/bin/perl
foreach (<ARGV>) {
print $_;
};
Try this execution:
$ dump-files.pl dump*
The version accesses the lines
of standard input using the STDIN file handle:
dump-stdin.pl
#!/usr/bin/perl
foreach (<STDIN>) {
print $_;
};
Try these executions:
Because the line-by-line processing of input is such a standard
way to manipulate text files, many Perl scripts use this structure
as an outline.
foreach vs. while
The
foreach
loops can be replaced by
while
loops
with similar, but slightly different behavior. For example,
consider this program compared to the "foreach" version above:
dump-stdin-while.pl
#!/usr/bin/perl
while (<STDIN>) {
print $_;
};
The difference is that the
while
version will immediately process input on a line-by-line basis
whereas, using the
foreach version,
standard input must be "complete" before all lines are processed.
Compare the executions to illustrate the differences:
$ dump-stdin.pl
First
Second
Ctrl-D (terminate)
vs.
$ dump-stdin-while.pl
First
Second
Ctrl-D (terminate)
Capturing options with getopts
Perl also has a getopts feature which can extract the command-line
options; the feature is part of the auxiliary package Getopt.
Here is a sample test program which illustrates these features:
arg-opts.pl
#!/usr/bin/perl -w
use strict;
use File::Basename;
use Getopt::Std;
use Data::Dumper;
$Data::Dumper::Terse = 1;
$Data::Dumper::Indent = 0;
print "this script: ", basename($0), "\n";
print "arguments: @ARGV\n";
# if getopts sees an unknown option, it returns false otherwise, returns true
my %options;
getopts( "qos", \%options ) or die("*** options error\n");
print "\n%options = ", Dumper(\%options), "\n";
# the presence of an option can be tested as follows:
print "\n-q option: ";
if ( defined $options{q} ) {
print "defined\n";
} else {
print "not defined\n";
}
# getopts automatically pulls out the options out up
# to the first non-option argument
print "\narguments after options: @ARGV\n";
my $non_opt_arg = shift @ARGV;
if (defined $non_opt_arg) {
print "\nextract non-option argument: $non_opt_arg\n";
} else {
exit;
}
print "\nafter non-opt. arg., remain: @ARGV\n";
# for options which expect arguments, if argument is
# not present, it returns false
%options = ();
getopts( "d:", \%options ) or die("*** options error\n");
print "\n%options = ", Dumper(\%options), "\n";
print "\nafter options, remaining arguments: @ARGV\n";
To test it, try running these commands:
Perl allows access to commands in the underlying operating system in
several ways:
the system command: execute a command (in a subshell)
shell-evaluated quotes
the exec command: execute a command; do not
return to the calling program
For the first two, the exit status is available, and,
like in Bash, the variable $? holds its value.
In contrast,
a successful execnever returns to the program.
system.pl
#!/usr/bin/perl -w
use strict;
my $command;
$command = "ls";
print "==> $command\n";
if (system("$command > /dev/null 2>&1") == 0) {
print " success, exit status = $?\n";
} else {
print " failure, exit status = $?\n";
}
$command = "ls xxxxx";
print "==> $command\n";
my $output = `$command`;
if ( $? == 0 ) { # you can also use this terse syntax: if ( !$? )
print " success, exit status = $?\n";
} else {
print " failure, exit status = $?\n";
}
You have to understand that system calls
(executed by a shell)
do not have side-effects for subsequent commands.
For example, in a shell script
you can change the scripts working directory by executing:
cd /home
whereas in Perl, if you were to use
system "cd /home";
it would have no effect on the working directory of the Perl script.
What you mean to use is Perl's own chdir command:
chdir "/home";
File Access
Assuming we're running in a UNIX-like environment, one way to read a file,
say the file "system.pl", into a Perl program is as follows:
$contents = `cat system.pl`;
Once we have this contents we can choose to iterate through the lines
of this file as follows:
@lines = split /\n/, $contents;
foreach my $line (@lines)
{
# do something with $line
}
The only problem with this approach is that it is system dependent, i.e.,
it won't work on, say, a Windows system where there is no
cat operation.
Perl, of course, does permit system-independent file access.
The operation to open system.pl for read access goes like this:
open F, "system.pl" or die( "can't open system.pl\n" );
The symbol F is called a file handle,
and is one of the few accepted
usages of a bareword. The validity of the
open (especially open for reading)
operation should always be checked. With F we could do:
$line = <F>; # read a line from F
or read all the lines from the file by using <F> in array context:
@lines = <F>; # read all lines from F into the @lines array
The open operation also can be used to open a file for writing, concatenation,
etc. by prefacing the file name with characters
">", ">>", etc., respectively.
If we open a file for writing, say,
open F, ">some_file";
then we can using the print operation to write to it, as follows:
print F "some line\n";
The trick is to make sure to avoid putting a comma after the file handle,
F.
A file handle also supports a read operation which avoids
reading line-by-line, and so would be more appropriate to reading/writing
a binary file. The following program illustrates several usages of
reading and writing files using file handles.
read-write.pl
#!/usr/bin/perl -w
use strict;
my $input_file = shift;
die "missing file argument\n" unless defined $input_file && -f $input_file;
my $output_file = "SampleOut";
my $contents;
open F, "$input_file" or die("can't open $input_file");
read F, $contents, -s "$input_file";
close F;
print "\n===>> printing $input_file to standard output\n";
print $contents;
print "\n===>> printing $input_file to $output_file\n";
open G, ">$output_file";
print G $contents;
close G;
open F, "$input_file" or die("can't open $input_file");
print "\n===>> enumerate $input_file to standard output\n";
my $lineno = 0;
foreach my $line (<F>) {
printf "%03d: ", ++$lineno; # add line number in front
print $line;
}
close F;
A simple test-run of this program with minimal output is:
$ read-write.pl dump-any.pl
$ cat SampleOut
Bash gives the ability to redirect
standard I/O with the > and < operators.
The equivalent can be done in Perl by opening the special
file handles STDIN, STDOUT, STDERR.
Here is a sample program controlling standard output.
redirect-stdio.pl
#!/usr/bin/perl
use strict;
print "One\n";
print "Two\n";
open STDOUT, ">TestFile";
print "Three\n";
print "Four\n";
String operations
There are many functions and operations
for creating and manipulating strings:
length: the string length
x: string repeat operator, e.g.:
"A" x 5 is "AAAAA"
index & rindex: search for substrings
substr: extract and replace substrings
sprintf: create a substring by insertion into key string; same
idea as the C function of the same name
Below is an example using the index function.
The call:
index $string, $substr, $startindex
returns the first occurrence of a substring, $substr,
within a $string, starting from $startindex (default is 0).
If no such occurrence,
-1 is returned.
The rindex function
does the same operation, except in reverse, starting
from the end of the string.
Here is a sample usage that finds all occurrences of a substring
within a string using repeated calls to index.
search.pl
#!/usr/bin/perl -w
use strict;
my $line = "The rain in Spain falls mainly on the plain.";
my $search = "ain";
my $ind;
print "search string: $line\n\n";
$ind = index $line, $search;
while($ind != -1) { # a failed find gives returns index -1
print "next occurrence of '$search' is at postion $ind\n";
$ind = index $line, $search, $ind+1;
}
The substr function
computes the substring of a given string starting and
ending at given indices:
substr $line, $start, $length;
If $length is omitted,
it is assumed to be the maximum possible length.
The substr function can also be used to replace
the extacted portion by another string
simply by assigning a new substring to the function. From the perspective
of programming language
theory, we say that the substr has an lvalue (location value),
and so can be assigned to. Here is an example program which illustrates
these points.
substr-replace.pl
#!/usr/bin/perl -w
use strict;
my $line = "abcdefghijklmnopqrstuvwxyz";
print "\$line: $line\n\n";
print "substr(\$line,0,6)=", substr($line, 0, 6), "\n";
print "substr(\$line,6,3)=", substr($line, 6, 3), "\n";
print "substr(\$line,9) =", substr($line, 9), "\n";
my $replace = "+++++++++";
print "\$replace=$replace\n";
# Replace at postion 8 a substring of the replacement's length
substr($line, 8, length($replace)) = $replace;
print "after replacement, \$line=$line\n";
# Delete a substring of length 4 at position 3
substr($line, 3, 4) = "";
print "after another replacement, \$line=$line\n";