A module, simply put, is a set of subroutines and variables that you put in a file so that they can be used and re-used by many programs. There are “traditional modules” and “object-oriented modules”—in this tutorial, we’ll concentrate on the traditional one.
Let’s construct a few subroutines to determine basic statistics. These will use a subset of the POD commenting style:
=begin comment
starts a block of comments
=end comment
ends a block of comments
=cut
takes you out of POD mode and back to Perl
use strict; =begin comment Find the average of an array, passed by reference. If the array is empty, return zero. =end comment =cut sub average { my $array_ref = shift; my $sum = 0; my $result = 0; my $n = scalar @{$array_ref}; my $item; foreach $item (@{$array_ref}) { $sum += $item; } if ($n > 0) { $result = $sum / $n; } return $result; } =begin comment Calculate the maximum and minimum values of a set of numbers passed as an array reference. If there are zero numbers, return zero for both maximum and minimum. =end comment =cut sub min_max { my $array_ref = shift; my $n = scalar @{$array_ref}; my $min; my $max; my $item; if ($n == 0) { $min = $max = 0; } else { $min = $max = ${$array_ref}[0]; foreach $item (@{$array_ref}) { $min = $item if ($item < $min); $max = $item if ($item > $max); } } return ($min, $max); } =begin comment Calculate the variance of a set of numbers with n items passed as an array reference. Algorithm: Calculate the sum of all the numbers (sum) Calculate the sum of the squares of the numbers (sum_sq) Variance is (n * sum_sq - (sum**2)) / (n * (n-1) ) If n <= 1, return zero. =end comment =cut sub variance { my $array_ref = shift; my $n = scalar(@{$array_ref}); my $result = 0; my $item; my $sum = 0; my $sum_sq = 0; my $n = scalar @{$array_ref}; foreach $item (@{$array_ref}) { $sum += $item; $sum_sq += $item*$item; } if ($n > 1) { $result = (($n * $sum_sq) - $sum**2)/($n*($n-1)); } return $result; }
If you want to use these routines in several different programs, you could
copy and paste them into each one. This is not a good solution, though;
if you decide to add or modify a subroutine, you have to change it in all
the files. Instead, you can put them in a file with a .pm
extension, and one crucial addition. At the end of the file, you
must put:
1;
You need to add this because, when you use a module, Perl will compile it, and the result of the compilation must be “true.” After adding the line, save the code in a file named Statistics.pm. I know I’ve warned you about using lowercase only for file names, but the convention for Perl modules is to uppercase the first letter.
Now that you have created the module file, you can use it as in the following test file:
#!/usr/bin/perl use strict; use Statistics; my @data = ( 5, 22, 4, 13 ); my @data2 = ( 11, 10, 12 ); my $avg = average(\@data); my $var = variance(\@data); my ($small, $big) = min_max(\@data); print "Average of first data set is $avg\n"; print "Variance is $var\n"; print "Range is $small to $big\n"; print "-" x 40, "\n"; $avg = average( \@data2 ); $var = variance( \@data2 ); ($small, $big) = min_max(\@data2); print "Average of second data set is $avg\n"; print "Variance is $var\n"; print "Range is $small to $big\n";
A namespace is an area where Perl keeps a list of all the names of variables and subroutines that are available to a program. Unless you specify otherwise, all your variables and subroutines go into the “main” namespace.
This is all well and good, but there’s a problem—subroutine
names like average
are quite common.
If you are using a lot of people’s
modules, there’s a distinct possibility of
name collision if you
just dump all the names into one namespace. To
avoid this, you make a module part of a package, and that sets up a new
namespace.
The rule in Perl is that the package name
must match the module name, so we have to add this to the module:
package Statistics;
Now the subroutines average
, min_max
,
and variance
belong to the Statistics
package
instead of the default package (named main
).
If you run the test program again, you will get this
error message:
Undefined subroutine &main::average called at stats.pl line 7.
To fix this, you must now specify the namespace as well as the subroutine name (and do this throughout the program):
my $avg = Statistics::average(\@data); my $var = Statistics::variance(\@data); # etc.
This solves the problem. If you were to use another package, say,
BuildingCode
, which also had an variance
subroutine (which finds out how much a variance for a certain type
of construction project costs), you could distinguish them:
use Statistics; use BuildingCode; my @data = (2, 6, 13); my $number = Statistics::variance( \@data ); my $cost = BuildingCode::variance( "electrical" );
The problem is now solved, but at the expense of having to qualify every subroutine call. Luckily, it is possible to import names from a package’s namespace into your own—if the package lets you. To allow the export of all your subroutines, add this to the module:
package Statistics; require Exporter; our @ISA = ("Exporter"); our @EXPORT_OK = qw(average variance min_max);
Then, explicitly import them into the main
namespace
within your program:
use Statistics qw( average variance min_max );
Note: even if a package says the export is not OK,
you can always access it
directly with the
fully qualified name.
The @EXPORT_OK
gives the names of subroutines that can be imported without being
fully qualified.
The person writing a package can also specify that certain names are exported whenever someone uses the package. For example, if you put this in the module:
package Statistics; require Exporter; our @ISA = ("Exporter"); our @EXPORT = qw(average min_max); # export by default our @EXPORT_OK = qw(variance); # export on request
@EXPORT
and @EXPORT_OK
Given the preceding setup, if a program is only going to use the
average
and min_max
functions, it can just use
those two functions; they were exported by default:
#!/usr/bin/perl; use Statistics; my @data = (1, 20, 3); my $avg = average(\@data); my ($small,$big) = min_max(\@data);
If the program wants to use variance
as well, it can not
do the following, because only the items in the qw
list get imported.
#!/usr/bin/perl; use Statistics qw(variance); # imports variance ONLY my @data = (1, 20, 3); my $var = variance(\@data); # this is ok... my $avg = average(\@data); # but this now causes an error my ($small,$big) = min_max(\@data);
How do you get the default exports as well as the specific one that
you wanted? Add :DEFAULT
to the import list:
#!/usr/bin/perl; use Statistics qw(:DEFAULT variance); my @data = (1, 20, 3); my $var = variance(\@data); # this is ok... my $avg = average(\@data); # ...and this works also my ($small,$big) = min_max(\@data); # ...as does this.