Let's start our review of Perl by looking at the way Perl lets you use numbers and strings (sequences of characters). These are the most elementary types of data that you can work with in Perl. These two datatypes are generally known as scalars.
A number in Perl may be written in several formats:
2.3E4 | means 2.3 times ten to the fourth power, or 23000. |
-4.1e3 | means -4.1 times ten to the third power, or -4100. |
78e-3 | means 78 times ten to the minus third power, or .078. |
It is possible to write integers in base 8 (octal) by preceding them with a zero. You may write integers in or base 16 (hexadecimal) notation by preceding them with 0x or 0X. These aren't used frequently, but are part of the language, and you may see them in other people's code. You will have to identify them correctly on the midterm, but will not have to convert them from one form to another.
Integer | Base | Equivalent in base 10 |
---|---|---|
0253 | octal | 2*64 + 5*8 + 3 = 171 |
0x47 | hexadecimal | 4*16 + 7 = 71 |
0x1a4 | hexadecimal | 1*256 + 10*16 + 4 = 420 |
Here are the numeric operations, divided into levels of priority.
Operation | Means | Example |
---|---|---|
** | Exponentiation (Raising to a power) | 3**4 81 |
* | Multiplication | 5 * 4 20 |
/ | Division | 7 / 2 3.5 |
% | Modulo (Remainder) |
53 % 7 4 |
+ | Addition | 3 + 8.2 11.2 |
- | Subtraction | 19.5 - 6 13.5 |
All of the examples above are called expressions, or more technically, arithmetic expressions.
In an expression that combines operations, they're done in priority order. Thus multiplication is done before addition:
3 + 5 * 4 23
Operations at the same priority level are done left to right. Thus,
10 / 5 * 2 4
You may use parentheses to force the operations to be done in the order you want:
(3 + 5) * 4 32
10 / (5 * 2) 1
You don't have to leave spaces around the operators, but it's a good idea to do so. It makes expressions like 3 - -5 much easier to read.
At this point, we know enough to write a small Perl program that prints the cube of 12. Perl traces its heritage to the Unix operating system, and adopts many of its conventions. One of these conventions is that a script for the operating system has a special first line that tells which program should be used to run the program. In most systems, the Perl program is in a file named /usr/bin/perl. So, we start Perl programs with
#!/usr/bin/perl
The next lines in our program will be comments that explain what the program is. Comments are for humans only; Perl ignores them. Everything folowing the number sign (#) is considred to be a comment. Our program now looks like this:
#!/usr/bin/perl # Print the cube of twelve. # Written by E.V. College
The heart of this program is the print function; we will see many flavors of this function. To print a given expression on the screen, you simply say this Perl statement:
print ( expression );
Note the semicolon at the end of the line. Every statement in Perl must end with a semicolon! Our complete program now looks like this:
#!/usr/bin/perl # Print the cube of twelve. # Written by E.V. College print (12 ** 3);
If you run this program, you'll see the number 1728 printed on your screen. OK, so it doesn't solve differential equations or run a mega-corporation, but hey, it's a start.
Besides the simple operations on numbers, there are also functions which operate on numbers. Here are a few of them.
The abs function converts a number to absolute value.
abs( 4.6 ) 4.6 abs( -3.2 ) 3.2
The number in parentheses is called the function's argument. The function takes the argument, does some sort of calculation on it, and returns a value that we can use later on.
The int function gives only the integer part of a number and discards any fractional part after the decimal point.
int( 4.5) 4 int( -7.9) -7
The sqrt function returns the square root of a number.
sqrt(2) 1.41421 sqrt(15.5) 3.937
An example of using this function would be to find the hypotenuse of a right triangle that has legs of length 3 and 4:
print( sqrt( 3**2 + 4**2 ) ); # will print 5
exp(1) 2.71828182846 exp(0.5) 1.64872
Note: The ** exponentiation operation needs two numbers; it raises the first number to the second number's power (2**4 raises two to the fourth power). The exponential form of a number, such as 1.3e5, means to multiply 1.3 times ten to the fifth power. The exp function (not explained here) takes one number, and raises the mathematical constant e to that power. These are three different concepts with unfortunately similar names!
Being able to calculate and print numbers is good, but it would certainly better if we could label our output (e.g., "The cube of 12 is 1728") instead of having some raw number pop up with no explanation. To do this labelling, we'll need to have strings. (We'll use them for a lot of other things, too.) A string is a sequence of characters, such as a word or a sentence. You enclose a sequence of characters in double quote marks or single quote marks, depending upon your needs.
If you put a string inside of single quotes, each character is taken literally. Examples are 'Perl' or 'Evergreen Valley College'. The beginning and ending single quote are not part of the string; they serve to show where the string begins and ends.
What if you want to include a single quote in your string, such as the name John O'Hara? The solution is to put a backslash before the quotemark:
'John O\'Hara' prints as John O'Hara
This brings up another problem: how do you put a backslash in a single-quoted string? The answer: put two backslashses in a row:
'File C:\\system' prints as File C:\system
We can now use this information to label our output from the previous program. To save space, we'll avoid repeating the special starting line and comments. We can use two print statements in a row:
print('The cube of twelve is '); print(12 ** 3);
Or we can put a series of items in the parentheses, separated by commas.
print('The cube of twelve is ', 12**3);
Note that, in both of these cases, we needed to place a blank inside the single quotes after the word is to separate the words from the number upon output.
This is an improvement, but if you run this program on some systems, you won't get a new line; the next prompt will appear right at the end of the output line. We need some way to say, "print a new line," and this is one of the roles of double quotes. If you put double quotes around a string, then some letters take on special meaning if they are preceded by a backslash. For example, \n will not print a backslash and a letter n; the combination will be taken to mean "print a new line". \t will not print a backslash and a letter t; the combination will be taken to mean "print a tab character." These are the most common combinations that you will use.
In addition, as we will see later, the characters $ and @ take on special meaning inside a double-quoted string. If you want to print a dollar sign or an at sign, you must precede it with a backslash. You can also put a double-quote inside a double-quoted string by preceding it with a backslash.
"The cost is \$7.95." prints as The cost is $7.95
"email fred\@ibm.com" prints as email fred@ibm.com
"John \"Red\" Adair" prints as John "Red" Adair
Double quotes are generally more used and more useful than single quotes, so we'll rewrite our program as follows to produce the new line after our output:
print("The cube of twelve is ", 12 ** 3, "\n");
The most-used operation on strings is the concatenation operator, which sticks strings together. You use a period . for this operation. These examples seem a bit useless; you'll see that concatenation becomes more useful when you learn about variables.
"door" . "bell" is the same as "doorbell" "the " . "world" is the same as "the world"
(note the blank inside the quote marks)"a" . "b" . "c" is the same as "abc"
The next operator, the repetition operator, is a strange one. It requires a string on the left side, and a number on the right side:
"ring" x 3 is the same as "ringringring" "what " x 2 is the same as "what what "
(note the blank inside the quote marks
is repeated!)
Before we proceed further, we have to examine what happens when you mix strings and numbers. Here's the general rule:
If an operation is a numeric operation, such as addition or multiplication, its operands are converted to numbers before the operation is performed. If the operation is a string operation, conversion to strings takes place before the operation is performed. [The same goes for inputs to a function.]
Converting a number to a string is easy; it acts as if the number were printed out to the screen and enclosed in quote marks. Note that this conversion changes the way some numbers look, as in the last examples.
88 . "keys" "88keys"
3 x 7 "3333333"
3.2e3 . "?!" "3200?!"
"take" . 2.0 "take2"
Converting a string to a number is a bit trickier. The steps are as follows:
Examples:
" 54 degrees" |
|
"-13.8?" |
|
"woohoo!" |
|
Here are some numeric operations on strings and numbers:
3 + "76trombones" 79
" 2 by 4 " * 5 10
"thing one" + "thing two" 0
So far so good, but there's room for improvement. If you wanted to find the cube of 11, you'd have to rewrite the program. For you, that's no problems, since you are now officially a programmer. But we'd like to have a more generally useful program that anyone could use; it would ask them for a number and give them back the cube of their desired number. The number wouldn't always be twelve; it would vary or change. It's no coincidence that Perl uses something called a variable to hold information that changes.
You can think of a scalar variable as a mailbox with a person's name on it. Inside the mailbox is a slip of paper that gives the current numeric or string value. In Perl, a variable is a chunk of memory that we give a name to; and inside that chunk of memory is the current numeric or string value that we want to work with. The name of a scalar variable begins with a dollar sign and is followed by a series of letters, digits, and underscores. Variable names must begin with a letter, and have a limit of 255 characters. These names should be descriptive of the information contained in them. Here are some examples of variable names: $price, $cubed_number, $message, $tax_amount. Variable names are case-sensitivie, which means that $price, $Price, and $PRICE are three totally different variables occupying three totally different chunks of memory.
You put a value into a variable with an assignment statement. It has this generic form, with these examples:
$variable = expression; $price = 13.95; $message = "Tax is 8.5%"; $tax_amount = $price * 0.085;
Perl figures out the expression on the right hand side of the equal sign, and then places the result into the variable on the left-hand side of the equal sign. The last example is particularly interesting. When a variable appears on the right hand side of an equal sign, or appears in a function or print statement, Perl "looks into the mailbox" and uses the variable's current value. Here's what the variables will look like after these statements:
Note: The equal sign is not the same as the equal sign in algebra. Read $a = $b as "Set variable a equal to variable b" rather than as a mathematical equation. Consider this:
$x = 5; $x = $x + 1;
In this example, the first line puts the number 5 into the variable $x. The second line is processed as follows: Perl looks at the right hand side of the equal sign. $x currently has the value five, so the computer adds five plus one. The result, six, is stored back into the chunk of memory used by $x.
We can now rewrite the cube-a-number program to use variables:
$number = 12; $cube = $number ** 3; print "The cube of ", $number, " is ", $cube, "\n";
This may seem to be a step backwards; we've just taken a one-line program and expanded it to three lines, and it still calculates only the cube of twelve. Let's replace the first line with this:
$number = <STDIN>
The right-hand side is a special notation that tells Perl to read one line of text from the standard input device (the keyboard) and put that string into the variable on the left side of the equal. Now we have the program we want; the program waits for the user to type a number, and then cubes it and prints the result. The only problem is that there's no "prompting"; the person who uses the program will just see a blinking cursor and have no idea what kind of entry she is supposed to make. Let's show the entire program, rewritten to prompt for and read input. Notice that we've also changed the comments to reflect the new capabilities of this program.
#!/usr/bin/perl # Prompt for a number, read it, cube it, and # print the result. # # Written by E.V. College print("Enter a number to be cubed "); $number = <STDIN>; $cube = $number ** 3; print("The cube of ", $number, " is ", $cube, "\n");
Actually, we've taken a slight shortcut here - the line that is read in from the keyboard has a "newline" character at the end of it (the Enter key on your keyboard puts it there). In this program, it doesn't matter, since the string-to-number conversion will ignore any trailing newline characters. However, if you're reading a string to be used as words, you may have a problem. Consider this program:
$userName = <STDIN>; print "The ", $userName, " household may already be a winner!\n";
If the user types the name Fred Doakes and presses the Enter key, the output will be:
The Fred Doakes household may already be a winner!
We need some easy way to get rid of the newline so that all the output appears on one line. We'll find out how to do this later with a string function.
Just like their numeric counterparts, string functions take arguments in parentheses, and return results for later use. Possibly the most useful string function is length, which tells you how long a string is. Its argument is a string or variable, and the result is a number that tells how many characters are in the string.
$word = "triboluminescent"; $len = length(word); print($word, " has ", $len, "letters"); # output: triboluminescent has 16 letters
You will often want to put user input in all upper case or all lower case. The uc and lc functions do this. They affect only letters; anything that's already in the proper case or is not a letter remains untouched.
$x = "Take heed!"; $y = uc($x); print($y); # prints TAKE HEED! $x = "[QUIET PLEASE]"; $y = lc($x); print($y); # prints [quiet please]
You may change the first letter of a string to upper case or lower case by using the ucfirst and lcfirst functions.
$x = "abc!"; $y = ucfirst($x); print($y); # prints Abc $x = "E.e. cummings"; $y = lcfirst($x); print($y); # prints e.e. cummings
Remember a while ago, when we had the problem of that extra newline when you use <STDIN> to read input from the keyboard, and how we wanted to get rid of it? To solve this problem, we will use the chomp function, which is designed to remove the last character of a variable, if and only if that character is a newline.
$x = "Fred\n"; chomp($x); print($x); #prints Fred $x = "Sally"; chomp($x); print ($x); #prints Sally
chomp is a new, improved version of the older chop function, which takes off the last character of a variable, no matter what it is:
$x = "Fred\n"; chop($x); print($x); #prints Fred $x = "Sally"; chop($x); print ($x); #prints Sall
Note chomp and chop are not like the other string functions we've used. The other functions leave their argument untouched and return a value that is normally put into another variable. chomp and chop work directly on the variable argument, and the argument must be a variable!
$x = "Fred\n"; $y = chomp($x); print($x, " ", $y, "\n"); $a = "Sally"; $b = chop($a); print ($a, " ", $b, "\n");
The chomp function returns the number of characters that it removed from the string. The chop function returns the character that was removed from the string. Thus, the output from the preceding example will be:
Fred 1 Sall y
Up to this point we've been using print with a comma-separated list of the items we want to print. We could, of course, put items together into one big long string with the concatenation operator. The following two are equivalent:
print("Age is ", $age, " and salary is ", $wages, ".\n"); print("Age is " . $age . " and salary is " . $wages . ".\n");
If the items are strings and simple scalars, you may dispense with the commas and the periods, and simply enclose the entire list in a single set of double quotes. Within double quotes, and only double quotes, the $ indicates that Perl should look for a following variable name and substitute it in. In fact, this interpolation is actually a shortcut for string concatenation.
print("Age is $age and salary is $wages.\n");
This works only with simple scalars; if you do any arithmetic, you can't use this trick. The following cannot be done with interpolation!
print("Average is ", $sum/$n, ".\n");
Ordinarily, Perl will be able to figure out where your interpolated variable name ends. You must enclose the variable name in curly braces to resolve cases that might be ambiguous, such as the following one. We need to put braces around the n, because $nt and $nth are also valid Perl variable names, and Perl cannot decide which you want.
$n = 5; print("This is your ${n}th purchase.\n");
Note: You may also use interpolation to construct strings for use on the right hand side of an assignment. The following are equivalent:
$message = "You have " . $n . " vacation days."; $message = "You have $n vacation days.";
The following two are the same, but the first one is easier to read and more direct. The second one is an unnecessary use of interpolation.
$x = $y; $x = "$y";
If you use a variable that has never been given a value, it has the special value undef (undefined), which means, in essence, "never been used before." If you use an undefined variable in a numeric operation, it works like zero. If you use it in a string operation, undef works like an empty string.
If the variable on the left side of an equal sign is the same as the first variable on the right side of the equal, you can use a shortcut. Note the last example; in the shortcut method, the whole right hand side is treated as if it were in parentheses.
$x = $x + 1; | $x += 1 | |
$z = $z . "a"; | $z .= "a" | |
$h = $h / (2 + $w); | $h /= (2 + $w) or $height /= 2.0 + $w; |
If you want to add 1 to a variable or subtract 1 from a number, there's a quick and easy form:
$x++; or ++$x; means $x = $x + 1;
$x--; or --$x; means $x = $x - 1;
When a ++ or -- appears in an assignment, then it does make a difference whether you place the operation before or after the variable name. The rule is: if the operation comes before the variable, you do the addition or subtraction before the assignment. If the operation comes after the variable, you do the addition or subtraction after the assignment.
|
|
If you have more than one ++
or --
in an
expression, such as this:
$a = 7; $b = 5; $c = $a++ * --$b;
Since the --
precedes the $b
, we'll do the
subtraction before the statement, and get rid of the --
.
$a = 7; $b = 5; $b = $b - 1; $c = $a++ * $b;
Since the ++
follows the $a
, we'll do the
addition after the statement and get rid of the ++
.
$a = 7; $b = 5; $b = $b - 1; $c = $a * $b; $a = $a + 1;
Now we have five simple statements; when we carry them out, the final
value of $a
is 8, $b
is 4, and
$c
is 28:
$a = 7; $b = 5; $b = $b - 1; # $a is now 7, and $b is now 4 $c = $a * $b; # thus $c is now 28 (7 * 4) $a = $a + 1; # and finally $a becomes 8
Every time we run one of the programs so far, it does all the instructions in the same way. The input data may be different, but the calculations never change. This, however, is not the way the real world works. For example, if you are depositing money into a bank account, you might get a warning if you try to deposit more than US$5,000. You'll have the money added to your balance in any case, but you just get the extra warning for this particular circumstance. If you're withdrawing money, and you try to withdraw more than you have, not only do you get a warning, the transaction can't happen. If you're using an ATM, you have a menu of choices for your transactions; deposit, withdrawal, or viewing the balance. We will need some extra concepts to handle these sorts of problems.
The first case, with big deposits, is the job of the if statement, which has this general format:
if (condition) { do some action }
The condition asks a yes-or-no, true-or-false question. If the answer to the question is "yes," the action specified in the curly braces is performed. If the answer is "no," the action is skipped. Here's our deposit example. Note that the action between the curly braces is indented to make the program easier to read.
print("Enter deposit amount "); $deposit = <STDIN>; chomp($deposit); if ($deposit > 5000) { print("Your deposit of \$$deposit is over \$5000.00.\n"); print("It may take five business days to clear.\n"); } $balance = $balance + deposit;
The second case, with withdrawals, requires the if-else statement, which has this general format:
if (condition) { do "yes" action } else { do "no" action }
Once again, the condition asks a yes-or-no, true-or-false question. If the answer to the question is "true," the action specified in the first set of curly braces is performed. If the answer is "false," the action in the second set of curly braces is performed. Here's an example:
print("Enter withdrawal amount "); chomp($withdrawal = <STDIN>); # a shortcut if ($withdrawal <= $balance) { $balance -= $withdrawal; } else { print("Sorry, but your withdrawal of \$$withdrawal\n"); print("exceeds your balance of \$$balance.\n"); }
The last two examples have fairly obvious expressions for the condition. Before proceeding to the third example case, Let's examine what kinds of true-or-false questions we can ask. First, we can do numeric comparisons with operators like < (less than), <= (less than or equal), > (greater than), >= (greater than or equal to). The operation for "not equal" is !=, and, read this carefully, the operation that asks if two items are "equal" is ==. Yes, that's two equal signs in a row! If you use only one equal sign, you will not get the result you expect, as we will explain later.
If you are comparing string, you may not use the <, >, and = symbols. Instead, you must use lt for less than, le for less than or equal, gt for greater than, ge for greater than or equal, ne for not equal, and eq for testing equality. These comparisons are done in ASCII order, which means that "a" lt "Z" comes back as "false", because the ASCII code for lower case a is larger than the ASCII code for capital Z.
Hint: if you want to do comparisons with user input strings, you should use the uc or lc functions to convert the user's input to a single case.
You may also combine conditions. For example, you may want to hire only people who are over 21 and have less than five years experience by using the && operator:
if ($age > 21 && $experience < 5) { print("You're hired!\n"); }
You can use the or operation, which is represented by two vertical bars || to express conditions such as "I will buy anything priced under $5.00 or is marked 'sale'."
if ($price < 5.00 || $marking eq "sale") { print("I'll buy it!\n"); }
Finally, you may use the not operator, represented by an exclamation point. It's usually used in very complex conditions; for simple conditions it's easier to write things directly as $age < 21 rather than !($age >= 21).
Important: If you want to check to see if a word is equal to one of two choices, you must write it as:
if ($word eq "choice1" || $word eq "choice2") ...You may never write it as:
if ($word eq "choice1" || choice2" ) ...While this will not generate any error messages, it will absolutely not do what you want.
When Perl evaluates a combined condition, it works from left to right and stops as soon as it knows the final result. For example, in an && combination, if the first condition is false, there's no need to go further; with and, the whole expression is false if either part is. In an || combination, if the first condition is true, there's no need to go further; with or, the whole expression is true if either part is.
You will usually see this used to great advantage in expressions like:
if ($number_of_people > 0 && $sum / $number_of_people > 70) { print("Above a C average.\n"); }
Let's say that the number of people is 6. Since that's greater than zero, Perl must check the second condition of the && to see if that is also true. If the number of people is 0, then the whole expression will turn out false, so Perl will never do the second condition of the &&, and will not produce a divide-by-zero error.
Returning to our last example above, with the ATM that can do several kinds of transactions, we want to have a decision that corresponds to the English, "If the transaction code is D, we do a deposit; otherwise, check if the transaction code is W to do a withdrawal; otherwise, check if the transaction code is B to do an account balance; otherwise, it's a bad choice." (Whew!)
Here's the outline of the code in Perl:
print("Enter transaction code: D)eposit, W)ithdraw, B)alance )"; chomp($transaction = <STDIN>); $transaction = uc($transaction); # make sure it's uppercase if ($transaction eq "D") { # do deposit } elsif ($transaction eq "W") { # do withdrawal } elsif ($transaction eq "B") { # do account balance } else { print("Invalid transaction, sorry.\n"); }
The last else is not necessary, but we recommend that you always put one in your program to catch any cases that weren't caught by the previous elsif conditions. Note that elsif is one word; those of you from other programming language backgrounds will be tempted to write else if, and it won't work.
We already have the code for deposits and withdrawals; we can simply put it inside the framework that we just showed. The idea of if statements within if statements is called a nested if statement. Here's the full ATM program:
$balance = 15602.75; # make up a starting number print("Enter transaction code: D)eposit, W)ithdraw, B)alance "); chomp($transaction = <STDIN>); $transaction = uc($transaction); # make sure it's uppercase if ($transaction eq "D") { print("Enter deposit amount "); $deposit = <STDIN>; chomp($deposit); if ($deposit > 5000) { print("Your deposit of \$$deposit is over \$5000.00.\n"); print("It may take five business days to clear.\n"); } $balance = $balance + deposit; } elsif ($transaction eq "W") { print("Enter withdrawal amount "); chomp($withdrawal = <STDIN>); # a shortcut if ($withdrawal <= $balance) { $balance -= $withdrawal; } else { print("Sorry, but your withdrawal of \$$withdrawal\n"); print("exceeds your balance of \$$balance.\n"); } } elsif ($transaction eq "B") { print("Your current balance is \$$balance."); } else { print("Invalid transaction, sorry.\n"); }
Before we leave this topic, you should know how conditions are actually handled in Perl. Unlike other languages, which have built-in constants named true and false (also called Booleans, Perl says that any value that works out to undefined (undef), the empty string, or the string consisting of exactly one zero ("0") is considered to be false. Anything else is considered to be true.
A true expression like (3 < 5) produces the value 1. A false expression like (3 > 5) produces the null string, so you ordinarily don't need to be concerned about this.
However, some programmers will use a shortcut. Since any non-zero number is considered to be "true", they will say:
if ($num_people) # instead of ($num_people != 0) { $avg = $sum / $num_people; print "The average is $avg\n"; } else { print "Cannot find an average.\n"; }
Depending upon whom you talk to, this is either the height of programming cleverness or a really awful idea. In any event, this effect is the reason that we warned you about the need to put two equal signs in a row to test for equality. If you use two equal signs to compare two numbers, the result will be Perl's automatic 1 or empty string (true/false) values. If you use only one equal sign, then you have an assignment, whose value is whatever the right hand side of the equal sign worked out to! Here's an example:
$a = 22; if ($a = 15) # $a gets value 15, which is non-zero, and thus TRUE { print("The true action\n"); } else # this case never occurs, since 15 is not FALSE. { print("The false action\n"); } print("The result is $a\n"); # always prints 15
The bank example works, but it only does one transaction. It would make more sense to be able to repeat any number of transactions until the user decides to quit. To do this sort of repetition, we will use a loop, so called, because if you were to draw a line showing which statements the Perl program was doing, that line would form a loop as the program comes back to repeat instructions.
The while loop has this general form and works as follows:
while (condition) { action }
Here's the bank program with a while loop that will repeat until the user enters Q to quit. We've cut out the details of each individual transaction to save space, and the new code is in boldface.
$balance = 15602.75; # make up a starting number print("Enter transaction code: D)eposit, W)ithdraw, B)alance,Q)uit )"; chomp($transaction = <STDIN>); $transaction = uc($transaction); # make sure it's uppercase while ($transaction ne "Q") { if ($transaction eq "D") { # handle deposit } elsif ($transaction eq "W") { # handle withdrawal } elsif ($transaction eq "B") { print("Your current balance is \$$balance."); } else { print("Invalid transaction, sorry.\n"); } print("Next transaction: D)eposit, W)ithdraw, B)alance, Q)uit )"; chomp($transaction = <STDIN>); $transaction = uc($transaction); } print("Thank you for doing business with us.\n");
Notice that we ask the question twice; once before we enter the loop, so that $transaction has a value to test. We ask again at the bottom of the loop so that we'll have our new value when we return to the test at the top. Because a while loop does its testing at the top, it's possible for the loop to do nothing. In the example above, if the user entered Q right away, the condition would test false, and Perl would continue with the last print statement.
Sometimes you want to ensure that a loop happens at least once. For example, if you're asking someone his age, and you want to make sure it's between 5 and 120, you need to get at least one input from the user. The do-while loop will do this for you, because it does its test at the bottom of the loop. Note: the semicolon at the end of the condition is required!
do { action } while (condition);
In the following example, we don't need to chomp the input, because we're doing numeric comparisons, and the string-to-number conversion will ignore the trailing newline.
print("Please enter your age: "); do { $age = <STDIN> } while ($age < 5 || $age > 120);
User interface design note: The preceding example isn't ideal; if you continue to enter wrong data, there's no error message that tells you why. Here's a version that is better, though it will require an if to test the condition again:
print("Please enter your age (between 5 and 120): "); do { $age = <STDIN> if ($age < 5 || $age > 120) { print("Please use digits to enter a number between\n"); print("5 and 120 for your age.\n"); } } while ($age < 5 || $age > 120);
Both while and do-while are "open-ended" loops; you normally use them when there's an indeterminate number of trips through the loop. Sometimes, though, you will want to do something a fixed number of times. For example, you might want to find the sum of the squares of the numbers one through ten. We can do it with a while loop:
$sum = 0; $number = 1; while ($number < 11) { $sum = $sum + $number**2; $number = $number + 1; } print("The sum of squares is $sum\n");
But Perl provides a loop designed specifically for counting; the for loop, which has this general model:
for (initialize; condition; update) { action; }
The preceding example with the sum of squares becomes:
$sum = 0; for ($number = 1; $number < 11; $number = $number + 1) { $sum = $sum + $number ** 2; } print("The sum of squares is $sum\n");
You must separate the initialize, condition, and update with semicolons. Don't use commas; that won't work at all. The update step in a for loop usually uses the ++ shortcut; instead of $number = $number + 1, you will most often see the much more compact $number++.
There's no law that says you have to count by one in the update, of course. If you wanted the sum of squares of all the odd numbers from one to ten, your for would look like this:
for ($number = 1; $number < 11; $number += 2)
There's one other loop construct that we'll look at, but we'll first have to investigate lists and arrays.
To this point, we've had all of our variables stored in scalars, each of which contains one string or number. Sometimes, data comes in related batches. For example, the names of all the people in a company department, or a list of their ID numbers. Now we could create separate scalars:
$name1 = "Fred"; $name2="Teresita"; $name3="Sven"; # etc.
but for any large company, this would become very painful; there'd be no easy way to go through the entire list of names or salaries. To the rescue comes the concept of a list and an array in Perl. Just as we thought of a scalar as a value on a slip of paper that went into an individual family's mailbox (a variable), we can think of a list as a series of slips of paper, each of which has a value written on it. And, instead of a single mailbox, an array is like those mailboxes you see at an apartment complex; there's Royal Apartments [Apartment 1], Royal Apartments [Apartment 2], etc.
To specify a list, you enclose the values in parentheses, separated by commas. Each element can itself refer to a variable or an expression:
("Fred", "Teresita", "Sven", "Harmeet") ( 1725, 240, 165, 458 ) ($i, $i*2, $i*3, $i*4, $i*5)
There's no law that says that all the entries in a list of names has to be the same; you can put all the values for names and ID numbers into one list
("Fred", 1725, "Teresita", 2400, "Sven", 165, "Harmeet", 458)
One thing you can't do is have a list inside a list; lists are automatically "flattened." Thus, the following two lists are identical:
( 2, 4, (3, 5, 7), 6) ( 2, 4, 3, 5, 7, 6)
If all of your list entries are strings without blanks in them, you can use the qw list constructor. Items are presumed to be separated by whitespace, and don't need quotes. The first example is a very common use. The second example shows something you need to be careful about; even though we want "Phone Number" to be one entry, the whitespace between the words forces qw to split it into two entries.
This list | Is the same as |
---|---|
qw(Monday Tuesday Wednesday Thursday Friday Saturday Sunday) |
("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday") |
qw(Name Date Phone Number) |
("Name", "Date", "Phone", "Number") |
You can also set several entries at a time by using a range of numbers, separated by two periods (..); the first number must be less than the second number, and any decimal part of the number is ignored. If the first number is bigger than the second number, the result is an empty list.
This list | Is the same as |
---|---|
(1..7) |
(1, 2, 3, 4, 5, 6, 7) |
(3, 5..9, 12) |
(3, 5, 6, 7, 8, 9, 12) |
(6.8 .. 9.3) |
(6, 7, 8, 9) |
(4, 5, 9..7, 6) |
(4, 5, 6) |
Just as you could put a scalar variable on the left side of an equal sign to assign it a value from the right side, you can put a list of variables on the left side of an equal, and assign it a series of values from the right side. If there are fewer items on the left than on the right, the excess items on the right are ignored. If there are fewer items on the right than on the left, the extra items on the right become undefined
This | Is the same as |
---|---|
($a, $b) = (3, 7); |
$a = 3; $b = 7; |
($a, $b) = (3, 7, 11); |
$a = 3; $b = 7; |
($a, $b, $c) = (3, 7); |
$a = 3; $b = 7; |
If you assign a list to a scalar, or use a list in a context that requires a scalar, only the last item in the list is used. Thus:
This | Is the same as |
---|---|
$a = (5, 6, 22); |
$a = 22; |
$a = 3 + (2, 4, 6); |
$a = 3 + 6; |
Just as a number or string becomes much more useful when you have a scalar variable to hold it, so a list is more useful when you put it in an array variable. Scalar variable names are preceded by a $; array variable names are preceded by an at sign @. Here's an example of assigning a list to an array.
@arr = (8.5, 6.4, 12.2, 9);
When you assign a value to a scalar variable, any previous value it might have held is overwritten. In a similar way, when you assign one array to another, the old value of the array is thrown out.
@a = (3, 7); @b = ("x", 5); @a = @b;@a now contains the list ("x", 5); the old value (3, 7) is gone.
While it is often useful to treat arrays as one huge block of information, we will also need to extract individual elements from the array and work with them. Earlier, we talked about an array as if it were a set of mailboxes at an apartment house. Instead of starting numbering with apartment one, array elements are numbered starting at zero. Here's a diagram of the array @arr, with its individual elements named.
The number of each item, also called the index, is enclosed in square brackets following the array name. You might be surprised that we use a dollar sign in $arr[0]; after all, @arr is an array, isn't it? Well yes, @arr is the entire array. But each individual item is a scalar, so we must indicate that with a $. So, if you want to print the first element (number zero) of the array, you would write print($arr[0]);.
You can use more than just an integer within the square brackets; you can use any Perl expression you like. All of the following will print the third item (index number two) of the array:
print($arr[2]); print($arr[1 + 1]); $i = 2; print $arr[$i]; $j = 4; print $arr[$j - 2];
Note: you can also use a negative number for an array index. $arr[-1] is the last item, $arr[-2] is the next to last item, etc.
for loops and arrays were made for each other. You'll very often want to step through an array one item at a time. For example, let's refer to the lists of employee names and ID numbers that we had earlier, and write a program to print them out, in order.
@name_array = ("Fred", "Teresita", "Sven", "Harmeet"); @id_array = ( 1725, 240, 165, 458 ); for ($i = 0; $i < 4; $i++) { print("Employee $name_array[$i] has ID $id_array[$i]\n); }
Notice that the loop test asks if $i is less than four. This is not a mistake, even though the array has four elements, they are numbered zero through three.
This program works, but it's not very flexible. If we were to add five more employee names and IDs, we'd have to change the 4 in the loop's test to a 9. We'd really like to be able to find the number of items in the array, and use that in the test of the for loop. There are two ways to find the length of an array: scalar(@arr) gives the number of items in the array; $#arr (note: there's no at sign!) gives the index of the last item in the array. Thus, the following are equivalent:
$arr_len = scalar(@arr); $arr_len = $#arr + 1; # add one because index numbers start at zero
Of the two notations, your humble author far prefers the first. We can now change the preceding program to use this new information; the change is in bold. This depends, of course, on both arrays having the exact same number of elements.
@name_array = ("Fred", "Teresita", "Sven", "Harmeet"); @id_array = ( 1725, 240, 165, 458 ); for ($i = 0; $i < scalar(@name_array); $i++) { print("$name_array[$i] has ID $id_array[$i]\n"); }
We can also put both names and ID numbers in the same array; the even indexed elements are the names and the odd indexed elements are the ID numbers. If we do this, then we have only one array to deal with, but we must now change the for loop's update to take a step size of two.
@employee = ("Fred", 1725, "Teresita", 240, "Sven", 165, "Harmeet", 458); for ($i = 0; $i < scalar(@employee); $i += 2) { print("$employee[$i] has ID $employee[$i+1]\n"); }
A Perl array expands to fit its data. As you store data into the array's elements, those elements are created. To shrink an array, you set the "last index" number to the new length. The following code shows how an array expands and contracts; the right column shows the equivalent list.
@a = (12, 17, 15, 9); $a[4] = 11; $a[7] = 14; $#a = 2; |
# (12, 17, 15, 9) # (12, 17, 15, 9, 11) # (12, 17, 15, 9, 11, undef, undef, 14) # (12, 17, 15) |
If you assign a list to a scalar, you get the last item in the list. If you assign an array to a scalar, you get the length of the array. Note the difference between the two:
@arr = (7, 9, 14); $x = (7, 9, 14); # same as $x = 14; $x = @arr; # same as $x = 3;
If you assign a scalar to an array, you get an array with one element in it.
@arr = 15; is the same as @arr = (15);
To get a whole array, you use @array_name. To get an individaul element, you use $array_name[index]. Sometimes, though, you'll want to get a subset of the array elements; You do this by specifying an array slice. An array slice is a list of the elements that you specify. You can specify an individual element, a list of elements, or a range of elements. Here are some example slices of this array:
@days = ("Mon", "Tue", "Wed", "Thu", "Fri");
Slice | Equivalent List |
---|---|
@days[3] | ("Thu") |
@days[2..4] | ("Wed", "Thu", "Fri") |
@days[0, 2, 3] | ("Mon", "Wed", "Thu") |
@days[2, 0, 1] | ("Wed", "Mon", "Tue") |
@days[4, 2..3, 0] | ("Fri", "Wed", "Thu", "Mon") |
Important: @days[3] is a list that has one item in it; $days[3] is a scalar value. If you assign either one of them to another scalar or to an array, it won't make a difference; Perl does the right thing with it. However, there are places where there can be a crucial difference between a list context and a scalar context, so you should be careful to write what you actually intended.
Slices can also appear on the left hand side of an equal sign; they work pretty much as you expect them to. The values on the right side are assigned to the array entries that you specified in your slice, as shown in these examples with this array:
@arr = (15, 17, 23, 14, 80, 9);
These assignments are done in sequence as if they occurred one after another in a program.
Assignment | New contents of @arr |
---|---|
@arr[2] = (44); | (15, 17, 44, 14, 80, 9) |
@arr[3..5] = (66, 77, 88); | (15, 17, 44, 66, 77, 88) |
@arr[0, 2, 3] = ("zero", "two", "three"); | ("zero", 17, "two", "three", 77, 88) |
@arr[2, 0, 1] = (222, 0, 111); | (0, 111, 222, "three", 77, 88) |
@arr[4, 2..3, 0] = (44, 22, 33, "z"); | ("z", 111, 22, 33, 44, 88) |
Important: Remember, an array slice is a list, not an array! If you assign a slice to a scalar, the result is the last element in the list. The following code will print the number 42.
@arr = (15, 18, 42, 56, 12); $x = @arr[0..2]; print "$x\n";