CIT 042 Index > Databases and Perl

Databases and Perl

A Short guide to DBI (The Perl Database Interface Module) summarizes the mechanics of how the DBI is set up. These notes will go over a complete example, creating tables in a database with a command-line file and then accessing those tables from a CGI script.

The Data

The input data is a list of California USA Wrestling tournaments (yes, it’s real data). Each line in the file contains as tab-separated values the event date, the sponsoring organization, location, street address, city, and zip code. State isn’t necessary; everything happens in California.

Creating Tables - the pseudocode

Here’s the way most database-filler programs work. In our case, thre is only one table to deal with, because I really didn’t want to deal with a fully relational database with multiple tables.

  1. Connect to the database.
  2. For each table:
    1. Drop the table if it exists (make sure old data doesn’t stick around).
    2. Create the table and its fields.
    3. Read the file and fill the table.

Connecting to a Database

Perl programs that use DBI must connect to the database through a database handle, which is established as follows:

use DBI;
use strict;
use CGI qw(:standard -debug);
use CGI::Carp qw(fatalsToBrowser);

my $dbh;	# database "handle"

$dbh = DBI->connect('DBI:database_driver:database_name',
	username, password)
	or die "Couldn't connect to database: " . DBI->errstr;

To do statements that have no user-supplied information, you can use the do function, which takes an SQL statement as its parameter and returns the number of rows affected in the database. If there is an error, the return value is 0. If n rows were affected, do returns 0E0, which evaluates to zero numerically but is treated as “true” by Perl. Here’s a statement which drops a table if it already exists.

my $stmt = "drop table if exists tablename";
my $result = $dbh->do( $stmt ) or
	die("Can't execute $stmt: $dbh->errstr\n");

At this point, we can create the new table. Here’s the general model for creating a table:

$stmt = "create table table_name
	(column_name type,
	 column_name type ... )
	 PRIMARY KEY (column_name_list)";

Among the more popular column types are:

TypeDescription
INT A number between -2147483648 to 2147483647. If you add the attribute AUTO_INCREMENT, each row in the table will get a unique value.
DOUBLE A floating point number.
DECIMAL(width,dec) A decimal number with a display width and dec decimals.
DATE A date in the form yyyy-mm-dd
CHAR(length) A character field of the given length, padded on the right with spaces. (max. length 255)
VARCHAR(length) A variable-length string of maximum length (up to 255).
TEXT A variable-length text field with a maximum length of 65535 characters.

The Program So Far

Here’s what the calendar-maker program looks like so far after dropping/creating the table:

#!/usr/bin/perl

use DBI;
use strict;
use CGI qw(:standard -debug);
use CGI::Carp qw(fatalsToBrowser);

my $dbh;
my $stmt;
my $result;

print "Content-type: text/plain\n\n";

print "About to connect...\n";

$dbh = DBI->connect( "DBI:mysql:cit042", "cit042", "cit042" )
    or die("Couldn't connect to database: " . DBI->errstr);
 
print "About to drop old table...\n";  
$stmt = "drop table if exists calendar";
$result = $dbh->do( $stmt ) or die("Cannot drop table: " . $dbh->errstr);

print "About to create table...\n";
$stmt = <<"CREATE_TABLE";
CREATE TABLE calendar (
    id INTEGER AUTO_INCREMENT,
    date DATE,
    association CHAR(8),
    location VARCHAR(60),
    address VARCHAR(60),
    city VARCHAR(30),
    zip CHAR(5),
    PRIMARY KEY ( id )
)
CREATE_TABLE

$result = $dbh->do( $stmt ) or die("Cannot create table: " . $dbh->errstr);

Creating Tables from the Input File

At this point, we will read the input file (redirected to STDIN because I’m being lazy) and split it into fields. To insert values, you could use a statement like the following, but it would be a bad thing to do.

$stmt =
"INSERT INTO calendar ( date, association, location, address, city, zip)
VALUES ('$date', '$assn', '$location', '$address', '$city', '$zip')"

If any of the values contain a semicolon or a single quote, you will get a MySQL syntax error when you do the statement at best; at worst, you could have a valid MySQL statement that does something completely unintended. Further, every time you go through the loop, the database driver has to re-parse the statement.

Instead, you should use the prepare() function, which “pre-compiles” the statement for later execution. Before entering the while loop, we do the following, where each ? stands for a value that will be supplied when the statement is executed. The variable $stmth is used as the statement handle for the parepared statement.

$stmt = <<"INSERT_STMT";
INSERT INTO calendar ( date, association, location, address,
	city, zip) VALUES (?, ?, ?, ?, ?, ?)
INSERT_STMT

$stmth = $dbh->prepare( $stmt );

Inside the loop, you will execute the statement via its statement handle using the execute() function. You must give this function exactly as many parameters as you had question marks in the statement you are about to execute. When you finish the loop, you should use the finish function to release the resources that the statement handle uses. Here is the remaining code in the program:

while (<STDIN>)
{
    chomp;
    ($date, $assn, $location, $addr, $city, $state) =
        split(/\t/);
    $result = $stmth->execute( $date, $assn, $location,
        $addr, $city, $state );
	print "Added $date $assn $location\n";
    if (!$result)
    {
        die("Could not add: " . $stmth->errstr );
    }
}
$stmth->finish;

$rc = $dbh->disconnect;
print "Table created.\n";

Note: instead of using individual variables, we could just as easily have split into an array and used it as the argument to execute() as follows:

@data = split(/\t/);
$result = $stmth->execute( @data );

Now you will upload this CGI script to the server and run it by typing in its URL. After the CGI script runs, you will have the calendar table in the cit042 database.

Retrieving the Data

At this point, we need an HTML form that connects to a CGI script. There’s no law that says you have to make every field searchable; in this case there is no real need to search by address or location.

Once the data gets to the CGI script, we will check it for validity; if the date formats are wrong or the fields are all empty/blank, we will give an error message. If there is data, we will connect to the database and issue a SELECT query, which has the general form:

SELECT expr1, expr2..
	FROM table1, table2...
	WHERE criteria

The expression list will be the list of column names, and there’s only one table, but the fun begins with the WHERE clause. This will change depending on what the user enters, so the last part of the statement must be built dynamically.

# Search by city only
WHERE city LIKE ?

#Search by ending date and association
WHERE date <= ? AND association = ?

There are probably a thousand ways to do this. I do it by creating two arrays as I go through the form input. The first array contains the things to look for (this will become the @data array passed to execute), and the second array contains the sub-clauses which will eventually get handed off to:

$where_clause = join(" AND ", @subclauses)

However, none of that is germane to the issue. Presume that the correct SELECT statement has been prepared and executed. You retrieve the results, if any, through the statement handle.

$rows_found = $sth->rows;

#
#	You may retrieve rows as an array...
#
while (@row = $sth->fetchrow_array)
{
	print "$row[0] $row[1]...\n";
}

#
#	or as an array reference...
#
while ($row_ref = $sth->fetchrow_arrayref)
{
	print $row_ref->[0], " ", $row_ref->[1] ...\n";
}

You can see the entire script here.