Home Page Commentary 10 December 2000 (Anatomy of a Software Mess) |
This is how it starts. It's 1985, and you're helping with the scorekeeping at amateur wrestling tournaments. You realize how difficult it is to keep track of everything by hand, and you decide to write a program to do the work. There are programs out there already, but they're expensive and not very flexible.
You write a set of programs in Pascal for the Apple II:
This program lets you draw the championship and consolation brackets on screen. As you put each line into the bracket, you add information such as this line contains the name of the winner of match 17, and it's the top line of match 22. The output of this program is a bracket data file that all the other programs use. [This program solves the flexibility program -- you're not stuck with only a standard eight, sixteen, or thirty-two person bracket; you can use any kind of bracket that you can describe.]
This program displays the bracket file and lets you enter the names and schools of all the participants in a tournament. This goes into two data files, one for schools and one for people's names.
This program is purposely kept separate from the scorekeeping program; you don't want people who are entering scores to inadvertently change a person's team name.
This is a text file that gives initialization information, such as the names of the weight classes, printer initialization strings, advancement and placement points, number of points awarded for a decision, major decision, or pin, etc.
This program lets you enter scores for all the matches in the tournaments. It uses the information in the above data files to keep track of the participants' names and schools, and makes sure that winners and losers go to the correct places in the bracket, and keeps track of team scores.
The program prints the bracket sheets, and also prints mailing labels that you stick on the bout sheets and wall charts.
You use this program successfully at small tournaments. Eventually you're using it to keep track of scores at the Central Coast Section (CCS) tournament, which has a 32-person bracket.
As the PC gains in popularity, you rewrite the program in C. It's about 1,700 lines long, and it uses global variables almost exclusively. Remember, this was written a long time ago. Nonetheless, it works, running on a PC with four megabytes of RAM and ten megabytes of hard disk. No, that is not a misprint.
Time passes, and more requests for additions come in. First, some tournaments are seeded. It would be nice to put all the participant names in a file, mark the ones that are seeded, and have a program that constructs the initial school names and participant name files. So you write the program, and it, too, works nicely.
Some years later, you have to run the Sierra Nevada Classic, which uses a 64-person bracket. You write a special version of the program that can handle that many people. This special version requires two computers so that data entry and printing can occur simultaneously. This lets you keep entering the data without having to wait for the Hewlett-Packard Deskjet Plus printer to complete its printing.
This is where the mess begins. You now fold the Sierra Nevada changes and improvements back into the CCS version, and the source code starts to grow. Every time you add a new function, you find yourself saving copies of global variables into local variables, and restoring them before you exit. The program, in short, is becoming less manageable.
Every year after CCS, the big event of the year, you promise yourself that you'll rewrite the program and document it. But you just never find the time. However, the demands for improvement continue to arrive, so you add some utility programs.
Possibly the best program you've ever written. At almost every tournament, coaches come up to you and claim that you've done the scoring wrong. Spending ten minutes printing brackets and tracing through them by hand is no fun, so you write this program that, given a team name, displays every match won by that team along with the number of team points awarded. Now, when coaches ask you questions, you take two minutes to make a printout, give it to them, and they go away. Ninety percent of the time, they don't come back, and if they do, there really is an error. This really is the best program you've ever written.
Entering the names and schools of all the participants one bracket at a time is difficult. You write a program that takes a text file of participant and school names and drops seeded wrestlers into the the proper places in the bracket. The remaining wrestlers are placed at random. You then go to the data entry program to make any changes (too many byes in a half-bracket, two people from the same geographic area meeting in the first round, etc.) This, of course, requires you to change the data entry program to make name swapping easier. And the complexity of the project grows.
The scorekeeping program keeps adding features as well. If you have a sixteen-person bracket and twenty schools, you create what are called out brackets for the extra people. This means you need to use a thirty-two person bracket with lots of empty brackets. You don't want the empty ones to show up, so you add a whole lot of code to make those matches invisible.
Meanwhile, a lot of obsolete code stays around. The very first version for CCS kept track of which league a team was in. The data entry program still creates the LEAGUE.DAT file, even though nobody else uses it any more.
Furthermore, there's a lot of duplication of code among the files. Thus, if you want to change the formatscore function in one file, you'd better make sure it's the same in all of them.
Did I mention documentation? There still isn't any. At least, not in the source files. You haven't written a document that tells anyone else how to use the programs, either.
Fate intervenes. You will be out of town one weekend when a coach is running a tournament. He wants to use the program, and he is sure that he can find someone to do all the data entry. This forces you to write a user document. The tournament goes well, so it seems you've written something that is generally usable.
You learn Perl, and find it useful for writing small programs that help CCS run more smoothly. At the end of the first day you can print a list of all the people who will be in semifinals the next day. You can now enter all the names and teams by team, and a Perl program will split them up into the weight classes for input to the draw plan program. Another script will take the databases and output HTML so that you can put results on the web.
More and more code accumulates, but, amazingly, this is all still running on the same four-megabyte PC. Finally, though, the chickens come home to roost. You are doing scorekeeping for Oak Grove's Blossom Hill tournament, which uses a sixteen-person bracket. You copy the code from thirty-two person CCS, since it has the latest and greatest features. you copy the bracket and data files from the Blossom Valley Athletic League (BVAL) tournament, which is a sixteen-person tournament. And they don't work together. Some frantic investigation shows that:
The first two problems are fixed easily; you simply copy over the code from BVAL so that the code is in synch with the data. The third problem requires you to do some fast code changes; again, undocumented, since you have to start the tournament on time.
Congratulations; you now have your mess! There are now two distinct versions of the program (CCS and BVAL) with a major variant on the BVAL rapidly heading for a third distinct version.
The tournament goes well, and after it's all over, you promise yourself that you're going to do a total rewrite; perhaps converting it all to Perl and releasing it under the GPL. There's not enough time to do this before CCS in February, but after that, well, this year for sure.
- o -
And that's how I spent my weekend. If you want to see the code in question, last year's CCS version is available as a ZIP file download.
< < Thanksgiving Thoughts | Back to top of page | Final Election Thoughts >> |