Testing large software with Perl
Erwan Lemonnier
Swedish Premium Pension Authority (PPM)
$Revision: 1.4 $
The bottom line
- Testing is good
- Testing is easy (TAP, Test::More...)
- The CPAN way of testing modules scales well for 1 module
- But real-life is more complex
Let's talk about test...
Menu
- When TAP is not enough...
- Building your own smoke suite
- Testing database driven code
When TAP is not enough...
A real life example:
Pluto
The core plateform of the Swedish Premium Pension Authority
Pluto, as of 2007-04 (1/2)
- 5.5 million users
- 700+ funds, many currencies
- 250 000 000 000 SEK (23 billion euros)
- 320 000 lines of Perl code (.pl, .pm, .t)
- 120 000 lines of SQL, shell and HTML
- 230 database tables
- 500 000 000 posts in the largest table
- 750 gigabytes of data in an Oracle database
- Up to 30 developers over 7 years
Pluto (2/2)
- as of 2004-04:
- less than 10 test scripts
- 0 unit tests
- tests are manual
- as of 2007-04:
- automated TAP tests
- 430 test files
- 10000 unit tests
- nightly builds in multiple code branches
- 30 test modules
What we learned:
- To test a large/complex system, you need a test framework that fits your specific system
- Make complex tests easy to write
- Slow start, but unittest count grows exponentially
Building a test framework
Writing your own test modules
With Test::Builder:
package My::Test;
use base 'Exporter';
our @EXPORT = qw( is_one );
use Test::Builder;
my $test = Test::Builder->new();
sub is_one {
$test->ok($_[0] =~ /^1$/, "is one");
}
Writing your own smoker 1/3
use Test::Harness::Straps;
my $straps = Test::Harness::Straps->new();
foreach my $file (@testfiles) {
my @output = run_file($file);
my %results = $traps->analyze($file,\@output);
}
- %results has the keys:
- passing, exit, max, seen, ok, skip, skip_all, skip, todo...
- details : details about each unit test
Writing your own smoker 2/3
Optionally:
- Generate a test report
- Mail results
- Store test results in a database
- GUI toward the test database (with nice stats)
Writing your own smoker 3/3
Give it a command line interface:
plutosmoke.pl [-h] [-a] [-v] [-r] [-c] [-p {name}]
{path1} {file2}...
-a run all tests
-p {name} run all tests in a sub project
-c make coverage statistics
-r run randomly
-v verbose
The next steps
- Created a cronjob: checkout all code and run smoke program
- Every day, in every branch...
- Continuous integration server?
Organizing test files 1/2
- Running one test file may require configuration of external components:
- Cleaning up a test database
- Filling database with default data
- Starting daemons
- You can do it explicitly from the test code...
Organizing test files 2/2
- ...or let your smoke program do it
- put metadata in your test files:
# @author your name
# @description whatever this test file does
# @system pluto
# @function base
# @function maths
# @option need_clean_db
(Hi Claes!)
Testing database driven code
Testing with a database
- Code runs against one database
- Database is too complex to be mocked
- Database is too complex to be filled with minimal testdata
- Database is too large to be dumped beside your test file and loaded into a test database upon each test
A Solution
- Dump a small part of the production database into a storable format (tar ball) containing just the data you want to test
- Check in this tar ball together with the test file that uses it
- Each time the test runs, it fills a test database with the data from the tar ball
Example
- Pluto: the central object in the database is a person
- A person has funds, transactions, payments, private information, etc.
- Funds have prices, currencies, etc.
Partial database dumps 1/2
- Dump/extract one person from production database
- Something like:
foreach $table in (@list) {
$dump->add_rows(from => $table,
where_person => $person);
}
Partial database dumps 2/2
- Add fund/currency prices relevant for this person
foreach $fund in (@funds_owned_by_person) {
$dump->add_rows(from => fund_price,
where_fund => $fund);
}
The dump format
- A text file, compressed: *.tgz
- 'print Dumper()':
$dump => {
table1 => [ { column1 => , # row1
column2 => ,
},
{ column1 => , # row2
column2 => ,
},
To read the dump back
- Uncompress *.tgz
- Use eval:
open(FILE,$path) or die $!;
my $data = join("",);
close(FILE);
my $VAR1;
eval $data;
Command line tools
prsdump.pl [-v] [-h]
[-d {database} -u {user} -p {pass}]
{personid}
[--fund] [--period yyyy-mm-dd:YYYY-MM-DD...]
[-f {filename}]
prsinject.pl [-v] [-h]
[-d {database} -u {user} -p {pass}]
{filename}
Module for injecting
use Test::Pluto::PrsDump;
inject_person(prsid => $person_id);
Bugs are now reproducible!
An error occurs against one database!
Bugs are now reproducible!
Dump the guilty person from this database
$ prsdump.pl -d FSTA1 195402061234
-> connect to database [FSTA1] as user [erwlem]
-> dumping general data about [195402061234]
-> dumping agregated tables
-> storing into file [195402061234.20070425-1202.FSTA1.dmpprs]
-> done.
Bugs are now reproducible!
Write a short test sequence:
use Test::Pluto::PrsDump;
use My::Faulty::Module;
inject_person(prsid => '195402061234');
my_faulty_sub(); # expect a crash here
Problems
- Database constraints
- Large dump = slow to inject = slow test
- Updating checksum tables
Remaining issues
- Testing parallel processes synchronized by semaphores
- Test coverage
- Continuous integration
- Test result GUI and reports
Conclusion
- Took 2.5 years to get a test framework up and running for Pluto
- Other those years, unit tests have grown exponentially
- Now we can at last start to refactor in peace...
Questions?
Thank you!