Testing large software with Perl

Erwan Lemonnier

Swedish Premium Pension Authority (PPM)

$Revision: 1.4 $

The bottom line

Testing is good
Testing is easy (TAP, Test::More...)
The CPAN way of testing modules scales well for 1 module
But real-life is more complex

So what?

This talk is about a few specific testing technics developed at PPM

For other technics, see: Perl Testing, a developer's notebook

Let's talk about test...

When TAP is not enough...

A real life example:

Pluto

The core plateform of the Swedish Premium Pension Authority

Pluto, as of 2007-04 (1/2)

5.5 million users
700+ funds, many currencies
250 000 000 000 SEK (23 billion euros)
320 000 lines of Perl code (.pl, .pm, .t)
120 000 lines of SQL, shell and HTML
230 database tables
500 000 000 posts in the largest table
750 gigabytes of data in an Oracle database
Up to 30 developers over 7 years

Pluto (2/2)

as of 2004-04:
- less than 10 test scripts
- 0 unit tests
- tests are manual
as of 2007-04:

automated TAP tests
430 test files
10000 unit tests
nightly builds in multiple code branches
30 test modules

What we learned:

To test a large/complex system, you need a test framework that fits your specific system
Make complex tests easy to write
Slow start, but unittest count grows exponentially

Building a test framework

Writing your own test modules

With Test::Builder:

package My::Test; use base 'Exporter'; our @EXPORT = qw( is_one ); use Test::Builder; my $test = Test::Builder->new(); sub is_one { $test->ok($_[0] =~ /^1$/, "is one"); }

Writing your own smoker 1/3

use Test::Harness::Straps; my $straps = Test::Harness::Straps->new(); foreach my $file (@testfiles) { my @output = run_file($file); my %results = $traps->analyze($file,\@output); }

%results has the keys:

passing, exit, max, seen, ok, skip, skip_all, skip, todo...
details : details about each unit test

Writing your own smoker 2/3

Optionally:

Generate a test report
Mail results
Store test results in a database
GUI toward the test database (with nice stats)

Writing your own smoker 3/3

Give it a command line interface:

plutosmoke.pl [-h] [-a] [-v] [-r] [-c] [-p {name}] {path1} {file2}... -a run all tests -p {name} run all tests in a sub project -c make coverage statistics -r run randomly -v verbose

The next steps

Created a cronjob: checkout all code and run smoke program
Every day, in every branch...
Continuous integration server?

Organizing test files 1/2

Running one test file may require configuration of external components:

Cleaning up a test database
Filling database with default data
Starting daemons

You can do it explicitly from the test code...

Organizing test files 2/2

...or let your smoke program do it
put metadata in your test files:

# @author your name # @description whatever this test file does # @system pluto # @function base # @function maths # @option need_clean_db

(Hi Claes!)

Testing database driven code

Testing with a database

Code runs against one database
Database is too complex to be mocked
Database is too complex to be filled with minimal testdata
Database is too large to be dumped beside your test file and loaded into a test database upon each test

A Solution

Dump a small part of the production database into a storable format (tar ball) containing just the data you want to test
Check in this tar ball together with the test file that uses it
Each time the test runs, it fills a test database with the data from the tar ball

Example

Pluto: the central object in the database is a person
A person has funds, transactions, payments, private information, etc.
Funds have prices, currencies, etc.

Partial database dumps 1/2

Dump/extract one person from production database
Something like:

foreach $table in (@list) { $dump->add_rows(from => $table, where_person => $person); }

Partial database dumps 2/2

Add fund/currency prices relevant for this person

foreach $fund in (@funds_owned_by_person) { $dump->add_rows(from => fund_price, where_fund => $fund); }

Add system parameters...

The dump format

A text file, compressed: *.tgz
'print Dumper()':

$dump => { table1 => [ { column1 => , # row1 column2 => , }, { column1 => , # row2 column2 => , },

To read the dump back

Uncompress *.tgz
Use eval:

open(FILE,$path) or die $!; my $data = join("",); close(FILE); my $VAR1; eval $data;

Command line tools

prsdump.pl [-v] [-h] [-d {database} -u {user} -p {pass}] {personid} [--fund] [--period yyyy-mm-dd:YYYY-MM-DD...] [-f {filename}] prsinject.pl [-v] [-h] [-d {database} -u {user} -p {pass}] {filename}

Module for injecting

use Test::Pluto::PrsDump; inject_person(prsid => $person_id);

Bugs are now reproducible!

An error occurs against one database!

Bugs are now reproducible!

Dump the guilty person from this database

$ prsdump.pl -d FSTA1 195402061234 -> connect to database [FSTA1] as user [erwlem] -> dumping general data about [195402061234] -> dumping agregated tables -> storing into file [195402061234.20070425-1202.FSTA1.dmpprs] -> done.

Bugs are now reproducible!

Write a short test sequence:

use Test::Pluto::PrsDump; use My::Faulty::Module; inject_person(prsid => '195402061234'); my_faulty_sub(); # expect a crash here

Nordic Perl Workshop 2007 / Copenhagen