I am a big fan of Test Driven Development (TDD) and tools like Hudson/Jenkins to automate the process of having a continuous integration build system are key.

On my current project we recently started moving things to Amazon EC2, and rather than put everything on one big server, I thought I’d follow the best practices in cloud computing and make a number of small special purpose servers to take care of the project’s needs.

We’ve had a Jenkins server running for a bit, so rather than reinventing the wheel, I figured I could copy my Jenkins configuration to a new server and get things up and running.

I fired up a new Tomcat server on Amazon Elastic Beanstalk, and loaded up the Jenkins WAR file, which quickly got me to a working Jenkins server. This project is written in PHP, so I had to install PHP after that, which meant logging in to the server and running through the whole PHP and PHPUnit setup.

Once that was done, I scp’d the Jenkins folders from the old server, edited the Tomcat startup files to include the environment variable to point Jenkins to the right place, changed a few permissions, and everything appeared to be working.

I could log in, I fired up the build, and it appeared to be running – very cool.

But that was when the flaw in the design of the PHP unit tests was exposed ….

I was watching the output of the phpunit tests, and noticed two things:

  1. The tests seemed to be taking a really long time
  2. Every test was failing

Watching the console, each time a test would fail, the little “E” would print, then a few seconds would go by and another “E” would appear. Finally after many minutes (because we have a LOT of classes to test) the error output appeared, and looked something like this for EVERY test:

And of course there were 5297 of these … I did some Google searches for the PHP_Invoker_TimeoutException which mostly pointed to issues with upgrade from one version of PHPUnit to another, but the versions on the old server and this one were the same.

So my next step was debugging an individual tests. Running the test from the command line gave me the same error, odd. But then I ran the test using php instead of the phpunit call, and found the problem: I was getting a timeout trying to open a database connection.

The issue as it turns out, is a design flaw in our code that hadn’t showed up before: all the classes invoke a database connection class that sets up the connection to the database as soon as they are loaded.

Since the Elastic Beanstalk server was in a different security group than was allowed to connect to the RDS database, it was unable to connect at all, and PHPUnit would simply timeout before the connection failed (by default phpunit sets 1 second as the acceptable for a test to run in order to catch endless loops).

Now in theory our tests shouldn’t be hitting the database (at least not for these unit tests since we don’t want them updating anything on the backend), so this problem turned out to be very fortuitous. Because the Jenkins server couldn’t reach the database, it exposed a flaw in our unit tests: we weren’t mocking all the things we needed to, so the tests were actually opening connections to the database.

With some refactoring of the test classes to mock the database access layer, the tests all succeeded. Next we’ll need to do the actual DBUnit tests for the database, and Selenium or HTTPUnit tests for all the front-end and AJAX stuff.