Sunday, 1 June 2014

How to run Selenium tests in parallel with xUnit

I’m a big fan of automated testing, unfortunately end to end tests, like the ones that use Selenium, have always been a problem. They are slow and their total execution time very quickly goes beyond of what most teams would accept as a reasonable feedback cycle. On my current project our Full CI build takes nearly 2 hours. Every project of a significant size I know of, has this problem, some of them have 4h+ builds. Assuming you have only one test environment then you can test up to 4 changes a day. Not a lot when daily deployments to Production are the ultimate goal. There are many solutions to this problem but most of them are based on splitting the list of tests into smaller subsets and executing them on separate machines (e.g. Selenium grid). This works but from my experience has two major drawbacks:

  • it is complex to setup and gather results
  • it can’t be replicated on a single machine, which means
    • I can’t debug easily tests that fail only because they are run together with some other tests
    • I can’t quickly run all tests locally to make sure my changes do not break anything

Another added benefit of running tests in parallel is that it becomes a very basic load test. This isn’t going to be good enough for sites that need to handle thousands of concurrent users, but it might be just what you need when the number of users is in low hundreds.

In the past I used MbUnit with VS 2010 and it worked great. I was able to run 20 Firefox browsers on a single VM. Unfortunately the project seems to be abandoned and it doesn’t work with VS 2013. Until recently, none of the major testing frameworks supported running tests in parallel. As far as I remember even xUnit 2.0 initially wasn’t meant to support this feature but this has changed and now it does.

So where is the catch? After spending some time with xUnit code my impression is that the core of xUnit and newly added parallelism were built with unit testing in mind. There is nothing wrong with this approach and I fully understand why it was done this way but it makes running tests that rely on limited resources (e.g. browser windows) challenging.

The rest of the blog posts focuses on how to overcome those challenges. A sample app that implements most of the described ideas is available on GitHub: https://github.com/pawelpabich/SeleniumWithXunit. The app is based on custom fork of xUnit: https://github.com/pawelpabich/xunit.

How to run tests in parallel

To execute tests in the sample app in parallel in VS 2013 (tested with Update 2) you need to install xUnit runner for VS and Firefox. Once this is done, compile the app, open Test Explorer (TEST –> Windows), select all tests and run them. You should see two browsers open at more or less the same time.

WARNING: If you want to run Selenium tests on a machine which has text size set to any value that is not 100% stick to Selenium 2.37.0 and Firefox 25 as this the only combo that works well in such a setup. This problem should be solved in the next version of Selenium which is 2.42.0 https://code.google.com/p/selenium/issues/detail?id=6112

xUnit unit of work

xUnit executes in parallel test collections. Test Collection is a group of tests that either belong to a single class or single assembly. Tests that belong to a single test collection are executed serially. Collection behaviour can be specified in VS settings or as a global setting in code via CollectionBehviour attribute. On top of that we have ability to control the level of concurrency by setting MaxParallelThreads. This threshold is very important as a single machine can handle only limited number of browsers.

At the moment a single test can’t become a test collection which means that a class with a few long running tests can significantly extend the total execution time, especially if it happens to be run at the end of a test run. I added this feature to my fork of xUnit: https://github.com/pawelpabich/xunit/commit/7fa0f32a1f831f8aef55a8ff7ee6db0dfb88a8cd and the sample app uses it. Both tests belong to the same class yet there are executed in parallel.

Isolate tests, always

Tests should not share any state but tests that run in parallel simply must not do it. Otherwise it is like asking for trouble which will lead to long and frustrating debugging sessions. The way this can be achieved is to make sure that all state used by a test is created in the constructor of the test and it gets disposed once the test is finished. For this we can use Autofac and create a new LifetimeScope for each test which then gets disposed when the test object gets disposed. xUnit doesn’t support IoC so we need to inject all dependencies using property injection in the constructor which is not a big deal.

Shared resources, yes, they exist

Technically speaking each test could launch a new instance of Firefox but this takes quite a bit of time and can be easily optimized. What we need is a pool of browsers where tests can take browsers from and return them to once they are done with them. In most cases shared resources can be initialized once per test run. xUnit doesn’t have a concept of global initialization so that’s why the test run setup happens in the constructor of the base class that all other tests inherit from. This isn’t a great solution but it works. You might be able to move it to the static constructor as long as the setup code doesn’t use Threads and Tasks because they will cause deadlocks. Lack of global initialization means that there is a no global clean up either, but this can be worked around by subscribing to AppDomain.Unload event and performing clean up there. From experience this works most of the time so I would rather have a proper abstraction in the framework.

Don't block synchronously

When MaxParallelThreads is set to a non zero value then xUnit creates a dedicated Task Scheduler with limited number of threads. This works great as long as you use async and await. But if you need to block synchronously you might cause a deadlock as there might be no free threads to execute the continuation. In such a case the safest way is to execute the code on the Default .NET Task Scheduler.

I need to know when something goes wrong

If a test implements IDisposable then xUnit will call Dispose method when the test is finished. Unfortunately this means the test doesn’t know whether it failed or succeeded which is a must have for browser based tests because we want to take a screenshot when something went wrong. That’s why I added this feature to my fork of xUnit.

What about R#, I won’t code without it!

There is an alpha version of xUnit plugin for R#. It runs xUnit tests fine but it doesn’t run them in parallel at the moment. But this might change in the future so keep an eye on https://github.com/xunit/resharper-xunit/pull/1.

Async, async everywhere

Internally xUnit uses async and await extensively. Every piece of the execution pipeline (Create test object –> Execute Before Filter –> Execute test –> Execute After Filter –> Dispose test) can end up as a separate Task. This doesn’t work well with tests that rely on limited resources because new tests can be started before already finished tests are disposed and their resources returned to a shared pool. In such a case we have two options. Either the new tests are blocked or there will be more resources created which is not always a viable option. This is why in my fork of xUnit the whole execution pipeline is wrapped in just one Task.

What about Continuous Integration

There is a console runner that can output Team City service messages and this is what I use. It works great, and the only complain I have is that TeamCity doesn’t display well messages coming concurrently from multiple different flows in its web interface. This should be fixed in the future http://youtrack.jetbrains.com/issue/TW-36214.

Can I tell you which tests to execute first?

In ideal world we would be able to run failed test first (e.g we can discover that via API call to the build server, thanks Jason Stangroome for the idea) and then the rest of the tests ordered by their execution time in descending order. xUnit lets us order tests within single test collection but there is no abstraction to provide custom sort order of test collections which is what I added to my fork. The orderer is specified in the same way a test orderer would be.

It works but it has some sharp edges

All in all I’m pretty happy with the solution and it has been working great for me over last couple of  weeks. The overall execution time of our CI build dropped from 1:41h to 21 min.

The work is not finished and I hope that together we can make it even easier to use and ideally make it part of the xUnit project.I started a discussion about this here (https://github.com/xunit/xunit/issues/107). Please join and share your thoughts. It’s time to put the problem of slow tests to bed so we focus on more interesting challenges :).