Testing Julia packages


Author: Oscar Dowson (@odow)

This post documents how the JuMP developers structure tests in Julia packages. We use this pattern extensively across the jump-dev package ecosystem (although only a few packages currently use ParallelTestRunner.jl; it’s a work in progress).

Background

Julia ships with the Test standard library, which provides macros such as @test, @testset, and @test_throws. A typical test layout looks like this:

# test/runtests.jl
using Test
include("A.jl")
include("B.jl")
# test/A.jl
using LinearAlgebra
@testset "A" begin
    @testset "test_plus" begin
        @test 1 + 1 == 2
    end
    @testset "test_sq" begin
        _test_sq_helper(x) = x^2
        N = 4
        @testset "$x" for x in 1:N
            @test _test_sq_helper(x) == x * x
        end
    end
end
# test/B.jl
@testset "B" begin
    @test norm([3, 4]) == 5
end

In our experience, this structure has a number of drawbacks:

  1. Poor isolation. Nested @testsets make it hard to run individual tests in isolation. To debug specific problems it’s often easiest to copy-paste the relevant parts of the tests, along with any required global state.
  2. Hidden global state. Test files can silently depend on global state from previous files (in the above example, test/B.jl depends on using LinearAlgebra from test/A.jl, and they both depend on using Test from test/runtests.jl). This makes refactoring the tests hard because a small change in one file can break other files in the test suite.
  3. No parallelism. Test executes files serially, which becomes a bottleneck for large or computationally heavy test suites.
  4. Unclear structure. There is no clear rule for when to start a new @testset or to start a new file. This often leads to new tests being retrofitted to existing testsets when it would have been better to start a new testset.

Over time, we have evolved a testing style that addresses these issues. Our design goals were:

  1. To make it easy to run single tests
  2. To make it easy to run groups of tests
  3. To make it easy to parallelise the tests
  4. To make it easy to add new tests

Each test is a function

To satisfy our first design goal, we make all tests functions.

Note how:

using Test
function test_plus()
    @test 1 + 1 == 2
    return
end

The main benefit of this structure is that it is easy to run single tests: just copy-paste the function into the REPL.

Group tests into files

To satisfy our second design goal, we group test functions into a module, and we use a special runtests function that automatically finds and runs all of the tests in the module.

Note how:

# test/test_A.jl
module TestA

using Test

function runtests()
    is_test(f::Symbol) = startswith("$f", "test_")
    for name in filter(is_test, names(@__MODULE__; all = true))
        @testset "$(name)" begin
            getfield(@__MODULE__, name)()
        end
    end
    return
end

function test_plus()
    @test 1 + 1 == 2
    return
end

_test_sq_helper(x) = x^2

function test_sq()
    N = 4
    for x in 1:N
        @test _test_sq_helper(x) == x * x
    end
    return
end

end  # module

TestA.runtests()
# test/test_B.jl
module TestB

using Test
using LinearAlgebra

function runtests()
    is_test(f::Symbol) = startswith("$f", "test_")
    for name in filter(is_test, names(@__MODULE__; all = true))
        @testset "$(name)" begin
            getfield(@__MODULE__, name)()
        end
    end
    return
end

function test_norm()
    @test norm([3, 4]) == 5
    return
end

end  # module

TestB.runtests()

This structure has two important advantages:

Utility files

Shared setup code should live in utility files that do not start with test_. These can be included where needed:

# test/utility.jl
const DATA = 1
# test/test_A.jl
module TestA
include("utility.jl")
# ...
end  # module
TestA.runtests()

Parallelise the tests with ParallelTestRunner

To satisfy our third design goal, we use ParallelTestRunner.jl. Because tests are isolated by file and module, parallelisation is straightforward.

The only custom logic required is to construct a test suite mapping file names to include expressions:

# test/runtests.jl
import MyPackage
import ParallelTestRunner
is_test_file(f) = startswith(f, "test_") && endswith(f, ".jl")
testsuite = Dict{String,Expr}(
    file => :(include($file))
    for (root, dirs, files) in walkdir(@__DIR__)
    for file in joinpath.(root, filter(is_test_file, files))
)
ParallelTestRunner.runtests(MyPackage, ARGS; testsuite)

Load balancing

ParallelTestRunner parallelises at the file level. More files mean more opportunities for parallelism, but each file carries a fixed startup cost. Aim for files that are neither very large or very small.

A good practice is to inspect timing output and refactor so that files take roughly comparable amounts of time. The worst case is a few very slow files alongside many trivial ones.

Rules for adding new tests

Our design leads to simple, scalable rules for adding new tests:

  1. Add each new test as a test_* function in an appropriate module.
  2. Split files as needed to keep parallel workloads balanced.

Complications

A drawback of our approach is that Julia allows method redefinition, in which two test_* functions with the same name in one file will overwrite each other.

Although --warn-overwrite=yes (the default during testing) emits warnings, these are easy to miss in CI. As a safeguard, we add a test that scans files for duplicate test_* definitions:

function _test_method_redefinition(filename)
    contents = read(filename, String)
    functions = Set{String}()
    for regex in (r"^function (test\_.+?)\(.*?\)"m, r"^(test\_.+?)\(.*?\) \= "m)
        for m in eachmatch(regex, contents)
            fn_name = String(m[1])
            if fn_name in functions
                error("In $filename: overwritten method: $fn_name")
            end
            push!(functions, fn_name)
        end
    end
    return
end

# This function tests that all files in `/test` do not have redefined methods.
function test_method_redefinition()
    test_dir = @__DIR__  # replace as needed
    for (root, dirs, files) in walkdir(test_dir)
        for file in files
            _test_method_redefinition(joinpath(root, file))
        end
    end
    return
end

Packages to study

If you want to implement this design pattern, here are some good packages to look at:

MathOptInterface is particularly instructive: its tests are sharded across directories, each directory runs in a separate CI job, and ParallelTestRunner is used within each shard for further parallelism.