-
Notifications
You must be signed in to change notification settings - Fork 50
Go Testing Patterns
This page documents patterns that we have found to be generally useful when writing tests in Go. This page is intended for a reader that is already familiar with the Go testing package, and with writing tests in Go. To get started with testing in Go, check out the go.dev tutorial.
The patterns documented on this page are too small to be useful in a library. Often abstracting them into generic library code would make them harder to use. They are intended to be copied, or applied, in any package where they are used.
Know of a common Go testing pattern that you would like to add to this page? Please open an issue to suggest adding it!
testing.T.Run is often used to run independent test cases,
but it can also be useful for running multi-step test cases where each step must pass before running
the next.
The runStep
function accomplishes this by wrapping t.Run
and failing the parent test with
t.FailNow
if the subtest fails. Use it in place of t.Run
for a multi-step test.
The behaviour of runStep
is similar to a test without any subtests, but t.Run
gives
additional scope to the test. The scope makes test failures more obvious by including
details about the functionality being tested in the test name, and allows for any
t.Cleanup
functions to run.
func runStep(t *testing.T, name string, fn func(t *testing.T)) {
if !t.Run(name, fn) {
t.FailNow()
}
}
Example
runStep(t, "create a resource", func(t *testing.T) {
...
})
runStep(t, "list resources includes the new resource", func(t *testing.T) {
...
})
runStep(t, "delete the resource", func(t *testing.T) {
...
})
runStep(t, "get a deleted resource returns an error", func(t *testing.T) {
...
})
Table driven tests are a common pattern in Go. Table driven tests are a great way of organizing many highly related test cases into a single test function.
The pattern documented here refines table drive tests further by reordering the sections to make table driven tests easier to read from top-to-bottom. The table test is organized into the following sections:
- A
type testCase struct
is always first, mostly out of necessity, the type must be defined first. For more complex tests you may want to use afunc(t *testing.T, ...)
as the type of fields (for examplesetup func(t *testing.T, req *http.Request)
). Using a function gives you lots of flexibility when constructing test cases. - A
run(t *testing.T, tc testCase)
function which will be called for each case. This function contains all of the logic for the test. Thisrun
function is the most relevant part of the test function, it tells us which function is being tested, and how thetestCase
fields will be used. By putting it near the top of the test function it becomes the first thing we see when jumping to the function definition. - A list of test cases follows. The list defines the inputs and expected values for each case. The list
may be a
[]testCase
ormap[string]testCase
, where the map version would use the key as the name for the test case instead of a struct field. Always try to choose descriptive and unique names for each case, to make it easier to find the failing case. - Finally there is some boilerplate at the bottom of the function to iterate over all of the test cases and call
run
fromt.Run
for each.
func TestSplit(t *testing.T) {
type testCase struct {
name string
input string
sep string
expected []string
}
run := func(t *testing.T, tc testCase) {
actual := strings.Split(tc.input, tc.sep)
assert.DeepEqual(t, actual, tc.expected)
}
testCases := []testCase{
{
name: "multiple splits",
input: "a/b/c",
sep: "/",
expected: []string{"a", "b", "c"},
},
{
name: "wrong separator",
input: "a/b/c",
sep: ",",
expected: []string{"a/b/c"},
},
{
name: "no separator",
input: "abc",
sep: "/",
expected: []string{"abc"},
},
{
name: "trailing separator",
input: "a/b/c/",
sep: "/",
expected: []string{"a", "b", "c", ""},
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
run(t, tc)
})
}
}
Example: Generic runTestCases
Using generics and the map[string]testCase
version of the test case list, the boilerplate at the
bottom of the function can be replaced with a call to runTestCases
.
func TestSplit(t *testing.T) {
type testCase struct { ... }
run := func(t *testing.T, tc testCase) { ... }
testCases := map[string]testCase{ ... }
runTestCases(t, run, testCases)
}
func runTestCases[TC any](t *testing.T, run func(*testing.T, TC), testCases map[string]TC) {
for name, tc := range testCases {
t.Run(name, func(t *testing.T) {
run(t, tc)
})
}
}
See Test suites and Run case for some other variations of table driven tests.
Table driven tests work well to group related tests together. If the number of test cases in the table grows beyond some point, especially when each case is complex, it can become difficult to work with the test.
The usual method of searching for a test case by name can become difficult with large numbes of test cases when:
- test names are too similar, there are long common prefixes on many tests
- there are other strings in the file that match the test names
- the translated test name contains many
_
characters which replaced any number of non-alphabetic characters
There are a few ways to address this problem. Often the test cases can be split up into different
test functions. If that is not an option, runCase
can help make extremely large test functions
easier to work with.
To address the problem, the []testCase
or map[string]testCase
in the table test can be
replaced with calls to a run
function that uses runCase
.
func runCase(t *testing.T, name string, run func(t *testing.T)) {
t.Helper()
t.Run(name, func(t *testing.T) {
t.Helper()
t.Log("case:", name)
run(t)
})
}
This extra call to t.Log
, and the two calls to t.Helper
, add another line to the test output.
The case: <test case name>
log message will be prefixed with the filename and line number of the test case that
failed.
IDEs and text editors provide an easy
way to jump to that file and line number. So when a test fails, the time to jump to each
test case can be significantly reduced.
The change to a test function from a list of test cases to run
calls can often be
automated by find and replace.
Example
func TestArgs(t *testing.T) {
type testCase struct {
opts options
expected []string
}
run := func(t *testing.T, name string, tc testCase) {
t.Helper()
runCase(t, name, func(t *testing.T) {
actual := Args(tc.opts)
assert.DeepEqual(t, actual, tc.expected)
})
}
run(t, "defaults", testCase{
opts: options{},
expected: []string{},
})
run(t, "no change", testCase{
opts: options{
args: []string{"./script", "-test.timeout=20m"},
},
expected: []string{"./script", "-test.timeout=20m"},
})
...
}
A test suite is a group of tests that share a function to setup dependencies, or share a dependency that takes a while to start. Sharing a dependency that is slow to start helps reduce the overall run time of the tests in a package. The tests in a suite will generally be closely related. They may all test the same function, the methods on a struct, or a set of related functions in a package.
The Go testing package in the standard library provides all the tools necessary to create a test suite. The testing.T.Cleanup function allows a setup function to register any cleanup that is required once the test ends. This helps keep related code together in one place.
Example
A test suite can have many forms, but will generally look something like the test below. The comments are for example purposes only. Each section is optional, and may be omitted.
func TestAPI(t *testing.T) {
// Start any shared dependencies, sometimes known as "SetupSuite".
// Any functions registered using t.Cleanup will act as "TearDownSuite".
srv := startServer(t)
// Define a setup method that will be run for each test, sometimes
// known as "SetupTest".
// Any functions registered in setup using t.Cleanup will act
// as "TearDownTest".
setup := func(t *testing.T, tc testCase) {
...
}
// Test cases follow
...
}
The test cases can be defined in different ways. They may be:
- sequential steps using
runStep
, where each step can callsetup
. - a table driven test, calling setup in
run
, or therunCase
variation. - a list of test functions that accept
testCase
and that may be called by other suites to test a contract or implementation of some interface.
Mocks, stubs, spies, and fakes are all kinds of test doubles. They are used in tests to replace the implementation of an interface to make code either easier to test, or to allow the test to run faster.
Using a test double can make writing tests easier and running tests faster, but overusing them can reduce the effectiveness of tests, and make tests more expensive to maintain. Always try to avoid a test double, and only use one when the alternative would significantly slow down the test.
Test doubles come in many forms, but all of them have one thing in common. They all implement some interface, and will be used in a test to replace the production implementation of that interface.
This common property of test doubles allows us to look at creating a test double in two stages.
First we need to generate the method definitions to implement the interface.
There are plenty of libraries that provide either code generation, or building blocks for creating fakes and mocks. While those tools may be useful for faking large interfaces, it is often better to instead reduce the size of the interface. Most interfaces in the standard library have 3 or fewer methods. Reducing the size of the interface that needs to be faked is good for loose coupling of code, but it also makes the interface easier to implement.
A small interface can be written out by hand, but most IDEs ( GoLand, VS Code, gopls) provide one or more ways of easily generating the initial method stubs for a fake implementation. The code generation provided by the IDE often makes external tools unnecessary, but they are an option.
IDE refactoring tools will also take care of updating the method signatures when an interface changes, removing the need for an external tools to update the method signatures of the test double.
Now that the method definitions exist, we need to write the implementation of those methods.
A fake is a full working implementation, not something any tool can help implement.
A stub returns predefined return values, again not something any tool can help you implement. Add a few fields to the struct to store the predefined return values, and set the values as part of test setup.
A mock registers expectations about what calls will be made. Don't use mocks. If you need to record something about calls being made, use a spy instead and make the assertions about calls as you would any other assertion in a test.
A spy records both the method calls and all the arguments passed to those calls. It can also return a value using the same technique as a stub or fake.
Example: Spy
A test spy can easily be implemented without any library or framework using the following struct and function.
type FunctionCall struct {
Name string
Args []interface{}
}
func Record(method interface{}, args ...interface{}) FunctionCall {
name := runtime.FuncForPC(reflect.ValueOf(method).Pointer()).Name()
return FunctionCall{Name: name, Args: args}
}
From each fake method, you call Record
, and at the end of the test case, use assert.DeepEqual
to compare
the recorded calls against the expected calls.
type testDouble struct {
calls []FunctionCall
}
func (d *testDouble) Emit(arg1 string, arg2 time.Time) error {
d.calls = append(d.calls, Record(d.Emit, arg1, arg))
return nil
}
...
func TestEmit(t *testing.T) {
d := &testDouble{}
...
expected := []FunctionCall{
Record(d.Emit, "metric", now),
}
assert.DeepEqual(t, d.calls, expected)
}
This pattern removes the need for a library, and ensures that assertion failures from the test double look and feel the same as other assertions in the test.
A shim is a variable that is introduced into the program entirely for the purpose of being able to change it during testing. Shims are often used to replace a static function call to some external package with a different implementation of the function. Shims are only necessary when the dependency is static (i.e. the function being called is a package-level function, not a method on some type). Shims should only be used sparingly. If you can pass in an interface as a dependency, prefer that over a shim, as it is easier to pass in a fake implementation of the interface.
Once the shim is added, we can use a monkey patch function in a test to replace the shim with a fake implementation. The patch function can accept either a static value to return, or it could accept a full replacement for the function being patched. The examples below show both options.
Note: since patching is used to replace static values, it is never safe to use t.Parallel
with patching.
Example: Generic Patch function
func Patch[S any](t *testing.T, target *S, replacement S) {
original := *target
*target = replacement
t.Cleanup(func() {
*target = original
})
}
Example: time.Now
It is common to want to patch time.Now
so that it returns a predictable value. This can make it much easier to test code
that uses the current time.
In the live code introduce a shim.
// timeNow is a shim for testing
var timeNow = time.Now
// something is a function that uses time.Now
func something() string {
now := timeNow()
...
}
In the test function patch timeNow
. The patch function uses t.Cleanup
to reset the value of the shim to the original value
when the test function exits.
func TestSomething(t *testing.T) {
now := time.Date(2020, 1, 2, 3, 4, 5, 6, time.UTC)
patchTimeNow(t, now)
// result should be predictable because we set a predictable value for now
result := something()
}
func patchTimeNow(t *testing.T, now time.Date) {
orig := timeNow
timeNow = func() time.Time {
return now
}
t.Cleanup(func() {
timeNow = orig
})
}
Example: Patching a function
Similar to the example above, the patch function could also accept a full replacement, instead of only accepting a static return value. This allows each test case to customize the value, or receive the value of the arguments passed to the function.
In the live code introduce a shim.
// runServer is a shim for testing
var runServer = server.Run
In the test code patch the shim to intercept the call, and use t.Cleanup
to reset the shim when the
test function exits.
func TestServerCmd(t *testing.T) {
var actual server.Options
patchRunServer(t, func(opts server.Options) error {
actual = opts
return nil
})
...
}
func patchRunServer(t *testing.T, fn func(opts server.Options) error) {
orig := runServer
runServer = fn
t.Cleanup(func() {
runServer = orig
})
}