Combinatorial testing

10 October 2022 ā€¢History

software-development

Have you recently tried to unit test a function which has many combinations of possible inputs and expected outputs?

An increasingly common way of writing such a test is to utilize a data-driven test. The problem with data-driven tests is that they can quickly grow to be large and unwieldy.

In this article, I want to introduce a technique for generating data-driven tests without having to spell out every individual combination of inputs/outputs in code.

But first, a quick refresher on data-driven tests...

Limitations of data-driven tests

You may be familiar with data-driven testing. Basically you write a table of combinations of inputs and outputs in which each test case is inputted as "row" of data. Data-driven testing is now supported by many popular test frameworks (Jest, JUnit, NUnit to name a few).

One problem with data-driven testing is: sometimes we have so many combinations to cover that a comprehensive data-driven test would be very lengthy and difficult to read or maintain.

Imagine, for example, trying to write a data-driven unit test for a function which returns the number of days in a given month.

The function takes two parameters: year and month and returns one day value. It has to deal with a range of values for each parameter. Multiplying all values that need to be tested by two parameters yields a large number of combinations.

it.each([
  { year: 2021, month: "jan", expectedDays: 31 },
  { year: 2021, month: "feb", expectedDays: 28 },
  { year: 2021, month: "march", expectedDays: 31 },
  { year: 2021, month: "april", expectedDays: 30 },

  // and so on and so on ... šŸ˜“
]);

That number of combinations, though easy for a computer to process, is not easy for us to wrap our human minds around!

Perhaps the solution is to express the combinations in a more concise manner ā€“ as grouped ranges of values ā€“ rather than spelling out every single combination.

Combinator function to the rescue!

What is a combinator?

In the world of functional programming, the term "combinator" informally refers to a pattern...

"where complex structures are built by defining a small set of very simple 'primitives', and a set of 'combinators' for combining them into more complicated structures"
ā€“ Combinator Pattern ā€¢ wiki.haskell.org/Combinator_pattern

The combinator I present in this article is more specific. It takes as input an object whose properties each have a value that is an array. Then it combines each value of each array. All of the objects generated by this means are then returned to the caller.

For example, suppose we provide an input object having a single property, "color", whose value is an array containing elements "red" and "blue":

{
  color: ["red", "blue"];
}

The combinator will return us an array having the following objects:

  1. an object having a "color" property whose value is "red" and
  2. an object having a "color" property whose value is "blue"
[
  {
    color: "red",
  },
  {
    color: "blue",
  },
];

Suppose we provide an additional property in our input object, "brightness", whose value is an array containing elements 100 and 200:

{
  color: ['red', 'blue'],
  brightness: [100, 200]
}

The combinator will return us an array having every specified combination of "color" and "brightness":

  1. an object having a "color" property whose value is "red" and a property "brightness" whose value is 100 and
  2. an object having a "color" property whose value is "red" and a property "brightness" whose value is 200 and
  3. an object having a "color" property whose value is "blue" and a property "brightness" whose value is 100 and
  4. an object having a "color" property whose value is "blue" and a property "brightness" whose value is 200

Like this:

[
  {
    color: "red",
    brightness: 100,
  },
  {
    color: "red",
    brightness: 200,
  },
  {
    color: "blue",
    brightness: 100,
  },
  {
    color: "blue",
    brightness: 200,
  },
];

Providing a definition object as input, we can get a large set of results as output.

Here's a high-level diagram:

UML diagram depicting combinatorial test definition and results
ā–² UML diagram depicting combinatorial test definition and results

Let's apply this combinator to a slightly more "real world" example.

An example: days in a month

For historical reasons, determining the number of days in a month in the Western calendar is complicated.

The following short rhyme tries to summarize the rules in a memorable way:

Thirty days have September,

April, June and November.

All the rest have thirty-one,

except February alone, which has

twenty-eight days each year

and twenty-nine days each leap-year

Suppose we wanted to unit-test a function, getDaysInMonth, which takes month and year as input and returns a number of days.

We could simply input every possible date into the unit test and assert on the month of each. As mentioned above, that could involve quite a lot of fiddling in Excel and would result in a very long and not very human-readable test file.

Instead, let's try to tackle this problem with a combinator.

Starting with the first two lines of the rhyme:

Thirty days have September,

April, June and November.

We can express this "thirty days" combination set programmatically, like this:

const thirtyDays = combinate({
  year: range(2020, 2023),
  month: ["april", "june", "september", "november"],
  expectedDays: [30],
});

The result can easily be passed into a data-driven test in Jest:

it.each(thirtyDays)(
  "$month in $year should have $expectedDays days",
  ({ month, year, expectedDays }) => {
    expect(getDaysInMonth(month, year)).toBe(expectedDays);
  }
);

On running the unit test, the following test cases will be generated and executed:

āœ“ april in 2020 should have 30 days (3 ms)
āœ“ june in 2020 should have 30 days
āœ“ september in 2020 should have 30 days
āœ“ november in 2020 should have 30 days
āœ“ april in 2021 should have 30 days
āœ“ june in 2021 should have 30 days (1 ms)
āœ“ september in 2021 should have 30 days
āœ“ november in 2021 should have 30 days (1 ms)
āœ“ april in 2022 should have 30 days
āœ“ june in 2022 should have 30 days
āœ“ september in 2022 should have 30 days
āœ“ november in 2022 should have 30 days
āœ“ april in 2023 should have 30 days
āœ“ june in 2023 should have 30 days
āœ“ september in 2023 should have 30 days
āœ“ november in 2023 should have 30 days

Notice how we can use a small amount of code (in this example, 5 lines for the combinate call) to generate a much larger set of test cases (16). This gives our test code more leverage.

Covering the remaining lines of the rhyme:

All the rest have thirty-one,

const thirtyOneDays = combinate({
  year: range(2020, 2023),
  month: ["january", "march", "may", "july", "august", "october", "december"],
  expectedDays: [31],
});

The following data will be generated:

āœ“ january in 2020 should have 31 days (2 ms)
āœ“ march in 2020 should have 31 days (1 ms)
āœ“ may in 2020 should have 31 days (1 ms)
āœ“ july in 2020 should have 31 days (1 ms)
āœ“ august in 2020 should have 31 days
āœ“ october in 2020 should have 31 days
... etc ...

except February alone, which has twenty-eight days each year

const februaryDays = combinate({
  year: [2023],
  month: ["february"],
  expectedDays: [28],
});
āœ“ february in 2023 should have 28 days (2 ms)

and twenty-nine days each leap-year

const februaryLeapYearDays = combinate({
  year: [2024],
  month: ["february"],
  expectedDays: [29],
});
āœ“ february in 2024 should have 29 days (2 ms)

Finally, putting it all together, here is the complete unit test:

describe("getDaysInMonth", () => {
  const thirtyDays = combinate({
    year: range(2020, 2023),
    month: ["april", "june", "september", "november"],
    expectedDays: [30],
  });

  const thirtyOneDays = combinate({
    year: range(2020, 2023),
    month: ["january", "march", "may", "july", "august", "october", "december"],
    expectedDays: [31],
  });

  const twentyEightDays = combinate({
    year: [2023],
    month: ["february"],
    expectedDays: [28],
  });

  const twentyNineDays = combinate({
    year: [2024],
    month: ["february"],
    expectedDays: [29],
  });

  it.each([
    ...thirtyDays,
    ...thirtyOneDays,
    ...twentyEightDays,
    ...twentyNineDays,
  ])(
    "$month in $year should have $expectedDays days",
    ({ month, year, expectedDays }) => {
      expect(getDaysInMonth(month, year)).toBe(expectedDays);
    }
  );
});

Notice that we can assign meaningful names to each of the variables, increasing the readability of the test code.

I'm sure you would agree that this test code, using a combinator, is more concise and readable than a large table of numbers and strings!

In closing, I encourage you to use combinatorial testing to shorten and sweeten your data-driven tests, thus testing your software thoroughly and making it maximally robust.

Introducing combinator-util

If you'd like to add a little combinatorial goodness to our unit tests, please check out this re-usable, open-source NPM package:

https://github.com/jonathanconway/combinator

Contributions welcome!

Further reading

These books inspired this article:

Ā© 2024 Jonathan Conway