Most Magik developers have heard about MUnit testing, although there are far too many that still don’t write proper MUnit tests – or even any at all. However I’d hazard a guess that just about all Magik developers have not heard about Property Based Testing (PBT).
Which is too bad, really, because PBT combined with MUnit brings automated testing to an entirely new level.
And while that may sound like hype, it isn’t. PBT gives you the power to write tests that automatically exercise your code in ways no example based test suites (such as MUnit) can.
You’ll write less testing code, achieve higher levels of code coverage and automatically find corner cases you may not have even thought about regardless of how well you know your code.
In addition, PBT allows you to write powerful integration tests that can find bugs no other testing paradigm would find.
If you’re intrigued, read on and I’ll demonstrate how to boost your testing in ways that make your code better, of higher quality and easier to unit and regression test as well as provide low-maintenance integration testing.
At this point you’re most likely thinking, this is too good to be true.
But you’re wrong… sort of.
The benefits I’ve outlined above are real, however there is no such thing as a free lunch. So you will have to learn to think differently. And writing good property-based tests require you to think more about your code and what it does. As we’ll see, this is something most developers don’t actually do and that means you might initially struggle to learn that skill.
But the results are definitely worth the effort. In fact, while PBT can be used on an existing code base, the fact it forces you to reason about your code means you’ll start to develop code differently, your designs will begin to incorporate testing from the start – rather than simply as an afterthought. And that results in code that’s cleaner, easier to maintain and enhance, as well as far more testable. These attributes combine to create high-quality software that maintains its quality from initial implementation to sunset.
MUnit Testing
Before we dive into PBT, let’s review MUnit testing for those that need a refresher.
MUnit uses an example-based testing paradigm. That means we hard code specific inputs, execute the code and then check to ensure the output is as expected. Therefore we need to come up with a set of carefully crafted specific inputs that will properly exercise the code.
If you can think of enough examples that will test all the possible states of your code, and you write enough of them, then you’ll have a good test suite.
Unfortunately that is usually easier said than done because most developers don’t understand their code well enough to determine all the various permutations of inputs that could result in failed tests. The consequence is this type of testing covers a small (sometimes insignificant) amount of code.
Additionally, many edge cases are missed because the developer did not think of them – and therefore did not create a test to validate them. So if you can understand the types of bugs that may affect your code, you can create example-based tests for them. However, if you can’t anticipate a particular class of bug, you won’t think to write a test for it and that bug will elude your test suite.
On the plus side, example-based tests are easy to write. Just manually create the inputs and check that the output is correct. But a comprehensive test suite may involve hundreds, or thousands, of manually created tests, and this is not something developers are usually chomping at the bit to do. So the usual case, at least in the Smallworld realm, is a minimal set of MUnit tests are written to satisfy a project’s requirements.
Example-based tests suffer from two major problems. First, they only test for potential bugs developers can envision and, second, they are tedious to write – so most developers don’t write lots of them.
In too many cases it becomes not about writing tests to find bugs, but about writing tests to check boxes in order to fulfill a Project Manager’s checklist or contractual obligations.
So example-based tests suffer from two major problems. First, they only test for potential bugs developers can envision and, second, they are tedious to write – so most developers don’t write lots of them.
But that doesn’t mean MUnit tests are bad. Quite the opposite. Properly constructed MUnits are very useful and can provide an automated way to unit and regression test a codebase. However we can significantly improve our testing by combining MUnit tests with PBT.
In such a scenario, a MUnit encapsulates a property-based test and drives the testing process. We continue to write intelligently constructed, example-based, MUnits but then we plug in property-based tests to cover the mundane, repetitive and tedious parts. When the property-based test completes, it returns a result (true or false) to the MUnit test that invoked it and that triggers the MUnit’s assertion.
Of course we can run property-based tests by themselves, but I feel the most comprehensive testing is achieved by combining the two testing paradigms.
Property Based Testing
With our MUnit refresher out of the way, it’s time to delve into PBT. This paradigm was created to automatically test Haskell programs and the granddaddy of all PBT programs is QuickCheck. You can find the original paper describing it here.
The idea behind PBT is we focus on properties rather than on specific tests. So instead of manually listing specific examples, we generalize what our code is doing and come up with rules that capture the essence of that code. These rules, if defined correctly, should return TRUE for all valid inputs passed to the code under test.
A testing library then generates the appropriate random data and passes them to the rules, which run the tests. This is quite different from example-based tests, where the examples you write indirectly imply what rules your code follows. With PBT, you explicitly define the rules and test by having the test library throw tons of data at them.

But as I mentioned, encoding rules that define a piece of code’s behaviour is very difficult for most people. However if you’re able to define these rules, everything else is done automatically and you can run a million tests just as easily as you run one.
Now since the core idea, and most difficult part, of PBT is writing good rules, let’s take a moment to discuss them.
A rule is simply some code that should always remain true for the code under test. Once rules are defined, the PBT library generates data, using built in or custom generators, passes the generated arguments to the code under test and records the results.
But because rules are tough to create, sometimes it’s worthwhile to start writing a number of standard example-based tests and look for general patterns that appear. Once we spot a pattern, simply encode that pattern and we have a rule.
Another potential trap to avoid is to ensure our rule does not use the same algorithm as the code we’re trying to test. We want to ensure the results of our code under test are correct and to effectively do this, we need to confirm the results using another method of calculating the results from the same arguments.
It’s similar to taking a measurement with a tape measure and a laser distance measuring tool. If the two agree, then we gain confidence the measurement is right. But if we take two measurements using the same tool, if that tool is not calibrated correctly, the measurements will be wrong, but we won’t know that because they will agree.
So we write rules that should be true for our code, generate tons of random data and execute the code under test. If a rule fails, we determine if the rule is wrong or if the code is wrong. Then we modify the incorrect piece and repeat the process. Writing property-based tests is an iterative process that logically sneaks up on the final testing solution.
Properties
When most people learn a skill, they’re usually bad at it to begin with. Sure there are some geniuses or phenoms that are born with an inbred ability, but these are few and far between. Most of us have to work to develop the skill.
Writing properties is no different.
You might be a person who can look at some code and immediately create an associated property… but I doubt it.
No, in all probability, you’ll struggle to create properties at first. But, as with most things, as you gain experience, the task will become easier. However you have to put in the work in order to reap the rewards.
So how do you go about moving from having a vague idea of how your code behaves to encoding specific behaviour into well-defined properties?
There are a few tricks of the trade, so to speak.
You can:
- Generalize properties based on traditional example-based tests.
- Model your code.
- Find invariants.
Let’s look at each of these in more detail.
Generalizing Example Based Tests
As I previously mentioned, when you’re stuck, simply start writing example-based tests. Then look for patterns that emerge – such as common steps you took to come up with each example test. Once you recognize a pattern, encode it into a predicate function (we’ll get to predicates in a moment).
Each time we encode a pattern, we increase our code coverage because once we have a property, the library can then generate as many test cases as we want (even thousands or millions) rather than simply testing the relatively few examples we came up with.
Modelling
In this instance you write a simple implementation of the code under test, using a different algorithm. That code doesn’t have to be efficient or pretty, it just needs to work – so the simpler the better.
You then feed both implementations the same arguments and check to see if they produce identical results.
Once your model is running, you can optimize the code under test as much as necessary and, as long as the results match, you can be fairly confident your optimized code is correct.
A form of modelling, called a Test Oracle, can be used when upgrading systems. Let’s say we are upgrading Smallworld GIS from version 4.3 to 5.x. We can use the current version (4.3) as the model and compare results against the 5.x version. The current implementation acts as a pre-written (and supposedly battle-tested) model.
So when we upgrade the code, we randomly generate the appropriate data, feeding them to both systems, and check to ensure the results are identical.
Invariants
When faced with a complex system, or even a simple one that is difficult to reason about, it usually makes sense to break it down into simpler components. Then we can write simpler properties for these components. We call the simplest of these properties, invariants. Once we’ve individually tested all of them, we assume the entire system works correctly.
Note this may not always be the case since the act of combining components may introduce errors – but that’s what integration tests are for – however it gets us a good deal of the way towards a solution and is a great deal better than spinning our wheels, not knowing how to start, because we can’t reason about the code as a whole.
For non-trivial pieces of code it may be difficult to come up with one or two properties that cover the entire code so, just like when developing software, decompose the problem and solve one sub-problem at a time.
For example, if we’re trying to test a complex interface that exposes a REST API endpoint via GSS, retrieves a JSON payload from an HTTP body, processes that payload using some sort of business logic and then writes the resulting records to the VMDS, it might be difficult to come up with properties that apply to the interface.
However if we break the interface into its logical components then we may be able to write properties to test that the REST API can be invoked, ensure the JSON payload can be retrieved from the HTTP body, validate the business logic and confirm when we write a record to the VMDS we can read it back and it’s the same as what was inserted.
Once we’ve tested each component, our confidence level rises with respect to the entire interface working as expected. We view the interface as a chain of components, so if we can validate each component in the chain is correct, we can assume, somewhat, the chain is not broken.
Of course we should write multiple properties, each representing an invariant, for each component. Then if they all pass, we gain more and more confidence our components are correct and thus our interface is correct. In effect, we build an effective test suite by creating many simple properties that when acting together, form a nearly impenetrable barrier against bugs. While each individual property might not guard against most bugs, all of them working together do.
MagikCheck
Okay, with that preamble out of the way, let’s get to the meat of the matter and specifically look at how we can use PBT with Magik. The first thing we need is a Magik PBT library.
MagikCheck is such a library. I wrote it to enhance testing in my projects but it can be used in a variety of different ways.
First, let’s define a simple piece of code to test.
add << _proc @add(a, b)
_return a + b
_endproc
$
See? I wasn’t kidding was I? That’s about as simple as you can get. Yet this code is about to teach you how to define properties.
Side Note
I’m using this simple procedure because it’s, well, simple. However keep in mind you would not normally test something so simple because…
(1) it’s simple enough to fully understand so we just assume it’s correct and,
(2) the + operator is built into Magik. Therefore we assume it’s been fully tested.
But it makes for a good example exactly because of this simplicity.
Onward…
MagikCheck takes a completely different approach compared to example-based testing. Rather than hardcoding specific inputs, developers make claims about the code under test that should be true for all possible inputs.
But what claim can we make about add()?
Well, if we think back to what I wrote earlier, let’s try to generalize a few example-based tests by listing specific cases and seeing if we can discern a pattern.
Here goes…
1 + 2 = 3
2 + 1 = 3
45 + 5 = 50
5 + 45 = 50
100 + 10 = 110
10 + 100 = 110
Now I don’t know about you, but I’m beginning to sense a pattern here… it appears the order of operands don’t matter, so let’s encode that into a property.
#
# COMMUTATIVE PROPERTY TEST:
#
add_commutative_test_pred <<
_proc @add_commutative_test_pred(verdict, a, b)
# a + b = b + a
_return verdict(add(a,b) = add(b,a))
_endproc
What we’ve done is claim that for all inputs (a and b), add(a,b) = add(b,a) — or, in regular notation, a + b = b + a. If this claim fails, then we know our add() procedure is wrong.
If the claim holds for all inputs we pass in, we say the claim is possibly true. If the claim fails for at least one set of inputs, we say the claim is definitely false. The more inputs we pass to the claim that does not invalidate it, the more confidence we have our claim is true — and by extension, our code is correct. So testing a million sets of inputs would give us more confidence the claim is true than if we just tested one set.
In MagikCheck, claims depend on properties that are implemented by procedures we call predicates (these are the rules we were talking about earlier). Predicates are just Magik procedures that can do anything allowed in Magik. The only requirement is they return TRUE if the property test passes and FALSE if it fails.
Also note the predicate takes a verdict procedure as its first argument. This is used in line 9 to return the Boolean result from the predicate. I’m not going to say anything else about the verdict because you don’t need to understand it right now. Just keep in mind your predicate procedure must use the verdict when returning its result and the argument to the verdict must be (or evaluate to) a Boolean value.
Side Note
Since Predicate procedures are just Magik procedures, they can be named anything, however the convention is to append the _test_pred suffix to the name.
I’d strongly recommend following this convention because it makes understanding (when you’re looking through dozens of tests) much easier.
MagikCheck Library
We’ve now defined the code we want to test as well as a predicate that implements our property.
But how do we run the actual test?
That’s where the library comes in. MagikCheck is delivered as a module that is part of the MagikFP library. When you load MagikCheck, MagikFP will automatically be loaded — which will define the global FP object used to access all MagikFP functionality.
Let’s now define a procedure that contains the code necessary to run our tests.
mc_run <<
_proc @magik_check_run()
_constant B << beeble_object.new()
# logging procedure
B.console <<
_proc @magik_check_console(p_value)
print(p_value)
write(newline_char)
_endproc
# configuration.
B.config << beeble_object.new({
# level of detail to report: 0 = none: no report.
# 1 = terse: show pass scores.
# 2 = failures: show individual cases that fail.
# 3 = classification: show all classification summaries.
# 4 = verbose: show all cases.
:detail, 4,
# callback for the report.
:on_report, B.console,
# callback for each passing trial.
:on_pass, B.console,
# callback for each failing trial.
:on_fail, B.console,
# callback for each trial that didn't deliver on time.
:on_lost, B.console,
# callback for the summary.
:on_result, B.console,
# time (in ms) each trial must complete by or it will be considered LOST.
:time_limit, 500,
# number of trials performed for each claim.
:nr_trials, 10
})
# NOTE: magik_check_constructor is defined on the MAGIK_CHECK object, so we inherit its methods by setting our prototype to that object.
B.prototype << FP.magik_check
# get an instance of magik_check by calling the constructor.
_local l_chk << B.magik_check_constructor()
# register the claim.
l_chk.claim("Check for Commutative Property", add_commutative_test_pred, {l_chk.integer(10000), l_chk.integer(10000)})
# run the check.
l_chk.check(B.config)
_endproc
Note that both MagikCheck and MagikFP make liberal use of the lightweight, prototypal beeble_object, so if you’re not familiar with it, feel free to read about it here.
In line 54 we register the claim (encoded in the add_commutative_test_pred procedure). This tells the library to use this predicate when it’s generating random data and running tests.
And speaking of data, you’ll notice the third argument to claim() is a simple_vector. This holds a handle to the generator functions used to create new random data for each trial. We’ll discuss generators later, but for now, just know these particular generators create random integers between 1 and 10,000 inclusive.
Line 57 is what actually kicks off the execution of our tests. Note the check() method is passed the configuration object we defined in line 17. Also note the configuration specifies we want 10 trials run (line 44), so MagikCheck will generate 10 sets of random data and invoke the code under test 10 times — once with each set of data.
And that’s it. We’ve done enough to run our first property-based test.
Magik> mc_run()
property_list:
:args "3801, 7901"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 1
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "2701, 3001"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 2
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "3101, 9701"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 3
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "9101, 701"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 4
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "8301, 3701"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 5
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "4801, 7701"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 6
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "8101, 5401"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 7
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "7001, 7401"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 8
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "1201, 7801"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 9
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "3401, 3701"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 10
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:pass 10
:fail 0
:lost 0
:total 10
:ok True
As you can see, MagikCheck ran 10 different test cases. Line 4 shows the random arguments generated for case #1 (i.e. the integers 3801 and 7901).
Line 6 shows the name of the property test that was run (this was specified when we registered the claim). Line 7 shows the predicate procedure (also specified during claim registration) and line 11 shows the result of the test (in this case it passed because 3801 + 7901 = 7901 + 3801).
The format is the same for each of the 10 cases, but note the randomly generated arguments are different in each case and serial is a counter that indicates the case number being run.
Finally, a summary is displayed. Line 113 shows the number of passing tests, line 116 shows the total number of tests executed and line 117 shows whether all tests passed (in this case they did).
Of course we could have run a thousand tests, or a million, just by changing nr_trials in the configuration object. And therein lies the beauty of PBT, once you’ve defined your properties (i.e. predicates), you can run as many or as few test cases as you want. And each time you run another set of tests, new random data are generated and passed to the code under test.
Defining Multiple Properties
So our tests all passed and we’re good, right?
Not so fast bucko!
We defined one property that appears to hold true for our add() procedure. However that isn’t enough because this property does not uniquely test addition. There are other operators that are also commutative (such as multiplication).
Take a look at what happens when we change add() to return an incorrect result.
add << _proc @add(a, b)
_return a * b
_endproc
$
We’ve changed the procedure to return the product of its arguments (line 2), rather than the sum. This is clearly wrong for something called add(). So let’s run the tests again and see what happens.
Magik> mc_run()
property_list:
:args "8301, 4401"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 1
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "4201, 8801"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 2
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
.
.
.
property_list:
:pass 10
:fail 0
:lost 0
:total 10
:ok True
I only showed the first two of the 10 trials, but as you can see from the summary, all tests passed because multiplication is also commutative. This is unacceptable. add() is clearly returning an incorrect result, but the tests all pass.
What should we do?
At first glance you might think we need to redefine our property and make it more complex in order to properly test add().
But…
Remember when we spoke about invariants?
Ahhh, yes… that might be a better way to go.
So rather than modifying our first test — which is still valid because addition is commutative and this test verifies commutativity — we could add an additional test.
But what test should we add?
If you’re drawing a blank, go back to our idea of generating examples and looking for patterns, perhaps something like this…
0 + 2 = 2
2 + 0 = 2
0 + 5 = 5
5 + 0 = 5
100 + 0 = 100
0 + 100 = 100
Once again a pattern emerges. This time we note that adding 0 to any other number simply gives us that other number. We also note this doesn’t work for multiplication (for example 0 * 2 = 0, not 2). So we might be on to something here.
Let’s encode this in a predicate…
#
# IDENTITY PROPERTY TEST:
#
add_identity_test_pred <<
_proc @add_identity_test_pred(verdict, a)
# a + 0 = 0 + a = a
_local l_sum << add(a, 0)
_local l_ident? << l_sum = add(0, a)
_return verdict(l_sum = a _and l_ident?)
_endproc
…and create another claim (line 8)…
.
.
.
# register the claim.
l_chk.claim("Check for Commutative Property", add_commutative_test_pred, {l_chk.integer(10000), l_chk.integer(10000)})
l_chk.claim("Check for Identity Property", add_identity_test_pred, {l_chk.integer(10000)})
.
.
.
Now when we run MagikCheck, we see the following (again, edited for brevity).
Magik> mc_run()
property_list:
:args "7501, 2401"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc add_commutative_test_pred(verdict,a,b)
:serial 1
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
.
.
.
property_list:
:args "3301"
:claim proc the_claim(p_register,p_serial)
:name "Check for Identity Property"
:predicate proc add_identity_test_pred(verdict,a)
:serial 11
:signature "proc integer"
:verdict proc verdict(p_result)
:pass False
property_list:
:args "201"
:claim proc the_claim(p_register,p_serial)
:name "Check for Identity Property"
:predicate proc add_identity_test_pred(verdict,a)
:serial 12
:signature "proc integer"
:verdict proc verdict(p_result)
:pass False
property_list:
:args "9601"
:claim proc the_claim(p_register,p_serial)
:name "Check for Identity Property"
:predicate proc add_identity_test_pred(verdict,a)
:serial 13
:signature "proc integer"
:verdict proc verdict(p_result)
:pass False
.
.
.
property_list:
:pass 10
:fail 10
:lost 0
:total 20
:ok False
That’s much better. Our overall test fails (line 58) because all the identity property tests failed. We can see that the 10 commutative tests all passed (lines 11 and 54) but the 10 identity tests all failed (lines 25, 36, 47 and 55). The result is that since at least one test in our suite failed, the final test failed (line 58).
If we had encapsulated this in an MUnit test, the result from line 58 (i.e. false) would be returned to the MUnit test causing the MUnit test to fail too.
Now let’s revert our add() procedure to the correct implementation.
add << _proc @add(a, b)
_return a + b
_endproc
$
…and re-run our test suite.
Magik> mc_run()
property_list:
:args "1, 9401"
:claim proc the_claim(p_register,p_serial)
:name "Check for Commutative Property"
:predicate proc arithmetic_commutative_test_pred(verdict,a,b)
:serial 1
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
.
.
.
property_list:
:args "5201"
:claim proc the_claim(p_register,p_serial)
:name "Check for Identity Property"
:predicate proc arithmetic_identity_test_pred(verdict,a)
:serial 11
:signature "proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "801"
:claim proc the_claim(p_register,p_serial)
:name "Check for Identity Property"
:predicate proc arithmetic_identity_test_pred(verdict,a)
:serial 12
:signature "proc integer"
:verdict proc verdict(p_result)
:pass True
.
.
.
property_list:
:pass 20
:fail 0
:lost 0
:total 20
:ok True
All our tests passed! Therefore we’re gaining confidence our code is correct.
But… can we do more?
Of course. Let’s see if we can add another claim based on the modelling example we discussed earlier.
Recall that we want to write a different implementation of the code under test so we can run both and compare the results.
So let’s do that now.
#
# SUCCESSOR PROPERTY TEST:
#
add_successor_test_pred <<
_proc @add_successor_test_pred(verdict, a, b)
# Wikipedia states: Within the context of integers, addition of one also plays a special role: for any integer a, the integer (a + 1) is the least integer greater than a, also known as the successor of a.
# Because of this succession, the value of a + b can also be seen as the bth successor of a, making addition iterated succession.
_if b < a
_then
# ensure a < b.
(a,b) << (b,a)
_endif
_local l_successor << 0
# get the successor.
_for i _over range(a + 1, b)
_loop
l_successor +<< 1
_endloop
# b = l_successor + a
_return verdict(a + l_successor + a = add(a, b))
_endproc
$
I’ll spare you the details of showing how I registered the claim (it’s the same as previously shown), but here’s a summary of the results…
Magik> mc_run()
.
.
.
property_list:
:args "101, 5401"
:claim proc the_claim(p_register,p_serial)
:name "Check for Successor Property"
:predicate proc add_successor_test_pred(verdict,a,b)
:serial 21
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
property_list:
:args "6900, 9901"
:claim proc the_claim(p_register,p_serial)
:name "Check for Successor Property"
:predicate proc add_successor_test_pred(verdict,a,b)
:serial 22
:signature "proc integer, proc integer"
:verdict proc verdict(p_result)
:pass True
.
.
.
property_list:
:pass 30
:fail 0
:lost 0
:total 30
:ok True
And just like that we have a strong suite of tests that can be automatically run to test add().
But here’s the interesting thing… our three predicates consist of less than 30 lines of code, yet they combine to thoroughly test add(). Even if we decided to run 10,000 or more tests, we would still only need those 30 lines.
I’d also like you to notice how we used the various tricks of generalizing example-based tests, modelling and invariants to create our three properties. Each property alone would not be sufficient to properly test add(), but when used together, they create a formidable test suite.
Generators
The quality of testing depends on the quality of the predicate procedures as well as the volume of randomized input data.
Even if our predicates are stellar but we use too little data, our tests may not be good. By default MagikCheck runs 100 trials for each property test. However you may want to adjust this number depending on the code under test.
We’ve already covered predicates, so let’s now turn our attention to automatically generating high-quality random data.
To support this, MagikCheck provides a comprehensive set of built-in generator functions, which are simply Magik procedures that generate random data of a particular type. Here’s a list of them…
Magik> print(fp.magik_check.generators)
{
"sequence":"proc sequence_generator_fn(p_seq)",
"boolean":"proc boolean_generator_fn( optional p_bias)",
"number":"proc number_generator_fn( optional p_from,p_to)",
"string":"proc string_generator_fn( gather p_args)",
"falsy":"proc falsy_generator_fn",
"simple_vector":"proc simple_vector_generator_fn( optional p_first,p_value)",
"rwo":"proc rwo_generator_fn(p_rwo)",
"date":"proc date_generator_fn( optional p_year_start,p_year_end)",
"integer":"proc integer_generator_fn( optional p_i,p_j)",
"any":"proc any",
"character":"proc character_generator_fn( optional p_i,p_j)",
"one_of":"proc one_of_generator_fn(p_collection, optional p_weights)"
}
These are usually sufficient for most testing, but if you need something more customized, you can simply write your own (because they’re just Magik procedures).
One thing to note…
Generators are procedures that return procedures. That may be a new concept to you, but it’s used often in functional programming. I won’t get into why it’s done this way, but just understand MagikCheck requires this format.
The upshot is that to use generators from the Magik prompt, you need to add another set of parentheses at the end in order to invoke the returned procedure. That’s why the extra parentheses are there in the following examples.
Alright, let’s look at how the built-in generators work.
Character
The character generator returns a single character by default. You can also provide arguments to constrain the returned characters (as in line 13).
Magik> gen << fp.magik_check.generators
a beeble_object
Magik> gen.character()()
%F
Magik> gen.character()()
%?
Magik> gen.character()()
%g
Magik> gen.character("a","z")()
%d
Magik> gen.character("a","z")()
%x
Magik>
String
The string generator returns random strings. You can get strings of a specific length by passing in an integer as the first argument. Note that generators can be composed together (as in line 1 where we’re composing the string and character generators).
Magik> gen.string(8, gen.character("beeblebroxZAPHOD"))()
"oeOZeZHb"
Magik> gen.string(8, gen.character("beeblebroxZAPHOD"))()
"OelHbleo"
Magik> gen.string(8, gen.character("beeblebroxZAPHOD"))()
"eADDObZl"
Magik> gen.string(8, gen.character("beeblebroxZAPHOD"))()
"erHbbOee"
Magik>
In line 7, below, we’re composing the integer and character generators (within the string generator), to create random length strings (of 1 to 10 length) containing characters from the “abcdef” set.
Magik> gen.string()()
"g[7()gUnC"
Magik> gen.string(gen.integer(200,500))()
"449"
Magik> gen.string(gen.integer(1,10), gen.character("abcdef"))()
"cde"
Magik> gen.string(gen.integer(1,10), gen.character("abcdef"))()
"ebebbabbd"
Magik> gen.string(gen.integer(1,10), gen.character("abcdef"))()
"eccdbbaf"
Magik> gen.string(gen.integer(1,10), gen.character("abcdef"))()
"cceaa"
Magik>
Integer
By default the integer generator returns a prime number. However you can supply arguments to shape the returned integers. Line 7 shows how to get integers from 1 to 100 while line 13 shows how to generate integers between 100 and 200, inclusive.
Magik> gen.integer()()
101
Magik> gen.integer()()
17
Magik> gen.integer(100)()
23
Magik> gen.integer(100,200)()
193
Magik> gen.integer(100,200)()
138
Magik> gen.integer(-100,100)()
16
Magik> gen.integer(-100,100)()
-30
Magik>
Boolean
By default the Boolean generator returns TRUE 50% of the time and FALSE 50% of the time. However it accepts an argument that biases the return in favour of FALSE if it’s less than 0.5 and in favour of TRUE if greater than 0.5. The lower the value, the more probable FALSE will be returned. The higher the value, the more probable TRUE will be returned.
Magik> gen.boolean()()
False
Magik> gen.boolean()()
True
Magik> gen.boolean(0.2)()
False
Magik> gen.boolean(0.2)()
False
Magik> gen.boolean(0.2)()
False
Magik> gen.boolean(0.2)()
True
Magik> gen.boolean(0.2)()
False
Magik> gen.boolean(0.8)()
True
Magik> gen.boolean(0.8)()
False
Magik> gen.boolean(0.8)()
True
Magik> gen.boolean(0.8)()
True
Magik> gen.boolean(0.8)()
True
Magik>
Falsy
The falsy generator randomly returns one value that can be considered false for various data types each time it is invoked.
Magik> gen.falsy()()
unset
Magik> gen.falsy()()
unset
Magik> gen.falsy()()
False
Magik> gen.falsy()()
0
Magik> gen.falsy()()
0
Magik> gen.falsy()()
""
Magik>
Simple Vector
The simple vector generator returns a randomly sized simple vector containing integers by default.
Magik> print(gen.simple_vector()())
simple_vector(1,1):
1 139
Magik> print(gen.simple_vector()())
simple_vector(1,1):
1 467
Magik> print(gen.simple_vector()())
simple_vector(1,1):
1 311
Magik> print(gen.simple_vector()())
simple_vector(1,3):
1 89
2 863
3 743
Magik> print(gen.simple_vector()())
simple_vector(1,2):
1 647
2 23
Magik>
This generator can also be composed with others. In line 1, below, we generate a simple vector with a random integer, floating point number and string.
Magik> print(gen.simple_vector({gen.integer(), gen.number(42), gen.string(10, gen.character("A","Z"))})())
simple_vector(1,3):
1 167
2 5.040
3 "CWUNPYIHHZ"
Magik> print(gen.simple_vector({gen.integer(), gen.number(42), gen.string(10, gen.character("A","Z"))})())
simple_vector(1,3):
1 691
2 20.58
3 "AQWPCZFFFK"
Magik> print(gen.simple_vector(5, gen.integer())())
simple_vector(1,5):
1 23
2 283
3 719
4 37
5 43
Magik>
Sequence
The sequence generator is initially seeded with a sequence of values. Then each time it’s invoked, it returns the next value in the sequence. When the end is reached, it wraps back to return the first value and continues indefinitely.
Magik> seq << gen.sequence({:one, :two, "three", 4})
proc sequence
Magik> seq()
:one
Magik> seq()
:two
Magik> seq()
"three"
Magik> seq()
4
Magik> seq()
:one
Magik>
Date
The date generator returns a random date. It also accepts arguments to constrain the date range (as shown in line 7).
Magik> gen.date()()
date(23/02/2010)
Magik> gen.date()()
date(07/03/1983)
Magik> gen.date(2000,2010)()
date(18/01/2009)
Magik> gen.date(2000,2010)()
date(26/05/2002)
Magik>
One Of
The one_of generator randomly returns one of its arguments. Note it can be composed with other generators to return random values of supported types.
Magik> gen.one_of({gen.integer(), gen.number(), gen.string()})()
977
Magik> gen.one_of({gen.integer(), gen.number(), gen.string()})()
"]>u<9z%"
Magik> gen.one_of({gen.integer(), gen.number(), gen.string()})()
"3%\Y"
Magik> gen.one_of({gen.integer(), gen.number(), gen.string()})()
101
Magik> gen.one_of({gen.integer(), gen.number(), gen.string()})()
0.4100
Magik>
Any
The any generator returns a random value of a random type.
Magik> gen.any()()
"a'nt"
Magik> gen.any()()
date(29/11/1990)
Magik> gen.any()()
0.5500
Magik> gen.any()()
3.142
Magik>
Number
The number generator behaves like the integer generator but returns floating point numbers.
Magik> gen.number()()
0.8100
Magik> gen.number()()
0.5700
Magik> gen.number()()
0.2700
Magik> gen.number(10,20)()
19.20
Magik> gen.number(10,20)()
10.20
Magik> gen.number(-10,20)()
-1.000
Magik>
Summary
We’ve covered quite a bit of ground in this article so hopefully you feel equipped to start testing the waters of PBT.
Remember, the goal of any testing suite (PBT or otherwise) is to ensure we deliver high quality code. Sometimes that means we use PBT but, at other times, we might use a different testing methodology (including example-based or even, dare I say it… manual testing). Just because we have access to a testing library, it doesn’t mean we have to use it in all situations. Common sense should be your guide.
If you take one thing away, it should be this: testing is very important. It is not an afterthought and it should be considered before you start writing your first line of code.
If you write code that is easy to test, you will properly test it — code that’s difficult to test won’t be properly tested and that will lead to lower quality software.
If you write code that is easy to test, you will properly test it — code that’s difficult to test won’t be properly tested and that will lead to lower quality software.
I’ll wrap up with three thoughts…
- Testing can prove applications have bugs, but never that they don’t. The more property-based tests we run, the higher our confidence level becomes that our code is good — but we can never be sure it has zero bugs.
- When a test suite passes, testing fails because the goal is not to write passing tests, but to find bugs. And all non-trivial code has bugs.
- If we run enough passing property-based tests, we gain confidence in the quality of our code.