I have some good news and some bad news. What was that? You want the bad news first? Okay…
There is no way to generate truly random numbers outside of quantum physics.
But what about rolling dice?
Nope.
Rolling dice only appears random because we can’t measure all the various physical factors that produce one number or another. If we could measure a die’s exact shape, the angle it’s thrown at, its acceleration, friction, surface attributes of what it will land on and myriad other aspects, then we would be able to predict the outcome. It’s the same for other physical events too, such as lottery balls and casino games.
As Henri Poincare said, “If we know the causes we can predict the effects. What is chance for the ignorant is not chance for the scientist. Chance is only the measure of our ignorance.”
Even things like atmospheric noise (used by sites such as random.org) or radioactive decay are not truly random, we just don’t have the knowledge and technology to model these phenomena.
And that brings us to the good news.
Since we are ignorant in so many ways, things that aren’t truly random appear random enough to be useful for important things such as cryptography and lotteries.
So let’s explore ways to create pseudo-random numbers in Magik.
Pseudo-random Number Generators
Every computer system worth its salt has some sort of pseudo-random number generator (PRNG) that’s usually good enough for us.
PRNGs depend on an initial value called the seed. If you give it the same seed, it will produce the same sequence of, “random,” numbers. Take a look at the code below.
_global test_seed
test_seed <<
_proc @test_seed(p_seed)
_constant PRNG << random.new(100, p_seed)
_for i _over range(1,10)
_loop
write(PRNG.get())
_endloop
_endproc
$
This code takes a seed argument and uses it to generate a PRNG. Then it writes the first 10 values from the PRNG.
Here are the results of a few runs.
Magik> test_seed(10)
13
80
93
90
46
56
97
88
81
14
Magik> test_seed(10)
13
80
93
90
46
56
97
88
81
14
Magik> test_seed(11)
38
68
11
55
33
7
40
33
93
7
Magik> test_seed(11)
38
68
11
55
33
7
40
33
93
7
Magik> test_seed(10)
13
80
93
90
46
56
97
88
81
14
Magik>
$
See how the sequence of numbers generated are identical when we use the same seed (i.e. 10) in lines 1, 13 and 49? Also note the sequences are identical when we use a seed of 11 in lines 25 and 37.
That’s because the PRNG algorithm uses the seed to generate numbers in a deterministic fashion. What’s nice, however, is the PRNG’s generated sequences have properties that approximate random numbers – basically each element in the sequence appears equally probable and is independent of the previous elements.
But the important point to remember is to avoid using the same seed if we want better random numbers.
So given its importance, how can we generate an appropriate seed?
Fortunately Magik has built in methods we can combine to give us a good seed.
Here’s some code.
_global test_seed_gen
test_seed_gen <<
_proc @test_seed_gen()
_constant DIGEST << system.sha256_digest(date_time.now(:milliseconds))
_constant SEED << write_string("16R", DIGEST.slice(1,9)).evaluate()
_constant PRNG << random.new(100, SEED)
_for i _over range(1,10)
_loop
write(PRNG.get())
_endloop
_endproc
$
Notice how we use the date_time.now() method with millisecond resolution, in line 5, to generate a value that can be used as input into system.sha256_digest().
I’ve discussed sha256 elsewhere, so I won’t get into it here, but check out this article if you’re interested in the details.
It’s important to use the output of date_time.now() directly as the input to sha256_digest() without first converting it to a string – because, by default, converting to a string results in one-second resolution and thus invoking the code multiple times within a second will give the same output.
Fortunately millisecond resolution is enough in all Smallworld environments, outside of running on a Supercomputer, to produce different outputs even when invoking in a tight loop.
In line 6 we then convert the digest (which is a 64-character hexadecimal string) to a decimal number using the evaluate() method (but using only the first 9 digits to avoid getting a bignum) and use it to seed the PRNG in line 7.
Here are the results of running the procedure.
Magik> test_seed_gen()
41
0
30
41
96
64
51
24
72
47
Magik> test_seed_gen()
23
91
28
48
29
68
55
67
48
86
Magik>
$
As you can see, now that we’re using different seeds, the sequences of generated numbers are different.
This is sufficient to produce pseudo-random numbers in one thread of execution. However if we want to ensure our numbers don’t collide when generated for different threads, servers and users, we can add these attributes when the seed is generated.
Generating Unique IDs
Since we want our seed to be unique, let’s start by generating a unique value. A side effect of this is that we can also use this code for other purposes, such as creating unique IDs or keys for example.
So let’s create a procedure to generate a unique ID.
_global unique_id
unique_id <<
_proc @unique_id(_optional p_prefix)
# P_PREFIX: a prefix that will prepended to the result.
_if ~(l_prefix << p_prefix.default("")).empty?
_then
l_prefix << write_string(l_prefix, "_")
_endif
_constant HIGH_ORDER << system.sha256_digest(date_time.now(:milliseconds)).write_string.slice(1,32)
_constant LOW_ORDER << system.sha256_digest(write_string(system.host_name, system.user_name, _thisthread.name)).slice_to_end(33)
_return write_string(l_prefix, HIGH_ORDER, LOW_ORDER)
_endproc
$
In line 12 we generate a sha256 digest from date_time.now(), with millisecond resolution, and then in line 14 generate another digest from the other attributes of interest. Each digest produces a 64-hexadecimal-digit value.
We then take the high order 32 hex digits from the first digest and the low order 32 hex digits from the second digest and combine them to form a new 64-digit hex number, which is our unique ID (note there’s also an option to provide a prefix, but we’ll ignore that for the moment).
So now we have a value that is unique across servers (if you want more uniqueness, add additional attributes to the LOW_ORDER input). And since it’s unique, using it as the seed for our PRNG ensures we get a different stream of, “random,” numbers each time the code is invoked.
Here’s the code for a PRNG factory that returns a procedure that generates pseudo-random numbers.
_global prng_factory <<
_proc @prng_factory(_optional p_range)
_constant AS_DECIMAL? << _if p_range _is _unset
_then >> _true
_else >> _false
_endif
_constant SEED << write_string("16R", unique_id().slice(28,36)).evaluate()
_constant PRNG << random.new(p_range.default(100), SEED)
_return _proc @prng()
_import PRNG
_import AS_DECIMAL?
_if AS_DECIMAL?
_then
_return (PRNG.get() / 100).as_float
_else
_return PRNG.get()
_endif
_endproc
_endproc
$
In lines 4, 9 and 11 we create private constants to hold the SEED (line 9) and PRNG (line 11). We also set up a flag in line 4 to indicate when we need to return decimal values between 0 and 0.99, inclusive.
Note these constants are held in a closure and PRNG plus the AS_DECIMAL? flag are accessed in the prng procedure (using the _import statement) that is returned from the factory in line 14.
We create a unique seed, in line 9, by converting the hexadecimal value (base 16) to decimal using the evaluate() method.
However if we do the conversion on the entire 64-digit hex value, returned from unique_id(), the result will be a bignum rather than an integer. And since there is a maximum limit to the seed accepted by random.new(), we take a subset of the hex value (ensuring it contains part of the high order and part of the low order digits generated by unique_id()) in order to scale the seed to an appropriate integer.
This results in a PRNG that should produce pseudo-random numbers that are different regardless of where and when you call it.
Here’s a procedure to test our PRNG.
_global test_random <<
_proc @test_random(_optional p_range)
_constant PRNG << prng_factory(p_range)
_for i _over range(1,10)
_loop
write(PRNG())
_endloop
_endproc
$
And here’s the output from a few executions.
Magik> test_random(100000)
4190
6256
33000
36760
85615
66752
59416
12812
13751
4259
Magik> test_random(100000)
72623
7491
75477
99000
68408
10411
73917
42179
51513
46758
Magik> test_random()
0.05000
0.1500
0.8700
0.8700
0.9000
0.03000
0.9500
0.7100
0.4600
0.4200
Magik> test_random()
0.7900
0.1300
0.6500
0.6900
0.1200
0.8100
0.4900
0.5700
0.2800
0.3200
Magik> test_random(1000)
51
615
289
308
292
703
585
183
357
154
Magik> test_random(1000)
66
50
560
959
827
58
24
370
508
197
Magik>
$
Note the sequences of pseudo-random numbers are different — even when the same range is requested. That’s because we’re using different seeds.
Of course if you save the seed, you can reproduce the sequence. And sometimes that’s what you want to do. For example, if you’re running property-based tests and want to ensure your, “randomly,” generated attributes can be replicated in the future, just save the generated seed and you’ll be able to exactly reproduce the sequence of numbers upon which your attributes were generated.
Truly Random Numbers
And if you need even more randomness, random.org exposes an API over the Internet.
You can generate as close to truly random numbers as possible as long as your Smallworld session can use the HTTP POST method (e.g. via GSS or the HTTP Magik wrapper class).
For most use cases, the techniques we’ve discussed for PRNGs should suffice. For more challenging requirements, using an external site to source your random numbers will cover a wider range of applications.