The Problem with rand(3)
Not all random numbers are created equal
Some random numbers are less random than others, and most are less random then you think. This entry outlines some of the flaws in the standard ‘c’ libraries random number generators.
rand(3), rand48(3), random(3)
The trivial ‘c’ approach to generating a random number is to use a function in the standard library. The rand(3)
, rand48(3)
, and random(3)
functions, however all suffer from the same basic flaws: the generator must be seeded, and the output is less random and unpredictable than it appears. These issues among others make rand
and rand
-like functions largely unsuitable for security applications.
Seeding
One of the greatest flaws of rand
and rand
-like functions is their need to be seeded. Seeding a these generators involves using srand(3)
, srand48(3)
, seed48(3)
, or srandom(3)
, which appears straightforward until one considers what initial value to seed these functions with. A common approach is to seed with a static value (0, 1, 42, or 0xdeadbeef
are popular choices), but this produces the same ‘random’ stream of numbers every time the program is run, making the output completely predictable.
A slightly better approach is to seed with time(NULL)
or getpid(2)
. However, time(NULL)
is not only predictable but on the second scale, and getpid
only changes once per run depending on how it is run and what operating system it is used with.
Unfortunately, regardless of what seeding method is used, the algorithm used to generate the actual numbers is still inherently predictable.
How Much Random
Another human-factor concern with rand
and rand
-like functions is how much randomness they actually generate. At worst rand
generates a ‘random’ int
in the range [0
, 0x7fff
], while rand48
and random
a long
in [0
, 0x7fffffff
]. However, this still poses a problem. One would reasonably expect a function which returns a random int
or long
to have its returned value fall in the full range of that datatype, but this isn’t the case. This means for unsigned types rand
will produce only half the possible values, and for signed types it will only produce positive values.
According to research performed by Michael Ligh and presented at DEFCON 17, the Conficker.B Worm used the rand
function to generate target IP addresses. To do this it needed to generate two random numbers in the range [0
, 0xffff
] but due to the deficiencies of rand
instead generated a number in the range [0
, 0x7fff
]. These two random numbers were then used for the first and last half of a target IP address. This error in expectation meant that no IP address with the first and third octets in [0x80
, 0xff
] would be generated. This left approximately half of possible IP addresses un-targetable.
Quality
Traditional random-number generators are entirely deterministic, meaning that if the seed is known every number in the output sequence can be predicted. If the generator is seeded with the time an attacker can estimate the moment (within a second of accuracy) in which srand
was called and generate the same ‘random’ sequence of numbers. Similarly, most operating systems don’t randomize their pid
’s, making this form of seeding entirely predictable.
Even if the pid
is randomized, like in OpenBSD, the quality of traditional algorithms leaves much to be desired. Many implementations of rand
and all implementations of rand48 use a linear-shift feedback register which is fast and easy, but produces predictable output. In the case of rand
it is also known that in many implementations there exists a predictable cyclic pattern to the least significant bits. In my research I discovered a similar pattern regarding the most significant bits, observing little change between generations and the first generated random number of nearby seeds on OS X. For example srand(0x00ff)
through srand(0x00f0)
generates the following for the first call to rand:
0x3d8c90
,0x3dce37
,0x3e0fde
,0x3e5185
,0x3e932c
,0x3ed4d3
,0x3f167a
,0x3f5821
,0x3f99c8
,0x3fdb6f
The most significant bits remain mostly unchanged, with all variations being incremental and minor. This phenomenon occurs with almost any sequential seed, and is unlikely unique to OS X.
Solution?
The best way to seed a random function is to use /dev/urandom
or /dev/random
if the urandom
device isn’t available, though this method has is own problems. If a program runs out of file descriptors (or, more importantly, is made to run out of file descriptors) or is in a chroot(2)
and /dev/random
isn’t available, it is impossible to seed or generate cryptographically secure numbers. Worse still, it is possible to accidentally set the seed as -1 or 0x7fffffff
, which is non-random, and the desired effect for a knowledgeable attacker.
When using OpenBSD, OS X, or other UNIX’s, arc4random(3)
may be available. This function does not suffer from any of the above issues, and is the best solution I have thus far found.
Further Resources
rand(3)
man page, iOS/OS X & OpenBSDrand48(3)
man page, iOS/OS X & OpenBSDrandom(3)
man page, iOS/OS X & OpenBSD- Examples of bad random
- Theo’s Random Hunt
- DEFCON 17 - Making fun of your Malware by Michael Ligh
License
Any source in this article is released under the ISC License.