Attacking Go's Lagged Fibonacci Generator
Dylan Katz
In 1969, Donald Knuth published the second volume of The Art of Computer Programming. One of the topics covered was random number generation, in which Knuth described several algorithms that take varying approaches to generating pseudo-random numbers. Knuth described an unpublished additive generator suggested by G.J. Mitchell and D.P. Moore in 1958 (Knuth, 1997, p. 27).
In the late 90s, Jim Reeds and Don Mitchell developed an interesting extension of this algorithm for an early version of UNIX. Eventually, the algorithm found its way into Plan 9 (Mitchell & Reeds, n.d.), and finally landed as Goβs default random source in 2008.
More than a decade later, I stumbled upon a program using Goβs random function to generate secrets. I wanted to determine if I could predict these values, and what the impact of using Goβs default non-secure random source was. However, I found no documentation on what algorithm was used aside from a vague comment in the source:
βalgorithm by DP Mitchell and JA Reedsβ
This led me down a rabbit hole that left me with two things: the seed of a random number generator and a story about a very old algorithm.
Go explicitly warns against the practice of using their standard pseudo-random number generator (PRNG) for security-sensitive contexts. Like most languages, it provides both a standard PRNG (the βmath/randβ package), and a cryptographically secure PRNG (the βcrypto/randβ package) intended for security-sensitive operations.
To dissuade misuse and encourage good practice, itβs important for the security and software development community to discuss the implications of using standard PRNGs for security-sensitive contexts. Many PRNGs expose enough information for their internal state to be reconstructed using a small set of outputs. Knowledge of an algorithmβs state may allow observers to recreate βrandomβ numbers generated by the algorithm. This allows for a class of attacks known as state recovery attacks.
Go uses an algorithm known as a lagged Fibonacci generator (LFG). The algorithm has been around for quite some time. The algorithm has some long-known statistical flaws that make it less appealing to use when compared to other PRNGs (Marsaglia, 1984), (Ziff, 1998). In addition to these flaws, LFGs expose internal state as output, creating security flaws (as youβd expect in a non-cryptographically secure PRNG).
Background
Before attacking anything, it is important to understand how it works. Letβs revisit the algorithm by Mitchell and Moore mentioned earlier:
βG.J. Mitchell and D.P. Moore β¦ suggested the somewhat unusual sequence defined by:
where m is even, and where are arbitrary integers not all even.β (Knuth, 1997, p. 27)
As Knuth explained, the algorithm begins with a set of 55 arbitrary (but not all even) integers, then generates values by summing two previous values in the sequence. This creates a relationship comparable to the Fibonacci sequence (where each value is the sum of the two previous values). Because the operation is addition, the algorithm is an example of an Additive Lagged Fibonacci Generator (ALFG). LFGs can also use multiplication or binary exclusive OR (XOR) as their operation. If youβre familiar with them, you can think of LFGs as large, two-tap linear feedback shift registers. The π and π, respectively). They have a significant impact on the behavior of the algorithm and are generally selected from a set of known good values. Go uses a pair of (π=273, π=607), a modulus of 263β1, and addition as the operation, so the ALFG used by Go is defined by this function:
For the initial state of π0,β¦π607 values, Go uses a vector of pre-generated values from a linear congruential generator after 780β1010 iterations (Reeds J. , 2014).
If a new seed is set for the LFG, a linear congruential generator (LCG) is created with the provided seed, then the state values are transformed with outputs from the LCG. This is a Lehmer RNG using Schrageβs method (Schrage, 1979) to generate a set of values over which an exclusive OR (XOR) operation is performed with the pre-generated values. This use of an LCG seeded with a user-provided number to manipulate the state of the LFG appears to be a novel technique developed by Reeds and Mitchell.
After the initial state is defined and seeded, we can begin to generate values. As mentioned, the algorithm has two indices, π and π, which are initialized to π = 0 and π = 334. Then, when a new value is requested, Go subtracts 1 from these indices, setting them back to 607 if they reach zero, and performs the addition/modulus operations for each set of two values in the state at indices π and π.
Impact of State Disclosure
Many PRNGs expose information about their internal state with varying degrees of security impact. Not all state disclosure is of the same significance, and some algorithms do a better job of concealing internal state. There are a few factors that influence the impact and practical applicability of state disclosure attacks such as repeatability, invertibility, and the amount of state disclosed per value.
ΒRepeatability is the ability to recreate the sequence generated from a specific set of inputs. If state is modified with values that arenβt exposed to observers, it may become significantly more difficult to reconstruct it. For instance, if the internal state of an algorithm is frequently modified with numbers from a true random source, any exposed state would only be known to observers for a short time. This may mean that itβs not possible to recover the entire state, or that it would only be useful for a short time.
Another factor is invertibility. If an algorithm can be inverted, this grants observers the ability to generate historical values given the current state of the algorithm. This increases the amount of information disclosed and could lead to attackers reconstructing generated secrets before a system was compromised. (O'Neill, 2014)
Finally, the number and continuity of outputs required to disclose state may significantly influence the exploitability of state disclosure attacks. Particularly when observers are accessing an algorithm over the network or are attempting to recover the state of algorithms that are updated frequently, recording several sequential outputs may not be feasible or may take significantly more resources.
ΒAttacks Against StateΒ
Okay, wow, thatβs a lot of information! Now that weβve discussed the history and behavior of this system, we can talk about how it breaks down. State recovery attacks against ALFGs are less complicated than the algorithm itself.
If π is the size of our state, after the initial π values the state of Goβs ALFG is composed solely of past outputs. This is just the nature of a Fibonacci sequence. All values are eventually the sum of two previous values. Goβs algorithm uses a π value of 607, which means 607 sequential outputs reveal the entire state of the algorithm. All we need to do to recover the state is record all those numbers.
Itβs possible this could be optimized further to require fewer values or correct for non-sequential values, perhaps by exploiting a relationship between values or brute forcing missing state values (although the latter would take quite some time). An algorithm known as the Berlekamp-Massey algorithm could also be used, but would require significantly more outputs since the LFG runs over elements in a Galois Field of 264. (Reeds & Sloane, 1985)
Once the current state has been recovered, generating new values is simply a matter of continuing the sequence. The algorithm is repeatable, which allows for all future values to be generated without any further interaction with the source generator. Unless explicitly re-seeded, Goβs RNG has a stable state.
Since this algorithm is also deterministic and invertible, previous values can be determined using the inverse of the operation chosen. Go uses an additive LFG (ALFG), so subtraction can be used to determine previous values. For any index π in lagged Fibonacci generator π, with taps π and π, if π β₯ π, then ππβπ = ππ β ππβπ. Running this recursively yields all historically generated values. Using this to get the original state of the algorithm is more difficult if the number of values generated isnβt known.
Additive lagged Fibonacci generators (ALFGs) are both repeatable and reversible after π values. For Go, this means after observing 607 outputs, every future and past value ever generated with the default random algorithm can be calculated. Given the ease with which weβre able to attack this algorithm, we can conclude that Goβs documentation is correct, and we should not use this algorithm for generating secrets. itβs not designed to be cryptographically secure. This holds true regardless of how itβs seeded or used.
Finding the Seed
Once the original state of Goβs lagged Fibonacci Generator has been completely recovered, the seed used to generate the sequence can be cracked. Before explaining this, itβs important to provide a little more detail on Reeds and Mitchellβs algorithm for seeding. As mentioned before, when the seed function is called, Go uses a linear congruential generator (LCG) with the provided seed to manipulate the current state of the lagged Fibonacci generator.
When a seed is applied, a new LCG is created, defined by πΏπΆπΊ(π₯)=48271π₯ πππ (231 β 1), where π₯ is the user-provided seed. For each value π£ in the ALFGβs current state, the LCG is used to generate three 32-bit values: π₯1, π₯2, and π₯3. These are combined into a single 64-bit output value, π which is generated using π = (π₯1 βͺ 40) β (π₯2 βͺ 20)β π₯3, where β is a binary exclusive OR (XOR) operation. Finally, π£ is updated to π β π£.
To fully recover the initially generated seed, weβll need two full values from the LCGβs state. However, the XOR operation destroys some of the information. Since the least amount of information is preserved for π₯1 it makes sense to target π₯2 and π₯3. Because these values were generated with an LCG, a pair of two state values, (π₯πβ1, π₯π) is likely valid if πΏπΆπΊ(π₯πβ1)=π₯π.
The first step in recovering these state values is to run a series of brute-force checks to determine the unknown bits. While only 11 bits of π₯2 are exposed, most of π₯2 was obscured by the XOR operation performed with π₯3. This means brute force checking candidates for π₯3 then reversing the XOR operation exposes the lowest 22 bits of candidates for π₯2. This significantly reduces the overall search space from 221 β 212 to 210 β 212 permutations.
The guess-and-check strategy described previously may not be completely accurate if thereβs more than one valid pair of (π₯πβ1,π₯π) that satisfies the check, but this becomes significantly less likely as more values of π₯ are checked. Once a valid pair of values is found, calculating the multiplicative modular inverse of π₯π under modulo π can be used to reverse the LCG to its seed. We know a fixed number of values were generated by this algorithm, as the state of the LFG is a constant number. This means we can roll the LCG back and know weβve found the originally provided seed.
SummaryΒ
As Go states in their documentation, the default lagged Fibonacci random source is not suitable for security-sensitive purposes. The methods Iβve developed and described here, while complex for me, would be simple for skilled cryptographers or mathematicians. The algorithm itself was never designed to be secure. I hope that my observations will prove valuable both in recording the extensive history of this algorithm (to the best of my ability) and assisting security experts in assessing risk, should you find developers misusing Goβs math/rand package.
Go is currently looking at alternatives to the lagged Fibonacci generator for their PRNG. More modern algorithms such as OβNeillβs permuted congruential generators strike a better balance between statistical qualities, speed, and predictability. Itβs encouraging to see security and predictability being considered as factors, even for non-cryptographically secure algorithms.
Code and notes generated during my research can found at my GitHub.
Acknowledgements
Thanks to Leviathan Security Group for providing me with the time and support to do this research (particularly, thank you to Dani Cronce and Hannah Silva), Edwin A. Robledo for continued discussion and advice, and the Go Contributors for the documentation and public discussion on the topic.
References
Chao-Liang Liu, G. H.-Y. (2008). Computing the modular inverses is as simple as computing the GCDs. Finite Fields and Their Applications, 65-75.
Knuth, D. E. (1997). The Art of Computer Programming (3rd ed., Vol. 2). Reading, Massachusetts: Addison Wesley Longman.
Marsaglia, G. (1984). A Current View of Random Number Generators. Computer Science and Statistics: 16th Symposium on the Interface (pp. 2-3). Atlanta: Elsevier Press.
Mitchell, D. P., & Reeds, J. A. (n.d.). Plan 9 from Bell Labs. Retrieved from https://9p.io/sources/plan9/sys/src/cmd/cwfs/lrand.c.
O'Neill, M. E. (2014). PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation. Harvey Mudd College, Claremont, CA. Retrieved from https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf
Reeds, J. (2014, February 27). (N. Craig-Wood, Interviewer) Retrieved from https://groups.google.com/g/golang-nuts/c/RZ1G3_cxMcM/m/_7J7tnHhsU4J
Reeds, J. A., & Sloane, N. J. (1985). Shift Register Synthesis (Modulo n). SIAM Journal on Computing.
Schrage, L. (1979). A More Portable Fortran Random Number Generator. ACM Transactions on Mathematical Software, 132--139.
The Go Authors. (2022, Oct 4). math/rand. Retrieved from Go Packages: https://pkg.go.dev/math/rand#pkg-overview
Ziff, R. M. (1998). Four-tap shift-register-sequence random-number generators. Computers in Physics, 385-392. doi:10.1063/1.168692