Attacking Go's Lagged Fibonacci Generator
Dylan Katz
In 1969, Donald Knuth published the second volume of The Art of Computer Programming. One of the topics covered was random number generation, in which Knuth described several algorithms that take varying approaches to generating pseudo-random numbers. Knuth described an unpublished additive generator suggested by G.J. Mitchell and D.P. Moore in 1958 (Knuth, 1997, p. 27).
In the late 90s, Jim Reeds and Don Mitchell developed an interesting extension of this algorithm for an early version of UNIX. Eventually, the algorithm found its way into Plan 9 (Mitchell & Reeds, n.d.), and finally landed as Go’s default random source in 2008.
More than a decade later, I stumbled upon a program using Go’s random function to generate secrets. I wanted to determine if I could predict these values, and what the impact of using Go’s default non-secure random source was. However, I found no documentation on what algorithm was used aside from a vague comment in the source:
“algorithm by DP Mitchell and JA Reeds”
This led me down a rabbit hole that left me with two things: the seed of a random number generator and a story about a very old algorithm.
Go explicitly warns against the practice of using their standard pseudo-random number generator (PRNG) for security-sensitive contexts. Like most languages, it provides both a standard PRNG (the “math/rand” package), and a cryptographically secure PRNG (the “crypto/rand” package) intended for security-sensitive operations.
To dissuade misuse and encourage good practice, it’s important for the security and software development community to discuss the implications of using standard PRNGs for security-sensitive contexts. Many PRNGs expose enough information for their internal state to be reconstructed using a small set of outputs. Knowledge of an algorithm’s state may allow observers to recreate “random” numbers generated by the algorithm. This allows for a class of attacks known as state recovery attacks.
Go uses an algorithm known as a lagged Fibonacci generator (LFG). The algorithm has been around for quite some time. The algorithm has some long-known statistical flaws that make it less appealing to use when compared to other PRNGs (Marsaglia, 1984), (Ziff, 1998). In addition to these flaws, LFGs expose internal state as output, creating security flaws (as you’d expect in a non-cryptographically secure PRNG).
Background
Before attacking anything, it is important to understand how it works. Let’s revisit the algorithm by Mitchell and Moore mentioned earlier:
“G.J. Mitchell and D.P. Moore … suggested the somewhat unusual sequence defined by:
where m is even, and where are arbitrary integers not all even.” (Knuth, 1997, p. 27)
As Knuth explained, the algorithm begins with a set of 55 arbitrary (but not all even) integers, then generates values by summing two previous values in the sequence. This creates a relationship comparable to the Fibonacci sequence (where each value is the sum of the two previous values). Because the operation is addition, the algorithm is an example of an Additive Lagged Fibonacci Generator (ALFG). LFGs can also use multiplication or binary exclusive OR (XOR) as their operation. If you’re familiar with them, you can think of LFGs as large, two-tap linear feedback shift registers. The 𝑗 and 𝑘, respectively). They have a significant impact on the behavior of the algorithm and are generally selected from a set of known good values. Go uses a pair of (𝑗=273, 𝑘=607), a modulus of 263−1, and addition as the operation, so the ALFG used by Go is defined by this function:
For the initial state of 𝑋0,…𝑋607 values, Go uses a vector of pre-generated values from a linear congruential generator after 780∗1010 iterations (Reeds J. , 2014).
If a new seed is set for the LFG, a linear congruential generator (LCG) is created with the provided seed, then the state values are transformed with outputs from the LCG. This is a Lehmer RNG using Schrage’s method (Schrage, 1979) to generate a set of values over which an exclusive OR (XOR) operation is performed with the pre-generated values. This use of an LCG seeded with a user-provided number to manipulate the state of the LFG appears to be a novel technique developed by Reeds and Mitchell.
After the initial state is defined and seeded, we can begin to generate values. As mentioned, the algorithm has two indices, 𝑗 and 𝑘, which are initialized to 𝑗 = 0 and 𝑘 = 334. Then, when a new value is requested, Go subtracts 1 from these indices, setting them back to 607 if they reach zero, and performs the addition/modulus operations for each set of two values in the state at indices 𝑗 and 𝑘.
Impact of State Disclosure
Many PRNGs expose information about their internal state with varying degrees of security impact. Not all state disclosure is of the same significance, and some algorithms do a better job of concealing internal state. There are a few factors that influence the impact and practical applicability of state disclosure attacks such as repeatability, invertibility, and the amount of state disclosed per value.
Repeatability is the ability to recreate the sequence generated from a specific set of inputs. If state is modified with values that aren’t exposed to observers, it may become significantly more difficult to reconstruct it. For instance, if the internal state of an algorithm is frequently modified with numbers from a true random source, any exposed state would only be known to observers for a short time. This may mean that it’s not possible to recover the entire state, or that it would only be useful for a short time.
Another factor is invertibility. If an algorithm can be inverted, this grants observers the ability to generate historical values given the current state of the algorithm. This increases the amount of information disclosed and could lead to attackers reconstructing generated secrets before a system was compromised. (O'Neill, 2014)
Finally, the number and continuity of outputs required to disclose state may significantly influence the exploitability of state disclosure attacks. Particularly when observers are accessing an algorithm over the network or are attempting to recover the state of algorithms that are updated frequently, recording several sequential outputs may not be feasible or may take significantly more resources.
Attacks Against State
Okay, wow, that’s a lot of information! Now that we’ve discussed the history and behavior of this system, we can talk about how it breaks down. State recovery attacks against ALFGs are less complicated than the algorithm itself.
If 𝑘 is the size of our state, after the initial 𝑘 values the state of Go’s ALFG is composed solely of past outputs. This is just the nature of a Fibonacci sequence. All values are eventually the sum of two previous values. Go’s algorithm uses a 𝑘 value of 607, which means 607 sequential outputs reveal the entire state of the algorithm. All we need to do to recover the state is record all those numbers.
It’s possible this could be optimized further to require fewer values or correct for non-sequential values, perhaps by exploiting a relationship between values or brute forcing missing state values (although the latter would take quite some time). An algorithm known as the Berlekamp-Massey algorithm could also be used, but would require significantly more outputs since the LFG runs over elements in a Galois Field of 264. (Reeds & Sloane, 1985)
Once the current state has been recovered, generating new values is simply a matter of continuing the sequence. The algorithm is repeatable, which allows for all future values to be generated without any further interaction with the source generator. Unless explicitly re-seeded, Go’s RNG has a stable state.
Since this algorithm is also deterministic and invertible, previous values can be determined using the inverse of the operation chosen. Go uses an additive LFG (ALFG), so subtraction can be used to determine previous values. For any index 𝑛 in lagged Fibonacci generator 𝑋, with taps 𝑗 and 𝑘, if 𝑛 ≥ 𝑘, then 𝑋𝑛−𝑘 = 𝑋𝑛 − 𝑋𝑛−𝑗. Running this recursively yields all historically generated values. Using this to get the original state of the algorithm is more difficult if the number of values generated isn’t known.
Additive lagged Fibonacci generators (ALFGs) are both repeatable and reversible after 𝑘 values. For Go, this means after observing 607 outputs, every future and past value ever generated with the default random algorithm can be calculated. Given the ease with which we’re able to attack this algorithm, we can conclude that Go’s documentation is correct, and we should not use this algorithm for generating secrets. it’s not designed to be cryptographically secure. This holds true regardless of how it’s seeded or used.
Finding the Seed
Once the original state of Go’s lagged Fibonacci Generator has been completely recovered, the seed used to generate the sequence can be cracked. Before explaining this, it’s important to provide a little more detail on Reeds and Mitchell’s algorithm for seeding. As mentioned before, when the seed function is called, Go uses a linear congruential generator (LCG) with the provided seed to manipulate the current state of the lagged Fibonacci generator.
When a seed is applied, a new LCG is created, defined by 𝐿𝐶𝐺(𝑥)=48271𝑥 𝑚𝑜𝑑 (231 − 1), where 𝑥 is the user-provided seed. For each value 𝑣 in the ALFG’s current state, the LCG is used to generate three 32-bit values: 𝑥1, 𝑥2, and 𝑥3. These are combined into a single 64-bit output value, 𝑜 which is generated using 𝑜 = (𝑥1 ≪ 40) ⊕ (𝑥2 ≪ 20)⊕ 𝑥3, where ⊕ is a binary exclusive OR (XOR) operation. Finally, 𝑣 is updated to 𝑜 ⊕ 𝑣.
To fully recover the initially generated seed, we’ll need two full values from the LCG’s state. However, the XOR operation destroys some of the information. Since the least amount of information is preserved for 𝑥1 it makes sense to target 𝑥2 and 𝑥3. Because these values were generated with an LCG, a pair of two state values, (𝑥𝑛−1, 𝑥𝑛) is likely valid if 𝐿𝐶𝐺(𝑥𝑛−1)=𝑥𝑛.
The first step in recovering these state values is to run a series of brute-force checks to determine the unknown bits. While only 11 bits of 𝑥2 are exposed, most of 𝑥2 was obscured by the XOR operation performed with 𝑥3. This means brute force checking candidates for 𝑥3 then reversing the XOR operation exposes the lowest 22 bits of candidates for 𝑥2. This significantly reduces the overall search space from 221 ∗ 212 to 210 ∗ 212 permutations.
The guess-and-check strategy described previously may not be completely accurate if there’s more than one valid pair of (𝑥𝑛−1,𝑥𝑛) that satisfies the check, but this becomes significantly less likely as more values of 𝑥 are checked. Once a valid pair of values is found, calculating the multiplicative modular inverse of 𝑥𝑛 under modulo 𝑚 can be used to reverse the LCG to its seed. We know a fixed number of values were generated by this algorithm, as the state of the LFG is a constant number. This means we can roll the LCG back and know we’ve found the originally provided seed.
Summary
As Go states in their documentation, the default lagged Fibonacci random source is not suitable for security-sensitive purposes. The methods I’ve developed and described here, while complex for me, would be simple for skilled cryptographers or mathematicians. The algorithm itself was never designed to be secure. I hope that my observations will prove valuable both in recording the extensive history of this algorithm (to the best of my ability) and assisting security experts in assessing risk, should you find developers misusing Go’s math/rand package.
Go is currently looking at alternatives to the lagged Fibonacci generator for their PRNG. More modern algorithms such as O’Neill’s permuted congruential generators strike a better balance between statistical qualities, speed, and predictability. It’s encouraging to see security and predictability being considered as factors, even for non-cryptographically secure algorithms.
Code and notes generated during my research can found at my GitHub.
Acknowledgements
Thanks to Leviathan Security Group for providing me with the time and support to do this research (particularly, thank you to Dani Cronce and Hannah Silva), Edwin A. Robledo for continued discussion and advice, and the Go Contributors for the documentation and public discussion on the topic.
References
Chao-Liang Liu, G. H.-Y. (2008). Computing the modular inverses is as simple as computing the GCDs. Finite Fields and Their Applications, 65-75.
Knuth, D. E. (1997). The Art of Computer Programming (3rd ed., Vol. 2). Reading, Massachusetts: Addison Wesley Longman.
Marsaglia, G. (1984). A Current View of Random Number Generators. Computer Science and Statistics: 16th Symposium on the Interface (pp. 2-3). Atlanta: Elsevier Press.
Mitchell, D. P., & Reeds, J. A. (n.d.). Plan 9 from Bell Labs. Retrieved from https://9p.io/sources/plan9/sys/src/cmd/cwfs/lrand.c.
O'Neill, M. E. (2014). PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation. Harvey Mudd College, Claremont, CA. Retrieved from https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf
Reeds, J. (2014, February 27). (N. Craig-Wood, Interviewer) Retrieved from https://groups.google.com/g/golang-nuts/c/RZ1G3_cxMcM/m/_7J7tnHhsU4J
Reeds, J. A., & Sloane, N. J. (1985). Shift Register Synthesis (Modulo n). SIAM Journal on Computing.
Schrage, L. (1979). A More Portable Fortran Random Number Generator. ACM Transactions on Mathematical Software, 132--139.
The Go Authors. (2022, Oct 4). math/rand. Retrieved from Go Packages: https://pkg.go.dev/math/rand#pkg-overview
Ziff, R. M. (1998). Four-tap shift-register-sequence random-number generators. Computers in Physics, 385-392. doi:10.1063/1.168692