Fast skipping in a linear congruential generator
Introduction
A linear congruential generator (LCG) is a pseudorandom number generation (PRNG) algorithm. Its behaviour is defined by the this recurrence relation:
\(x_0 = \text{seed}\)
\(x_{k+1} = (a x_k + b) \mod m \text{, for $k \in \mathbb{N}$}\)
The constants \(a\), \(b\), \(m\) are chosen carefully to give good properties, such as having a full period of \(m - 1\), having \(m\) be a power of 2 (which is efficient on computers), or having \(b\) be 0 (to save an operation). The seed value can be chosen (almost) arbitrarily for each random number generator instance. The sequence of \(x\) values are the pseudorandom numbers that are generated. They can be used as-is, or they can be filter and have their distrubtion altered.
Fast skipping
Suppose we know \(x_0\) and we want to determine \(x_n\) for some \(n\). Clearly it is possible to iterate the recurrence formula \(n\) times to get our desired result, which takes \(\theta(n)\) time. But with some clever arithmetic, we can get the result much faster, in fact in \(\theta(\log n)\) time.
Consider the special case where \(b = 0\). Then \(x_n = a^n x_0 \mod m\). We can compute \(a^n x_0 \mod m\) quickly using the well known modular exponentiation algorithm, which is exponentiation by squaring with a reduction of each intermediate result modulo \(m\).
As for the general case, first consider the recurrence relation \(y_{k+1} = a y_k + b\), which is the same as the LCG recurrence but without the modulo. The closed-form solution to this first-order linear recurrence is \(y_k = a^n y_0 + \frac{a^n - 1}{a - 1} b\). (Finding this solution requires some ingenuity or a table or a CAS. But now that the solution has been stated, you can prove its correctness by induction quite easily.)
Reinserting the modulos, we have \(x_k = ((a^n x_0 \mod m) + (\frac{a^n - 1}{a - 1} b)) \mod m\). The first term, \(a^n x_0 \mod m\), is the modular exponentiation discussed above. The second term is trickier. The quotient \(\frac{a^n - 1}{a - 1}\) is always an integer, because it is equal to \(a^{n-1} + a^{n-2} + \cdots + a^1 + a^0\). (Proof left to the reader.) To compute \(\frac{a^n - 1}{a - 1} \mod m\), we compute \((a^n - 1) \mod (a - 1) m\), then non-modular-divide by \(a - 1\). (I will not justify why this works.)
Altogether, we have: \(x_k = ((a^n x_0 \mod m) + (\frac{(a^n - 1) \mod (a -1) m}{a - 1} b)) \mod m\)
Backward iteration
By rearranging the recurrence relation, we get \(x_k = a^{-1} (x_{k+1} - b) \mod m = a^{-1} x_{k+1} + (a^{-1})(-b)\). As long as \(a\) and \(m\) are coprime, the inverse of \(a\) exists and can be computed by the extended Euclidean algorithm.
Incidentally, we can do fast backward skipping by using \((a', b', m)\) with \(a' = a^{-1} \mod m\) and \(b' = -(a^{-1} b) \mod m\).
Source code
To back up my claims about fast seeking and backward iteration, here is live Java code that implements the concepts discussed in the article: LcgRandom.java