Forcing a file’s CRC to any value

By modifying any 4 consecutive bytes in a file, you can force the file’s CRC-32 hash to any value you choose. This technique is useful for vanity purposes, and shows that CRC-32 is extremely malleable and unsuitable for protecting a file from intentional modifications. This page explains the mathematics behind the technique, and provides programs to demonstrate this technique.

Source code

HxD hex editor

Java: forcecrc32.java
Usage: javac forcecrc32.java; java forcecrc32 FileName ByteOffset NewCrc32Value
Example: java forcecrc32 picture.bmp 31337 DEADBEEF

Python: forcecrc32.py
Usage: python forcecrc32.py FileName ByteOffset NewCrc32Value
Example: python forcecrc32.py document.txt 94 CAFEF00D

To find the appropriate byte offset in the file to make the modification, it is recommended to use a hex editor (e.g. HxD on Windows, GHex on Linux).

The program updates your specified file in place, but only changes the 4 bytes starting at ByteOffset. If you are unsure of what you’re doing, make a backup of the target file before running the program.

Review of CRC-32

These are the steps to calculate the CRC-32 of a file or message:

  1. Append the bytes 00 00 00 00 to the message.

  2. XOR the initial 4 bytes by FF FF FF FF.

  3. Interpret the sequence of bytes as a list of polynomial coefficients. Earlier bytes are assigned higher powers. Less significant bits are assigned higher powers. So for example, the hexadecimal byte sequence 01 C0 represents the sequence of polynomial coefficients 10000000 00000011, which represents the polynomial \(x^{15} + x^1 + x^0\).

  4. Divide this preprocessed message polynomial by the CRC-32 generator polynomial[0] in \(\text{GF}(2)\) and keep only the remainder.

  5. Interpret the remainder as a 32-bit integer by mapping less significant bits to higher powers. For example, the polynomial \(x^{31} + x^3 + x^1 + x^0\) is represented by the integer 0xD0000001.

  6. XOR the value by 0xFFFFFFFF.

Forcing the CRC

The core of the CRC algorithm (including CRC-32 in particular) is polynomial division, where the polynomials have coefficients drawn from the finite field \(\text{GF}(2)\).

But to do more advanced analysis on the properties of CRCs, we need to adopt a more formal view of things. The underlying field for the coefficients is always \(\text{GF}(2)\). The generator polynomial defines a quotient ring[1], and now every polynomial we work with is an element of this ring. So the goal is to take the preprocessed message polynomial \(M(x)\) (which may have a degree greater than the generator polynomial’s degree) and reduce it to its canonical form by doing polynomial division. Remember that from now on, we treat each polynomial as a single object, and the result of each arithmetic operation is a polynomial object.

To force the CRC to any value we desire, read the following observations and derivation:

  1. Let \(G(x)\) be the CRC generator polynomial.

  2. Let \(n\) be the bit length of the original message. Let \(k\) be the chosen bit offset from the start of the message where the message will be modified.

  3. Let \(M(x)\) be the preprocessed message polynomial.

  4. Let \(R(x) \equiv M(x) \!\mod\! G(x)\) be the remainder polynomial, having degree less than \(G\)’s degree.

  5. Let \(R'(x)\) be the new remainder chosen by the user.

  6. Let \(D(x) \equiv R'(x) - R(x) \!\mod\! G(x)\). This is the delta for the remainder.

  7. The act of modifying degree-of-\(G\) bits of the message \(M(x)\) at bit offset \(k\) means XORing those bits by some polynomial \(E(x)\).

  8. Let \(M'(x)\) be the preprocessed message after modification, which we will compute.

  9. Let \(E(x)\) be the delta applied to the message bits that will be modified, which we will compute.

  10. Algebraically, we have that \(M'(x) \equiv M(x) + E(x) \, x^{n - k} \!\mod\! G(x)\).

  11. We want \(M'(x) \equiv R'(x) \!\mod\! G(x)\). But we know \(M(x) \equiv R(x) \!\mod\! G(x)\).

  12. Substitute and subtract to get \(E(x) \equiv D(x) \, (x^{n-k})^{-1} \mod G(x)\).

To summarize, in order to compute how to modify the message bytes, we need to calculate the difference between the current CRC and the desired CRC, calculate a polynomial power mod \(G(x)\), calculate a reciprocal mod \(G(x)\), and calculate a product mod \(G(x)\).

Notes

More info



Feedback

Question? Comment? Contact me

ProjectNayuki: Like, comment, follow updates on Facebook