Project Nayuki


Optimizing brainfuck compiler

Brainfuck Mandelbrot output

This Python script translates a brainfuck source file into a C/Java/Python source file. Along the way, it performs optimizations on the brainfuck program, described below. Afterward, you use your C/Java/Python compiler or interpreter on the newly generated source code.

Download: bfc.py

Usage: python bfc.py BrainfuckFile OutputFile.c/java/py

Full example:
python bfc.py mandelbrot.b.txt mandelbrot.c gcc -o mandelbrot mandelbrot.c ./mandelbrot

Tip: It can be tedious to compile and run a brainfuck program using this tool. For example for C, you might want to put all the commands on one line: python bfc.py myprog.b myprog.c && gcc -o myprog myprog.c && ./myprog. Alternatively, you can write a Makefile with rules for compiling, running, cleaning, etc.

Example and benchmark

mandelbrot.c in Notepad2

Benchmark:

Unoptimized Optimized
C, -O0 17.20 s 2.77 s
C, -O1 1.09 s 0.64 s
C, -O2 1.02 s 0.66 s
C, -O3 1.05 s 0.66 s
Java 41.40 s 19.20 s
Python 490.00 s 190.00 s

All benchmarks above were performed on an Intel Xeon E3-1270 CPU (single-threaded). The non-optimized source codes are not published because they are uninteresting. Numbers are reported to 3 significant figures. Software-wise, GCC 4.8.2, Oracle Java 1.7.0_51, CPython 2.7.6, and Linux 3.12.6 x86-64 were used.

Optimizations

I will use this notation: unsigned char *p points to the current cell (which corresponds to the C output from the script).

Fusing increments/decrements

A real computer can add by not just one, but by any number. So for example, we can translate +++ into p[0] += 3;.

Fusing left/right movements

Similarly, we can move the pointer by more than one – for example, we can translate >> into p += 2;.

Fusing movements into adds

We can add/subtract from cells other than the current one. For example, we can translate >++< into p[1] += 2;, which no longer contains pointer movements.

Postponing movements

In a block of brainfuck commands, if the block consists of only increments/decrements, left/right, and input/output, then we can keep track of the virtual movement and then perform it in one shot at the end of the block. For example, we can translate >+>-> into p[1]+=1; p[2]-=1; p+=3;. Note that we have to actually perform the movements before entering a loop/if and before exiting a loop/if.

Simple loops

A moment’s thought should make it clear that [-] translates to p[0] = 0;. But we can generalize even further than that. If the loop body has no subloops and no input/output, all the movements add up to 0, and all the increments/decrements at p[0] add up to −1, then we are basically running the loop body p[0] times. So for all the increments/decrements outside of p[0], some multiple of p[0] is added to that location. For example, we can translate [->+>---<<] into p[1]+=p[0]; p[2]-=p[0]*3; p[0]=0;.

Assign followed by add

Obviously, p[i] = 0; followed by p[i] += v; can be replaced by p[i] = v;. As well, p[i] = 0; followed by p[i] += p[j] * v; can be replaced by p[i] = p[j] * v;.

Complex loops

Extending the simple loop optimization, if a location other than the loop counter is used as the source for multiply-add and this location is cleared to 0 within the loop, then the loop can be turned into an if-statement. For example, we can optimize while(p[0]!=0){ p[0]--; p[1]+=2; p[2]+=p[3]*5; p[3]=0; } into if(p[0]!=0){ p[1]+=p[0]*2; p[0]=0; p[2]+=p[3]*5; p[3]=0; }.

Notes