This release 0.3.0 is the first release to focus on performance. In the 0.2.x series Nuitka achieved feature parity with CPython 2.6 and that was very important, but now it is time to make it really useful.
Optimization has been one of the main points, although I was also a bit forward looking to Python 2.7 language constructs. This release is the first where I really started to measure things and removed the most important bottlenecks.
Fast global variables: Reads of global variables were fast already. This was due to a trick that is now also used to check them and to do a much quicker update if they are already set.
Fast break/continue statements: To make sure these statements execute the finally handlers if inside a try, these used C++ exceptions that were caught by try/finally in while or for loops.
This was very slow and had very bad performance. Now it is checked if this is at all necessary and then it's only done for the rare case where a break/continue really is inside the tried block. Otherwise it is now translated to a C++ break/continue which the C++ compiler handles more efficiently.
Added unlikely() compiler hints to all errors handling cases to allow the C++ compiler to generate more efficient branch code.
The for loop code was using an exception handler to make sure the iterated value was released, using PyObjectTemporary for that instead now, which should lead to better generated code.
Using constant dictionaries and copy from them instead of building them at run time even when contents was constant.
Pystone(1.1) time for 50000 passes = 0.65 This machine benchmarks at 76923.1 pystones/second
Pystone(1.1) time for 50000 passes = 0.52 This machine benchmarks at 96153.8 pystones/second
That's a 25% speedup now and a good start clearly. It's not yet in the range of where i want it to be, but there is always room for more. And the break/continue exception was an important performance regression fix.