19 March 2011
Nuitka Release 0.3.7
This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.
This is about the new release with focus on performance and cleanups. It
indicates significant progress with the milestone this release series
really is about as it adds a compiled_method
type.
So far functions, generator function, generator expressions were
compiled objects, but in the context of classes, functions were wrapped
in CPython instancemethod
objects. The new compiled_method
is
specifically designed for wrapping compiled_function
and therefore
more efficient at it.
Bug Fixes
When using
Python
orNuitka.py
to execute some script, the exit code in case of “file not found” was not the same as CPython. It should be 2, not 1.The exit code of the created programs (
--deep
mode) in case of an uncaught exception was 0, now it an error exit with value 1, like CPython does it.Exception tracebacks created inside
with
statements could contain duplicate lines, this was corrected.
Optimization
Global variable assignments now also use
assign0
where no reference exists.The assignment code for module variables is actually faster if it needs not drop the reference, but clearly the code shouldn’t bother to take it on the outside just for that. This variant existed, but wasn’t used as much so far.
The instance method objects are now Nuitka’s own compiled type too. This should make things slightly faster by itself.
Our new compiled method objects support dedicated method parsing code, where
self
is passed directly, allowing to make calls taking a fast path in parameter parsing.This avoids allocating/freeing a
tuple
object per method call, while reduced 3% ticks in “PyStone” benchmark, so that’s significant.Solved a
TODO
ofBUILTIN_RANGE
to change it to pre-allocating the list in the final size as we normally do everywhere else. This was a tick reduction of 0.4% in “PyStone” benchmark, but the measurement method normalizes on loop speed, so it’s not visible in the numbers output.Parameter variables cannot possibly be uninitialized at creation and most often they are never subject to a
del
statement. Adding dedicated C++ variable classes gave a big speedup, around 3% of “PyStone” benchmark ticks.Some abstract object operations were re-implemented, which allows to avoid function calls e.g. in the
ITERATOR_NEXT
case, this gave a few percent on “PyStone” as well.
Cleanups
New package
nuitka.codegen
to contain all code generation related stuff, movednuitka.templates
tonuitka.codegen.templates
as part of that.Inside the
nuitka.codegen
package theMainControl
module now longer reaches intoGenerator
for simple things, but goes throughCodeGeneration
for everything now.The
Generator
module uses almost no tree nodes anymore, but instead gets information passed in function calls. This allows for a cleanup of the interface towardsCodeGeneration
. Gives a cleaner view on the C++ code generation, and generally furthers the goal of other than C++ language backends.More “PyLint” work, many of the reported warnings have been addressed, but it’s not yet happy.
Defaults for
yield
andreturn
areNone
and these values are now already added (as constants) during tree building so that no such special cases need to be dealt with inCodeGeneration
and future analysis steps.Parameter parsing code has been unified even further, now the whole entry point is generated by one of the function in the new
nuitka.codegen.ParameterParsing
module.Split variable, exception, built-in helper classes into separate header files.
New Tests
The exit codes of CPython execution and Nuitka compiled programs are now compared as well.
Errors messages of methods are now covered by the
ParameterErrors
test as well.
Organizational
A new script “benchmark.sh” (now called “run-valgrind.py”) script now starts “kcachegrind” to display the valgrind result directly.
One can now use it to execute a test and inspect valgrind information right away, then improve it. Very useful to discover methods for improvements, test them, then refine some more.
The “check-release.sh” script needs to unset
NUITKA_EXTRA_OPTIONS
or else the reflection test will trip over the changed output paths.
Numbers
python 2.6:
Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second
Nuitka 0.3.7 (driven by python 2.6):
Pystone(1.1) time for 50000 passes = 0.28
This machine benchmarks at 178571 pystones/second
This is a 132% speed of 0.3.7 compared to CPython, up from 109% compare to the previous release. This is a another small increase, that can be fully attributed to milestone 2 measures, i.e. not analysis, but purely more efficient C++ code generation and the new compiled method type.
One can now safely assume that it is at least twice as fast, but I will try and get the PyPy or Shedskin test suite to run as benchmarks to prove it.
No milestone 3 work in this release. I believe it’s best to finish with milestone 2 first, because these are quite universal gains that we should have covered.