This is about the new release with focus on performance and cleanups. It indicates significant progress with the milestone this release series really is about as it adds a compiled_method type.
So far functions, generator function, generator expressions were compiled objects, but in the context of classes, functions were wrapped in CPython instancemethod objects. The new compiled_method is specifically designed for wrapping compiled_function and therefore more efficient at it.
- When using Python or Nuitka.py to execute some script, the exit code in case of "file not found" was not the same as CPython. It should be 2, not 1.
- The exit code of the created programs (--deep mode) in case of an uncaught exception was 0, now it an error exit with value 1, like CPython does it.
- Exception tracebacks created inside with statements could contain duplicate lines, this was corrected.
Global variable assignments now also use assign0 where no reference exists.
The assignment code for module variables is actually faster if it needs not drop the reference, but clearly the code shouldn't bother to take it on the outside just for that. This variant existed, but wasn't used as much so far.
The instance method objects are now Nuitka's own compiled type too. This should make things slightly faster by itself.
Our new compiled method objects support dedicated method parsing code, where self is passed directly, allowing to make calls taking a fast path in parameter parsing.
This avoids allocating/freeing a tuple object per method call, while reduced 3% ticks in "PyStone" benchmark, so that's significant.
Solved a TODO of BUILTIN_RANGE to change it to pre-allocating the list in the final size as we normally do everywhere else. This was a tick reduction of 0.4% in "PyStone" benchmark, but the measurement method normalizes on loop speed, so it's not visible in the numbers output.
Parameter variables cannot possibly be uninitialized at creation and most often they are never subject to a del statement. Adding dedicated C++ variable classes gave a big speedup, around 3% of "PyStone" benchmark ticks.
Some abstract object operations were re-implemented, which allows to avoid function calls e.g. in the ITERATOR_NEXT case, this gave a few percent on "PyStone" as well.
- New package nuitka.codegen to contain all code generation related stuff, moved nuitka.templates to nuitka.codegen.templates as part of that.
- Inside the nuitka.codegen package the MainControl module now longer reaches into Generator for simple things, but goes through CodeGeneration for everything now.
- The Generator module uses almost no tree nodes anymore, but instead gets information passed in function calls. This allows for a cleanup of the interface towards CodeGeneration. Gives a cleaner view on the C++ code generation, and generally furthers the goal of other than C++ language backends.
- More "PyLint" work, many of the reported warnings have been addressed, but it's not yet happy.
- Defaults for yield and return are None and these values are now already added (as constants) during tree building so that no such special cases need to be dealt with in CodeGeneration and future analysis steps.
- Parameter parsing code has been unified even further, now the whole entry point is generated by one of the function in the new nuitka.codegen.ParameterParsing module.
- Split variable, exception, built-in helper classes into separate header files.
- The exit codes of CPython execution and Nuitka compiled programs are now compared as well.
- Errors messages of methods are now covered by the ParameterErrors test as well.
A new script "benchmark.sh" (now called "run-valgrind.py") script now starts "kcachegrind" to display the valgrind result directly.
One can now use it to execute a test and inspect valgrind information right away, then improve it. Very useful to discover methods for improvements, test them, then refine some more.
The "check-release.sh" script needs to unset NUITKA_EXTRA_OPTIONS or else the reflection test will trip over the changed output paths.
Pystone(1.1) time for 50000 passes = 0.65 This machine benchmarks at 76923.1 pystones/second
Nuitka 0.3.7 (driven by python 2.6):
Pystone(1.1) time for 50000 passes = 0.28 This machine benchmarks at 178571 pystones/second
This is a 132% speed of 0.3.7 compared to CPython, up from 109% compare to the previous release. This is a another small increase, that can be fully attributed to milestone 2 measures, i.e. not analysis, but purely more efficient C++ code generation and the new compiled method type.
One can now safely assume that it is at least twice as fast, but I will try and get the PyPy or Shedskin test suite to run as benchmarks to prove it.
No milestone 3 work in this release. I believe it's best to finish with milestone 2 first, because these are quite universal gains that we should have covered.