Nuitka Release 0.5.10
This release has a focus on code generation optimization. Doing major changes away from “C++-ish” code to “C-ish” code, many constructs are now faster or got looked at and optimized.
Compatibility: The variable name in locals for the iterator provided to the generator expression should be
.0, now it is.
Generators could leak frames until program exit, these are now properly freed immediately.
Faster exception save and restore functions that might be in-lined by the backend C compiler.
Faster error checks for many operations, where these errors are expected, e.g. instance attribute lookups.
Do not create traceback and locals dictionary for frame when
GeneratorExitare raised. These tracebacks were wasted, as they were immediately released afterwards.
Closure variables to functions and parameters of generator functions are now attached to the function and generator objects.
The creation of functions with closure taking was accelerated.
The creation and destruction of generator objects was accelerated.
The re-formulation for in-place assignments got simplified and got faster doing so.
In-place operations of
strwere always copying the string, even if was not necessary.
a += b # Was not re-using the storage of "a" in case of strings
Python2: Additions of
intfor Python2 are now even faster.
Access to local variable values got slightly accelerated at the expense of closure variables.
Added support for optimizing the
Removing unused temporary and local variables as a result of optimization, these previously still allocated storage.
The use of C++ classes for variable objects was removed. Closure variables are now attached as
PyCellObjectto the function objects owning them.
The use of C++ context classes for closure taking and generator parameters has been replaced with attaching values directly to functions and generator objects.
The indentation of code template instantiations spanning multiple was not in all cases proper. We were using emission objects that handle it new lines in code and mere
listobjects, that don’t handle them in mixed forms. Now only the emission objects are used.
Some templates with C++ helper functions that had no variables got changed to be properly formatted templates.
The internal API for handling of exceptions is now more consistent and used more efficiently.
The printing helpers got cleaned up and moved to static code, removing any need for forward declaration.
The use of
INCREASE_REFCOUNT_Xwas removed, it got replaced with proper
Py_XINCREFusages. The function was once required before “C-ish” lifted the need to do everything in one function call.
The use of
INCREASE_REFCOUNTgot reduced. See above for why that is any good. The idea is that
Py_INCREFmust be good enough, and that we want to avoid the C function it was, even if in-lined.
assertObjectfunction that checks if an object is not
NULLand has positive reference count, i.e. is sane, got turned into a preprocessor macro.
Deep hashes of constant values created in
--debugmode, which cover also mutable values, and attempt to depend on actual content. These are checked at program exit for corruption. This may help uncover bugs.
Speedcenter has been enhanced with better graphing and has more benchmarks now. More work will be needed to make it useful.
Updates to the Developer Manual, reflecting the current near finished state of “C-ish” code generation.
New reference count tests to cover generator expressions and their usage got added.
Many new construct based tests got added, these will be used for performance graphing, and serve as micro benchmarks now.
Again, more basic tests are directly executable with Python3.
This is the next evolution of “C-ish” coming to pass. The use of C++ has for all practical purposes vanished. It will remain an ongoing activity to clear that up and become real C. The C++ classes were a huge road block to many things, that now will become simpler. One example of these were in-place operations, which now can be dealt with easily.
Also, lots of polishing and tweaking was done while adding construct benchmarks that were made to check the impact of these changes. Here, generators probably stand out the most, as some of the missed optimization got revealed and then addressed.
Their speed increases will be visible to some programs that depend a lot on generators.
This release is clearly major in that the most important issues got addressed, future releases will provide more tuning and completeness, but structurally the “C-ish” migration has succeeded, and now we can reap the benefits in the coming releases. More work will be needed for all in-place operations to be accelerated.
More work will be needed to complete this, but it’s good that this is coming to an end, so we can focus on SSA based optimization for the major gains to be had.