Nuitka Release 0.6.17
This release has a focus on performance improvements, while also polishing plugins and adding many new features.
Fix, plugins were not catching being used on packages not installed. Fixed in 0.6.16.2 already.
macOS: Fix weaknesses in the
otoolparsing to determine DLL dependency parsing. Fixed in 0.6.16.2 already.
Linux: Allow onefile program args with spaces contained to be properly passed. Fixed in 0.6.16.3 already.
Windows: Avoid using less portable C function for
%PID%formatting, which restores compilation on Windows 7 with old toolchains. Fixed in 0.6.16.3 already.
Standalone: Added support for
fstringspackage. Fixed in 0.6.16.3 already.
Compatibility: Fix, need to import
sitemodule, not before. This was causing crashes on CentOS7 with Python2. Fixed in 0.6.16.3 already.
Compatibility: Fix, when extension modules failed to load, in some cases the
ImportErrorwas lost to a
KeyError. Fixed in 0.6.16.3 already.
Fix, linker resource modes
linkerwere not working anymore, but are needed with LTO mode at least. Fixed in 0.6.16.3 already.
Standalone: Bytecode modules with null bytes in standard library, typically from disk corruption, were not handled properly. Fixed in 0.6.16.3 already.
.throw()into generators could cause corruption. Fixed in 0.6.16.4 already.
Python2: Fix, the bytecode compilation didn’t respect the
--python-flag=no_assertsmode. Fixed in 0.6.16.4 already.
Fix, calls were not annotating their arguments as escaped, causing corruption of mutable in static optimization. Fixed in 0.6.16.5 already.
Fix, some sequence objects, e.g.
numpy.arrayactually implement in-place add operations that need to be called. Fixed in 0.6.16.5 already.
Windows: Fix, onefile binaries were not working after being signed. This now works.
Standalone: Added missing implicit dependency for
Compatibility: Modules giving
SyntaxErrorfrom source were not properly handled, giving run time
ImportError. Now they are giving
Fix, the LTO mode has issues with
incbinusage on older gcc, so use
linkermode when it is enabled.
Python3: Fix, locals dict codes were not properly checking errors that the mapping might raise when setting values.
Fix, modules named
entrywere causing compile time errors in the C stage.
macOS: Never include files from OS private frameworks in standalone mode.
Fix, the python flag
--python-flag=no_warningwasn’t working on all platforms.
Compatibility: Fix, the main code of the
sitemodule wasn’t executing, so that its added builtins were not there. Of course, you ought to use
--python-flag=no_siteto not have it in the normal case.
Python2: Added code path to handle edited standard library source code which then has no valid bytecode file.
Anaconda: In module mode, the CondaCC wasn’t recognized as form of gcc.
Fix, bytecode modules could shadow compiled modules of the same name.
Onefile: Fix, expansion of
%PID%wasn’t working properly on non-Windows, making temp paths less unique. The time stamp is not necessarily enough.
multiprocessingerror exits from slave processes were not reporting tracebacks.
xcbglintegrationsto the list of sensible Qt plugins to include by default, otherwise rendering will be inferior.
platformthemesto the list of sensible Qt plugins to include by default, otherwise file dialogs on non-Windows would be inferior.
.pyifiles were not ordered deterministically.
Standalone: Added support for
Fix, namespace packages were not using run time values for their
Python3.7+: Fix, was leaking
AttributeErrorexceptions during name imports.
Fix, standard library detection could fail for relative paths.
Added experimental support for C level PGO (Profile Guided Optimization), which runs your program and then uses feedback from the execution. At this time only gcc is supported, and only C compiler is collecting feedback. Check the User Manual for a table with current results.
macOS: Added experimental support for creating application bundles. For these, icons can be specified and console can be disabled. But at this time, onefile and accelerated mode are not yet usable with it, only standalone mode works.
Plugins: Add support for
pkg_resources.requirecalls to be resolved at compile time. These are not working at run time, but this avoids the issue very nicely.
Plugins: Massive improvements to the
anti-bloatplugin, it can now make
matplotlibuse much less packages and has better error handling.
anti-bloatability ability to append code to a module, which might get used in the future by other plugins that need some sort of post load changes to be applied.
Plugins: Added ability to replace code of functions at parse time, and use this in
anti-bloatplugin to replace functions that do unnecessary stuff with variants that often just do nothing. This is illustrated here.
gevent._util: description: "remove gevent release framework" change_function: "prereleaser_middle": "'(lambda data: None)'" "postreleaser_before": "'(lambda data: None)'"
This example is removing
geventcode that loads dependencies used for their CI release process, that need not be part of normal programs.
Added ability to persist source code changes done by plugins in the Python installation. This is considered experimental and needs write access to the Python installation, so this is best done in a virtualenv and it may confuse plugins.
Added support for
multiprocessing.trackerand spawn mode for all platforms. For non-default modes outside of Windows, you need to
--enable-plugin=multiprocessingto use these.
Plugins: Allow multiple entry points to be provided by one or several plugins for the same modules. These are now merged into one automatically.
Standalone: Fix for numpy not working when compiling with
Fix, method calls were not respecting descriptors provided by types with non-generic attribute lookups.
Windows: Add support for using self-compiled Python3 from the build folder too.
Added support for Nuitka-Python 2.7, which will be our faster Python fork.
Colorized output for error outputs encountered in Scons, these are now yellow for better recognition.
Faster threading code was used for Python3.8 or higher, and this has been extended to 3.7 on Windows, but we won’t be able to have it other platforms and not on earlier Python3 versions.
Faster calls esp. with keyword arguments. Call with keywords no longer create dictionaries if the call target supports that, and with 3.8 or higher, non-compiled code that allows vectorcall is taken advantage of.
Faster class creation that avoids creation of argument tuples and dictionaries.
Faster attribute check code in case of non-present attributes.
Faster unbound method calls, unlike bound methods calls these were not optimized as well yet.
Type shapes for star arguments are now known and used in optimization.
def f(*args, **kwargs): type(args) # Statically known to be tuple type(kwargs) # Statically known to be dict
Python2: Faster old-style class creation. These are classes that do not explicitly inherit from
Python2: Faster string comparisons for Python by specializing for the
strtype as well.
Python3: Added specialization for
bytescomparisons too. These are naturally very much the same as
strcomparisons in Python2.
Added specialization for
listcomparisons too. We had them for
tuplesonly so far.
Faster method calls when called from Python core, our
tp_callslot wasn’t as good as it can be.
Optimization: Faster deep copies of constants. This can speed up constant calls with mutable types. Before it was checking the type too often to be fast.
Allow using static linking with Debian Python giving much better performance with the system Python. This is actually a huge improvement as it makes things much faster. So far it’s only automatically enabled for Python2, but it seems to work for Python3 on Debian too. Needs more tweaking in the future.
functoolsmodule to the list of hard imports in preparation of optimizing
functools.partialto work better with compiled functions.
Python2: Demote to
xrangewhen iterating over
rangecalls, even for small ranges, they are always faster. Previously this was only done for values with at least 256 values.
Enable LTO automatically for Debian Python, this also allows more optimization.
Enable LTO automatically for Anaconda with CondaCC on non-Windows, also allowing more optimization.
Added section in the User Manual on how to deal with memory issues and C compiler bugs. This is a frequent topic and should serve as a pointer for this kind of issue.
--ltooption was changed to require an argument, so that it can also be disabled. The default is
autowhich is the old behaviour where it’s enabled if possible.
--no-progressbarin order to make it more clear what it’s about. Previously it was possible to relate it to
No longer require specific versions of dependencies in our
requirements.txtand relegate those to only being in
requirements-devel.txtsuch that by default Nuitka doesn’t collide with user requirements on those same packages which absolutely all the time don’t really make a difference.
Added ability to check all unpushed changes with pylint with a new
./bin/check-nuitka-with-pylint --unpushedoption. Before it was only possible to make the check (quickly) with
--diff, but that stopped working after commits are made.
Revived support for
vmprofbased analysis of compiled programs, but it requires a fork of it now.
Make Windows specific compiler options visible on all platforms. There is no point in them being errors, instead warnings are given when they are specified on non-Windows.
Added project variable
Commercialfor use in Nuitka project syntax.
Consistent use of metavars for nicer help output should make it more readable.
asttree dumps in case of
KeyboardInterruptexceptions, they are just very noisy. Also not annotate where Nuitka was in optimization when a plugin is asking to
Encoding names for UTF-8 in calls to
.encode()were used inconsistent with and without dashes in the source code, added cleanup to auto-format that picks the one blessed.
Cleanup taking of run time traces of DLLs used in preparation for using it in main code eventually, moving it to a dedicated module.
Avoid special names for Nuitka options in test runner, this only adds a level of confusion. Needs more work in future release.
Unify implementation to create modules into single function. We had 3 forms, one in recursion, one for main module, and one for plugin generated code. This makes it much easier to understand and use in plugins.
Further reduced code duplication between the two Scons files, but more work will be needed there.
Escaped variables are still known to be assigned/unassigned rather than unknown, allowing for many optimizations to still work on them., esp. for immutable value
Enhanced auto-format for rest documents, bullet list spacing is now consistent and spelling of organisational is unified automatically.
Moved icon conversion functionality to separate module, so it can be reused for other platforms more easily.
reflectedtest, because of Nuitka special needs to restart with variable Python flags. This could be reverted though, since Nuitka no longer needs anything outside inline copies, and therefore no longer loads from site packages.
anti-bloatplugin in standalone tests of Numpy, Pandas and tests to reduce their compile times, these have become much more manageable now.
Enhanced checks for used files to use proper below path checks for their ignoring.
Remove reflected test, compiling Nuitka with Nuitka has gotten too difficult.
Verify constants integrity at program end in debug mode again, so we catch corruption of them in tests.
This release is one of the most important ones in a long time. The PGO and LTO, and static libpython work make a big different for performance of created binaries.
The amount of optimization added is also huge, calls are much faster now, and object creations too. These avoiding to go through actual dictionaries and tuples in most cases when compiled code interacts gives very significant gains. This can be seen in the increase of pystone performance.
The new type specializations allow many operations to be much faster.
More work will follow in this area and important types,
int do not have specialized comparisons for Python3, holding it back
somewhat to where our Python2 performance is for these things.
For scalability, the
anti-bloat work is extremely valuable, and this
plugin should become active by default in the future, for now it must be
strongly recommended. It needs more control over what parts you want to
deactivate from it, in case of it causing problems, then we can and
should do it.
The support for macOS has been enhanced a lot, and will become perfect in the next release (currently develop). The bundle mode is needed for all kinds of GUI programs to not need a console. This platform is becoming as well supported as the others now.
Generally this release marks a huge step forward. We hope to add Python level PGO in the coming releases, for type knowledge retrofitted without any annotations used. Benchmarks will become more fun clearly.