11 November 2021
Nuitka Release 0.6.17
This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.
This release has a focus on performance improvements, while also polishing plugins and adding many new features.
Bug Fixes
Fix, plugins were not catching being used on packages not installed. Fixed in 0.6.16.2 already.
macOS: Fix weaknesses in the
otool
parsing to determine DLL dependency parsing. Fixed in 0.6.16.2 already.Linux: Allow onefile program args with spaces contained to be properly passed. Fixed in 0.6.16.3 already.
Windows: Avoid using less portable C function for
%PID%
formatting, which restores compilation on Windows 7 with old toolchains. Fixed in 0.6.16.3 already.Standalone: Added support for
fstrings
package. Fixed in 0.6.16.3 already.Compatibility: Fix, need to import
.pth
files aftersite
module, not before. This was causing crashes on CentOS7 with Python2. Fixed in 0.6.16.3 already.Compatibility: Fix, when extension modules failed to load, in some cases the
ImportError
was lost to aKeyError
. Fixed in 0.6.16.3 already.Fix, linker resource modes
code
andlinker
were not working anymore, but are needed with LTO mode at least. Fixed in 0.6.16.3 already.Standalone: Bytecode modules with null bytes in standard library, typically from disk corruption, were not handled properly. Fixed in 0.6.16.3 already.
Fix, failed
.throw()
into generators could cause corruption. Fixed in 0.6.16.4 already.Python2: Fix, the bytecode compilation didn’t respect the
--python-flag=no_asserts
mode. Fixed in 0.6.16.4 already.Fix, calls were not annotating their arguments as escaped, causing corruption of mutable in static optimization. Fixed in 0.6.16.5 already.
Fix, some sequence objects, e.g.
numpy.array
actually implement in-place add operations that need to be called. Fixed in 0.6.16.5 already.Windows: Fix, onefile binaries were not working after being signed. This now works.
Standalone: Added missing implicit dependency for
sklearn
.Compatibility: Modules giving
SyntaxError
from source were not properly handled, giving run timeImportError
. Now they are givingSyntaxError
.Fix, the LTO mode has issues with
incbin
usage on older gcc, so uselinker
mode when it is enabled.Python3: Fix, locals dict codes were not properly checking errors that the mapping might raise when setting values.
Fix, modules named
entry
were causing compile time errors in the C stage.macOS: Never include files from OS private frameworks in standalone mode.
Fix, the python flag
--python-flag=no_warning
wasn’t working on all platforms.Compatibility: Fix, the main code of the
site
module wasn’t executing, so that its added builtins were not there. Of course, you ought to use--python-flag=no_site
to not have it in the normal case.Python2: Added code path to handle edited standard library source code which then has no valid bytecode file.
Anaconda: In module mode, the CondaCC wasn’t recognized as form of gcc.
Fix, bytecode modules could shadow compiled modules of the same name.
Onefile: Fix, expansion of
%PID%
wasn’t working properly on non-Windows, making temp paths less unique. The time stamp is not necessarily enough.Fix,
multiprocessing
error exits from slave processes were not reporting tracebacks.Standalone: Added
xcbglintegrations
to the list of sensible Qt plugins to include by default, otherwise rendering will be inferior.Standalone: Added
platformthemes
to the list of sensible Qt plugins to include by default, otherwise file dialogs on non-Windows would be inferior.Fix, created
.pyi
files were not ordered deterministically.Standalone: Added support for
win32file
.Fix, namespace packages were not using run time values for their
__path__
value.Python3.7+: Fix, was leaking
AttributeError
exceptions during name imports.Fix, standard library detection could fail for relative paths.
New Features
Added experimental support for C level PGO (Profile Guided Optimization), which runs your program and then uses feedback from the execution. At this time only gcc is supported, and only C compiler is collecting feedback. Check the User Manual for a table with current results.
macOS: Added experimental support for creating application bundles. For these, icons can be specified and console can be disabled. But at this time, onefile and accelerated mode are not yet usable with it, only standalone mode works.
Plugins: Add support for
pkg_resources.require
calls to be resolved at compile time. These are not working at run time, but this avoids the issue very nicely.Plugins: Massive improvements to the
anti-bloat
plugin, it can now makenumpy
,scipy
,skimage
,pywt
, andmatplotlib
use much less packages and has better error handling.Plugins: Added
anti-bloat
ability ability to append code to a module, which might get used in the future by other plugins that need some sort of post load changes to be applied.Plugins: Added ability to replace code of functions at parse time, and use this in
anti-bloat
plugin to replace functions that do unnecessary stuff with variants that often just do nothing. This is illustrated here.gevent._util: description: "remove gevent release framework" change_function: "prereleaser_middle": "'(lambda data: None)'" "postreleaser_before": "'(lambda data: None)'"
This example is removing
gevent
code that loads dependencies used for their CI release process, that need not be part of normal programs.Added ability to persist source code changes done by plugins in the Python installation. This is considered experimental and needs write access to the Python installation, so this is best done in a virtualenv and it may confuse plugins.
Added support for
multiprocessing.tracker
and spawn mode for all platforms. For non-default modes outside of Windows, you need to--enable-plugin=multiprocessing
to use these.Plugins: Allow multiple entry points to be provided by one or several plugins for the same modules. These are now merged into one automatically.
Standalone: Fix for numpy not working when compiling with
--python-flag=no_docstrings
.Fix, method calls were not respecting descriptors provided by types with non-generic attribute lookups.
Windows: Add support for using self-compiled Python3 from the build folder too.
Added support for Nuitka-Python 2.7, which will be our faster Python fork.
Colorized output for error outputs encountered in Scons, these are now yellow for better recognition.
Optimization
Faster threading code was used for Python3.8 or higher, and this has been extended to 3.7 on Windows, but we won’t be able to have it other platforms and not on earlier Python3 versions.
Faster calls esp. with keyword arguments. Call with keywords no longer create dictionaries if the call target supports that, and with 3.8 or higher, non-compiled code that allows vectorcall is taken advantage of.
Faster class creation that avoids creation of argument tuples and dictionaries.
Faster attribute check code in case of non-present attributes.
Faster unbound method calls, unlike bound methods calls these were not optimized as well yet.
Type shapes for star arguments are now known and used in optimization.
def f(*args, **kwargs): type(args) # Statically known to be tuple type(kwargs) # Statically known to be dict
Python2: Faster old-style class creation. These are classes that do not explicitly inherit from
object
.Python2: Faster string comparisons for Python by specializing for the
str
type as well.Python3: Added specialization for
bytes
comparisons too. These are naturally very much the same asstr
comparisons in Python2.Added specialization for
list
comparisons too. We had them fortuples
only so far.Faster method calls when called from Python core, our
tp_call
slot wasn’t as good as it can be.Optimization: Faster deep copies of constants. This can speed up constant calls with mutable types. Before it was checking the type too often to be fast.
Allow using static linking with Debian Python giving much better performance with the system Python. This is actually a huge improvement as it makes things much faster. So far it’s only automatically enabled for Python2, but it seems to work for Python3 on Debian too. Needs more tweaking in the future.
Optimization: Added
functools
module to the list of hard imports in preparation of optimizingfunctools.partial
to work better with compiled functions.Python2: Demote to
xrange
when iterating overrange
calls, even for small ranges, they are always faster. Previously this was only done for values with at least 256 values.Enable LTO automatically for Debian Python, this also allows more optimization.
Enable LTO automatically for Anaconda with CondaCC on non-Windows, also allowing more optimization.
Organizational
Added section in the User Manual on how to deal with memory issues and C compiler bugs. This is a frequent topic and should serve as a pointer for this kind of issue.
The
--lto
option was changed to require an argument, so that it can also be disabled. The default isauto
which is the old behaviour where it’s enabled if possible.Changed
--no-progress
to--no-progressbar
in order to make it more clear what it’s about. Previously it was possible to relate it to--show-progress
.No longer require specific versions of dependencies in our
requirements.txt
and relegate those to only being inrequirements-devel.txt
such that by default Nuitka doesn’t collide with user requirements on those same packages which absolutely all the time don’t really make a difference.Added ability to check all unpushed changes with pylint with a new
./bin/check-nuitka-with-pylint --unpushed
option. Before it was only possible to make the check (quickly) with--diff
, but that stopped working after commits are made.Revived support for
vmprof
based analysis of compiled programs, but it requires a fork of it now.Make Windows specific compiler options visible on all platforms. There is no point in them being errors, instead warnings are given when they are specified on non-Windows.
Added project variable
Commercial
for use in Nuitka project syntax.Consistent use of metavars for nicer help output should make it more readable.
Avoid
ast
tree dumps in case ofKeyboardInterrupt
exceptions, they are just very noisy. Also not annotate where Nuitka was in optimization when a plugin is asking tosysexit
.
Cleanups
Encoding names for UTF-8 in calls to
.encode()
were used inconsistent with and without dashes in the source code, added cleanup to auto-format that picks the one blessed.Cleanup taking of run time traces of DLLs used in preparation for using it in main code eventually, moving it to a dedicated module.
Avoid special names for Nuitka options in test runner, this only adds a level of confusion. Needs more work in future release.
Unify implementation to create modules into single function. We had 3 forms, one in recursion, one for main module, and one for plugin generated code. This makes it much easier to understand and use in plugins.
Further reduced code duplication between the two Scons files, but more work will be needed there.
Escaped variables are still known to be assigned/unassigned rather than unknown, allowing for many optimizations to still work on them., esp. for immutable value
Enhanced auto-format for rest documents, bullet list spacing is now consistent and spelling of organizational is unified automatically.
Moved icon conversion functionality to separate module, so it can be reused for other platforms more easily.
Tests
Removed
reflected
test, because of Nuitka special needs to restart with variable Python flags. This could be reverted though, since Nuitka no longer needs anything outside inline copies, and therefore no longer loads from site packages.Use
anti-bloat
plugin in standalone tests of Numpy, Pandas and tests to reduce their compile times, these have become much more manageable now.Enhanced checks for used files to use proper below path checks for their ignoring.
Remove reflected test, compiling Nuitka with Nuitka has gotten too difficult.
Verify constants integrity at program end in debug mode again, so we catch corruption of them in tests.
Summary
This release is one of the most important ones in a long time. The PGO and LTO, and static libpython work make a big different for performance of created binaries.
The amount of optimization added is also huge, calls are much faster now, and object creations too. These avoiding to go through actual dictionaries and tuples in most cases when compiled code interacts gives very significant gains. This can be seen in the increase of pystone performance.
The new type specializations allow many operations to be much faster.
More work will follow in this area and important types, str
and
int
do not have specialized comparisons for Python3, holding it back
somewhat to where our Python2 performance is for these things.
For scalability, the anti-bloat
work is extremely valuable, and this
plugin should become active by default in the future, for now it must be
strongly recommended. It needs more control over what parts you want to
deactivate from it, in case of it causing problems, then we can and
should do it.
The support for macOS has been enhanced a lot, and will become perfect in the next release (currently develop). The bundle mode is needed for all kinds of GUI programs to not need a console. This platform is becoming as well supported as the others now.
Generally this release marks a huge step forward. We hope to add Python level PGO in the coming releases, for type knowledge retrofitted without any annotations used. Benchmarks will become more fun clearly.