26 March 2023
Nuitka Release 1.5
This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.
This release contains the long awaited 3.11 support, even if only on an experimental level. This means where 3.10 code is used, it is expected to work equally well, but the Python 3.11 specific new features have yet been done.
There is plenty of new features in Nuitka, e.g. much enhanced reports, Windows ARM native compilation support, and the usual slew of anti-bloat updates, and newly supported packages.
Bug Fixes
Standalone: Added implicit dependencies for
charset_normalizer
package. Fixed in 1.4.1 already.Standalone: Added platform DLLs for
sounddevice
package. Fixed in 1.4.1 already.Plugins: The info from Qt bindings about other Qt bindings being suppressed for import, was spawning multiple lines, breaking tests. Merged to a single line until we do text wrap for info messages as well. Fixed in 1.4.1 already.
Plugins: Fix
removeDllDependencies
was broken and could not longer be used to remove DLLs from inclusion. Fixed in 1.4.1 already.Fix, assigning methods of lists and calling them that way could crash at runtime. The same was true of dict methods, but had never been observed. Fixed in 1.4.2 already.
Standalone: Added DLL dependencies for
onnxruntime
. Fixed in 1.4.2 already.Standalone: Added implicit dependencies for
textual
package. Fixed in 1.4.2 already.Fix, boolean tests of lists could be optimized to wrong result when list methods got recognized, due to not annotating the escape during that pass properly. Fixed in 1.4.3 already.
Standalone: Added missing implicit dependency of
apsw
. Fixed in 1.4.3 already.Note
Currently
apsw
only works with manual workarounds and only in limited ways, there is an import level incompatible with__init__
being an extension module, that Nuitka does not yet handle.Python3: Fix, for range arguments that fail to divide there difference, the code would have crashed. Fixed in 1.4.3 already.
Standalone: Fix, added support for newer
pkg_resources
with another vendored package. Fixed in 1.4.4 already.Standalone: Fix, added support for newer
shapely
2.0 versions. Fixed in 1.4.4 already.Plugins: Fix, some yaml package configurations with DLLs by code didn’t work anymore, notably old
shapely
1.7.x versions were affected. Fixed in 1.4.4 already.Fix, for onefile final result the “–output-dir” option was ignored. Fixed in 1.4.4 already.
Standalone: Added
mozilla-ca
package data file. Fixed in 1.4.4 already.Standalone: Fix, added missing implicit dependency for newer
gevent
. Fixed in 1.4.4 already.Scons: Accept an installed Python 3.11 for Scons execution as well. Fixed in 1.4.4 already.
Python3.7: Some
importlib.resource
nodes asserted against use in 3.7, expecting it to be 3.8 or higher, but this interface is present in 3.7 already. Fixed in 1.4.5 already.Standalone: Fix, Python DLLs installed to the Windows system folder were not included, causing the result to be not portable. Fixed in 1.4.5 already.
Python3.9+: Fix,
metadata.resources
files methodjoinpath
is some contexts is expected to accept variable number of arguments. Fixed in 1.4.5 already.Standalone: Workaround for
customtkinter
data files on non-Windows. Fixed in 1.4.5 already.Standalone: Added support for
overrides
package. Fixed in 1.4.6 already.Standalone: Added data files for
strawberry
package. Fixed in 1.4.7 already.Fix, anti-bloat plugin caused crashes when attempting to warn about packages coming from
--include-package
by the user. Fixed in 1.4.7 already.Windows: Fix, main program filenames with an extra dot apart from the
.py
suffix, had the part beyond that wrongly trimmed. Fixed in 1.4.7 already.Fix, list methods didn’t properly annotated value escape during their optimization, which could lead to wrong optimization for boolean tests. Fixed in 1.4.7 already.
Standalone: Added support for
imagej
,scyjava
,jpype
packages. Fixed in 1.4.8 already.Fix, using
--include-package
on extension module names was not working. Fixed in 1.4.8 already.Standalone: Added support for
tensorflow.keras
namespace as well.Distutils: Fix namespace packages were not including their contained modules properly with regards to
__file__
properties, making relative file access impossible.Onefile: On Windows the onefile binary did lock itself, which could fail with certain types of AV software. This is now avoided.
Accessing files using the top level
metadata.resources
files object was not working properly, this is now supported too.MSYS2: Make sure mixing POSIX and Windows slashes causes no issues by hard-coding the onefile archive to use the subsystem slash rather than what MSYS prefers to use internally.
Standalone: Added missing dependencies of newer
imageio
.Fix, side effect nodes didn’t annotate their non-exception raising nature properly, if that was the case.
New Features
Added experimental support for Python 3.11, for 3.10 language level code it should be fully usable, but the
CPython311
test suite has not even been started to check newly added or changed features.Windows: Support for native Python on Windows ARM64, which needs 3.11 or higher, but standalone and therefore onefile do not yet work, due to lack of any form of binary dependency analysis tool.
This platform is relatively new in Python and generally. For the time being standalone and onefile should be done with Intel based Python, they would also be ARM64 only, whereas 32/64 Bit binaries can be run on all Windows ARM platforms.
Reports: Write compilation report even in case of Nuitka being interrupted or crashing. This then includes the exception, and a status like
completed
orinterrupted
. At this time this happens only when--report=
was specified, but in the future we will likely write one in case of Nuitka crashes.Reports: Now the details of the used Python version, its flavor, the OS and the architecture are included. This is crucial information for analysis and can make
--version
output unnecessary.Reports: License reports now handle
UNKNOWN
license by falling back to checking the classifiers, and therefore include the correct license e.g. withsetuptools
. Also in case no license text is found, do not create an empty block. Added in 1.4.4 already.Reports: In case the distribution name and the contained package names differ, output the list of packages included from a distribution. Added in 1.4.4 already.
Reports: Include data file sizes in report. Added in 1.4.7 already.
Reports: Include memory usage into the compilation report as well.
macOS: Add support for downloading
ccache
on arm64 (M1/M2) too. Added in 1.4.4 already.UI: Allow
--output-filename
for standalone mode again. Added in 1.4.3 already.Standalone: Improved isolation with Python 3.8 or higher. Using new init mechanisms of Python, we now achieve that the scan for
pyvenv.cfg
on in current directory and above is not done, using it will be unwanted.Python2: Expose
__loader__
for modules and register withpkg_resources
too which expects these to be present for custom resource handling.Python3.9+: The
metadata.resources
files objects methoditerdir
was not implemented yet. Fixed in 1.4.5 already.Python3.9+: The
metadata.resources
files objects methodabsolute
was not implemented yet.Added experimental ability to create virtualenv from an existing compilation report with new
--create-environment-from-report
option. It attempts to create a requirements file with the used packages and their versions. However, sometimes it seems not to be possible to due to conflicts.
Optimization
Onefile: Use memory mapping for calculating the checksum of files on all platforms. This is faster and simpler code. So far it had only be done this way on Windows, but other platforms also benefit a lot from it.
Onefile: Use memory mapping for accessing the payload rather than file operations. This avoids differences to macOS payload handling and is much faster too.
Anti-Bloat: Avoid using
dask
injoblib
.Note
Newer versions of
joblib
do not currently work yet due to their own form of multiprocessing spawn not being supported yet.Anti-Bloat: Adapt for newer
pandas
package.Anti-Bloat: Remove more
IPython
usages in newer tensorflow.Use dedicated class bodies for Python2 and Python3, with the former has a static dict type shape, and with Python3 this needs to be traced in order to tell what the meta class put in there.
Compile time optimize dict
in
/not in
anddict.has_key
operations statically where the keys of a dict are known. As a result, the class declarations of Python3 no longer created code for both branches, the one withmetaclass =
in the class declaration and without. That means also a big scalability improvement.For the Python3 class bodies, the usage of
locals()
was not recognized as not locally escaping all the variables, leading to variable traces where each class variable was marked as escaped for no good reason.Added support for
dict.fromkeys
method, making the code generation understand and handle static methods as well.Added support for
os.listdir
andos.path.basename
. Added in 1.4.5 already for use in implementing theiterdir
method, but they are also now optimized by themselves.Added support for trusted constant values of the
os
module. These arecurdir
,pardir
,sep
,extsep
,altsep
,pathsep
,linesep
which may enable some minor compile time optimization to happen and completes this aspect of theos
module.Faster
digit
size checks duringfloat
code generation for better compile time performance.Faster
list
operations due to usingPyList_CheckExact
everywhere this is applicable, this mostly makes debug operations faster, but also deep copying list values, or extending lists with iterables, etc.Optimization: Collect module usages of the given module during its abstract execution. This avoids a full tree visit afterwards only to find them. It is much cheaper to collect them while we go over the tree. This enhances the scalability of large compilations by ca. 5%.
Optimization: Faster determination of loop variables. Rather than using a generic visitor, we use the children having generator codes to add traversal code that emits relevant variables to the user directly.
Cache extra search paths in order to avoid repeated directory operations as these are known to be slow at times.
Standalone: Do not include
py.typed
data files, these indicator files are for IDEs, but not needed at run time ever.Make sure that the generic attribute code optimization is also effective in cases where a Python DLL is used. Previously this was only guaranteed to be used with static libpython.
Faster list constant usage
Small immutable constants get their own code that is much faster for small sizes.
Medium sized lists get code that just is hinted the size, but takes items from a source list, still a lot faster.
For repeated lists where all elements are the same, we use a dedicated helper for all sizes, that is even faster except for small ones with LTO enabled, where the C compiler may already do that effectively.
Added optimization for
os.path.abspath
andos.path.isabs
which of course have not as much potential for compile time optimization, but we needed them for providing.absolute()
for the meta path loader files implementation.Faster class dictionary propagation decision. Instead of checking for trace types, let the trace object decide. Also abort immediately on first inhibit, rather than checking all variables. This improves Python2 compile time, and Python3 where this code is now starting to get used when the class dictionary is shown to have
dict
type.Specialize type method
__prepare__
which is used in the Python3 re-formation of class bodies to initialize the class dictionary. Where the metaclass is resolved, we can use this to decide that the standard empty dictionary is used statically, enabling class dictionary propagation for best scalability.At this time this only happens with classes without bases, but we expect to soon do this with all compile time known base classes. At this time, these optimization to become effective, we need to optimize meta class selection from bases classes, as well as modification of base classes with
__mro_entries__
methods.The
bool
built-in on boolean values is now optimized away.Since it’s used also for conditions being extracted, this is actually somewhat relevant, since it could keep code alive in side effects at least for no good reason and this allows a proper reduction.
Organizational
Project: Require the useful stuff for installation of Nuitka already. These are things we cannot inline really, but otherwise will frequently be warned about, e.g.
zstandard
for onefile andordered-set
for fast operation, but we do not require packages that might fail to install.User Manual: Added section about virus scanners and how to avoid false reports.
User Manual: Enhanced description for plugin module loading, the old code was too complicated and actually working only for a mode of including plugin code that is discouraged.
User Manual: Fix section for standalone finding files on wrong level.
Windows: Using the console on Python 3.4 to 3.7 is not working very well with e.g. many Asian systems. Nuitka fails to setup the encoding for stdin and stdout or this platform. It can then produce exceptions on input or output of unicode data, that doesn’t overlap with UTF-8.
We now inform the user of these older Python with a warning and mnemonic, to either disable the console or to upgrade to Python 3.8 or higher, which normally won’t be much of an issue for most users.
Read more on the info page for detailed information. Added in 1.4.1 already.
Debugging: Fixup debugging reference count output with Python3.4. For Python 3.11 compatibility tests, actually it was useful to compare with a version that doesn’t have coroutines yet. Never tell me, supporting old versions is not good.
Deprecating support for Python 3.3, there is no apparent use of this version, and it has gained specific bugs, that are indeed not worth our time. Python 2.6 and Python 2.7 will continue to be supported probably indefinitely.
Recommend
ordered-set
for Python 3.7 to 3.9 as well, as not only for 3.10+ because on Windows, to installorderset
MSVC needs to be installed, whereasordered-set
has a wheel for ready use.Actually zstandard requirement is for a minimal version, added that to the requirement files.
Debugging: Lets not reexecute Nuitka in case if we are debugging it from Visual Code.
Debugging: Include the
.pdb
files in Windows standalone mode for proper C tracebacks should that be necessary.UI: Detect the GitHub flavor of Python as well.
Quality: Check the
clang-format
version to avoid older ones with bugs that made it switch whitespace for one file. Using the one from Visual Code C extension is a good idea, since it will often be available. Running the checks on newer Ubuntu GitHub Actions runner to have the correct version available.Quality: Updated the version of
rstfmt
andisort
to the latest versions.GitHub: Added commented out section for enabling
ssh
login, which we occasionally need to git bisect problems specific to GitHub Python flavor.Plugins: Report problematic plugin name with module name or DLL name when these raise exceptions.
Use
ordered-set
package for Python3.7+ rather than only Python3.10+ because it doesn’t need any build dependency on Windows.UI: When showing source changes, also display the module name with the changed code.
UI: Use function intended for user query when asking about downloads too.
UI: Do not report usage of
ccache
for linking from newer version, that is not relevant.Onefile: Make sure we have proper error codes when reporting IO errors.
MSVC: Detect a version for developer prompts too. This version is needed for use in enabling version specific features.
Started UML diagrams with
plantuml
that will need to be completed before using them in then new and more visual parts of Nuitka documentation.UI: Check icon conversion capability at start of compilation rather than error exiting at the very end informing the user about required
imageio
packages to convert to native icons.Quality: Enhanced autoformat on Windows, which was susceptible to tools introducing Windows new lines before other steps were performed, that then could be confused, also enforcing use of UTF-8 encoding when working with Nuitka source code for formatting.
Cleanups
The
delvewheel
plugin was still using azmq
class name from its original implementation, adapted that.Use common template for generator frames as well. This made them also work with 3.11, by avoiding duplication.
Applied code formatting to many more files in
tests
, etc.Removed a few micro benchmarks that are instead to be covered by construct-based tests now.
Enhanced code generation for specialized in-place operations to avoid unused code for operations that do not have any shortcuts where the operation would be actual in-place of a reference count 1 object.
Better code generation for module variable in-place operations with proper indentation and no repeated calls.
Plugins: Use the
namedtuple
factory that we created for informational tuples from plugins as well.Make details of download utils module more accessible for better reuse.
Remove last remaining Python 3.2 version check in C code, for us this is just Python3 with 3.2 being unsupported.
Cleanup, name generated call helper file properly, indicating that it is a generated file.
Tests
Made the CPython3.10 test suite largely executable with Python 3.11 and running that with CI now.
Allow measuring constructs without writing the code diff again. Was crashing when no filename was given.
Make Python3.11 test execution recognized by generally accepting partially supported versions to execute the tests with.
Handle also
newfstat
directory checks in file usage scan. This are used on newer Linux systems.GitHub: In actions use
--report
for coverage and upload the reports as artifacts.Use
no-qt
plugin to avoid warnings inmatplotlib
test rather than disabling the warnings about Qt bindings.macOS: Detect if the machine can take runtime traces, which on Apple Silicon by default it cannot.
macOS: Cover all APIs for file tracing, rather than just one for extended coverage.
Fix, distutils test was not installing the built wheels, but source archive and therefore compiling that second time.
For the
pyproject.toml
using tests, Nuitka was always downloaded from PyPI rather than using the version under test.Ignore
ld
info output about mismatching architecture libraries being ignored. Fixed in 1.4.1 already.
Summary
With this release an important new avenue for scalability has been started. While for Python2 class bodies were very often reduced to just that dictionary creation, with Python3 that was not the case, due to the many new complexities, and while this release makes a start, we will be able to continue this path towards much more scalable class creation codes. And while the performance does not really matter all that much for these, knowing these, will ultimately lead us to “compiled classes” as our own type, and “compiled objects” that may well perform much faster.
Already now, the enhancements to class creation codes will result in smaller binaries, but much more is expected the more this is completed.
The majority of the work was of course to become Python3.11 compatible, and unfortunately the attribute lookups are not as optimized as for 3.10 yet, which may cause disappointing results for performance initially. We will need to complete that before benchmarks will make much sense.
For the next release, full Python 3.11 support is planned. I believe it should be usable. Problems with 3.11 may get hotfixes, but ultimately the develop version is probably the one to recommend when using 3.11 with Nuitka, as there will be the whole set of fixes, since not everything will be ported back.
The new reports should be used in bug reporting soon. We foresee that for issue reports, these may well become mandatory. Together with the ability to create a virtualenv from the reports, this may make reproducing issues a breeze, but first tries on complex projects were also highlighting that it may not be as simple.