24 January 2024
Nuitka Package Configuration Part 3
This is the third part of a post series under the tag package_config that explains the Nuitka Package Configuration in more detail. To recap, Nuitka package configuration is the way Nuitka learns about hidden dependencies, needed DLLs, data files, and just generally avoids bloat in the compilation. The details are here on a dedicate page on the web site in Nuitka Package Configuration but reading on will be just fine.
Problem Package
Each post will feature one package that caused a particular problem. In
this case, we are talking about the package toga
.
Problems like with this package are typically encountered in standalone mode only, but they also affect accelerated mode, since it doesn’t compile all the things desired in that case. Some packages, and in this instance look at what OS they are running on, environment variables, etc. and then in a relatively static fashion, but one that Nuitka cannot see through, loads a what it calls “backend” module.
We are going to look at that in some detail, and will see a workaround
applied with the anti-bloat
engine doing code modification on the
fly that make the choice determined at compile time, and visible to
Nuitka is this way.
Initial Symptom
The initial symptom reported was that toga
did suffer from broken
version lookups and therefor did not work, and we encountered even two
things, that prevented it, one was about the version number. It was
trying to do int
after resolving the version of toga by itself to
None
.
Traceback (most recent call last):
File "C:\py\dist\toga1.py", line 1, in <module>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\toga\__init__.py", line 1, in <module toga>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\toga\app.py", line 20, in <module toga.app>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\toga\widgets\base.py", line 7, in <module toga.widgets.base>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\travertino\__init__.py", line 4, in <module travertino>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\setuptools_scm\__init__.py", line 7, in <module setuptools_scm>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\setuptools_scm\_config.py", line 15, in <module setuptools_scm._config>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\setuptools_scm\_integration\pyproject_reading.py", line 8, in <module setuptools_scm._integration.pyproject_reading>
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "C:\py\dist\setuptools_scm\_integration\setuptools.py", line 62, in <module setuptools_scm._integration.setuptools>
File "C:\py\dist\setuptools_scm\_integration\setuptools.py", line 29, in _warn_on_old_setuptools
ValueError: invalid literal for int() with base 10: 'unknown'
So, this is clearly something that we consider bloat in the first place,
to runtime lookup your own version number. The use of setuptools_scm
is implying the use of setuptools, for which the version cannot be
determined, and that’s crashing.
Step 1 - Analysis of initial crashing
So first thing, we did was to repair setuptools
, to know its
version. It is doing it a bit different, because it cannot use itself.
Our compile time optimization failed there, but also would be overkill.
We never came across this, since we avoid setuptools
very hard
normally, but it’s not good to be incompatible.
- module-name: 'setuptools.version'
anti-bloat:
- description: 'workaround for metadata version of setuptools'
replacements:
"pkg_resources.get_distribution('setuptools').version": "repr(__import__('setuptools.version').version.__version__)"
We do not have to include all metadata for setuptools
here, just to
get that one item, so we chose to make a simple string replacement here,
that just looks the value up at compile time and puts it into the source
code automatically. That removes the
pkg_resources.get_distribution()
call entirely.
With that, setuptools_scm
was not crashing anymore. That’s good. But
we don’t really want it to be included, since it’s good for dynamically
detecting the version from git, and what not, but including the
framework for building C extensions, not a good idea in the general
case. Nuitka therefore said this:
Nuitka-Plugins:WARNING: anti-bloat: Undesirable import of 'setuptools_scm' (intending to
Nuitka-Plugins:WARNING: avoid 'setuptools') in 'toga' (at
Nuitka-Plugins:WARNING: 'c:\3\Lib\site-packages\toga\__init__.py:99') encountered. It may
Nuitka-Plugins:WARNING: slow down compilation.
Nuitka-Plugins:WARNING: Complex topic! More information can be found at
Nuitka-Plugins:WARNING: https://nuitka.net/info/unwanted-module.html
So that’s informing the user to take action. And in the case of optional
imports, i.e. ones where using code will handle the ImportError
just
fine and work without it, we can use do this.
- module-name: 'toga'
anti-bloat:
- description: 'remove setuptools usage'
no-auto-follow:
'setuptools_scm': ''
when: 'not use_setuptools'
He we say, no not automatically follow setuptools_scm
reports,
unless there is other code that still does it. In that way, the
import still happens if some other part of the code imports the module,
but only then. We no longer enforce the non-usage of a module here, we
just make that decision based on other uses being present.
With this the bloat warning, and the inclusion of setuptools_scm
into the compilation is removed, and you always want to make as small as
possible and remove those packages that do not contribute anything but
overhead, aka bloat.
The next thing discovered was that toga
needs the toga-core
distribution to version check. For that, we use the common solution, and
tell that we want to include the metadata of it, for when toga
is
part of a compilation.
- module-name: 'toga'
data-files:
include-metadata:
- 'toga-core'
So that moved the entire issue of version looks to resolved.
Step 2 - Dynamic Backend dependency
Now on to the backend issue. What remained was a need for including the platform specific backend. One that can even be overridden by an environment variable. For full compatibility, we invented something new. Typically what we would have done is to create a toga plugin for the following snippet.
- module-name: 'toga.platform'
variables:
setup_code: 'import toga.platform'
declarations:
'toga_backend_module_name': 'toga.platform.get_platform_factory().__name__'
anti-bloat:
- change_function:
'get_platform_factory': "'importlib.import_module(%r)' % get_variable('toga_backend_module_name')"
There is a whole new thing here, a new feature that was added
specifically for this to be easy to do. And with the backend selection
being complex and partially dynamic code, we didn’t want to hard code
that. So we added support for variables
and their use in Nuitka
Package Configuration.
The first block variables
defines a mapping of expressions in
declarations
that will be evaluated at compile time given the setup
code under setup_code
.
This then allows us to have a variable with the name of the backend that
toga
decides to use. We then change the very complex function
get_platform_factory
that we used used, for compilation, to be
replacement that Nuitka will be able to statically optimize and see the
backend as a dependency and use it directly at run time, which is what
we want.
Final remarks
I am hoping you will find this very helpful information and will join
the effort to make packaging for Python work out of the box. Adding
support for toga
was a bit more complex, but with the new tool, once
identified to be that kind of backend issue, it might have become a lot
more easy.
Lessons learned. We should cover packages that we routinely remove from
compilation, like setuptools, but e.g. also IPython. This will have to
added, such that setuptools_scm
cannot cloud the vision to actual
issues.