Nuitka Release 0.3.4

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This new release of Nuitka has a focus on re-organizing the Nuitka generated source code and a modest improvement on the performance side.

For a long time now, Nuitka has generated a single C++ file and asked the C++ compiler to translate it to an executable or shared library for CPython to load. This was done even when embedding many modules into one (the "deep" compilation mode, option --deep).

This was simple to do and in theory ought to allow the compiler to do the most optimization. But for large programs, the resulting source code could have exponential compile time behavior in the C++ compiler. At least for the GNU g++ this was the case, others probably as well. This is of course at the end a scalability issue of Nuitka, which now has been addressed.

So the major advancement of this release is to make the --deep option useful. But also there have been a performance improvements, which end up giving us another boost for the "PyStone" benchmark.

Bug fixes

  • Imports of modules local to packages now work correctly, closing the small compatibility gap that was there.
  • Modules with a "-" in their name are allowed in CPython through dynamic imports. This lead to wrong C++ code created. (Thanks to Li Xuan Ji for reporting and submitting a patch to fix it.)
  • There were warnings about wrong format used for Ssize_t type of CPython. (Again, thanks to Li Xuan Ji for reporting and submitting the patch to fix it.)
  • When a wrong exception type is raised, the traceback should still be the one of the original one.
  • Set and dict contractions (Python 2.7 features) declared local variables for global variables used. This went unnoticed, because list contractions don't generate code for local variables at all, as they cannot have such.
  • Using the type() built-in to create a new class could attribute it to the wrong module, this is now corrected.

New Features

  • Uses Scons to execute the actual C++ build, giving some immediate improvements.
  • Now caches build results and Scons will only rebuild as needed.
  • The direct use of __import__() with a constant module name as parameter is also followed in "deep" mode. With time, non-constants may still become predictable, right now it must be a real CPython constant string.

New Optimization

  • Added optimization for the built-ins ord() and chr(), these require a module and built-in module lookup, then parameter parsing. Now these are really quick with Nuitka.
  • Added optimization for the type() built-in with one parameter. As above, using from builtin module can be very slow. Now it is instantaneous.
  • Added optimization for the type() built-in with three parameters. It's rarely used, but providing our own variant, allowed to fix the bug mentioned above.

Cleanups

  • Using scons is a big cleanup for the way how C++ compiler related options are applied. It also makes it easier to re-build without Nuitka, e.g. if you were using Nuitka in your packages, you can easily build in the same way than Nuitka does.
  • Static helpers source code has been moved to ".hpp" and ".cpp" files, instead of being in ".py" files. This makes C++ compiler messages more readable and allows us to use C++ mode in Emacs etc., making it easier to write things.
  • Generated code for each module ends up in a separate file per module or package.
  • Constants etc. go to their own file (although not named sensible yet, likely going to change too)
  • Module variables are now created by the CPythonModule node only and are unique, this is to make optimization of these feasible. This is a pre-step to module variable optimization.

New Tests

  • Added "ExtremeClosure" from my Python quiz, it was not covered by existing tests.
  • Added test case for program that imports a module with a dash in its name.
  • Added test case for main program that starts with a dash.
  • Extended the built-in tests to cover type() as well.

Organizational

  • There is now a new environment variable NUITKA_SCONS which should point to the directory with the SingleExe.scons file for Nuitka. The scons file could be named better, because it is actually one and the same who builds extension modules and executables.

  • There is now a new environment variable NUITKA_CPP which should point to the directory with the C++ helper code of Nuitka.

  • The script "create-environment.sh" can now be sourced (if you are in the top level directory of Nuitka) or be used with eval. In either case it also reports what it does.

    Update

    The script has become obsolete now, as the environment variables are no longer necessary.

  • To cleanup the many "Program.build" directories, there is now a "clean-up.sh" script for your use. Can be handy, but if you use git, you may prefer its clean command.

    Update

    The script has become obsolete now, as Nuitka test executions now by default delete the build results.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.4:

Pystone(1.1) time for 50000 passes = 0.34
This machine benchmarks at 147059 pystones/second

This is 91% for 0.3.4, up from 80% before.

Nuitka Pre-Release 0.3.4pre1

This pre-release of Nuitka has a focus on re-organizing the Nuitka generated source code. Please see the page "What is Nuitka?" for clarification of what it is now and what it wants to be.

For a long time, Nuitka has generated a single C++ file, even when embedding many modules into one. And it has always showed that the GNU g++ compiler clearly has exponential compile time behavior when translating these into the executable.

This is no more the case. So this pre-release is mainly about making the "--deep" feature useful. Before the release, I may look into optimizations for speed again. Right now time is very short due to day job reasons, so this pre-release is also about allowing people to use the improvements that I have made and get some feedback about it.

Bug fixes

  • None at all. Although I am sure that there may be regressions on the options side. The tests of CPython 2.7 all pass still, but you may find some breakage.

Cleanups

  • Static helpers source code has been moved to ".hpp" and ".cpp" files, instead of being in ".py" files.
  • Generated generated code for each module is now a separate file.
  • Constants etc. go to their own file (although not named sensible yet)

New Features

  • Uses Scons to make the build.

New Tests

  • I have added ExtremClosure from the Python quiz. I feel it was not covered by existing tests yet.

Organizational

  • There is now a new environment variable "NUITKA_SCONS" which should point to the directory with the Scons file for Nuitka.
  • The create-environment.sh can now be sourced (if you are in the top level directory of Nuitka) or be used with eval. In either case it also reports what it does.

Numbers

None at this time. It likely didn't change much at all. And I am not yet using the link time optimization feature of the g++ compiler, so potentially it should be worse than before at max.

This release will be inside the "git" repository only. Check out latest version here to get it.

Yours, Kay Hayen

Nuitka needs you - a call for help

Hello everybody,

Update

Python3 support was added, and has reached 3.3 in the mean time. The doctests are extracted by a script indeed. But exception stack correctness is an ongoing struggle.

my Python compiler Nuitka has come a long way, and currently I have little to no time to spend on it, due to day job reasons, so it's going to mostly stagnate for about 2 weeks from my side. But that's coming to an end, and still I would like to expand what we currently have, with your help.

Note: You can check the page What is Nuitka? for clarification of what it is now and what it wants to be.

As you will see, covering all the CPython 2.6 and 2.7 language features is already something. Other projects are far, far away from that. But going ahead, I want to secure that base. And this is where there are several domains where you can help:

  • Python 3.1 or higher

    I did some early testing. The C/API changed in many ways, and my current working state has a couple of fixes for it. I would like somebody else to devote some time to fixing this up. Please contact me if you can help here, esp. if you are competent in the C/API changes of Python 3.1. Even if the CPython 3.1 doesn't matter as much to me, I believe the extended coverage from the new tests in its test suite would be useful. The improved state is not yet released. I would make an release to the person(s) that want to work on it.

  • Doctests support

    I have started to extract the doctests from the CPython 2.6 test suite. There is a script that does it, and you basically only need to expand it with more of the same. No big issue there, but it could find issues with Nuitka that we would like to know. Of course, it should also be expanded to CPython 2.7 test suite and ultimately also CPython 3.1

  • Exception correctness

    I noted some issues with the stacks when developing with the CPython 2.7 tests, or now failing 2.6 tests, after some merge work. But what would be needed would be tests to cover all the situations, where exceptions could be raised, and stack traces should ideally be identical for all. This is mostly only accuracy work and the CPython test suite is bad at covering it.

All these areas would be significant help, and do not necessarily or at all require any Nuitka inside knowledge. You should also subscribe the mailing list if you consider helping, so we can discuss things in the open.

If you choose to help me, before going even further into optimization, in all likelihood it's only going to make things more solid. The more tests we have, the less wrong paths we can take. This is why I am asking for things, which all point into that direction.

Thanks in advance, Kay Hayen

Nuitka Release 0.3.3

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release of Nuitka continues the focus on performance. It also cleans up a few open topics. One is "doctests", these are now extracted from the CPython 2.6 test suite more completely. The other is that the CPython 2.7 test suite is now passed completely. There is some more work ahead though, to extract all of the "doctests" and to do that for both versions of the tests.

This means an even higher level of compatibility has been achieved, then there is performance improvements, and ever cleaner structure.

Bug fixes

Generators

  • Generator functions tracked references to the common and the instance context independently, now the common context is not released before the instance contexts are.
  • Generator functions didn't check the arguments to throw() the way they are in CPython, now they are.
  • Generator functions didn't trace exceptions to "stderr" if they occurred while closing unfinished ones in "del".
  • Generator functions used the slightly different wordings for some error messages.

Function Calls

  • Extended call syntax with ** allows that to use a mapping, and it is now checked if it really is a mapping and if the contents has string keys.
  • Similarly, extended call syntax with * allows a sequence, it is now checked if it really is a sequence.
  • Error message for duplicate keyword arguments or too little arguments now describe the duplicate parameter and the callable the same way CPython does.
  • Now checks to the keyword argument list first before considering the parameter counts. This is slower in the error case, but more compatible with CPython.

Classes

  • The "locals()" built-in when used in the class scope (not in a method) now is correctly writable and writes to it change the resulting class.
  • Name mangling for private identifiers was not always done entirely correct.

Others

  • Exceptions didn't always have the correct stack reported.
  • The pickling of some tuples showed that "cPickle" can have non-reproducible results, using "pickle" to stream constants now

New Optimization

  • Access to instance attributes has become faster by writing specific code for the case. This is done in JIT way, attempting at run time to optimize attribute access for instances.
  • Assignments now often consider what's cheaper for the other side, instead of taking a reference to a global variable, just to have to release it.
  • The function call code built argument tuples and dictionaries as constants, now that is true for every tuple usage.

Cleanups

  • The static helper classes, and the prelude code needed have been moved to separate C++ files and are now accessed "#include". This makes the code inside C++ files as opposed to a Python string and therefore easier to read and or change.

New Features

  • The generator functions and generator expressions have the attribute "gi_running" now. These indicate if they are currently running.

New Tests

  • The script to extract the "doctests" from the CPython test suite has been rewritten entirely and works with more doctests now. Running these tests created increased the test coverage a lot.
  • The Python 2.7 test suite has been added.

Organizational

  • One can now run multiple "compare_with_cpython" instances in parallel, which enables background test runs.
  • There is now a new environment variable "NUITKA_INCLUDE" which needs to point to the directory Nuitka's C++ includes live in. Of course the "create-environment.sh" script generates that for you easily.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.3:

Pystone(1.1) time for 50000 passes = 0.36
This machine benchmarks at 138889 pystones/second

This is 80% for 0.3.3, up from 66% before.

Nuitka Release 0.3.2

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release of Nuitka continues the focus on performance. But this release also revisits the topic of feature parity. Before, feature parity had been reached "only" with Python 2.6. This is of course a big thing, but you know there is always more, e.g. Python 2.7.

With the addition of set contractions and dict contractions in this very release, Nuitka is approaching Python support for 2.7, and then there are some bug fixes.

Bug fixes

  • Calling a function with ** and using a non-dict for it was leading to wrong behavior. Now a mapping is good enough as input for the ** parameter and it's checked.
  • Deeply nested packages "package.subpackage.module" were not found and gave a warning from Nuitka, with the consequence that they were not embedded in the executable. They now are.
  • Some error messages for wrong parameters didn't match literally. For example "function got multiple..." as opposed to "function() got multiple..." and alike.
  • Files that ended in line with a "#" but without a new line gave an error from "ast.parse". As a workaround, a new line is added to the end of the file if it's "missing".
  • More correct exception locations for complex code lines. I noted that the current line indication should not only be restored when the call at hand failed, but in any case. Otherwise sometimes the exception stack would not be correct. It now is - more often. Right now, this has no systematic test.
  • Re-raised exceptions didn't appear on the stack if caught inside the same function, these are now correct.
  • For exec the globals argument needs to have "__builtins__" added, but the check was performed with the mapping interface. That is not how CPython does it, and so e.g. the mapping could use a default value for "__builtins__" which could lead to incorrect behavior. Clearly a corner case, but one that works fully compatible now.

New Optimization

  • The local and shared local variable C++ classes have a flag "free_value" to indicate if an "PY_DECREF" needs to be done when releasing the object. But still the code used "Py_XDECREF" (which allows for "NULL" values to be ignored.) when the releasing of the object was done. Now the inconsistency of using "NULL" as "object" value with "free_value" set to true was removed.
  • Tuple constants were copied before using them without a point. They are immutable anyway.

Cleanups

  • Improved more of the indentation of the generated C++ which was not very good for contractions so far. Now it is. Also assignments should be better now.
  • The generation of code for contractions was made more general and templates split into multiple parts. This enabled reuse of the code for list contractions in dictionary and set contractions.
  • The with statement has its own template now and got cleaned up regarding indentation.

New Tests

  • There is now a script to extract the "doctests" from the CPython test suite and it generates Python source code from them. This can be compiled with Nuitka and output compared to CPython. Without this, the doctest parts of the CPython test suite is mostly useless. Solving this improved test coverage, leading to many small fixes. I will dedicate a later posting to the tool, maybe it is useful in other contexts as well.
  • Reference count tests have been expanded to cover assignment to multiple assignment targets, and to attributes.
  • The deep program test case, now also have a module in a sub-package to cover this case as well.

Organizational

  • The gitweb interface might be considered an alternative to downloading the source if you want to provide a pointer, or want to take a quick glance at the source code. You can already download with git, follow the link below to the page explaining it.

  • The "README.txt" has documented more of the differences and I consequently updated the Differences page. There is now a distinction between generally missing functionality and things that don't work in --deep mode, where Nuitka is supposed to create one executable.

    I will make it a priority to remove the (minor) issues of --deep mode in the next release, as this is only relatively little work, and not a good difference to have. We want these to be empty, right? But for the time being, I document the known differences there.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.2:

Pystone(1.1) time for 50000 passes = 0.39
This machine benchmarks at 128205 pystones/second

This is 66% for 0.3.2, slightly up from the 58% of 0.3.1 before. The optimization done were somewhat fruitful, but as you can see, they were also more cleanups, not the big things.

Tough Week

This week was clearly a tough week for my family.

My 6 years old son was hurt when he fell of some stone steps during day care after school. He broke his arm twice and it was really bad. Things didn't go optimal with that, and in the clinic, during the surgery, one of the risks materialized, and so he got content from the stomach into his lunge, which is really serious. So he ended up in intensive care in the hospital.

This meant he could not breathe without assistence anymore, but fortunately that went over quite well, so he was only one night in intensive care and then was sent on the normal kids section of the hospital.

My wife had a 3 days exams all the while, which would have been very problematic to interrupt. So I drove her to the exams very early in the morning, then to hospital playing with my son, and later to pick her up, so she can see him too. My other son, 3 years, was meanwhile in Kindergarten from where I picked him up to allow the kids to play together.

They play together each day, and they missed another very badly. The little one was very concerned with what the big one was doing.

You may like to see this picture from a happier day.

On top of that, we had the final acceptance of the new house yesterday, just barely after the son was released from hospital. The new house will also consume a lot of time for a while. I am preparing the move in about 2 weeks.

For Nuitka this meant of course that no time at all was available for it and that it will not be much in the near future. I missed a week in the day job and will need to catch up. Family and day job are priorities of course.

Luckily small things already make an improvement and then I had some things done already, so there still will be some progress. And of course, I do need some recreation in any case, and Nuitka is a lot of fun to me. I expect to make a Nuitka release 0.3.2 this weekend, and it will again be a good step ahead.

Kay Hayen

Nuitka Release 0.3.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release of Nuitka continues the focus on performance and contains only cleanups and optimization. Most go into the direction of more readable code, some aim at making the basic things faster, with good results as to performance as you can see below.

New Optimization

  • Constants in conditions of conditional expressions (a if cond else d), if/elif or while are now evaluated to true or false directly. Before there would be temporary python object created from it which was then checked if it had a truth value.

    All of that is obviously overhead only. And it hurts the typically while 1: infinite loop case badly.

  • Do not generate code to catch BreakException or ContinueException unless a break or continue statement being in a try: finally: block inside that loop actually require this.

    Even while uncaught exceptions are cheap, it is still an improvement worthwhile and it clearly improves the readability for the normal case.

  • The compiler more aggressively prepares tuples, lists and dicts from the source code as constants if their contents is "immutable" instead of building at run time. An example of a "mutable" tuple would be ({},) which is not safe to share, and therefore will still be built at run time.

    For dictionaries and lists, copies will be made, under the assumption that copying a dictionary will always be faster, than making it from scratch.

  • The parameter parsing code was dynamically building the tuple of argument names to check if an argument name was allowed by checking the equivalent of name in argument_names. This was of course wasteful and now a pre-built constant is used for this, so it should be much faster to call functions with keyword arguments.

  • There are new templates files and also actual templates now for the while and for loop code generation. And I started work on having a template for assignments.

Cleanups

  • Do not generate code for the else of while and for loops if there is no such branch. This uncluttered the generated code somewhat.

  • The indentation of the generated C++ was not very good and whitespace was often trailing, or e.g. a real tab was used instead of "t". Some things didn't play well together here.

    Now much of the generated C++ code is much more readable and white space cleaner. For optimization to be done, the humans need to be able to read the generated code too. Mind you, the aim is not to produce usable C++, but on the other hand, it must be possible to understand it.

  • To the same end of readability, the empty else {} branches are avoided for if, while and for loops. While the C++ compiler can be expected to remove these, they seriously cluttered up things.

  • The constant management code in Context was largely simplified. Now the code is using the Constant class to find its way around the problem that dicts, sets, etc. are not hashable, or that complex is not being ordered; this was necessary to allow deeply nested constants, but it is also a simpler code now.

  • The C++ code generated for functions now has two entry points, one for Python calls (arguments as a list and dictionary for parsing) and one where this has happened successfully. In the future this should allow for faster function calls avoiding the building of argument tuples and dictionaries all-together.

  • For every function there was a "traceback adder" which was only used in the C++ exception handling before exit to CPython to add to the traceback object. This was now in-lined, as it won't be shared ever.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.1:

Pystone(1.1) time for 50000 passes = 0.41
This machine benchmarks at 121951 pystones/second

This is 58% for 0.3.1, up from the 25% before. So it's getting somewhere. As always you will find its latest version here.

Nuitka Release 0.3.0

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release 0.3.0 is the first release to focus on performance. In the 0.2.x series Nuitka achieved feature parity with CPython 2.6 and that was very important, but now it is time to make it really useful.

Optimization has been one of the main points, although I was also a bit forward looking to Python 2.7 language constructs. This release is the first where I really started to measure things and removed the most important bottlenecks.

New Features

  • Added option to control --debug. With this option the C++ debug information is present in the file, otherwise it is not. This will give much smaller ".so" and ".exe" files than before.
  • Added option --no-optimization to disable all optimization. It enables C++ asserts and compiles with less aggressive C++ compiler optimization, so it can be used for debugging purposes.
  • Support for Python 2.7 set literals has been added.

Performance Enhancements

  • Fast global variables: Reads of global variables were fast already. This was due to a trick that is now also used to check them and to do a much quicker update if they are already set.

  • Fast break/continue statements: To make sure these statements execute the finally handlers if inside a try, these used C++ exceptions that were caught by try/finally in while or for loops.

    This was very slow and had very bad performance. Now it is checked if this is at all necessary and then it's only done for the rare case where a break/continue really is inside the tried block. Otherwise it is now translated to a C++ break/continue which the C++ compiler handles more efficiently.

  • Added unlikely() compiler hints to all errors handling cases to allow the C++ compiler to generate more efficient branch code.

  • The for loop code was using an exception handler to make sure the iterated value was released, using PyObjectTemporary for that instead now, which should lead to better generated code.

  • Using constant dictionaries and copy from them instead of building them at run time even when contents was constant.

New Tests

  • Merged some bits from the CPython 2.7 test suite that do not harm 2.6, but generally it's a lot due to some unittest module interface changes.
  • Added CPython 2.7 tests test_dictcomps.py and test_dictviews.py which both pass when using Python 2.7.
  • Added another benchmark extract from "PyStone" which uses a while loop with break.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.0:

Pystone(1.1) time for 50000 passes = 0.52
This machine benchmarks at 96153.8 pystones/second

That's a 25% speedup now and a good start clearly. It's not yet in the range of where i want it to be, but there is always room for more. And the break/continue exception was an important performance regression fix.

Release Nuitka 0.2.4

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release 0.2.4 is likely the last 0.2.x release, as it's the one that achieved feature parity with CPython 2.6, which was the whole point of the release series, so time to celebrate. I have stayed away (mostly) from any optimization, so as to not be premature.

From now on speed optimization is going to be the focus though. Because right now, frankly, there is not much of a point to use Nuitka yet, with only a minor run time speed gain in trade for a long compile time. But hopefully we can change that quickly now.

New Features

  • The use of exec in a local function now adds local variables to scope it is in.
  • The same applies to from module_name import * which is now compiled correctly and adds variables to the local variables.

Bug Fixes

  • Raises UnboundLocalError when deleting a local variable with del twice.
  • Raises NameError when deleting a global variable with del twice.
  • Read of to uninitialized closure variables gave NameError, but UnboundLocalError is correct and raised now.

Cleanups

  • There is now a dedicated pass over the node tree right before code generation starts, so that some analysis can be done as late as that. Currently this is used for determining which functions should have a dictionary of locals.
  • Checking the exported symbols list, fixed all the cases where a static was missing. This reduces the "module.so" sizes.
  • With gcc the "visibility=hidden" is used to avoid exporting the helper classes. Also reduces the "module.so" sizes, because classes cannot be made static otherwise.

New Tests

  • Added "DoubleDeletions" to cover behaviour of del. It seems that this is not part of the CPython test suite.
  • The "OverflowFunctions" (those with dynamic local variables) now has an interesting test, exec on a local scope, effectively adding a local variable while a closure variable is still accessible, and a module variable too. This is also not in the CPython test suite.
  • Restored the parts of the CPython test suite that did local star imports or exec to provide new variables. Previously these have been removed.
  • Also "test_with.py" which covers PEP 343 has been reactivated, the with statement works as expected.

Python Exec in nested Functions Quiz

Quiz Question

Say you have the following module code:

a_global = 7

def deepExec():
    for_closure = 3

    def execFunction():
        code = "f=2"

        # Can fool it to nest
        exec code

        print "Locals now", locals()

        print "Closure was taken", for_closure
        print "Globals still work", a_global
        print "Added local from code", f

    execFunction()

deepExec()

Can you overcome the SyntaxError this gives in CPython? Normally exec like this is not allowed for nested functions. Well, think about it, the solution is in the next paragraph.

Solution

The correct answer is that you need to add "in None, None" to the exec and you are fine. The exec is now allowed and behaves as expected. You can see it in the locals, "f" was indeed added to it, the closure value is correct, and the global still works.

It seems the "SyntaxError" tries to avoid such code, but on the other hand, exec is not forbidden when it has parameters, and those imply defaults when they are None.

Now, I had this strange realization when implementing the "exec" behaviour for my Python compiler Nuitka which in its next version (due later this week) will be able to handle this type code as well. :-)

Kay Hayen