Nuitka Release 0.5.24

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release is again focusing on optimization, this time very heavily on the generator performance, which was found to be much slower than CPython for some cases. Also there is the usual compatibility work and improvements for Pure C support.

Bug Fixes

  • Windows: The 3.5.2 coroutine new protocol implementation was using the wrapper from CPython, but it's not part of the ABI on Windows. Have our own instead. Fixed in 0.5.23.1 already.
  • Windows: Fixed second compilation with MSVC failing. The files renamed to be C++ files already existed, crashing the compilation. Fixed in 0.5.23.1 already.
  • Mac OS: Fixed creating extension modules with .so suffix. This is now properly determined by looking at the importer details, leading to correct suffix on all platforms. Fixed in 0.5.23.1 already.
  • Debian: Don't depend on a C++ compiler primarily anymore, the C compiler from GNU or clang will do too. Fixed in 0.5.23.1 already.
  • Pure C: Adapted scons compiler detecting to properly consider C11 compilers from the environment, and more gracefully report things.

Optimization

  • Python2: Generators were saving and restoring exceptions, updating the variables sys.exc_type for every context switch, making it really slow, as these are 3 dictionary updates, normally not needed. Now it's only doing it if it means a change.
  • Sped up creating generators, coroutines and coroutines by attaching the closure variable storage directly to the object, using one variable size allocation, instead of two, once of which was a standard malloc. This makes creating them easier and avoids maintaining the closure pointer entirely.
  • Using dedicated compiled cell implementation similar to PyCellObject but fully under our control. This allowed for smaller code generated, while still giving a slight performance improvement.
  • Added free list implementation to cache generator, coroutines, and function objects, avoiding the need to create and delete this kind of objects in a loop.
  • Added support for the built-in sum, making slight optimizations to be much faster when iterating over lists and tuples, as well as fast long sum for Python2, and much faster bool sums too. This is using a prototype version of a "qiter" concept.
  • Provide type shape for xrange calls that are not constant too, allowing for better optimization related to those.

Tests

  • Added workarounds for locks being held by Virus Scanners on Windows to our test runner.
  • Enhanced constructs that test generator expressions to more clearly show the actual construct cost.
  • Added construct tests for the sum built-in on verious types of int containers, making sure we can do all of those really fast.

Summary

This release improves very heavily on generators in Nuitka. The memory allocator is used more cleverly, and free lists all around save a lot of interactions with it. More work lies ahead in this field, as these are not yet as fast as they should be. However, at least Nuitka should be faster than CPython for these kind of usages now.

Also, proper pure C in the Scons is relatively important to cover more of the rarer use cases, where the C compiler is too old.

The most important part is actually how sum optimization is staging a new kind of approach for code generation. This could become the standard code for iterators in loops eventually, making for loops even faster. This will be for future releases to expand.

Nuitka Release 0.5.23

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release is focusing on optimization, the most significant part for the users being enhanced scalability due to memory usage, but also break through structural improvements for static analysis of iterators and the debut of type shapes and value shapes, giving way to "shape tracing".

Bug Fixes

  • Fix support Python 3.5.2 coroutine changes. The checks got added for improved mode for older 3.5.x, the new protocol is only supported when run with that version or higher.

  • Fix, was falsely optimizing away unused iterations for non-iterable compile time constants.

    iter(1) # needs to raise.
    
  • Python3: Fix, eval must not attempt to strip memoryviews. The was preventing it from working with that type.

  • Fix, calling type without any arguments was crashing the compiler. Also the exception raised for anything but 1 or 3 arguments was claiming that only 3 arguments were allowed, which is not the compatible thing.

  • Python3.5: Fix, follow enhanced error checking for complex call handling of star arguments.

  • Compatibility: The from x import x, y re-formulation was doing two __import__ calls instead of re-using the module value.

Optimization

  • Uses only about 66% of the memory compared to last release, which is very important step for scalability independent of re-loading. This was achieved by making sure to break loop traces and their reference cycle when they become unused.

  • Properly detect the len of multiplications at compile time from newly introduces value shapes, so that this is e.g. statically optimized.

    print(len("*" * 10000000000))
    
  • Due to newly introduced type shapes, len and iter now properly detect more often if values will raise or not, and warn about detected raises.

    iter(len((something)) # Will always raise
    
  • Due to newly introduced "iterator tracing", we can now properly detect if the length of an unpacking matches its source or not. This allows to remove the check of the generic re-formulations of unpackings at compile time.

    a, b = b, a    # Will never raise due to unpacking
    a, b = b, a, c # Will always raise, 3 items cannot unpack to 2
    
  • Added support for optimization of the xrange built-in for Python2.

  • Python2: Added support for xrange iterable constant values, pre-building those constants ahead of time.

  • Python3: Added support and range iterable constant values, pre-building those constants ahead of time. This brings optimization support for Python3 ranges to what was available for Python2 already.

  • Avoid having a special node variange for range with no arguments, but create the exception raising node directly.

  • Specialized constant value nodes are using less generic implementations to query e.g. their length or iteration capabilities, which should speed up many checks on them.

  • Added support for the format built-in.

  • Python3: Added support for the ascii built-in.

Organizational

  • The movement to pure C got the final big push. All C++ only idoms of C++ were removed, and everything works with C11 compilers. A C++03 compiler can be used as a fallback, in case of MSVC or too old gcc for instance.
  • Using pure C, MinGW64 6x is now working properly. The latest version had problems with hypot related changes in the C++ standard library. Using C11 solves that.
  • This release also prepares Python 3.6 support, it includes full language support on the level of CPython 3.6.0b1.
  • The CPython 3.6 test suite was run with Python 3.5 to ensure bug level compatibility, and had a few findings of incompatibilities.

Cleanups

  • The last holdouts of classes in Nuitka were removed, and many idioms of C++ were stopped using.
  • Moved range related helper functions to a dedicated include file.
  • Using str is not bytes to detect Python3 str handling or actual bytes type existence.
  • Trace collections were using a mix-in that was merged with the base class that every user of it was having.

Tests

  • Added more static optimization tests, a lot more has become feasible to decide at run time, and is now done. These are to detect regressions in that domain.
  • The CPython 3.6 test suite is now also run with CPython 3.5 which found some incompatibilities.

Summary

This release marks a huge step forward. We are having the structure for type inference now. This will expand in coming releases to cover more cases, and there are many low hanging fruits for optimization. Specialized codes for variable versions of certain known shapes seems feasible now.

Then there is also the move towards pure C. This will make the backend compilation lighter, but due to using C11, we will not suffer any loss of convinience compared to "C-ish". The plan is to use continue to use C++ for compilation for compilers not capable of supporting C11.

The amount of static analysis done in Nuitka is now going to quickly expand, with more and more constructs predicted to raise errors or simplified. This will be an ongoing activity, as many types of expressions need to be enhanced, and only one missing will not let it optimize as well.

Also, it seems about time to add dedicated code for specific types to be as fast as C code. This opens up vast possibilities for acceleration and will lead us to zero overhead C bindings eventually. But initially the drive is towards enhanced import analysis, to become able to know the precide module expected to be imported, and derive type information from this.

The coming work will attack to start whole program optimization, as well as enhanced local value shape analysis, as well specialized type code generation, which will make Nuitka improve speed.

Nuitka Release 0.5.22

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release is mostly an intermediate release on the way to the large goal of having per module compilation that is cachable and requires far less memory for large programs. This is currently in progress, but required many changes that are in this release, more will be needed.

It also contains a bunch of bug fixes and enhancements that are worth to be released, and the next changes are going to be more invasive.

Bug Fixes

  • Compatibility: Classes with decorated __new__ functions could miss out on the staticmethod decorator that is implicit. It's now applied always, unless of course it's already done manually. This corrects an issue found with Pandas. Fixed in 0.5.22.1 already.
  • Standalone: For at least Python 3.4 or higher, it could happen that the locale needed was not importable. Fixed in 0.5.22.1 already.
  • Compatibility: Do not falsely assume that not expressions cannot raise on boolean expressions, since those arguments might raise during creation. This could lead to wrong optimization. Fixed in 0.5.22.2 already.
  • Standalone: Do not include system specific C libraries in the distribution created. This would lead to problems for some configurations on Linux in cases the glibc is no longer compatible with newer oder older kernels. Fixed in 0.5.22.2 already.
  • The --recurse-directory option didn't check with decision mechanisms for module inclusion, making it impossible to avoid some things.

Optimization

  • Introduced specialized constant classes for empty dictionaries and other special constants, e.g. "True" and "False", so that they can have more hard coded properties and save memory by sharing constant values.
  • The "technical" sharing of a variable is only consider for variables that had some sharing going in the first place, speeing things up quite a bit for that still critical check.
  • Memory savings coming from enhanced trace storage are already visible at about 1%. That is not as much as the reloading will mean, but still helpful to use less overall.

Cleanups

  • The global variable registry was removed. It was in the way of unloading and reloading modules easily. Instead variables are now attached to their owner and referenced by other users. When they are released, these variables are released.
  • Global variable traces were removed. Instead each variable has a list of the traces attached to it. For non-shared variables, this allows to sooner tell attributes of those variables, allowing for sooner optimization of them.
  • No longer trace all initial users of a variable, just merely if there were such and if it constitutes sharing syntactically too. Not only does this save memory, it avoids useless references of the variable to functions that stop using it due to optimization.
  • Create constant nodes via a factory function to avoid non-special instances where variants exist that would be faster to use.
  • Moved the C string functions to a proper nuitka.utils.CStrings package as we use it for better code names of functions and modules.
  • Made functions and explicit child node of modules, which makes their use more generic, esp. for re-loading modules.
  • Have a dedicated function for building frame nodes, making it easier to see where they are created.

Summary

This release is the result of a couple of months work, and somwhat means that proper re-loading of cached results is becoming in sight. The reloading of modules still fails for some things, and more changes will be needed, but with that out of the way, Nuitka's footprint is about to drop and making it then absolutely scalable. Something considered very important before starting to trace more information about values.

This next thing big ought to be one thing that structurally holds Nuitka back from generating C level performance code with say integer operations.

Nuitka Release 0.5.21

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release focused on scalability work. Making Nuitka more usable in the common case, and covering more standalone use cases.

Bug Fixes

  • Windows: Support for newer MinGW64 was broken by a workaround for older MinGW64 versions.
  • Compatibility: Added support for the (inofficial) C-Python API Py_GetArgcArgv that was causing prctl module to fail loading on ARM platforms.
  • Compatibility: The proper error message template for complex call arguments is now detected as compile time. There are changes comming, that are already in some pre-releases of CPython.
  • Standalone: Wasn't properly ignoring Tools and other directories in the standard library.

New Features

  • Windows: Detect the MinGW compiler arch and compare it to the Python arch. In case of a mismatch, the compiler is not used. Otherwise compilation or linking gives hard to understand errors. This also rules out MinGW32 as a compiler that can be used, as its arch doesn't match MinGW64 32 bits variant.
  • Compile modules in two passes with the option to specify which modules will be considered for a second pass at all (compiled without program optimization) or even become bytecode.
  • The developer mode installation of Nuitka in develop mode with the command pip install -e nuitka_git_checkout_dir is now supported too.

Optimization

  • Popular modules known to not be performance relevant are no longer C compiled, e.g. numpy.distutils and many others frequently imported (from some other module), but mostly not used and definitely not performance relevant.

Cleanups

  • The progress tracing and the memory tracing and now more clearly separate and therefore more readable.
  • Moved RPM related files to new rpm directory.
  • Moved documentation related files to doc directory.
  • Converted import sorting helper script to Python and made it run fast.

Organizational

  • The Buildbot infrastructure for Nuitka was updated to Buildbot 0.8.12 and is now maintained up to date with Ansible.
  • Upgraded the Nuitka bug tracker to Roundup 1.5.1 to which I had previously contributed security fixes already active.
  • Added SSL certificates from Let's Encrypt for the web server.

Summary

This release advances the scalability of Nuitka somewhat. The two pass approach does not yet carry all possible fruits. Caching of single pass compiled modules should follow for it to become consistently fast.

More work will be needed to achieve fast and scalable compilation, and that is going to remain the focus for some time.

Nuitka Release 0.5.20

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release is mostly about catching up with issues. Most address standalone problems with special modules, but there are also some general compatibility corrections, as well as important fixes for Python3.5 and coroutines and to improve compatibility with special Python variants like AnaConda under the Windows system.

Bug Fixes

  • Standalone Python3.5: The _decimal module at least is using a __name__ that doesn't match the name at load time, causing programs that use it to crash.
  • Compatibility: For Python3.3 the __loader__ attribute is now set in all cases, and it needs to have a __module__ attribute. This makes inspection as done by e.g. flask working.
  • Standalone: Added missing hidden dependencies for Tkinter module, adding support for this to work properly.
  • Windows: Detecting the Python DLL and EXE used at compile time and preserving this information use during backend compilation. This should make sure we use the proper ones, and avoids hacks for specific Python variants, enhancing the support for AnaConda, WinPython, and CPython installations.
  • Windows: The --python-debug flag now properly detects if the run time is supporting things and error exits if it's not available. For a CPython3.5 installation, it will switch between debug and non-debug Python binaries and DLLs.
  • Standalone: Added plug-in for the Pwm package to properly combine it into a single file, suitable for distribution.
  • Standalone: Packages from standard library, e.g. xml now have proper __path__ as a list and not as a string value, which breaks code of e.g. PyXML. Issue#183.
  • Standalone: Added missing dependency of twisted.protocols.tls. Issue#288.
  • Python3.5: When finalizing coroutines that were not finished, a corruption of its reference count could happen under some circumstances.
  • Standalone: Added missing DLL dependency of the uuid module at run time, which uses ctypes to load it.

New Features

  • Added support for AnaConda Python on this Linux. Both accelerated and standalone mode work now. Issue#295.
  • Added support for standalone mode on FreeBSD. Issue#294.
  • The plug-in framework was expanded with new features to allow addressing some specific issues.

Cleanups

  • Moved memory related stuff to dedicated utils package nuitka.utils.MemoryUsage as part of an effort to have more topical modules.
  • Plug-ins how have a dedicated module through which the core accesses the API, which was partially cleaned up.
  • No more "early" and "late" import detections for standalone mode. We now scan everything at the start.

Summary

This release focused on expanding plugins. These were then used to enhance the success of standalone compatibility. Eventually this should lead to a finished and documented plug-in API, which will open up the Nuitka core to easier hacks and more user contribution for these topics.

Nuitka Release 0.5.19

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release brings optimization improvements for dictionary using code. This is now lowering subscripts to dictionary accesses where possible and adds new code generation for known dictionary values. Besides this there is the usual range of bug fixes.

Bug Fixes

  • Fix, attribute assignments or deletions where the assigned value or the attribute source was statically raising crashed the compiler.
  • Fix, the order of evaluation during optimization was considered in the wrong order for attribute assignments source and value.
  • Windows: Fix, when g++ is the path, it was not used automatically, but now it is.
  • Windows: Detect the 32 bits variant of MinGW64 too.
  • Python3.4: The finalize of compiled generators could corrupt reference counts for shared generator objects. Fixed in 0.5.18.1 already.
  • Python3.5: The finalize of compiled coroutines could corrupt reference counts for shared generator objects.

Optimization

  • When a variable is known to have dictionary shape (assigned from a constant value, result of dict built-in, or a general dictionary creation), or the branch merge thereof, we lower subscripts from expecting mapping nodes to dictionary specific nodes. These generate more efficient code, and some are then known to not raise an exception.

    def someFunction(a,b):
        value = {a : b}
        value["c"] = 1
        return value
    

    The above function is not yet fully optimized (dictionary key/value tracing is not yet finished), however it at least knows that no exception can raise from assigning value["c"] anymore and creates more efficient code for the typical result = {} functions.

  • The use of "logical" sharing during optimization has been replaced with checks for actual sharing. So closure variables that were written to in dead code no longer inhibit optimization of the then no more shared local variable.

  • Global variable traces are now faster to decide definite writes without need to check traces for this each time.

Cleanups

  • No more using "logical sharing" allowed to remove that function entirely.
  • Using "technical sharing" less often for decisions during optimization and instead rely more often on proper variable registry.
  • Connected variables with their global variable trace statically avoid the need to check in variable registry for it.
  • Removed old and mostly unused "assume unclear locals" indications, we use global variable traces for this now.

Summary

This release aimed at dictionary tracing. As a first step, the value assign is now traced to have a dictionary shape, and this this then used to lower the operations which used to be normal subscript operations to mapping, but now can be more specific.

Making use of the dictionary values knowledge, tracing keys and values is not yet inside the scope, but expected to follow. We got the first signs of type inference here, but to really take advantage, more specific shape tracing will be needed.

Nuitka Progress in 2015

For quite a bit, there have been no status posts, not for lack of news, but a lot has happened indeed. I just seem to post a lot more to the mailing list than I do here. Especially about unfinished stuff, which is essentially for a project like Nuitka everything that's going on.

Like I previously said, I am shy to make public postings about unfinished stuff and that's going to continue. But I am breaking it, to keep you up to date with where Nuitka has been going lately.

And with release focuses, I have been making some actual changes that I think are worth talking about.

SSA (Single State Assignment Form)

The SSA using release has been made last summer. Recent releases have lifted more and more restrictions on where and now it is applied and made sure the internal status is consistent and true. And that trend is going to continue even more.

For shared variables (closure variables and module variables), Nuitka is still too conservative to make optimization. Code does annotate value escapes, but it's not yet trusting it. The next releases will focus on lifting that kind of restriction, and for quality of result, that will mean making a huge jump ahead once that works, so module variables used locally a lot will become even faster to use then and subject to static optimization too.

Function Inlining

When doing my talk to EuroPython 2015, I was demoing it that, and indeed, what a break through. The circumstances under which it is done are still far too limited though. Essentially that ability is there, but will not normally be noticable yet due to other optimization, e.g. functions are most often module variables and not local to the using function.

More code generation improvements will be needed to be able to inline functions that might raise an exception. Also the "cost" of inlining a function is also very much an unsolved issue. It will become the focus again, once the SSA use as indicated above expands to module variables, as then inlining other things than local functions will be possible too.

So there is a lot of things to do for this to really make a difference to your programs. But it's still great to have that part solved so far.

Scalability

Parameter Parsing

Recent releases have replaced some of the oldest code of Nuitka, the one that generated special argument parsing for each function individually, now replaced with generic code, that surprisingly is often even faster, although quick entry points were tough to beat.

That gives the C backend compiler a much easier time. Previously 3 C functions were created per Python level function, two of which could get really big with many arguments, and these are no more.

Variable Error Messages

Something similar was going on with variable error messages. Each had their exception value pre-computed and created at module load time. Most of these are of course unused. This has been replaced with code that generates it on the fly, resulting in a lot less constants code.

Code Objects

And another thing was to look after code objects, of which there often were two for each Python level function. The one used or the frame during run time and the one used in the function object, differered often, sometimes by small things like flags or local variable names.

That of course was just the result of not passing that along, but created cached objects with hopefully the same options, but that not being true.

Resolving that, and sharing the code object used for creation and then the frame is gives less complex C code too.

Optimization

The scalability of Nuitka also depends much on generated code size. With the optimization become more clever, less code is generated, and that trend will continue as more structural optimization are applied.

Every time e.g. an exception is identified to not happen, this removes the corresponding error exits from the C code, which then makes it easier for the C compiler. Also more specialized code as we now have or dictionaries, is often less complex to it.

Compatibility

Important things have happened here. Full compatibility mode is planned to not be the default anymore in upcoming releases, but that will only mean to not be stupid compatible, but to e.g. have more complete error messages than CPython, more correct line numbers, or for version differences, the best Python version behaviour.


The stable release has full support for Python 3.5, including the new async and await functions. So recent releases can pronounce it as fully supported which was quite a feat.

I am not sure, if you can fully appreciate the catch up game needed to play here. CPython clearly implements a lot of features, that I have to emulate too. That's going to repeat for every major release.

The good news is that the function type of Nuitka is now specialized to the generators and classes, and that was a massive cleanup of its core that was due anyway. The generators have no more their own function creation stuff and that has been helpful with a lot of other stuff.

Another focus driven from Python3, is to get ahead with type shape tracing and type inference of dicionary, and value tracing. To fully support Python3 classes, we need to work on something that is a dictionary a-like, and that will only ever be efficient if we have that. Good news is that the next release is making progress there too.

Performance

Graphs and Benchmarks

I also presented this weak point to EuroPython 2015 and my plan on how to resolve it. Unfortunately, nothing really happened here. My plan is still to use what the PyPy people have developed as vmprof.

So that is not progressing, and I could need help with that definitely. Get in contact if you think you can.

Standalone

The standalone mode of Nuitka was pretty good, and continued to improve further, but I don't care much.

Other Stuff

EuroPython 2015

This was a blast. Meeting people who knew Nuitka but not me was a regular occurrence. And many people well appreciate my work. It felt much different than the years before.

I was able to present Nuitka's function in-lining indeed there, and this high goal that I set myself, quite impressed people.

Also I made many new contacts, largely with the scientific community. I hope to find work with data scientists in the coming years. More amd more it looks like my day job should be closer to Nuitka and my expertise in Python.

Funding

Nuitka receives the occasional donation and those make me very happy. As there is no support from organization like the PSF, I am all on my own there.

This year I want to travel to Europython 2016. It would be sweet if aside of my free time it wouldn't also cost me money. So please consider donating some more, as these kind of events are really helpul to Nuitka.

Collaborators

Nuitka is making more and more break through progress. And you can be a part of it. Now.

You can join and should do so now, just follow this link or become part of the mailing list and help me there with request I make, e.g. review posts of mine, test out things, pick up small jobs, answer questions of newcomers, you know the drill probably.

Videos

There is a Youtube channel of mine with all the videos of Nuitka so far and I have been preparing myself with proper equipment to make Videos of Nuitka, but so far nothing has come out of that.

I do however really want to change that. Let's see if it happens.

Twitter

I have started to use my Twitter account on occasions. You are welcome to follow me there. I will highlight interesting stuff there.

Future

So, there is multiple things going on:

  • Type Inference

    With SSA in place, Nuitka starts to recognize types, and treat things that work something assigned from {} or dict built-in with special nodes and code.

    That's going to be a lot of work. For float and list there are very important use cases, where the code can be much better. But dict is the hardest case, and to get the structure of shape tracing right, we are going there first.

  • Shape Analyisis

    The plan for types, is not to use them, but the more general shapes, things that will be more prevalent than actual type information in a program. In fact the precise knowledge will be rare, but more often, we will just have a set of operations performed on a variable, and be able to guess from there.

    Shape analysis will begin though with concrete types like dict. The reason is that some re-formulations like Python3 classes should not use locals, but dictionary accesses throughout for full compatibility. Tracing that correctly to be effectively the same code quality will allow to make that change.

  • Plug-ins

    Something I wish I could have shown at EuroPython was plug-ins to Nuitka. It has become more complete, and some demo plug-ins for say Qt plugins or multiprocessing, are starting to work, but it's not progressing recently. The API will need work and of course documentation. Hope is for this to expand Nuitka's reach and appeal to get more contributors.

    It would be sweet, if there were any takers, aiming to complete these things.

  • Nested frames

    One result of in-lining will be nested frames still present for exceptions to be properly annotated, or locals giving different sets of locals and so on.

    Some cleanup of these will be needed for code generation and SSA to be able to attach variables to some sort of container, and for a function to be able to reference different sets of these.

Let me know, if you are willing to help. I really need that help to make things happen faster. Nuitka will become more and more important only. And with your help, things will be there sooner.

Release Focus

One thing I have started recently, is to make changes to Nuitka focused to just one goal, and to only deal with the rare bug in other fields, but not much else at all. So instead of across the board improvements in just about everything, I have e.g. in the last release added type inference for dictionaries and special nodes and their code generation for dictionary operations.

This progresses Nuitka in one field. And the next release then e.g. will only focus on making the performance comparison tool, and not continue much in other fields.

That way, more "flow" is possible and more visible progress too. As an example of this, these are the focuses of last releases.

  • Full Python 3.5 on a clean base with generators redone so that coroutines fit in nicely.
  • Scalability of C compilation with argument parsing redone
  • Next release soon: Shape analysis of subscript usages and optimization to exact dictionaries
  • Next release thereafter: Comparison benchmarking (vmprof, resolving C level function identifiers easier)

Other focuses will also happen, but that's too far ahead. Mostly like some usability improvements will be the focus of a release some day. Focus is for things that are too complex to attack as a side project, and therefore never happen although surely possible.

Digging into Python3.5 coroutines and their semantics was hard enough, and the structual changes needed to integrate them properly with not too much special casing, but rather removing existing special cases (generator functions) was just too much work to ever happen while also doing other stuff.

Summary

So I am very excited about Nuitka. It feels like the puzzle is coming together finally, with type inference becoming a real thing. And should dictionaries be sorted out, the real important types, say float for scientific use cases, or int, list for others, will be easy to make.

With this, and then harder import association (knowing what other modules are), and module level SSA tracing that can be trusted, we can finally expect Nuitka to be generally fast and deserve to be called a compiler.

That will take a while, but it's likely to happen in 2016. Let's see if I will get the funding to go to EuroPython 2016, that would be great.

Nuitka Release 0.5.18

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release mainly has a scalability focus. While there are few compatibility improvements, the larger goal has been to make Nuitka compilation and the final C compilation faster.

Bug Fixes

  • Compatibility: The nested arguments functions can now be called using their keyword arguments.

    def someFunction(a,(b,c)):
        return a, b, c
    
    someFunction(a = 1, **{".1" : (2,3)})
    
  • Compatibility: Generators with Python3.4 or higher now also have a __del__ attribute, and therefore properly participate in finalization. This should improve their interactions with garbage collection reference cycles, although no issues had been observed so far.

  • Windows: Was outputting command line arguments debug information at program start. Issue#284. Fixed in 0.5.17.1 already.

Optimization

  • Code generated for parameter parsing is now a lot less verbose. Python level loops and conditionals to generate code for each variable has been replaced with C level generic code. This will speed up the backend compilation by a lot.
  • Function calls with constant arguments were speed up specifically, as their call is now fully prepared, and yet using less code. Variable arguments are also faster, and all defaulted arguments are also much faster. Method calls are not affected by these improvements though.
  • Nested argument functions now have a quick call entry point as well, making them faster to call too.
  • The slice built-in, and internal creation of slices (e.g. in re-formulations of Python3 slices as subscripts) cannot raise. Issue#262.
  • Standalone: Avoid inclusion of bytecode of unittest.test, sqlite3.test, distutils.test, and ensurepip. These are not needed, but simply bloat the amount of bytecode used on e.g. MacOS. Issue#272.
  • Speed up compilation with Nuitka itself by avoid to copying and constructing variable lists as much as possible using an always accurate variable registry.

Cleanups

  • Nested argument functions of Python2 are now re-formulated into a wrapping function that directly calls the actual function body with the unpacking of nested arguments done in nodes explicitly. This allows for better optimization and checks of these steps and potential in-lining of these functions too.
  • Unified slice object creation and built-in slice nodes, these were two distinct nodes before.
  • The code generation for all statement kinds is now done via dispatching from a dictionary instead of long elif chains.
  • Named nodes more often consistently, e.g. all loop related nodes start with Loop now, making them easier to group.
  • Parameter specifications got simplified to work without variables where it is possible.

Organizational

  • Nuitka is now available on the social code platforms gitlab as well.

Summary

Long standing weaknesses have been addressed in this release, also quite a few structural cleanups have been performed, e.g. strengthening the role of the variable registry to always be accurate, is groundlaying to further improvement of optimization.

However, this release cycle was mostly dedicated to performance of the actual compilation, and more accurate information was needed to e.g. not search for information that should be instant.

Upcoming releases will focus on usability issues and further optimization, it was nice however to see speedups of created code even from these scalability improvements.

Nuitka Release 0.5.17

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release is a major feature release, as it adds full support for Python3.5 and its coroutines. In addition, in order to properly support coroutines, the generator implementation got enhanced. On top of that, there is the usual range of corrections.

Bug Fixes

  • Windows: Command line arguments that are unicode strings were not properly working.
  • Compatibility: Fix, only the code object attached to exceptions contained all variable names, but not the one of the function object.
  • Python3: Support for virtualenv on Windows was using non-portable code and therefore failing. Issue#266.
  • The tree displayed with --display-tree duplicated all functions and did not resolve source lines for functions. It also displayed unused functions, which is not helpful.
  • Generators with parameters leaked C level memory for each instance of them leading to memory bloat for long running programs that use a lot of generators. Fixed in 0.5.16.1 already.
  • Don't drop positional arguments when called with --run, also make it an error if they are present without that option.

New Features

  • Added full support for Python3.5, coroutines work now too.

Optimization

  • Optimized frame access of generators to not use both a local frame variable and the frame object stored in the generator object itself. This gave about 1% speed up to setting them up.
  • Avoid having multiple code objects for functions that can raise and have local variables. Previously one code object would be used to create the function (with parameter variable names only) and when raising an exception, another one would be used (with all local variable names). Creating them both at start-up was wasteful and also needed two tuples to be created, thus more constants setup code.
  • The entry point for generators is now shared code instead of being generated for each one over and over. This should make things more cache local and also results in less generated C code.
  • When creating frame codes, avoid working with strings, but use proper emission for less memory churn during code generation.

Organizational

  • Updated the key for the Debian/Ubuntu repositories to remain valid for 2 more years.
  • Added support for Fedora 23.
  • MinGW32 is no more supported, use MinGW64 in the 32 bits variant, which has less issues.

Cleanups

  • Detecting function type ahead of times, allows to handle generators different from normal functions immediately.
  • Massive removal of code duplication between normal functions and generator functions. The later are now normal functions creating generator objects, which makes them much more lightweight.
  • The return statement in generators is now immediately set to the proper node as opposed to doing this in variable closure phase only. We can now use the ahead knowledge of the function type.
  • The nonlocal statement is now immediately checked for syntax errors as opposed to doing that only in variable closure phase.
  • The name of contraction making functions is no longer skewed to empty, but the real thing instead. The code name is solved differently now.
  • The local_locals mode for function node was removed, it was always true ever since Python2 list contractions stop using pseudo functions.
  • The outline nodes allowed to provide a body when creating them, although creating that body required using the outline node already to create temporary variables. Removed that argument.
  • Removed PyLint false positive annotations no more needed for PyLint 1.5 and solved some TODOs.
  • Code objects are now mostly created from specs (not yet complete) which are attached and shared between statement frames and function creations nodes, in order to have less guess work to do.

Tests

  • Added the CPython3.5 test suite.
  • Updated generated doctests to fix typos and use common code in all CPython test suites.

Summary

This release continues to address technical debt. Adding support for Python3.5 was the major driving force, while at the same time removing obstacles to the changes that were needed for coroutine support.

With Python3.5 sorted out, it will be time to focus on general optimization again, but there is more technical debt related to classes, so the cleanup has to continue.

Nuitka Release 0.5.16

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This is a maintenance release, largely intended to put out improved support for new platforms and minor corrections. It should improve the speed for standalone mode, and compilation in general for some use cases, but this is mostly to clean up open ends.

Bug Fixes

  • Fix, the len built-in could give false values for dictionary and set creations with the same element.

    # This was falsely optimized to 2 even if "a is b and a == b" was true.
    len({a, b})
    
  • Python: Fix, the gi_running attribute of generators is no longer an int, but bool instead.

  • Python3: Fix, the int built-in with two arguments, value and base, raised UnicodeDecodeError instead of ValueError for illegal bytes given as value.

  • Python3: Using tokenize.open to read source code, instead of reading manually and decoding from tokenize.detect_encoding, this handles corner cases more compatible.

  • Fix, the PyLint warnings plug-in could crash in some cases, make sure it's more robust.

  • Windows: Fix, the combination of AnaConda Python, MinGW 64 bits and mere acceleration was not working. Issue#254.

  • Standalone: Preserve not only namespace packages created by .pth files, but also make the imports done by them. This makes it more compatible with uses of it in Fedora 22.

  • Standalone: The extension modules could be duplicated, turned this into an error and cache finding them during compile time and during early import resolution to avoid duplication.

  • Standalone: Handle "not found" from ldd output, on some systems not all the libraries wanted are accessible for every library.

  • Python3.5: Fixed support for namespace packages, these were not yet working for that version yet.

  • Python3.5: Fixes lack of support for unpacking in normal tuple, list, and set creations.

    [*a] # this has become legal in 3.5 and now works too.
    

    Now also gives compatible SyntaxError for earlier versions. Python2 was good already.

  • Python3.5: Fix, need to reduce compiled functions to __qualname__ value, rather than just __name__ or else pickling methods doesn't work.

  • Python3.5: Fix, added gi_yieldfrom attribute to generator objects.

  • Windows: Fixed harmless warnings for Visual Studio 2015 in --debug mode.

Optimization

  • Re-formulate exec and eval to default to globals() as the default for the locals dictionary in modules.
  • The try node was making a description of nodes moved to the outside when shrinking its scope, which was using a lot of time, just to not be output, now these can be postponed.
  • Refactored how freezing of bytecode works. Uncompiled modules are now explicit nodes too, and in the registry. We only have one or the other of it, avoiding to compile both.

Tests

  • When strace or dtruss are not found, given proper error message, so people know what to do.
  • The doc tests extracted and then generated for CPython3 test suites were not printing the expressions of the doc test, leading to largely decreased test coverage here.
  • The CPython 3.4 test suite is now also using common runner code, and avoids ignoring all Nuitka warnings, instead more white listing was added.
  • Started to run CPython 3.5 test suite almost completely, but coroutines are blocking some parts of that, so these tests that use this feature are currently skipped.
  • Removed more CPython tests that access the network and are generally useless to testing Nuitka.
  • When comparing outputs, normalize typical temporary file names used on posix systems.
  • Coverage tests have made some progress, and some changes were made due to its results.
  • Added test to cover too complex code module of idna module.
  • Added Python3.5 only test for unpacking variants.

Cleanups

  • Prepare plug-in interface to allow suppression of import warnings to access the node doing it, making the import node is accessible.
  • Have dedicated class function body object, which is a specialization of the function body node base class. This allowed removing class specific code from that class.
  • The use of "win_target" as a scons parameter was useless. Make more consistent use of it as a flag indicator in the scons file.
  • Compiled types were mixing uses of compiled_ prefixes, something with a space, sometimes with an underscore.

Organizational

  • Improved support for Python3.5 missing compatibility with new language features.
  • Updated the Developer Manual with changes that SSA is now a fact.
  • Added Python3.5 Windows MSI downloads.
  • Added repository for Ubuntu Wily (15.10) for download. Removed Ubuntu Utopic package download, no longer supported by Ubuntu.
  • Added repository with RPM packages for Fedora 22.

Summary

So this release is mostly to lower the technical debt incurred that holds it back from supporting making more interesting changes. Upcoming releases may have continue that trend for some time.

This release is mostly about catching up with Python3.5, to make sure we did not miss anything important. The new function body variants will make it easier to implement coroutines, and help with optimization and compatibility problems that remain for Python3 classes.

Ultimately it will be nice to require a lot less checks for when function in-line is going to be acceptable. Also code generation will need a continued push to use the new structure in preparation for making type specific code generation a reality.