Nuitka Release 0.4.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release is the first follow-up with a focus on optimization. The major highlight is progress towards SSA form in the node tree.

Also a lot of cleanups have been performed, for both the tree building, which is now considered mostly finished, and will be only reviewed. And for the optimization part there have been large amounts of changes.

New Features

  • Python 3.3 experimental support

    • Now compiles many basic tests. Ported the dictionary quick access and update code to a more generic and useful interface.
    • Added support for __qualname__ to classes and functions.
    • Small compatibility changes. Some exceptions changed, absolute imports are now default, etc.
    • For comparison tests, the hash randomization is disabled.
  • Python 3.2 support has been expanded.

    The Python 3.2 on Ubuntu is not providing a helper function that was used by Nuitka, replaced it with out own code.

Bug fixes

  • Default values were not "is" identical.

    def defaultKeepsIdentity(arg = "str_value"):
        print arg is "str_value"
    
    defaultKeepsIdentity()
    

    This now prints "True" as it does with CPython. The solution is actually a general code optimization, see below. Issue#55

  • Usage of unicode built-in with more than one argument could corrupt the encoding argument string.

    An implementation error of the unicode was releasing references to arguments converted to default encoding, which could corrupt it.

  • Assigning Python3 function annotations could cause a segmentation fault.

New Optimization

  • Improved propagation of exception raise statements, eliminating more code. They are now also propagated from all kinds of expressions. Previously this was more limited. An assertion added will make sure that all raises are propagated. Also finally, raise expressions are converted into raise statements, but without any normalization.

    # Now optimizing:
    raise TypeError, 1/0
    # into (minus normalization):
    raise ZeroDivisionError, "integer division or modulo by zero"
    
    # Now optimizing:
    (1/0).something
    # into (minus normalization):
    raise ZeroDivisionError, "integer division or modulo by zero"
    
    # Now optimizing:
    function(a, 1/0).something
    # into (minus normalization), notice the side effects of first checking
    # function and a as names to be defined, these may be removed only if
    # they can be demonstrated to have no effect.
    function
    a
    raise ZeroDivisionError, "integer division or modulo by zero"
    

    There is more examples, where the raise propagation is new, but you get the idea.

  • Conditional expression nodes are now optimized according to the truth value of the condition, and not only for compile time constants. This covers e.g. container creations, and other things.

    # This was already optimized, as it's a compile time constant.
    a if ("a",) else b
    a if True else b
    
    # These are now optimized, as their truth value is known.
    a if (c,) else b
    a if not (c,) else b
    

    This is simply taking advantage of infrastructure that now exists. Each node kind can overload "getTruthValue" and benefit from it. Help would be welcome to review which ones can be added.

  • Function creations only have side effects, when their defaults or annotations (Python3) do. This allows to remove them entirely, should they be found to be unused.

  • Code generation for constants now shares element values used in tuples.

    The general case is currently too complex to solve, but we now make sure constant tuples (as e.g. used in the default value for the compiled function), and string constants share the value. This should reduce memory usage and speed up program start-up.

Cleanups

  • Optimization was initially designed around visitors that each did one thing, and did it well. It turns out though, that this approach is unnecessary, and constraint collection, allows for the most consistent results. All remaining optimization has been merged into constraint collection.
  • The names of modules containing node classes were harmonized to always be plural. In the beginning, this was used to convey the information that only a single node kind would be contained, but that has long changed, and is unimportant information.
  • The class names of nodes were stripped from the "CPython" prefix. Originally the intent was to express strict correlation to CPython, but with increasing amounts of re-formulations, this was not used at all, and it's also not important enough to dominate the class name.
  • The re-formulations performed in tree building have moved out of the "Building" module, into names "ReformulationClasses" e.g., so they are easier to locate and review. Helpers for node building are now in a separate module, and generally it's much easier to find the content of interest now.
  • Added new re-formulation of print statements. The conversion to strings is now made explicit in the node tree.

New Tests

  • Added test to cover default value identity.

Organizational

  • The upload of Nuitka to PyPI has been repaired and now properly displays project information again.

Summary

The quicker release is mostly a consolidation effort, without actual performance progress. The progress towards SSA form matter a lot on the outlook front. Once this is finished, standard compiler algorithms can be added to Nuitka which go beyond the current peephole optimization.

PyStone Comparison Nuitka, Cython, and CPython

As you all know, Nuitka (see "what is Nuitka?" ) has recently completed a milestone. Always short on time, I am not doing a whole lot of benchmarking yet, and focus on development. But here is an interesting submission from Dave Kierans (CTO of iPowow! Ltd):

➜  ~  python pystone.py 1000000
Pystone(1.1) time for 1000000 passes = 10.2972
This machine benchmarks at 97113.5 pystones/second
➜  ~  cython --embed pystone.py;gcc pystone.c -I/usr/include/python2.6 -L /usr/lib/ -lpython2.6 -o ./pystone.cython;./pystone.cython 1000000
Pystone(1.1) time for 1000000 passes = 8.20789
This machine benchmarks at 121834 pystones/second
➜  ~  nuitka-python pystone.py 1000000
Pystone(1.1) time for 1000000 passes = 4.06196
This machine benchmarks at 246187 pystones/second

This is nice result for Nuitka, even if we all know that pystone is not a really good benchmark at all, and that its results definitely do not translate to other software. It definitely makes me feel good. With the 0.4.x series kicked off, it will be exciting to see, where these numbers can indeed go, once Nuitka actually applies standard compiler techniques.

Nuitka Release 0.4.0

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release brings massive progress on all fronts. The big highlight is of course: Full Python3.2 support. With this release, the test suite of CPython3.2 is considered passing when compiled with Nuitka.

Then lots of work on optimization and infrastructure. The major goal of this release was to get in shape for actual optimization. This is also why for the first time, it is tested that some things are indeed compile time optimized to spot regressions easier. And we are having performance diagrams, even if weak ones:

New Features

  • Python3.2 is now fully supported.
    • Fully correct metaclass = semantics now correctly supported. It had been working somewhat previously, but now all the corner cases are covered too.
    • Keyword only parameters.
    • Annotations of functions return value and their arguments.
    • Exception causes, chaining, automatic deletion of exception handlers as values.
    • Added support for starred assigns.
    • Unicode variable names are also supported, although it's of course ugly, to find a way to translate these to C++ ones.

Bug fixes

  • Checking compiled code with instance(some_function, types.FunctionType) as "zope.interfaces" does, was causing compatibility problems. Now this kind of check passes for compiled functions too. Issue#53

  • The frame of modules had an empty locals dictionary, which is not compatible to CPython which puts the globals dictionary there too. Also discussed in Issue#53

  • For nested exceptions and interactions with generator objects, the exceptions in "sys.exc_info()" were not always fully compatible. They now are.

  • The range builtin was not raising exceptions if given arguments appeared to not have side effects, but were still illegal, e.g. range([], 1, -1) was optimized away if the value was not used.

  • Don't crash on imported modules with syntax errors. Instead, the attempted recursion is simply not done.

  • Doing a del on __defaults and __module__ of compiled functions was crashing. This was noticed by a Python3 test for __kwdefaults__ that exposed this compiled functions weakness.

  • Wasn't detecting duplicate arguments, if one of them was not a plain arguments. Star arguments could collide with normal ones.

  • The __doc__ of classes is now only set, where it was in fact specified. Otherwise it only polluted the name space of locals().

  • When return from the tried statements of a try/finally block, was overridden, by the final block, a reference was leaked. Example code:

    try:
        return 1
    finally:
        return 2
    
  • Raising exception instances with value, was leaking references, and not raising the TypeError error it is supposed to do.

  • When raising with multiple arguments, the evaluation order of them was not enforced, it now is. This fixes a reference leak when raising exceptions, where building the exception was raising an exception.

New Optimization

  • Optimizing attribute access to compile time constants for the first time. The old registry had no actual user yet.
  • Optimizing subscript and slices for all compile time constants beyond constant values, made easy by using inheritance.
  • Built-in references now convert to strings directly, e.g. when used in a print statement. Needed for the testing approach "compiled file contains only prints with constant value".
  • Optimizing calls to constant nodes directly into exceptions.
  • Optimizing built-in bool for arguments with known truth value. This would be creations of tuples, lists, and dictionaries.
  • Optimizing a is b and a is not b based on aliasing interface, which at this time effectively is limited to telling that a is a is true and a is not a is false, but this will expand.
  • Added support for optimizing hasattr, getattr, and setattr built-ins as well. The hasattr was needed for the class re-formulation of Python3 anyway.
  • Optimizing getattr with string argument and no default to simple attribute access.
  • Added support for optimizing isinstance built-in.
  • Was handling "BreakException" and "ContinueException" in all loops that used break or continue instead of only where necessary.
  • When catching "ReturnValueException", was raising an exception where a normal return was sufficient. Raising them now only where needed, which also means, function need not catch them ever.

Cleanups

  • The handling of classes for Python2 and Python3 have been re-formulated in Python more completely.
    • The calling of the determined "metaclass" is now in the node tree, so this call may possible to in-line in the future. This eliminated some static C++ code.
    • Passing of values into dictionary creation function is no longer using hard coded special parameters, but temporary variables can now have closure references, making this normal and visible to the optimization.
    • Class dictionary creation functions are therefore no longer as special as they used to be.
    • There is no class creation node anymore, it's merely a call to type or the metaclass detected.
  • Re-formulated complex calls through helper functions that process the star list and dict arguments and do merges, checks, etc.
    • Moves much C++ code into the node tree visibility.
    • Will allow optimization to eliminate checks and to compile time merge, once in-line functions and loop unrolling are supported.
  • Added "return None" to function bodies without a an aborting statement at the end, and removed the hard coded fallback from function templates. Makes it explicit in the node tree and available for optimization.
  • Merged C++ classes for frame exception keeper with frame guards.
    • The exception is now saved in the compiled frame object, making it potentially more compatible to start with.
    • Aligned module and function frame guard usage, now using the same class.
    • There is now a clear difference in the frame guard classes. One is for generators and one is for functions, allowing to implement their different exception behavior there.
  • The optimization registries for calls, subscripts, slices, and attributes have been replaced with attaching them to nodes.
    • The ensuing circular dependency has been resolved by more local imports for created nodes.
    • The package "nuitka.transform.optimization.registries" is no more.
    • New per node methods "computeNodeCall", "computeNodeSubscript", etc. dispatch the optimization process to the nodes directly.
  • Use the standard frame guard code generation for modules too.
    • Added a variant "once", that avoids caching of frames entirely.
  • The variable closure taking has been cleaned up.
    • Stages are now properly numbered.
    • Python3 only stage is not executed for Python2 anymore.
    • Added comments explaining things a bit better.
    • Now an early step done directly after building a tree.
  • The special code generation used for unpacking from iterators and catching "StopIteration" was cleaned up.
    • Now uses template, Generator functions, and proper identifiers.
  • The return statements in generators are now re-formulated into raise StopIteration for generators, because that's what they really are. Allowed to remove special handling of return nodes in generators.
  • The specialty of CPython2.6 yielding non-None values of lambda generators, was so far implemented in code generation. This was moved to tree building as a re-formulation, making it subject to normal optimization.
  • Mangling of attribute names in functions contained in classes, has been moved into the early tree building. So far it was done during code generation, making it invisible to the optimization stages.
  • Removed tags attribute from node classes. This was once intended to make up for non-inheritance of similar node kinds, but since we have function references, the structure got so clean, it's no more needed.
  • Introduced new package nuitka.tree, where the building of node trees, and operations on them live, as well as recursion and variable closure.
  • Removed nuitka.transform and move its former children nuitka.optimization and nuitka.finalization one level up. The deeply nested structure turned out to have no advantage.
  • Checks for Python version was sometimes "> 300", where of course ">= 300" is the only thing that makes sense.
  • Split out helper code for exception raising from the handling of exception objects.

New Tests

  • The complete CPython3.2 test suite was adapted (no __code__, no __closure__, etc.) and is now passing, but only without "--debug", because otherwise some of the generated C++ triggers (harmless) warnings.
  • Added new test suite designed to prove that expressions that are known to be compile time constant are indeed so. This works using the XML output done with "--dump-xml" and then searching it to only have print statements with constant values.
  • Added new basic CPython3.2 test "Functions32" and "ParameterErrors32" to cover keyword only parameter handling.
  • Added tests to cover generator object and exception interactions.
  • Added tests to cover try/finally and return in one or both branches correctly handling the references.
  • Added tests to cover evaluation order of arguments when raising exceptions.

Organizational

  • Changed my email from GMX over to Gmail, the old one will still continue to work. Updated the copyright notices accordingly.
  • Uploaded Nuitka to PyPI as well.

Summary

This release marks a milestone. The support of Python3 is here. The re-formulation of complex calls, and the code generation improvements are quite huge. More re-formulation could be done for argument parsing, but generally this is now mostly complete.

The 0.3.x series had a lot releases. Many of which brought progress with re-formulations that aimed at making optimization easier or possible. Sometimes small things like making "return None" explicit. Sometimes bigger things, like making class creations normal functions, or getting rid of or and and. All of this was important ground work, to make sure, that optimization doesn't deal with complex stuff.

So, the 0.4.x series begins with this. The focus from now on can be almost purely optimization. This release contains already some of it, with frames being optimized away, with the assignment keepers from the or and and re-formulation being optimized away. This will be about achieving goals from the "ctypes" plan as discussed in the developer manual.

Also the performance page will be expanded with more benchmarks and diagrams as I go forward. I have finally given up on "codespeed", and do my own diagrams.

Python3 support by Nuitka is upcoming

This is kind of semi-interesting news for you Python3 lovers. My "Python compiler Nuitka" has been supporting the language somewhat, but in the next release it's going to be complete.

Support for annotations, unicode variable names, keyword only arguments, starred assignments, exception chaining plus causes, and full support for new meta classes all have been added. So Python3.2 is covered now too and passing the test suite on about the same level as with Python2.6 and Python2.7 already.

I readied this while working on optimization, which is also seeing some progress, that I will report on another day. Plus it's "available now on PyPI" too. At the time of this writing, there is pre-releases there, but after next release, there will only stable releases published.

My quick take on the Python3 that I saw:

Keyword Only Arguments

Now seriously. I didn't miss that. And I am going to hate the extra effort it causes to implement argument parsing. And when I looked into default handling, I was just shocked:

def f( some_arg = function1(), *, some_key_arg = function2() )

What would you expect the evaluation order to be for defaults? I raised a CPython "bug report" about it. And I am kind of shocked this could be wrong.

Annotations

def f( arg1 : c_types.c_int, arg2 : c_types.c_int ) -> c_types.c_long
   return arg1+arg2

That looks pretty promising. I had known about that already, and I guess, whatever "hints" idea, we come up with Nuitka, it should at least allow this notation. Unfortunately, evaluation order of annotations is not that great either. I would expected them to come first, but they come last. Coming last, they kind of come too late to accept do anything with defaults.

Unicode Variables

Now great. While it's one step closer to "you can't re-enter any identifier without copy&paste", it is also a whole lot more consistent, if you come e.g. from German or French and need only a few extra letters. And there has to be at least one advantage to making everything unicode, right?

Exception Chaining

Gotta love that. Exception handlers are often subject to bit rot. They will themselves contain errors, when they are actually used, hiding the important thing that happened. No more. That's great.

The exception causes on the other hand, might be useful, but I didn't see them a lot yet. That's probably because they are too new.

Starred Assigns

I was wondering already, why that didn't work before.

a, *b = something()

Metaclass __prepare__

That's great stuff. Wish somebody had thought about meta classes that way from the outset.

nonlocal

Yeah, big stuff. I love it. Also being able to write closure variables is great for consistency.

Conclusion

Nothing ground breaking. Nothing that makes me give up on Python2.7 yet, but a bunch of surprises. And for Nuitka, this also found bugs and corner cases, where things were not yet properly tested. The Python3 test suite, inspired the finding of some obscure compatibility problems for sure.

Overall, it's possible to think of Python2 and Python3 as still the same language with only minor semantic differences. These can then be dealt with re-formulations into the simpler Python that Nuitka optimization deals with internally.

I for my part, am still missing the print statement. But well, now Python3 finds them when I accidentally check them in. So that's kind of a feature for as long as I develop in Python2.6/2.7 language.

Now if one of you would help out with Python3.3, that would be great. It would be about the new dictionary implementation mostly. And maybe finding a proper re-formulation for "yield from" too.

Static Compilation - That is the point

In a recent post, Stefan Behnel questioned the point of static compilation and suggests that in order to be useful, a static compiler needs to add something on top.

This is going to be a rebuttal.

Compatibility, I mean it

First of all, let me start out, by saying that Nuitka is intended to be the fully optimizing compiler of Python. The optimizing part is not yet true. Right now, it's probably a correct compiler of Python. Correct in the sense that it's compatible to CPython as far as possible.

As examples of what I mean with compatibility:

  • Nuitka will hold references to local variables of frames, when an exception is raised, and will release them only once the next exception is raised.
  • Nuitka will give the same error messages for all kinds of errors. For example, the parameter parsing of functions will be the same.
  • Nuitka provides all language constructs no matter how absurd or unused they are.

Compatibility != Slower

While generally Nuitka will have a hard time to be faster and compatible to CPython, I don't have much concern about that. Using guards between optimistic, and less optimistic variants of code, there is no doubt in my head, that for programs, lots of code will only need very minimal type annotation and still receive respectable speedups.

Of course, at this point, this is only speculation. But I somehow gather that the sentiment is that incompatible and fast need to go along. I totally dispute that.

Language Extensions

Now, the "in addition" stuff, that Stefan is talking about. I don't see the point at all. It is quite obvious that everything you can say with cdef syntax, could also be said with a more Pythonic syntax. And if it were missing, it could be added. And where it's a semantic change, it should be frowned upon.

For the Nuitka project, I never considered an own parser of Python. No matter how easy it would be, to roll your own, and I understand that Cython did that, it's always going to be wrong and it's work. Work that has no point. The CPython implementation exhibits and maintains the module ast that works just fine.

For Python, if this were so really useful, such language extensions should be added to Python itself. If there were missing meaningful things, I contend they would best be added there, not in a fork of it. Note how ctypes and cffi have been added. When I created bindings for Ada code (C exports) to Python, it was so easy to do that in pure Python with ctypes. I so much enjoyed that.

So, slow bindings are in my view really easy to create with plain Python now. Somebody ought to make a ".h" to ctypes/cffi declarations converter, once they are really faster to use (possibly due to Nuitka). For Nuitka it should be possible to accelerate these into direct calls and accesses. At which point, mixing generated C code and C include statements, will just be what it is, a source of bugs that Nuitka won't have.

Note

Further down, I will give examples of why I think that cdef is inferior to plain Python, even from a practical point of view.

Lack of Interpreter is bad

Static compilation vs. interpretation as a discussion has little merits to me. I find it totally obvious that you don't need static compilation, but 2 other things:

  1. You may need interpretation.
  2. And may need speed.

To me static code analysis and compilation are means to achieve that speed, but not intended to remove interpretation, e.g. plugins need to still work, no matter how deep the go.

For Cython syntax there is no interpreter, is there? That makes it loose an important point. So it has to have another reason for using it, and that would be speed and probably convenience. Now suppose Nuitka takes over with these benefits, what would it be left with? Right. Nothing. At all. Well, of course legacy users.

The orinal sin fall of PyRex - that is now Cython - is nothing that Nuitka should repeat.

No Lock-in

The Cython language is so underspecified, I doubt anybody could make a compatible implementation. Should you choose to use it, you will become locked in. That means, if Cython breaks or won't work to begin with, you are stuck.

That situation I totally despise. It seems an unnecessary risk to take. Not only, if your program does not work, you can't just try another compiler. You also will never really know, if it's either your fault or Cython's fault until you do know, whose fault it is. Find yourself playing with removing, adding, or re-phrasing cdef statements, until things work.

Common. I would rather use PyPy or anything else, that can be checked with CPython. Should I ever encounter a bug with it, I can try CPython, and narrow down with it. Or I can try Jython, IronPython, or low and behold, Nuitka.

I think, this totally makes it obvious, that static compilation of a non-Python language has no point to people with Python code.

What I will allways admit, is that Cython is (currently) the best way to create fast bindings, because Nuitka is not ready yet. But from my point of view, Cython has no point long term if a viable replacement that is Pythonic exists.

Python alone is a point

So, if you leave out static compilation vs. interpretation and JIT compilation, what would be the difference between PyPy and Nuitka? Well, obviously PyPy people are a lot cooler and cleverer. Their design is really inspiring and impressive. My design and whole approach to Nuitka is totally boring in comparison.

But from a practical standpoint, is there any difference? What is the difference between Jython and PyPy? The target VM it is. PyPy's or Java's. And performance it is, of course.

So, with Python implementations all being similar, and just differing in targets, and performances, do they all have no point? I believe taken to the logical conclusion, that is what Stefan suggests. I of course think that PyPy, Nuitka, and Jython have have as much of a point, as CPython does.

Type Annotations done right

And just for fun. This is making up a use cases of type annotations:

plong = long if python_version < 3 else int

@hints.signature( plong, plong )
def some_function( a ):
    return a ** 2

Notice how plong depends on an expression, that may become known during compile time or not. Should that turn out to be not possible, Nuitka can always generate code for both branches and branch when called.

Or more complex and useful like this:

def guess_signature( func ):
   types = [ None ]

   emit = types.append
   for arg in inspect.getargnames( func ):
      if arg == "l":
         emit( list )
      elif arg == "f":
         emit( float )
      elif arg == "i":
         emit( int )
      else:
         hints.warning( "Unknown type %s" % arg )
         emit( None )

   return hints.signature( *types )

def many_hints( func ):
   # Won't raise exception.
   hints.doesnot_raise( func )

   # Signature to be inferred by conventions
   guess_signature( func )( func )

   # No side effects
   hints.pure( func )

@many_hints
def some_func1( f ):
   return f + 2.0

@many_hints
def some_func2( i ):
   return i + 2

@many_hints
def some_func3( l ):
   return i + [ 2 ]

This is just a rough sketch, but hopefully you get the idea. Do this with Cython, can you?

The hints can be put into decorators, which may be discovered as inlinable, which then see more inlines. For this to work best, the loop over the compile time constant code object, needs to be unrolled, but that appears quite possible.

The signatures can therefore be done fully automatic. One could use prefix notation to indicate types.

Another way would put fixed types for certain variable names. In Nuitka code, "node", "code", "context", etc. have always the same types. I suspect many programs are the same, and it would be sweet, if you could plug something in and check such types throughout all of the package.

And then, what do you do then? Well, you can inspect these hints at run time as well, they work with CPython as well (though they won't make things faster, only will that find errors in your program), they will even work with PyPy, or at least not harm it. It will nicely JIT them away I suppose.

Your IDE will like the code. syntax highlighting, auto indent will work. With every Python IDE. PyLint will find the bugs I made in that code up there. And Nuitka will compile it and benefit from the hints.

My point here really is, that cdef is not flexible, not standard, not portable. It should die. It totally is anti-Pythonic to me.

Elsewhere

In Java land, people compile to machine code as well. They probably also - like stupid me - didn't understand that static compilation would have no point. Why do they do it? Why am I using compiled binaries done with their compiler then?

And why didn't they take the chance to introduce ubercool cdef a-likes while doing it? They probably just didn't know better, did they?

No seriously. A compiler is just a compiler. It takes a source code in a language and turns it into a package to execute. That may be a required or an optional step. I prefer optional for development turn around. It should try and make code execute as fast as it can. But it should not change the language. With Cython I have to compile. With Nuitka I could.

In fact, I would be hard pressed to find another example of a compiler that extends the interpreted language compiled, just so there is a point in having it.

Conclusion

Nuitka has a point. On top of that I enjoy doing it. It's great to have the time to do this thing in the correct way.

So far, things worked out pretty well. My earlier experimentations with type inference had shown some promise. The "value friends" thing, and the whole plan, appears relatively sound, but likely is in need of an update. I will work on it in december. Up to now, and even right now I worked on re-formulations, that should have made it possible to get more release ready effects from this.

When I say correct way, I mean this. When I noticed that type inference was harder than it should be, I could take the time and re-architecture things so that it will be simpler. To me that is fun. This being my spare time allows me to do things this efficiently. That's not an excuse, it's a fact that explains my approach. It doesn't mean it makes less sense, not at all.

As for language compatibility, there is more progress with Python3. I am currently changing the class re-formulations for Python2 and Python3 (they need totally different ones due to metaclass changes) and then "test_desc.py" should pass with it too, which will be a huge achievement in that domain. I will do a post on that later.

Then infrastructure, should complete the valgrind based benchmark automatism. Numbers will become more important from now on. It starts to make sense to observe them. This is not entirely as fun. But with improving numbers, it will be good to show off.

And of course, I am going to document some more. The testing strategy of Nuitka is worth a look, because it's totally different from everything else people normally do.

Anyway. I am not a big fan of controversy. I respect Cython for all it achieved. I do want to go where it fails to achieve. I should not have to justify that, it's actually quite obvious, isn't it?

Yours, Kay

Nuitka Release 0.3.25

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release brings about changes on all fronts, bug fixes, new features. Also very importantly Nuitka no longer uses C++11 for its code, but mere C++03. There is new re-formulation work, and re-factoring of functions.

But the most important part is this: Mercurial unit tests are working. Nearly. With the usual disclaimer of me being wrong, all remaining errors are errors of the test, or minor things. Hope is that these unit tests can be added as release tests to Nuitka. And once that is done, the next big Python application can come.

Bug fixes

  • Local variables were released when an exception was raised that escaped the local function. They should only be released, after another exception was raised somewhere. Issue#39.

  • Identifiers of nested tuples and lists could collide.

    a = ((1, 2), 3)
    b = ((1,), 2, 3)
    

    Both tuples had the same name previously, not the end of the tuple is marked too. Fixed in 0.3.24.1 already.

  • The __name__ when used read-only in modules in packages was optimized to a string value that didn't contain the package name.

  • Exceptions set when entering compiled functions were unset at function exit.

New Features

  • Compiled frames support. Before, Nuitka was creating frames with the standard CPython C/API functions, and tried its best to cache them. This involved some difficulties, but as it turns out, it is actually possible to instead provide a compatible type of our own, that we have full control over.

    This will become the base of enhanced compatibility. Keeping references to local variables attached to exception tracebacks is something we may be able to solve now.

  • Enhanced Python3 support, added support for nonlocal declarations and many small corrections for it.

  • Writable __defaults__ attribute for compiled functions, actually changes the default value used at call time. Not supported is changing the amount of default parameters.

Cleanups

  • Keep the functions along with the module and added "FunctionRef" node kind to point to them.
  • Reformulated or and and operators with the conditional expression construct which makes the "short-circuit" branch.
  • Access self in methods from the compiled function object instead of pointer to context object, making it possible to access the function object.
  • Removed "OverflowCheck" module and its usage, avoids one useless scan per function to determine the need for "locals dictionary".
  • Make "compileTree" of "MainControl" module to only do what the name says and moved the rest out, making the top level control clearer.
  • Don't export module entry points when building executable and not modules. These exports cause MinGW and MSVC compilers to create export libraries.

New Optimization

  • More efficient code for conditional expressions in conditions:
 if a if b else c

See above, this code is now the typical pattern for each ``or`` and ``and``,
so this was much needed now.

Organizational

  • The remaining uses of C++11 have been removed. Code generated with Nuitka and complementary C++ code now compile with standard C++03 compilers. This lowers the Nuitka requirements and enables at least g++ 4.4 to work with Nuitka.

  • The usages of the GNU extension operation a ?: b have replaced with standard C++ constructs. This is needed to support MSVC which doesn't have this.

  • Added examples for the typical use cases to the "User Manual".

  • The "compare_with_cpython" script has gained an option to immediately remove the Nuitka outputs (build directory and binary) if successful. Also the temporary files are now put under "/var/tmp" if available.

  • Debian package improvements, registering with "doc-base" the "User Manual" so it is easier to discover. Also suggest "mingw32" package which provides the cross compiler to Windows.

  • Partial support for MSVC (Visual Studio 2008 to be exact, the version that works with CPython2.6 and CPython2.7).

    All basic tests that do not use generators are working now, but those will currently cause crashes.

  • Renamed the --g++-only option to --c++-only.

    The old name is no longer correct after clang and MSVC have gained support, and it could be misunderstood to influence compiler selection, rather than causing the C++ source code to not be updated, so manual changes will the used. This solves Issue#47.

  • Catch exceptions for continue, break, and return only where needed for try/finally and loop constructs.

New Tests

  • Added CPython3.2 test suite as "tests/CPython32" from 3.2.3 and run it with CPython2.7 to check that Nuitka gives compatible error messages. It is not expected to pass yet on Python3.2, but work will be done towards this goal.
  • Make CPython2.7 test suite runner also execute the generated "doctest" modules.
  • Enabled tests for default parameters and their reference counts.

Summary

This release marks an important point. The compiled frames are exciting new technology, that will allow even better integration with CPython, while improving speed. Lowering the requirements to C++03 means, we will become usable on Android and with MSVC, which will make adoption of Nuitka on Windows easier for many.

Structurally the outstanding part is the function as references cleanup. This was a blocker for value propagation, because now functions references can be copied, whereas previously this was duplicating the whole function body, which didn't work, and wasn't acceptable. Now, work can resume in this domain.

Also very exciting when it comes to optimization is the remove of special code for or and and operators, as these are now only mere conditional expressions. Again, this will make value propagation easier with two special cases less.

And then of course, with Mercurial unit tests running compiled with Nuitka, an important milestone has been hit.

For a while now, the focus will be on completing Python3 support, XML based optimization regression tests, benchmarks, and other open ends. Once that is done, and more certainty about Mercurial tests support, I may call it a 0.4 and start with local type inference for actual speed gains.

Letting go of C++11

This post is about Nuitka the Python compiler started out using C++0x which is now C++11, and then chose to stop it.

In the Beginning

Very early on, when I considered how to generate code from the node tree, in a way, that mistakes should practically be impossible to make, I made the fundamental decision, that every Python expression, which produces temporary variables, should become an expression in the generated code too.

Note

That is my choice, I think it keeps code generation more simple, and easier to understand. There may come a separate post about how that played out.

That decision meant some trouble. Certain things were not easy, but generally, it was achievable for g++ relatively quickly, and then lots of helper functions would be needed. Think of MAKE_TUPLE and MAKE_DICT, but also other stuff needed that. Calling a Python built-in with variable number of parameters e.g. could be implemented that way easily.

Other nice things were enum classes, and generally good stuff. It was really quick to get Nuitka code generation off the ground this way.

Reality Strikes

But then, as time went on, I found that the order of evaluation was becoming an issue. It became apparent that for more and more things, I needed to reverse it, so it works. Porting to ARM, it then became clear, that it needs to be the other way around for that platform. And checking out clang, which is also a C++11 compiler, I noticed, this one yet uses a different one.

So, for normal functions, I found a solution that involves the pre-processor to reverse or not, both function definition and call sites, and then it is already correct.

This of course, doesn't work for C++11 variadic functions. So, there came a point, where I had to realize, that each of its uses was more or less causing evaluation order bugs. So that most of their uses were already removed. And so I basically knew they couldn't stay that way.

Other Features

Also, things I initially assumed, e.g. that lambda functions of C++11 may prove useful, or even "auto", didn't turn out to be true. There seemingly is a wealth of new features, besides variadic templates that I didn't see how Nuitka would benefit from it at all.

New Wishes

Then, at Europython, I realized, that Android is still stuck with g++-4.4 and as such, that an important target platform will be unavailable to me. This platform will become even more important, as I intend to buy an device now.

Biting it

So what I did, was to remove all variadic functions and instead generate code for them as necessary. I just need to trace the used argument counts, and then provide those, simple enough.

Also, other things like deleted copy constructors, and so on, I had to give up on these a bit.

This change was probably suited to remove subtle evaluation order problems, although I don't recall seeing them.

The Present

The current stable release still requires C++11, but the next release will work on g++-4.4 and compiles fine with MSVC from Visual Studio 2008, although at this time, there is still the issue of generators not working yet, but I believe that ought to be solvable.

The new requirement is only C++03, which means, there is a good chance that supporting Android will become feasible. I know there is interest from App developers, because there, even the relatively unimportant 2x speedup, that Nuitka might give for some code, may matter.

Conclusion

So that is a detour, I have taken, expanding the base of Nuitka even further. I felt, this was important enough to write down the history part of it.

Nuitka Release 0.3.24

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release contains progress on many fronts, except performance.

The extended coverage from running the CPython 2.7 and CPython 3.2 (partially) test suites shows in a couple of bug fixes and general improvements in compatibility.

Then there is a promised new feature that allows to compile whole packages.

Also there is more Python3 compatibility, the CPython 3.2 test suite now succeeds up to "test_builtin.py", where it finds that str doesn't support the new parameters it has gained, future releases will improve on this.

And then of course, more re-formulation work, in this case, class definitions are now mere simple functions. This and later function references, is the important and only progress towards type inference.

Bug fixes

  • The compiled method type can now be used with copy module. That means, instances with methods can now be copied too. Issue#40. Fixed in 0.3.23.1 already.

  • The assert statement as of Python2.7 creates the AssertionError object from a given value immediately, instead of delayed as it was with Python2.6. This makes a difference for the form with 2 arguments, and if the value is a tuple. Issue#41. Fixed in 0.3.23.1 already.

  • Sets written like this didn't work unless they were predicted at compile time:

    { value }
    

    This apparently rarely used Python2.7 syntax didn't have code generation yet and crashed the compiler. Issue#42. Fixed in 0.3.23.1 already.

  • For Python2, the default encoding for source files is ascii, and it is now enforced by Nuitka as well, with the same SyntaxError.

  • Corner cases of exec statements with nested functions now give proper SyntaxError exceptions under Python2.

  • The exec statement with a tuple of length 1 as argument, now also gives a TypeError exception under Python2.

  • For Python2, the del of a closure variable is a SyntaxError.

New Features

  • Added support creating compiled packages. If you give Nuitka a directory with an "__init__.py" file, it will compile that package into a ".so" file. Adding the package contents with --recurse-dir allows to compile complete packages now. Later there will be a cleaner interface likely, where the later is automatic.
  • Added support for providing directories as main programs. It's OK if they contain a "__main__.py" file, then it's used instead, otherwise give compatible error message.
  • Added support for optimizing the super built-in. It was already working correctly, but not optimized on CPython2. But for CPython3, the variant without any arguments required dedicated code.
  • Added support for optimizing the unicode built-in under Python2. It was already working, but will become the basis for the str built-in of Python3 in future releases.
  • For Python3, lots of compatibility work has been done. The Unicode issues appear to be ironed out now. The del of closure variables is allowed and supported now. Built-ins like ord and chr work more correctly and attributes are now interned strings, so that monkey patching classes works.

Organizational

  • Migrated "bin/benchmark.sh" to Python as "misc/run-valgrind.py" and made it a bit more portable that way. Prefers "/var/tmp" if it exists and creates temporary files in a secure manner. Triggered by the Debian "insecure temp file" bug.

  • Migrated "bin/make-dependency-graph.sh" to Python as "misc/make-dependency-graph.py" and made a more portable and powerful that way.

    The filtering is done a more robust way. Also it creates temporary files in a secure manner, also triggered by the Debian "insecure temp file" bug.

    And it creates SVG files and no longer PostScript as the first one is more easily rendered these days.

  • Removed the "misc/gist" git sub-module, which was previously used by "misc/make-doc.py" to generate HTML from "User Manual" and "Developer Manual".

    These are now done with Nikola, which is much better at it and it integrates with the web site.

  • Lots of formatting improvements to the change log, and manuals:

    • Marking identifiers with better suited ReStructured Text markup.
    • Added links to the bug tracker all Issues.
    • Unified wordings, quotation, across the documents.

Cleanups

  • The creation of the class dictionaries is now done with normal function bodies, that only needed to learn how to throw an exception when directly called, instead of returning NULL.

    Also the assignment of __module__ and __doc__ in these has become visible in the node tree, allowing their proper optimization.

    These re-formulation changes allowed to remove all sorts of special treatment of class code in the code generation phase, making things a lot simpler.

  • There was still a declaration of PRINT_ITEMS and uses of it, but no definition of it.

  • Code generation for "main" module and "other" modules are now merged, and no longer special.

  • The use of raw strings was found unnecessary and potentially still buggy and has been removed. The dependence on C++11 is getting less and less.

New Tests

  • Updated CPython2.6 test suite "tests/CPython26" to 2.6.8, adding tests for recent bug fixes in CPython. No changes to Nuitka were needed in order to pass, which is always good news.
  • Added CPython2.7 test suite as "tests/CPython27" from 2.7.3, making it public for the first time. Previously a private copy of some age, with many no longer needed changes had been used by me. Now it is up to par with what was done before for "tests/CPython26", so this pending action is finally done.
  • Added test to cover Python2 syntax error of having a function with closure variables nested inside a function that is an overflow function.
  • Added test "BuiltinSuper" to cover super usage details.
  • Added test to cover del on nested scope as syntax error.
  • Added test to cover exec with a tuple argument of length 1.
  • Added test to cover barry_as_FLUFL future import to work.
  • Removed "Unicode" from known error cases for CPython3.2, it's now working.

Summary

This release brought forward the most important remaining re-formulation changes needed for Nuitka. Removing class bodies, makes optimization yet again simpler. Still, making function references, so they can be copied, is missing for value propagation to progress.

Generally, as usual, a focus has been laid on correctness. This is also the first time, I am release with a known bug though: That is Issue#39 which I believe now, may be the root cause of the mercurial tests not yet passing.

The solution will be involved and take a bit of time. It will be about "compiled frames" and be a (invasive) solution. It likely will make Nuitka faster too. But this release includes lots of tiny improvements, for Python3 and also for Python2. So I wanted to get this out now.

As usual, please check it out, and let me know how you fare.

Python 3 wonders - Breaking str

That just killed some hope inside of me, breaking code that uses str ought to be forbidden.

Update

Turns out, this only a bug, and not intentional. And the bug is only in the doc string, so it's being fixed, and there is no inconsistency then any more.

Python 3:

Python 3.2.3 (default, Jun 25 2012, 23:10:56)
[GCC 4.7.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print( str.__doc__ )
str(string[, encoding[, errors]]) -> str

Create a new string object from the given encoded string.
encoding defaults to the current default string encoding.
errors can be 'strict', 'replace' or 'ignore' and defaults to 'strict'.
>>> str( string = "a" )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'string' is an invalid keyword argument for this function

Python 2:

Python 2.7.3 (default, Jul 13 2012, 17:48:29)
[GCC 4.7.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print str.__doc__
str(object) -> string

Return a nice string representation of the object.
If the argument is a string, the return value is the same object.
>>> str( object = "a" )
'a'

I do understand that it's in fact just the old unicode built-in. In fact, I made it work like that for Nuitka just now. But there is a difference, for Python2, it was well behaved.

Python 2:

>>> print unicode.__doc__
unicode(string [, encoding[, errors]]) -> object

Create a new Unicode object from the given encoded string.
encoding defaults to the current default string encoding.
errors can be 'strict', 'replace' or 'ignore' and defaults to 'strict'.
>>> unicode( string = "a" )
u'a'

Is Python3 supposed to be more clean or what? I think it is not happening.

Yours, Kay Hayen

Python 3 wonders - Barry BDFL

While writing Nuitka I get to see an absurd amount of CPython code. For a while now, it's also CPython3.2 that I look at. Checking out __future__ handling, I was surprised the other day though, this really works:

# Python 3.2.3 (default, Jun 25 2012, 23:10:56)
# [GCC 4.7.1] on linux2
>>> from __future__ import barry_as_FLUFL
>>> 1 <> 2
True
>>> 1 != 2
  File "<stdin>", line 1
    1 != 2
       ^
SyntaxError: invalid syntax

It's new in CPython3, and this the code that makes it possible, from the Python parser:

if (type == NOTEQUAL) {
    if (!(ps->p_flags & CO_FUTURE_BARRY_AS_BDFL) &&
                    strcmp(str, "!=")) {
        PyObject_FREE(str);
        err_ret->error = E_SYNTAX;
        break;
    }
    else if ((ps->p_flags & CO_FUTURE_BARRY_AS_BDFL) &&
                    strcmp(str, "<>")) {
        PyObject_FREE(str);
        err_ret->text = "with Barry as BDFL, use '<>' "
                        "instead of '!='";
        err_ret->error = E_SYNTAX;
        break;
    }
}

Now who would think bad of that, would you? The fun aspect is, that Nuitka will easily supports it. By re-using the Python parser, it works out of the box, I only needed to add the flag value.

For fun, I tried to add a test that confirms - and then notice:

  • It doesn't really work for CPython3.2 already.
  • The flag is only used for eval and exec and not on the same level, so it's only inherited.
  • And "2to3" kindly removes that flag silently. It probably should raise an error.

So yeah, oddities in Python3!