16 April 2011

Looking where Nuitka stands

In case you wonder, [what Nuitka is](/pages/overview.html), look here. Over the 0.3.x release cycle, I have mostly looked at its performance with “pystone”. I merely wanted to have a target to look at and enjoy the progress we have made there.

In the context of the Windows port then, Khalid Abu Bakr used the pybench on Windows and that got me interested. It’s a nice collection of micro benchmarks, which is quite obviously aimed for looking CPython implementations only. In that it’s quite good to check where Nuitka is good at, and where it can still take improvements for the milestone 2 stuff.

Enhancements to PyBench

The pybench refused to accept that Nuitka could use so little time on some tests, I needed to hack it to allow it.
Then it had “ZeroDivisionError” exceptions, because Nuitka can run fully predictable code not at all, thus with a time of 0ms, which gives interesting factors.
Also these are many results, we are going to care for regressions only, so there is an option now to output only tests with negative values.

The Interesting Parts

Nuitka currently has some fields where optimizations are already so effective as to render the whole benchmark pointless. Longterm, most of PyBench will not be looked at anymore, where the factor becomes “infinity”, there is little point in looking at it. We will likely just use it as a test that optimizations didn’t suddenly regress. Publishing the numbers will not be as interesting.
Then there are slow downs. These I take seriously, because of course I expect that Nuitka shall only be faster than CPython. Sometimes the implementation of Nuitka for some rarely used features is sub par though. I color coded these in red in the table below.
ComplexPythonFunctionCalls: These are twice as slow, which is an tribute to the fact, that the code in this domain is only as good as it needs to be. Of course function calls are very important, and this needs to be addressed.
TryRaiseExcept: This is much slower because of the cost of the raise statement, which is extremely high currently. For every raise, a frame object with a specific code object is created, so the traceback will point to the correct location. This is very inefficient, and wasteful. We need to be able to create code objects that can be used for all lines needed, and then we can re-use it and only have one frame object per function, which then can be re-used itself. There is already some work for that in [current git](/doc/download.html) (0.3.9 pre 2), but it’s not yet complete at all.
WithRaiseExcept: Same problem as TryRaiseExcept, the exception raising is too expensive.
Note also that -90% is in fact much worse that +90%, the “diff” numbers from pybench make improvements look much better than regressions do. You can also checkout the comparison on the new [benchmark pages](https://speedcenter.nuitka.net) that I am just creating, they are based on codespeed, which I will blog upon separately.

Look at this table of results as produced by pybench:

Benchmark Results

Test Name	min CPython	min Nuitka	diff
BuiltinFunctionCalls	76ms	54ms	+41.0%
BuiltinMethodLookup	57ms	47ms	+22.1%
CompareFloats	79ms	0ms	+inf%
CompareFloatsIntegers	75ms	0ms	+inf%
CompareIntegers	76ms	0ms	+inf%
CompareInternedStrings	68ms	32ms	+113.0%
CompareLongs	60ms	0ms	+inf%
CompareStrings	86ms	62ms	+38.2%
CompareUnicode	61ms	50ms	+21.9%
ComplexPythonFunctionCalls	86ms	179ms	-52.3%
ConcatStrings	98ms	99ms	-0.6%
ConcatUnicode	127ms	124ms	+2.3%
CreateInstances	76ms	52ms	+46.8%
CreateNewInstances	58ms	47ms	+22.1%
CreateStringsWithConcat	85ms	90ms	-6.5%
CreateUnicodeWithConcat	74ms	68ms	+9.5%
DictCreation	58ms	36ms	+60.9%
DictWithFloatKeys	67ms	44ms	+51.7%
DictWithIntegerKeys	64ms	30ms	+113.8%
DictWithStringKeys	60ms	26ms	+130.6%
ForLoops	47ms	15ms	+216.2%
IfThenElse	67ms	16ms	+322.5%
ListSlicing	69ms	70ms	-0.9%
NestedForLoops	72ms	25ms	+187.4%
NestedListComprehensions	87ms	42ms	+105.9%
NormalClassAttribute	62ms	77ms	-18.9%
NormalInstanceAttribute	56ms	24ms	+129.7%
PythonFunctionCalls	72ms	34ms	+116.1%
PythonMethodCalls	84ms	38ms	+120.0%
Recursion	97ms	56ms	+73.1%
SecondImport	61ms	47ms	+31.6%
SecondPackageImport	66ms	29ms	+125.4%
SecondSubmoduleImport	86ms	32ms	+172.0%
SimpleComplexArithmetic	74ms	62ms	+18.3%
SimpleDictManipulation	65ms	35ms	+89.7%
SimpleFloatArithmetic	77ms	56ms	+39.3%
SimpleIntFloatArithmetic	58ms	39ms	+48.3%
SimpleIntegerArithmetic	59ms	37ms	+57.7%
SimpleListComprehensions	75ms	33ms	+128.7%
SimpleListManipulation	57ms	27ms	+109.4%
SimpleLongArithmetic	68ms	57ms	+19.9%
SmallLists	69ms	41ms	+66.6%
SmallTuples	66ms	98ms	-32.2%
SpecialClassAttribute	63ms	49ms	+29.1%
SpecialInstanceAttribute	130ms	24ms	+434.5%
StringMappings	67ms	62ms	+8.5%
StringPredicates	69ms	59ms	+16.6%
StringSlicing	73ms	47ms	+54.8%
TryExcept	57ms	0ms	+3821207.1%
TryFinally	65ms	26ms	+153.4%
TryRaiseExcept	64ms	610ms	-89.5%
TupleSlicing	76ms	67ms	+12.7%
UnicodeMappings	88ms	91ms	-2.9%
UnicodePredicates	64ms	59ms	+8.8%
UnicodeProperties	69ms	63ms	+8.8%
UnicodeSlicing	80ms	68ms	+17.6%
WithFinally	84ms	26ms	+221.2%
WithRaiseExcept	67ms	1178ms	-94.3%

She’s a doctor now Nuitka Release 0.3.9