Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I did some work on a Python JIT in the path. The two biggest challenges were:

- Python is much, much, more dynamic than Javascript. You can override just about anything in Python, including the meaning of accessing a property. You have overloaded operators (with pretty complex resolution rules), metaclasses, and more. And they're all used extensively. There's some Javascript equivalents to those things, but either there are fewer deoptimization cases or are features that aren't commonly used in practice (e.g. Proxy objects).

- Python has a ton of important libraries implemented as C extensions. These libraries tend to depend on undefined behavior of the CPython interpreter (e.g. destruction order which is more deterministic with ref counting) or do things that happen to work but are clearly not supposed to be done (e.g. defining a full Python object as a static variable).

I guess also economic incentives, there hasn't been an incentive for anybody to staff a 50 person project to build a Python JIT given that it's cheaper to rewrite some or all of the application in C/C++/Rust/Go whereas that's not an option in Javascriptland.



I completely agree with you and I'd argue that #2 and #3 are the two biggest reasons.

It is easy to forget the colossal amount of engineering resources that browser vendors have spent creating and rewriting their Javascript engines. And due to the nature of how their JITs work, all that work is tied down to the specific Javascript environment they were written for. (For example, you can't really reuse the v8 codebase to create a Python JIT)

And in Python's case a lot of the appeal of the language rests on the extensive library ecosystem, which has a significant number of extensions written in C. Generally speaking JIT compilers aren't very good at optimizing code that spends a lot of time inside or interacting with C extensions, even if we ignore the significant issues you mentioned regarding undefined behavior.


Why is javascript slower than Java despite being both garbage collected and v8 having more engeeners than e.g openjdk?


A couple obvious reasons would be compiled vs interpreted, and static vs dynamic, both of which bear some inherit runtime performance cost.


Compiled vs interpreted is not a useful distinction. Until fairly recently when Ignition was introduced, V8 had no interpreter. All the JS was compiled by the first tier “full codegen”

The bigger difference is that the JVM is heavily optimized for performance after a long warmup and V8 needs to produce relatively fast code early during page loading.

Java being much more static certainly helps warmup time but ultimately doesn’t really affect final performance. LuaJIT can beat C in some cases once it has time to compile all traces needed.


V8 needs to produce relatively fast code early during page loading So for e.g backend js programs is it possible to ask v8 to take more time to optimize?


Why would would you want it to take longer to optimize?


To get more time to generate betten codegen so a faster program (but slower "launch time")


Most VMs will recompile long-running hot code; I’d assume V8 does this as well.


Modern JS engines only interpret cold functions, and static vs dynamic doesn't matter once type information has been collected.

The real reason is that HotSpot has 10+ more years of work put into it than V8.


Static vs dynamic still matters if the program is doing "dynamic stuff", where "dynamic stuff" means any thing that the JIT compiler is currently not able to optimize.


That's fair, but that's also stuff that you just can't really do in Java in general¹, so it's not useful for a comparison. The fact is that the vast majority of JavaScript code is pretty static and there's nothing preventing it from running as fast as Java other than man-decades of compiler engineering.

¹ Possibly excluding reflection. It's been a long time since I used the Java reflection APIs and I have no idea if you can do things like add class fields named after arbitrary strings at runtime. Even if you could, presumably this bails out of jitted code so the situation is basically the same as in JS.


dynamic types probably the first order reason and some unfortunate language choices being the 2nd order reason.


Dynamic typing means you pay the cost of trace recording or profiling to collect the type info. The actual code performance should ultimately be the same. JIT can remove the overhead of dynamic dispatch and replace it with a fixed call and a guard, for example. This isn’t possible with dynamically loading C libraries.


> JIT can remove the overhead of dynamic dispatch and replace it with a fixed call and a guard, for example.

Only when the guard isn’t triggered constantly. With an actual type system you can remove many of these guards altogether instead of having them everywhere and falling back to the slow case when you get something unexpected.


There are actually two problems here: How to handle things being re-defined and how to handle an unexpected type after speculation.

In real high performance VMs the guard for redefinition is effectively a single instruction which is a CPU can easily branch predict and handle with out of order execution: https://chrisseaton.com/truffleruby/low-overhead-polling/

With unexpected types we can use LuaJIT as an example: The type speculation guard will be turned into a conditional branch to a side trace. The slow path quickly becomes another fast path.


> For example, you can't really reuse the v8 codebase to create a Python JIT

Actually, come to think of it... V8 also runs WASM right? Right now I think WASM is missing a few features (like garbage collection) which Python would need to be efficiently compiled to WASM, but once those are solved...


At a conceptual level that would be no different than having a Python implementation targeting the JVM or CLR runtimes (aka Jython and IronPython).

The usual situation for these alternate implementations is that they make it easier to interact with other code that targets those runtimes, but that they do not speed up the average speed of the interpreter. The previously-mentioned compatibility and performance issues for C extensions also remain.


There was brief point in time where IronPython was faster than CPython for several major benchmarks, precisely because of the better CLR JIT and GC, plus some smart tricks implemented in the "DLR". (It's almost sad, IronPython and the "DLR" have been left to grow so many weeds since that time.)


I definitely should have been clearer in my other comment. Alternative interpreters such as IronPython can be faster but most of the time the speed stays in the same "order of magnitude" as the original C interpreter. On the other hand, good JIT can deliver a 10x or more speedup in the best case scenarios where it manages to get rid of the dynamic typing overhead. (For subtle technical reasons, running the language interpreter on top a jitting VM like the CLR is not enough. The underlying JIT has a hard time looking further than the IronPython interpreter itself and making optimizations at the "Python level")


That's where some of the most "DLR" magic filled in (cached) and optimized a lot of the "Python level" in a way that the CLR JIT could take advantage. The DLR briefly was a huge bundle of hope for some really interesting business logic caching. In a past life I did some really wild stuff with DLR caching for a complicated business workflow tool. It's dark matter in an enterprise that I'm sure all of it is still running, but I'm not sure if the performance has kept up over time (and have no way to ask, and probably don't care) as the CLR declared "mission complete" on the DLR and maybe hasn't kept it quite as optimized since the IronPython heyday.


In Ruby land, JRuby is significantly faster than CRuby since 9.0. It has a proper IR for optimization and can inline etc.


Couldn't you compile the C extensions to WASM too? Then it'd just be WASM code interacting with WASM code.


I don't think its possible to efficiently compile a dynamically-typed language like Python to statically-typed WASM.


Thought experiment: what about transpiling Python to JS? https://github.com/QQuick/Transcrypt looks like a nice implementation, but their readme just talks about deployment to browsers — I'm curious about outside of the browser whether Transcrypt + Node might be more efficient than CPython.

(Not even just CPython, but really any dynamic language implementation.)

And then of course wasm could still be used for C extensions.


Not random code, but specifically written with type checking should be doable. There was an asm,js subset for example.


Calling asm.js a subset of Javascript is a bit of a stretch. It looks nothing like idiomatic Javascript, and was more of a low-level statically-typed language dressed up in Javascript clothing.

At some point they realized that representing this low-level code as Javascript text instead of as specially-designed bytecode added a significant ammount of parsing and compilation overhead, which was one of the initial motivations for the creation of WebAssembly. If I had to sum up WASM in one sentence, it is that it is kind of like the JVM, except that its instruction set was designed specifically for running programs downloaded from the web. Special attention was paid to security and startup latency.


Erm... the JVM was also designed specifically for running programs downloaded from the web. Special attention was paid to security and startup latency. Javascript got its name because it was the only other language designed to be downloaded and executed in a browser!


rpython might be a better example


You’d have to compile the CPython interpreter to WASM.


> And they're all used extensively.

That's the key. MicroPython is significantly less dynamic than full Python, and would be much easier (but still not easy) to write a fast JIT for. Unfortunately such a JIT wouldn't be very useful - MicroPython won't run much code that hasn't been written specifically for it. Without the dynamic features, MicroPython is essentially a different language from regular Python.


> I guess also economic incentives, there hasn't been an incentive for anybody to staff a 50 person project to build a Python JIT given that it's cheaper to rewrite some or all of the application in C/C++/Rust/Go whereas that's not an option in Javascriptland.

Ask Dropbox https://github.com/dropbox/pyston


Right, let's ask Dropbox:

https://blog.pyston.org/2017/01/31/pyston-0-6-1-released-and...

(And after that blog post, there are no commits in the github repo you linked to).


Yeah, I was excited when they announced this effort. Sad they 86ed it.


I think you have the first three points — dynamism, C extensions, and economics — right on the money. The fourth point I would add is that Python has a huge standard library. That's a very large surface area, all of which ends up needing optimization effort to get good performance across a wide variety of programs.


> You can override just about anything in Python, including the meaning of accessing a property

JavaScript has getters that do the same [0]. You can even redefine 'undefined' depending on version and mode [1].

Is operator overloading really that much 'worse' in Python?

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[1] https://stackoverflow.com/a/8783528/


> You can override just about anything in Python, including the meaning of accessing a property.

Can't you do this in JS as well?


As far as I know, JS does not have the equivalent of Python's descriptor protocol[0]. This is notably different from setter/getter methods.

[0]: https://docs.python.org/3/howto/descriptor.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: