I did some work on a Python JIT in the path. The two biggest challenges were: - ...

ufo · on Sept 12, 2019

I completely agree with you and I'd argue that #2 and #3 are the two biggest reasons.

It is easy to forget the colossal amount of engineering resources that browser vendors have spent creating and rewriting their Javascript engines. And due to the nature of how their JITs work, all that work is tied down to the specific Javascript environment they were written for. (For example, you can't really reuse the v8 codebase to create a Python JIT)

And in Python's case a lot of the appeal of the language rests on the extensive library ecosystem, which has a significant number of extensions written in C. Generally speaking JIT compilers aren't very good at optimizing code that spends a lot of time inside or interacting with C extensions, even if we ignore the significant issues you mentioned regarding undefined behavior.

The_rationalist · on Sept 12, 2019

Why is javascript slower than Java despite being both garbage collected and v8 having more engeeners than e.g openjdk?

setr · on Sept 12, 2019

A couple obvious reasons would be compiled vs interpreted, and static vs dynamic, both of which bear some inherit runtime performance cost.

jashmatthews · on Sept 12, 2019

Compiled vs interpreted is not a useful distinction. Until fairly recently when Ignition was introduced, V8 had no interpreter. All the JS was compiled by the first tier “full codegen”

The bigger difference is that the JVM is heavily optimized for performance after a long warmup and V8 needs to produce relatively fast code early during page loading.

Java being much more static certainly helps warmup time but ultimately doesn’t really affect final performance. LuaJIT can beat C in some cases once it has time to compile all traces needed.

The_rationalist · on Sept 12, 2019

V8 needs to produce relatively fast code early during page loading So for e.g backend js programs is it possible to ask v8 to take more time to optimize?

saagarjha · on Sept 12, 2019

Why would would you want it to take longer to optimize?

The_rationalist · on Sept 12, 2019

To get more time to generate betten codegen so a faster program (but slower "launch time")

saagarjha · on Sept 12, 2019

Most VMs will recompile long-running hot code; I’d assume V8 does this as well.

PeCaN · on Sept 12, 2019

Modern JS engines only interpret cold functions, and static vs dynamic doesn't matter once type information has been collected.

The real reason is that HotSpot has 10+ more years of work put into it than V8.

ufo · on Sept 12, 2019

Static vs dynamic still matters if the program is doing "dynamic stuff", where "dynamic stuff" means any thing that the JIT compiler is currently not able to optimize.

PeCaN · on Sept 13, 2019

That's fair, but that's also stuff that you just can't really do in Java in general¹, so it's not useful for a comparison. The fact is that the vast majority of JavaScript code is pretty static and there's nothing preventing it from running as fast as Java other than man-decades of compiler engineering.

―

¹ Possibly excluding reflection. It's been a long time since I used the Java reflection APIs and I have no idea if you can do things like add class fields named after arbitrary strings at runtime. Even if you could, presumably this bails out of jitted code so the situation is basically the same as in JS.

gameswithgo · on Sept 12, 2019

dynamic types probably the first order reason and some unfortunate language choices being the 2nd order reason.

jashmatthews · on Sept 12, 2019

Dynamic typing means you pay the cost of trace recording or profiling to collect the type info. The actual code performance should ultimately be the same. JIT can remove the overhead of dynamic dispatch and replace it with a fixed call and a guard, for example. This isn’t possible with dynamically loading C libraries.

saagarjha · on Sept 13, 2019

> JIT can remove the overhead of dynamic dispatch and replace it with a fixed call and a guard, for example.

Only when the guard isn’t triggered constantly. With an actual type system you can remove many of these guards altogether instead of having them everywhere and falling back to the slow case when you get something unexpected.

jashmatthews · on Sept 13, 2019

There are actually two problems here: How to handle things being re-defined and how to handle an unexpected type after speculation.

In real high performance VMs the guard for redefinition is effectively a single instruction which is a CPU can easily branch predict and handle with out of order execution: https://chrisseaton.com/truffleruby/low-overhead-polling/

With unexpected types we can use LuaJIT as an example: The type speculation guard will be turned into a conditional branch to a side trace. The slow path quickly becomes another fast path.

Ajedi32 · on Sept 12, 2019

> For example, you can't really reuse the v8 codebase to create a Python JIT

Actually, come to think of it... V8 also runs WASM right? Right now I think WASM is missing a few features (like garbage collection) which Python would need to be efficiently compiled to WASM, but once those are solved...

ufo · on Sept 12, 2019

At a conceptual level that would be no different than having a Python implementation targeting the JVM or CLR runtimes (aka Jython and IronPython).

The usual situation for these alternate implementations is that they make it easier to interact with other code that targets those runtimes, but that they do not speed up the average speed of the interpreter. The previously-mentioned compatibility and performance issues for C extensions also remain.

WorldMaker · on Sept 12, 2019

There was brief point in time where IronPython was faster than CPython for several major benchmarks, precisely because of the better CLR JIT and GC, plus some smart tricks implemented in the "DLR". (It's almost sad, IronPython and the "DLR" have been left to grow so many weeds since that time.)

ufo · on Sept 12, 2019

I definitely should have been clearer in my other comment. Alternative interpreters such as IronPython can be faster but most of the time the speed stays in the same "order of magnitude" as the original C interpreter. On the other hand, good JIT can deliver a 10x or more speedup in the best case scenarios where it manages to get rid of the dynamic typing overhead. (For subtle technical reasons, running the language interpreter on top a jitting VM like the CLR is not enough. The underlying JIT has a hard time looking further than the IronPython interpreter itself and making optimizations at the "Python level")

WorldMaker · on Sept 12, 2019

That's where some of the most "DLR" magic filled in (cached) and optimized a lot of the "Python level" in a way that the CLR JIT could take advantage. The DLR briefly was a huge bundle of hope for some really interesting business logic caching. In a past life I did some really wild stuff with DLR caching for a complicated business workflow tool. It's dark matter in an enterprise that I'm sure all of it is still running, but I'm not sure if the performance has kept up over time (and have no way to ask, and probably don't care) as the CLR declared "mission complete" on the DLR and maybe hasn't kept it quite as optimized since the IronPython heyday.

jashmatthews · on Sept 12, 2019

In Ruby land, JRuby is significantly faster than CRuby since 9.0. It has a proper IR for optimization and can inline etc.

Ajedi32 · on Sept 12, 2019

Couldn't you compile the C extensions to WASM too? Then it'd just be WASM code interacting with WASM code.

bakery2k · on Sept 12, 2019

I don't think its possible to efficiently compile a dynamically-typed language like Python to statically-typed WASM.

buu700 · on Sept 12, 2019

Thought experiment: what about transpiling Python to JS? https://github.com/QQuick/Transcrypt looks like a nice implementation, but their readme just talks about deployment to browsers — I'm curious about outside of the browser whether Transcrypt + Node might be more efficient than CPython.

(Not even just CPython, but really any dynamic language implementation.)

And then of course wasm could still be used for C extensions.

mixmastamyk · on Sept 12, 2019

Not random code, but specifically written with type checking should be doable. There was an asm,js subset for example.

ufo · on Sept 12, 2019

Calling asm.js a subset of Javascript is a bit of a stretch. It looks nothing like idiomatic Javascript, and was more of a low-level statically-typed language dressed up in Javascript clothing.

At some point they realized that representing this low-level code as Javascript text instead of as specially-designed bytecode added a significant ammount of parsing and compilation overhead, which was one of the initial motivations for the creation of WebAssembly. If I had to sum up WASM in one sentence, it is that it is kind of like the JVM, except that its instruction set was designed specifically for running programs downloaded from the web. Special attention was paid to security and startup latency.

cwp · on Sept 13, 2019

Erm... the JVM was also designed specifically for running programs downloaded from the web. Special attention was paid to security and startup latency. Javascript got its name because it was the only other language designed to be downloaded and executed in a browser!

firethief · on Sept 12, 2019

rpython might be a better example

simonh · on Sept 12, 2019

You’d have to compile the CPython interpreter to WASM.

bakery2k · on Sept 12, 2019

> And they're all used extensively.

That's the key. MicroPython is significantly less dynamic than full Python, and would be much easier (but still not easy) to write a fast JIT for. Unfortunately such a JIT wouldn't be very useful - MicroPython won't run much code that hasn't been written specifically for it. Without the dynamic features, MicroPython is essentially a different language from regular Python.

oweiler · on Sept 12, 2019

> I guess also economic incentives, there hasn't been an incentive for anybody to staff a 50 person project to build a Python JIT given that it's cheaper to rewrite some or all of the application in C/C++/Rust/Go whereas that's not an option in Javascriptland.

Ask Dropbox https://github.com/dropbox/pyston

jsnell · on Sept 12, 2019

Right, let's ask Dropbox:

https://blog.pyston.org/2017/01/31/pyston-0-6-1-released-and...

(And after that blog post, there are no commits in the github repo you linked to).

aswanson · on Sept 12, 2019

Yeah, I was excited when they announced this effort. Sad they 86ed it.

munificent · on Sept 12, 2019

I think you have the first three points — dynamism, C extensions, and economics — right on the money. The fourth point I would add is that Python has a huge standard library. That's a very large surface area, all of which ends up needing optimization effort to get good performance across a wide variety of programs.

MaxBarraclough · on Sept 15, 2019

> You can override just about anything in Python, including the meaning of accessing a property

JavaScript has getters that do the same [0]. You can even redefine 'undefined' depending on version and mode [1].

Is operator overloading really that much 'worse' in Python?

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[1] https://stackoverflow.com/a/8783528/

viklove · on Sept 12, 2019

> You can override just about anything in Python, including the meaning of accessing a property.

Can't you do this in JS as well?

tynorf · on Sept 12, 2019

As far as I know, JS does not have the equivalent of Python's descriptor protocol[0]. This is notably different from setter/getter methods.

[0]: https://docs.python.org/3/howto/descriptor.html