Some information from Anandtech's deep dive into Apple's "big" Firestorm core. >...

gsnedders · on May 12, 2021

> This might be one reason why Apples does so well in browser benchmarks (JavaScript numbers are floating-point doubles).

Reminder that browsers try to avoid using doubles for the Number type, preferring integers with overflow checks. Much of layout uses fixed point for subpixels, too. Using doubles all the time would be a notable perf regression.

baybal2 · on May 13, 2021

> physical register file capacity we estimate at around 354 entries

That's actually less than most desktop CPUs these days, and much less than Xeons.

Symmetry · on May 13, 2021

Where are you getting that? I thought Intel was at 180 physical integers registers for the same core microarchitecture shared by both desktops and servers.

baybal2 · on May 13, 2021

The number of "hidden" registers for register renaming is few times that number.

Symmetry · on May 13, 2021

The last I heard about the number of physical integer registers changing at Intel was the increase from 168 to 180 with Skylake.

https://en.wikichip.org/wiki/intel/microarchitectures/skylak...

If you have a source I'm happy to read it but otherwise I think you're confused. Especially about Intel client and server cores having different numbers of registers. The lowest level difference between them I've heard of that wasn't features being fused off is different L3 cache sizes.

stephencanon · on May 13, 2021

Yeah, all the public documents I've seen say the Sunny Cove PRF has no change from Skylake, so 180 INT registers and 168 FP.

baybal2 · on May 13, 2021

I think I am confused.

I do remember I heard that physical register file was around 500 registers, but I believe my memory fails me now.

_kbh_ · on May 13, 2021

One of the reasons Apple does so well in browser tests is that ARM now has instructions to increasing the performance and decreasing the power draw of JavaScript operations.

CalChris · on May 13, 2021

Well, it has one: FJCVTZS Floating-point Javascript Convert to Signed fixed-point, rounding toward Zero.

olliej · on May 13, 2021

It’s simply matching the c86 float to int conversion, because JS specifies that behavior in the spec - all this instruction does is even the playing field it isn’t some magic instruction that does more than x86 does.

At a logic level there are no changes to the expensive part of rounding, only changes to the overflow values in the result.

goldenkey · on May 13, 2021

I thought you were kidding and then I looked it up: https://developer.arm.com/documentation/dui0801/latest/A64-F...

Seems kind of gross to me to have such a language specific instruction to be honest.

olliej · on May 13, 2021

It’s actually “convert float to int the way x86 does it because js specified that behavior”

saagarjha · on May 13, 2021

WebKit measured this to be an improvement of less than 2%. So it is certainly "one of the reasons", but certainly not a driving one. (Plus, it's ARMv8.3+.)

olliej · on May 13, 2021

And to be clear that performance win comes from removing the branches that are otherwise needed to provide x86 semantics

ksec · on May 13, 2021

JSC didn't even use that instruction when most of the benchmark was done. It has absolutely nothing to do with it, the idea floated or amplified by Grubber / DaringFireball which I believe he never actually goes back to correct it even after the fact was shown.

Much like his idea of $149 AirPods were made close to BOM cost. And that is how the whole world went on to believe all the wrong information.

olliej · on May 13, 2021

That is one instruction and a more accurate definition of it would be “match x86 float to integer conversion”.

It’s not a complex instruction, essentially using an explicit set of non default rounding flags. All in order to match what x86 does.

So if that instruction does help arm, it is only in getting rid of an advantage x86 had in being the dominant arch 25 years ago.