We can see in this specific case there was better cache locality and more data w...

topspin · on Sept 12, 2019

> We can see in this specific case there was better cache locality

Dramatically better L1 and L2 cache behavior. It seems clear that the additional instruction load of the Rust driver is partially made up by the excellent cache utilization.

This "Rust vs C" document is just one part of a larger analysis of network driver implementations in many languages; C, Rust, Go, C#, Java, OCaml, Haskell, Swift, Javascript and Python. Have a look at the top level README.md of that GitHub repo.

safercplusplus · on Sept 12, 2019

Presumably a C++ (or SaferCPlusPlus[1] ;) implementation would see a similar cache performance advantage versus C?

Also, isn't it unintuitive that branch mispredictions go up with larger batch sizes? Wouldn't there be fewer branches per unit time?

[1] https://github.com/duneroadrunner/SaferCPlusPlus/blob/master...