Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That % 63 at the end is integer division, that's horribly slow. Literally any other sensible approach will beat it, including a naive LUT.


Optimising compilers are smart enough to get rid of unsigned divisions by constants. See [1] (part one of a series; also [2] is neat). To quote [2], "libdivide's scalar code is up to 16 times faster for powers of 2, 10 times faster for non-powers of 2, compared to naive hardware division." In this case GCC (even with -O0) replaces % 63 with a complicated sequence of one integer multiplication, three subtractions, an addition, and three shifts, which I couldn't be bothered decompiling.

[1] http://ridiculousfish.com/blog/posts/labor-of-division-episo...

[2] http://libdivide.com/


You're right. I read somewhere that no compiler optimizes modulus by a constant. That's clearly very outdated. I tested with clang 3.2, gcc 4.5, and icc 13 (all on a 32bit platform) and they all quite happily replace the division with other instructions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: