Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Agreed on ZLUDA being the practical choice. This project is more impressive as a "build a GPU compiler from scratch" exercise than as something you'd actually use for ML workloads. The custom instruction encoding without LLVM is genuinely cool though, even if the C subset limitation makes it a non-starter for most real CUDA codebases.


ZLUDA doesn't have full coverage though and that means only a subset of cuda codebases can be ported successfully - they've focused on 80/20 coverage for core math.

Specifically:

CuBLAS (limited/partial scope), cuBLASLt (limited/partial scope), cuDNN (limited/partial scope), cuFFT, cuSPARSE, NVML (very limited/partial scope)

Notably Missing: cuSPARSELt, cuSOLVER, cuRAND, cuTENSOR, NPP, nvJPEG, nvCOMP, NCCL, OptiX

I'd estimate it's around 20% of CUDA library coverage.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: