Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is insanely-fast-whisper fast enough to actually run on the CPU and still trascribe in realtime? I see that none of these are running quantized models, it's still fp16. Seems like there's more speed left to be found.

Edit: I see it doesn't yet support CPU inference, should be interesting once it's added.



Insanely fast whisper is mainly taking advantage of a GPU’s parallelization capabilities by increasing the batch size from 1 to N. I doubt it would meaningfully improve CPU performance unless you’re finding that running whisper sequentially is leaving a lot of your CPU cores idle/underutilized. It may be more complicated if you have a matrix co-processor available, I’m really not sure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: