Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Far far less. Alpaca-7B's compute cost was around $60-$70 for Stanford and around $0.60 (yes 60 cents) for equivalent fine tunes using the Parameter Efficient Fine Tuning (PEFT) strategy of Low Rank Adapters (LoRA).

The repo above can be replicated for similar costs. Easily less than $10 for up to 30B using LoRA (which requires only 24GB of VRAM for 30B/33B and smaller).



I thought so too, but for newcomers, they should expect to train model a dozen times or so :-)


I am interested in this. What would be the cost for the best model possible by the public?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: