Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Often most of the memory is for intermediate activations of the network, not the parameters used to calculate those intermediate activations.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: