Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Mixtral is also a MoE model, hence the name: mixtral.


Despite both being MoEs, thr architectures are different. DBRX has double the number of experts in the pool (16 vs 8 for Mixtral), and doubles the active experts (4 vs 2)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: