Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

models trained on gpt output might be more distilled and specialized but it wouldn't be improving generalization



I disagree with this. If you give GPT information that was not part of its dataset and ask it to make question and answer pairs off of that information, you are adding higher quality breadth to the training corpus.

Phi-2 seems like pretty good proof of that.


that's the point, they get less good at everything, but really good at one or a few things

The real benefit here is

1. It's much cheaper and faster to train a bunch of specialized models once you have a single good LLM

2. You probably can't get the same capabilities from a specialized model by training it directly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: