models trained on gpt output might be more distilled and specialized but it woul...

lukeplato · on Dec 13, 2023

https://twitter.com/pfau/status/1674766269113937920

eightysixfour · on Dec 13, 2023

I disagree with this. If you give GPT information that was not part of its dataset and ask it to make question and answer pairs off of that information, you are adding higher quality breadth to the training corpus.

Phi-2 seems like pretty good proof of that.

verdverm · on Dec 13, 2023

that's the point, they get less good at everything, but really good at one or a few things

The real benefit here is

1. It's much cheaper and faster to train a bunch of specialized models once you have a single good LLM

2. You probably can't get the same capabilities from a specialized model by training it directly.