“You don’t need a fancy model, you can just find a similar example directly from...

ironSkillet · on Dec 6, 2020

If you look at how they actually "translate" the fancy model to the simple one, it requires fully fitting the original model (and keeping track of the evolution of gradients over the training). So it wouldn't make training more efficient, but perhaps it would be useful in inference or probing the characteristics of the original model.

lumost · on Dec 6, 2020

This has always anecdotally appeared to be the case when investigating the predictions of neural nets. Particularly when it comes time to answer the question “what does this model not handle”

smallnamespace · on Dec 6, 2020

Defining ‘similar’ robustly is the meat of the problem, and what we’re finding deep NNs to do well.