Wouldn't this one-layer network be a lot less "compressive" than the multi-layer...

		dkural on Dec 7, 2020 \| parent \| context \| favorite \| on: Every Model Learned by Gradient Descent Is Approxi... Wouldn't this one-layer network be a lot less "compressive" than the multi-layer net, and in some sense "duplicate" subnetworks in earlier layers?