I really wish people would stop evaluating a model's coding capability with one-...

I really wish people would stop evaluating a model's coding capability with one-shots.

The vast majority of coding energy is what comes next.

Even today, sonnet-3.5 is still the best "existing code base" model. Which is gratifying (to Anthropic) and/or alarming to everyone else