With the primary advancements over the past two years being Chain Of Thought which absolutely obliterates token counts in what world would the "per token" value of a model be going up...
If you are able to cogently explain how you would instruct GPT 3.5 with ANY amount of tokens to do what Sonnet 4 is able to do, I am sure there's a lot of wealthy people that would be very interested in having a talk with you.
With the primary advancements over the past two years being Chain Of Thought which absolutely obliterates token counts in what world would the "per token" value of a model be going up...