local models are 3 to 6 months behind SOTA models with the huge benefit of not n...

ghrl · 2026-05-26T16:04:35 1779811475

I would say that is highly unlikely if by SOTA models you are not just referring to coding benchmarks but more general purpose ability and domain-specific knowledge. For example Kimi 2.6, which is comparable to Opus 4.6, is roughly 500+GB large, and I don't see how that would run on consumer hardware anytime soon. Besides, this is not just about the technical feasibility, but also economically not viable whatsoever. Why should consumer laptops be capable of running such models, when they would be massively underutilized most of the time, when inference providers can produce the same results faster, cheaper and a lot more viable economically?

henry2023 · 2026-05-26T16:50:44 1779814244

Because privacy has perceived value.

sourcecodeplz · 2026-05-26T16:57:29 1779814649

It runs right now on 512gb RAM Macs and PCs.

Our_Benefactors · 2026-05-26T20:16:03 1779826563

It runs like shit though in terms of tokens/second and still has a reduced context window. Vs a single claude prompt can easily get into 300k tokens without breaking a sweat.

I want local AI to be a thing but the hardware isn’t here yet, because the only options are a Mac Studio or DGX machines strapped together. RAM prices needs to crash before local AI has a chance at actually competing.

zozbot234 · 2026-05-26T20:38:31 1779827911

The more recent Chinese models are no longer heavily limited by context size. It can easily fit in RAM on a prosumer laptop. (You can also use swap space to extemd that, since context is only written to once per inference, thus a relatively mild wear-and-tear concern.)

Our_Benefactors · 2026-05-27T00:59:00 1779843540

Claude has 1M context window for the enterprise. 128k feels like a toy in comparison.

sourcecodeplz · 2026-05-27T01:10:59 1779844259

Deepseek pro/flash both have 1m.

ATMLOTTOBEER · 2026-05-27T15:08:19 1779894499

You’re right, and it feels like these people saying otherwise either don’t use these tools professionally (and therefore can’t tell a difference between local/cloud models) or literally just haven’t tried running local models

As soon as I can buy hardware for less than 5k that runs an opus 4.6+/5.5 model locally I will do it instantly

fg137 · 2026-05-26T20:22:55 1779826975

"shady third party"

If Claude hosted on AWS bedrock is not considered trustworthy, I have some bad news for you.

henry2023 · 2026-05-26T23:23:54 1779837834

Anthropic illegally downloaded virtually all copyrighted material in the world to train their models. What makes you think they will have even a little consideration for your IP?

fg137 · 2026-05-27T03:31:10 1779852670

Let's say as a matter of fact that Anthropic seizes every opportunity to steal IP.

How is that going to work on Bedrock, when they don't even manage the infrastructure?

linkregister · 2026-05-27T00:37:44 1779842264

*tortiously downloaded

lurking_swe · 2026-05-26T19:51:29 1779825089

the bigger issue is context windows. HUGE difference there.