MyUltiDev's comments

MyUltiDev · 2026-04-22T12:12:15 1776859935

The dependency-reduction numbers are the part that stood out to me more than the perf figures. Going from 192 to 1 in @rspack/dev-server and from 15 MB to 1.4 MB of install size is a shift in philosophy, not just a benchmark. Bundling dependencies into the npm package to control transitive versions is also a stronger supply-chain stance than most bundlers take at this stage.

Reading through the list of projects that already ship Rspack (Next.js, Nuxt, Docusaurus, Storybook, Nx, Angular Rspack, and so on), that's also the group most exposed to any 2.0 breakage. The Upgrade Guide and the migration Agent Skill are both listed in the post, which is further than most major-version releases go, but neither of those tells me which of those downstream integrations will need their own updates to stay working against 2.0 defaults. Is there a plan to publish a compatibility matrix for the listed ecosystem integrations as they adopt 2.0, or is that expected to surface through the individual projects' issue trackers?

MyUltiDev · 2026-04-22T12:07:12 1776859632

The GitHub App angle is the interesting half here. It is the one integration where rotation is genuinely free, because you get first-class refresh semantics rather than bolted-on PAT expiry (the 90-day-and-forget-on-vacation failure mode you describe is painfully familiar). For the plain-header case like the Stripe curl example earlier in the post, I've been running similar setups across a few cloud providers, and rotation is where it breaks in practice: proxies that don't hot-reload the injected credential when upstream issues a new one. The TLS termination piece tends to get most of the architectural attention but is usually the easier half once you're already owning the proxy.

For the integrations that aren't GitHub-style OAuth Apps, where upstream just ships a long-lived API key and someone still has to rotate it, how are you planning to handle the refresh lifecycle on the exe.dev side? Is that declared per-integration, or is the proxy expected to notice 401s and pull a fresh credential from somewhere upstream?

MyUltiDev · 2026-04-16T18:12:23 1776363143

The hardware-attested privacy path is the interesting part of this, but the economic side has a quieter risk the thread has not named: the load tax per request. MiniMax M2.5 239B from your catalog still has to load all 239B weights even though only 11B are active — that is roughly 120GB at Q4_K_M, and cold load from SSD on Apple Silicon is measurable in tens of seconds. Even the Qwen3.5 122B MoE lands around 65GB cold. If the coordinator routes request number two to a different idle Mac than request number one, or if the owner's machine spun the model out to free memory in between, each request pays that cold load before the first token. Keeping the model resident 24/7 solves the latency but eats into the power budget the operator is trying to amortize in the first place. How does the coordinator decide which provider to keep warm for which model? A 16GB or 32GB home Mac cannot host Qwen3.5 122B MoE at all, and the Mac Studios that can are a much smaller slice of the 100M machine estimate.

MyUltiDev · 2026-04-16T18:08:57 1776362937

The attribution and lock-in arguments are the loud parts of this story, but the quieter production reason to move is concurrency. llama.cpp's server takes parallel N with cont-batching enabled by default, which interleaves tokens from multiple requests inside a single batch and keeps the GPU busy. Ollama defaults its parallel slots low and the interaction is less transparent, so the first time three people share a single model instance you feel it before any of the ethics become relevant. For a 70B Q4_K_M on a workstation, the real ceiling is KV cache fragmentation, and you have to size the context window around the parallel count rather than around one user. What is the highest parallel value anyone here has kept stable on a 70B Q4_K_M before the cache eviction pattern starts hurting quality?

MyUltiDev · 2026-04-15T18:44:16 1776278656

The Cloudinary fix that nobody in this thread is naming is actually two lines. Upload the asset with type set to authenticated instead of the default upload type, and generate a signed URL server side with sign_url true whenever alogged in user requests it. Once the asset is authenticated the public URL stops resolving entirely, so even the Google indexed copies go cold. The reason Fiverr cannot just turn this on now is that they already have years of stored messages where every reference is the default public delivery type, and switching the existing media library from public to authenticated breaks every existing URL across the whole platform. That is the architectural brittleness someone upthread was pointing at, and it is also why the only realistic path forward for them is rotating new uploads to authenticated and accepting that the historical exposure is permanent. What would actually catch this category of mistake earlier, an SDK default that refused to upload anything as public unless you opt in?

MyUltiDev · 2026-04-15T18:41:29 1776278489

The trigger matrix here is actually the most interesting part. Schedule plus API plus GitHub event on the same routine unlocks some nice patterns, and the /fire endpoint returning a session URL means you can wire this into alerting tools or a CD pipeline from almost anywhere. The part that is not really covered in the docs is what state a routine is supposed to recover from if a previous run died halfway through a repository change. The protection around claude/-prefixed branches helps you not clobber main, but it does not tell the next run what the previous run actually finished. I run scheduled jobs against multi step pipelines on my own infra and the failure mode that bites is not the crash, it is the run that returned success while a downstream side effect quietly broke. The /fire response returns a session URL and a session ID, which tells you the routine started, but how is a routine expected to notice when the downstream thing it kicked off (a CD pipeline, an alert follow-up, a library port PR like the one in the examples quietly fell over five minutes after the session ended?

MyUltiDev · 2026-04-11T16:24:44 1775924684

The thing that bothers me most about this story is that the binary on the Chrome Web Store and the public source on the repo have no enforced relationship at all. The store accepts a packaged extension and trusts the developer to say it matches the public code. I tried to reproduce the published build for a few extensions I actually depend on, and in most cases I could not, even when the maintainer was clearly acting in good faith. Firefox AMO at least asks for source and runs a diff against a clean build before they let it through, Chrome does not. If reproducible builds plus a signed attestation tying a store version to a commit are not the right answer here, what would actually catch the silent pivot from benign to malicious before users start getting injected ads?

MyUltiDev · 2026-04-11T16:22:44 1775924564

A 20 year retrospective with no Hetzner or OVH numbers in sight is a bit of a tell. I run workloads across AWS, Hetzner, and a couple of smaller providers, and the gap is not subtle. For a small to medium web stack you are looking at roughly $350 a month on AWS versus 20 to 25 euros on Hetzner for similar specs, plus 20 TB of bandwidth included instead of being billed at 9 cents a gig after the first 100. What AWS actually sells at this point is not compute, it is the IAM model, the global footprint, the deep integrations, and the org chart consensus that nobody gets fired for picking it. That is a real product and worth a lot in some shops, but it is a very different product from what cloud meant in 2006. For the people who have actually moved a real workload off AWS recently, what was the part that turned out to be more painful than you expected?

MyUltiDev · 2026-04-11T16:20:28 1775924428

Reading this right after the Sashiko endorsement is a bit jarring. Greg KH greenlit an AI reviewer running on every patch a couple weeks back, and that direction actually seems to be helping, while here the conversation is still about whether contributors will take responsibility for AI code they submit. That feels like the harder side to police. The bugs that land kernel teams in trouble are race conditions, locking, lifetimes, the things models are most confidently wrong about. I have seen agents produce code that compiles cleanly, reads fine on a Friday review, then deadlocks under contention three weeks later. Is this contributor policy supposed to be the long term answer, or a placeholder until something Sashiko-shaped does the heavy filtering on the maintainer side too?