I don't think there's much for Amazon to gain from publishing these sorts of internal details. Amazon's services are used by developers who are looking to tightly optimize their usage. If Amazon were to publish detailed internal information, it's likely that folks would start optimizing applications based on internal details that have the potential to change over time.
Secondly, I think that a lot of companies publish these "tech blogs" as a way to boost recruiting (look at the cool stuff that we're doing, don't you want to join us?). Amazon, of course, doesn't have a recruiting problem. If you want to work on the largest-scale systems, it's already a top destination for you.
I imagine (hope) that they are doing some kind of intelligent read-ahead in the frontend servers to optimize for sequential reads that would avoid this looking terrible for applications.
Notably, this is going to manage your data in it's native format (i.e. you can actually read-write the files out of the S3 bucket as if they were actual objects, mapping 1:1 to each file). The ZFS backend is (almost certainly) a block-based format that is persisted to S3 (meaning that you cannot use it for existing data in S3, and you cannot access data written through ZFS via S3).
This is pretty different than s3fs. s3fs is a FUSE file system that is backed by S3.
This means that all of the non-atomic operations that you might want to do on S3 (including edits to the middle of files, renames, etc) are run on the machine running S3fs. As a result, if your machine crashes, it's not clear what's going to show up in your S3 bucket or if would corrupt things.
As a result, S3fs is also slow because it means that the next stop after your machine is S3, which isn't suitable for many file-based applications.
What AWS has built here is different, using EFS as the middle layer means that there's a safe, durable place for your file system operations to go while they're being assembled in object operations. It also means that the performance should be much better than s3fs (it's talking to ssds where data is 1ms away instead of hdds where data is 30ms away).
Well, I think this is what our company, Archil, is working on. We basically built an SSD clustering layer that proxies/caches/and assembles requests into object storage so that you can run a POSIX file system directly on top.
There's also some really great projects like SlateDB in this space, which could be more like what you're looking for (~RocksDB like API that runs on S3).
We just released a driver that allows users of just-bash to attach a full Archil file system, synced to S3. This would let you run just-bash in an enrivonment where you don't have a full VM and get high-performance access to data that's in your S3 bucket already to do like greps or edits.
It's 100% because the number of operations happening on Github has likely 100x'd since the introduction of coding agents. They built Github for one kind of scale, and the problem is that they've all of a sudden found themselves with a new kind of scale.
That doesn't normally happen to platforms of this size.
ISTR that the lift-n-shift started like ... 3 years ago? That much of it was already shifted to Azure ... 2 years ago?
The only thing that changed in the last 1 year (if my above two assertions are correct (which they may not be)) is a much-publicised switch to AI-assisted coding.
Secondly, I think that a lot of companies publish these "tech blogs" as a way to boost recruiting (look at the cool stuff that we're doing, don't you want to join us?). Amazon, of course, doesn't have a recruiting problem. If you want to work on the largest-scale systems, it's already a top destination for you.