Go maps reuse memory on overwrites, which is why orcaman achieves 0 B/op for pure updates. xsync's custom bucket structure allocates 24 B/op per write even when overwriting existing keys.
At 1M writes/second with 90% overwrites: xsync allocates ~27 MB/s, orcaman ~6 MB/s. The trade is 24 bytes/op for 2x speed under contention. Whether this matters depends on whether your bottleneck is CPU or memory allocation.
Benchmark code: standard Go testing framework, 8 workers, 100k keys.
Allocation rates comparison is included. If your application writes into the map most of the time, you should go with plain map + RWMutex (or orcaman/concurrent-map). But if, for instance, you're using the map as a cache, read operations will dominate and having better read scalability becomes important. As an example, Otter cache library uses a modified variant of xsync.Map, not a plain map + RWMutex.
I focused on B/op because it was the only apparent weakness I saw. My “reuse” note was about allocation behavior, not false sharing. We’re talking about different concerns.
one thing these benchmarks dont capture well is the GC pressure difference at scale. in a long-running service processing millions of requests, the allocation pattern matters more than the per-op cost. xsyncs 24 B/op per write adds up to real GC pauses when you have hundreds of goroutines hitting the map concurrently.
in practice ive found that for most go services the bottleneck isnt the map implementation at all - its the serialization of the values going in and out. if youre storing anything more complex than a string or int, the json/protobuf marshal dominates the profile. worth profiling your actual workload before optimizing the map.
Allocation rates are also compared. Long story short, vanilla map + RWMutex (or a sharded variant of it like orcaman/concurrent-map) is the way to go if you want to minimize allocations. On the other hand, if reads dominate your workload, using one of custom concurrent maps may be a good idea.
There are multiple GH issues around better sync.Map. Among other alternatives, xsync.Map is also mentioned. But Golang core team doesn't seem interested in sync.Map (or a generic variant of it) improvements.
Orcaman is a very straightforward implementation (just sharded RW locks and backing maps), but it limits the number of shards to a fixed 32. I wonder what the benchmarks would look like if the shard count were increased to 64, 128, etc.
Would be great to see that - there are multiple GH issues for that. But so far, I'm not convinced that Google prioritizes community requests over its own needs.
Idk why but I tend to shy away from non std libs that use unsafe (like xsync). I'm sure the code is fine, but I'd rather take the performance hit I guess.
Unsafe usage in the recent xsync versions is very limited (runtime.cheaprand only). On the other hand, your point is valid and it'd be great to see standard library improvements.
I don't write Go but respect to the author for trying to list trade-off considerations for each of the implementations tested, and not just proclaim their library the overal winner.
Thanks. There are downsides in each approach, e.g. if you care about minimal allocation rate, you should go with plain map + RWMutex. So yeah, no silver bullet.
Pure overwrite workload (pre-allocated values): xsync.Map: 24 B/op 1 alloc/op 31.89 ns/op orcaman/concurrent-map: 0 B/op 0 alloc/op 70.72 ns/op
Real-world mixed (80% overwrites, 20% new): xsync.Map: 57 B/op 2 allocs/op 218.1 ns/op orcaman/concurrent-map: 63 B/op 3 allocs/op 283.1 ns/op
Go maps reuse memory on overwrites, which is why orcaman achieves 0 B/op for pure updates. xsync's custom bucket structure allocates 24 B/op per write even when overwriting existing keys.
At 1M writes/second with 90% overwrites: xsync allocates ~27 MB/s, orcaman ~6 MB/s. The trade is 24 bytes/op for 2x speed under contention. Whether this matters depends on whether your bottleneck is CPU or memory allocation.
Benchmark code: standard Go testing framework, 8 workers, 100k keys.
reply