The primary reason why you should be using 3x or higher replication is the read ...

luizfelberti · on Dec 8, 2020

Everything you just said is on point, but I think that's an orthogonal thing to what the paper is going for. Hot data should absolutely have a fully-materialized copy at the node where operations are made, and an arbitrary number of readable copies can be materialized for added performance in systems that don't rely on strong consistency as much.

However for cold-data, there really hasn't been (or at least I am unaware of) any system that can achieve the combined durability of 1.5x Reed-Solomon codes + 3x replication, with such a small penalty to storage costs.

Like you said though, it's definitely not the thing you'd be doing for things that prioritize performance as aggressively as the use-cases you've suggested.

H8crilA · on Dec 8, 2020

~1.5x reed solomon is the default these days, again, unless you need read throughput performance. It is awesome :)

Also, these days the storage of the data doesn't have to be at the same machine that processes the data. A lot of datacenter setups have basically zero transfer cost (or, alternatively, all the within-DC transfer cost is in the CAPEX required to build the DC in the first place), ultra low latency, and essentially unlimited bandwidth for any within-datacenter communication. This doesn't hold for dc1->dc2 communication, in particular it is very very far from the truth in long distance lines.

One way to think about the above is that datacenters have become the new supercomputers of the IBM era - it's free and really fast to exchange data within a single DC.

Also2, this is completely independent of consistency guarantees. At best it relates to durability guarantees, but that I want from all storage solutions. And yes, properly done reed solomon has the same durability guarantees as plain old replicated setup.

Also to the above also2, single-DC solutions are never really durable as the DC can simply burn down or meet some other tragic end, you need geographic replication if your data cannot be accidentally lost without serious consequences (a lot of data actually can be lost, in particular if it is some kind of intermediate data that can be regenerated from the "source" with some engineering effort). This is not just a theoretical concern, I've seen "acts of God" destroy single-DC setups data, ay least partially. It is pretty rare, though.

luizfelberti · on Dec 9, 2020

I'm confused, as you don't seem to be replying to any point I've made...

> ~1.5x reed solomon is the default these days, again, unless you need read throughput performance

I'm not surprised that Reed-Solomon is the "default these days" given that it exists since the 1960's, and that the most widely available and deployed open-source distributed filesystem is HDFS (which uses Reed-Solomon). However I don't see how that is to be taken as a blind endorsement for it, especially given that the paper in reference explicitly compares itself to Reed-Solomon based systems, including concerns regarding reconstruction costs, performance, and reliability.

> Also, these days the storage of the data doesn't have to be at the same machine that processes the data

Even though what you said here is correct, I don't see how that's relevant to the referenced paper, nor do I think I implied that I hold a contrary belief in any way from what I said.

> Also2, this is completely independent of consistency guarantees

My comment about consistency referred only to the fact that you cannot "simply" spin up more replicas to increase read throughput, because consistent reads often have to aqcuire a lock on systems that enforce stronger consistency, so your comments regarding throughput are not universally true, given that there are many systems where reads cannot be made faster this way, as they are bottle-necked by design.

> Properly done Reed-Solomon has the same durability guarantees as plain old replicated setup

This is not true unless the fragments themselves are being replicated across failure domains, which you seem to address with your next comment with "you need geographic replication if your data cannot be accidentally lost without serious consequences". All of this, however, is directly addressed in the paper as well:

> The advantage of erasure coding over simple replication is that it can achieve much higher reliability with the same storage, or it requires much lower storage for the same reliability. The existing systems, however, do not explore alternative erasure coding designs other than Reed-Solomon codes. In this work, we show that, under the same reliability requirement, LRC allows a much more efficient cost and performance tradeoff than Reed-Solomon.