I'm confused, as you don't seem to be replying to any point I've made...
> ~1.5x reed solomon is the default these days, again, unless you need read throughput performance
I'm not surprised that Reed-Solomon is the "default these days" given that it exists since the 1960's, and that the most widely available and deployed open-source distributed filesystem is HDFS (which uses Reed-Solomon). However I don't see how that is to be taken as a blind endorsement for it, especially given that the paper in reference explicitly compares itself to Reed-Solomon based systems, including concerns regarding reconstruction costs, performance, and reliability.
> Also, these days the storage of the data doesn't have to be at the same machine that processes the data
Even though what you said here is correct, I don't see how that's relevant to the referenced paper, nor do I think I implied that I hold a contrary belief in any way from what I said.
> Also2, this is completely independent of consistency guarantees
My comment about consistency referred only to the fact that you cannot "simply" spin up more replicas to increase read throughput, because consistent reads often have to aqcuire a lock on systems that enforce stronger consistency, so your comments regarding throughput are not universally true, given that there are many systems where reads cannot be made faster this way, as they are bottle-necked by design.
> Properly done Reed-Solomon has the same durability guarantees as plain old replicated setup
This is not true unless the fragments themselves are being replicated across failure domains, which you seem to address with your next comment with "you need geographic replication if your data cannot be accidentally lost without serious consequences". All of this, however, is directly addressed in the paper as well:
> The advantage of erasure coding over simple replication is that it can achieve much higher reliability with the same storage, or it requires much lower storage for the same reliability. The existing systems, however, do not explore alternative erasure coding designs other than Reed-Solomon codes. In this work, we show that, under the same reliability requirement, LRC allows a much more efficient cost and performance tradeoff than Reed-Solomon.
> ~1.5x reed solomon is the default these days, again, unless you need read throughput performance
I'm not surprised that Reed-Solomon is the "default these days" given that it exists since the 1960's, and that the most widely available and deployed open-source distributed filesystem is HDFS (which uses Reed-Solomon). However I don't see how that is to be taken as a blind endorsement for it, especially given that the paper in reference explicitly compares itself to Reed-Solomon based systems, including concerns regarding reconstruction costs, performance, and reliability.
> Also, these days the storage of the data doesn't have to be at the same machine that processes the data
Even though what you said here is correct, I don't see how that's relevant to the referenced paper, nor do I think I implied that I hold a contrary belief in any way from what I said.
> Also2, this is completely independent of consistency guarantees
My comment about consistency referred only to the fact that you cannot "simply" spin up more replicas to increase read throughput, because consistent reads often have to aqcuire a lock on systems that enforce stronger consistency, so your comments regarding throughput are not universally true, given that there are many systems where reads cannot be made faster this way, as they are bottle-necked by design.
> Properly done Reed-Solomon has the same durability guarantees as plain old replicated setup
This is not true unless the fragments themselves are being replicated across failure domains, which you seem to address with your next comment with "you need geographic replication if your data cannot be accidentally lost without serious consequences". All of this, however, is directly addressed in the paper as well:
> The advantage of erasure coding over simple replication is that it can achieve much higher reliability with the same storage, or it requires much lower storage for the same reliability. The existing systems, however, do not explore alternative erasure coding designs other than Reed-Solomon codes. In this work, we show that, under the same reliability requirement, LRC allows a much more efficient cost and performance tradeoff than Reed-Solomon.