Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is almost never reasonable to assume normality and make calculations like this. This is particularly the case when you are dealing with lifespan, which isn't normally-distributed even in the slightest. The actual ranges are likely smaller than you are stating here, and variance is just not a very practical or interpretable metric to use when dealing with such a skewed distribution.

We should be stating something like a probability density interval (i.e. what is the actual range / interval that 95% of age-related deaths occur within), and then re-framing how much genetic variation can explain within that range, or something like it. As it is presented in the headline / takeaway, the heritability estimate is almost impossible to translate into anything properly interpretable.

https://biology.stackexchange.com/questions/87850/why-isnt-l...





Since lifespan can't follow a power law distribution, I suspect the error in variance from assuming it IS normally distributed is far less than you're suggesting.

Like even if I'm off by a factor of 2, then only ~3ish years are explainable by environment/exercise/diet/etc. Then... OK... that's really not that bad of an error in this context. That also feels a little low to me. I'd have guessed around 5-8 years anyway based on my experience with healthcare and life.


> I suspect the error in variance from assuming it IS normally distributed is far less than you're suggesting. [...] Like even if I'm off by a factor of 2 [...]

You would be deeply mistaken. Robust statistics texts (e.g. Wilcox) are full of examples of distributions that have zero skew and are even nearly indistinguishable by eye from a Gaussian, but where the differences in variance and thus resulting differences in conclusions drawn are profound. Heck, a sample from a Cauchy distribution looks not too bad, but in fact the variance is not even defined (or effectively infinite, and, thus, meaningless).

And even if you have enough data that statistical issues are not a concern, the problem is that most summary metrics (like effect sizes, heritability, etc) are developed under the assumptions of near-normality AND minimal skew, so that the effect size can be interpreted as something about the overlap and or positioning of the bulks of the distributions. But when skew and long tails are involved, the bulk itself is what is messed up, making most such metrics largely uninterpretable.

I.e. it isn't just that variance is hard to measure accurately here, it is that, even if measured accurately, variance isn't actually a meaningful metric here.

The few metrics that do remain interpretable in such cases tend to be those like HPDI in Bayesian methods, which look at actual distribution shapes and try to quantify a bulk in a sensible location. Likewise, meaningful effect sizes for skewed and long-tailed data need to actually take into account distribution overlap in meaningful regions. Heritability does not do this, as it is an explained variance metric.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: