Why would the simplest theory be the most likely to be true?

kjeetgill · on July 20, 2018

I think there's a little more foundation to it that just "Occam said it". I think the point is that the simplest theory is a) a lower bound and b) most valuably falsifiable.

Take a theory, add a bit about a Unicorn. or a Dragon. Now you have a more complicated theory. You can arbitrarily contrive a more complicated solution.

But the simplest? You've found a local minima. And because it's more stable you can compare it to competing theories without trivial refutation. see: the 3-5 theories of dark matter.

RobertoG · on July 20, 2018

Occam's razor is not about the simplest theory being true, but about how from a simple theory you can create infinite more just adding an element.

If you have infinite possible theories, the only logical thing to do is to stay with the most simple theory until it can't explain the world anymore, then you choose the next one more simple.

There is also, I think, a theoretical information argument: when creating a theory you are trying to compress all the observations. The better the compression, the better the theory.

I don't understand why people down-vote you, by the way.

tlb · on July 20, 2018

That's a big topic, and the definition of simple is subtle. I recommend David Deutsch's book _The Beginning of Infinity_ for an accessible introduction to what makes one theory better than another when they both fit the observations.

whatshisface · on July 20, 2018

One theory might be better than another when they both fit the observations, but you can't say which one is true-er. For example, were Newton been presented with GR but only allowed his contemporary evidence, he should not conclude that classical gravitation is more true than GR on the grounds that he doesn't have any evidence for the additional complexity.

RobertoG · on July 20, 2018

I think you are wrong. If he is following the scientific method, and he doesn't have evidence, he should 'conclude' that.

The key is that in science, you never 'conclude' (in the sense of finalize) anything. Everything is temporal until new evidence deny your current understanding.

pishpash · on July 20, 2018

Volumes have been written about this topic (principle of least assumption, parsimonious representation, Occam's razor, etc.). It underlies all statistical learning theory and information theory of which science is a special case.

kingbirdy · on July 20, 2018

Less edge cases mean it's less likely to break if we discover a new particle or something else unexpected

darkmighty · on July 20, 2018

I guess that's a good overview. It's related to the phenomenon of overfitting in machine learning: you can always easily find a sufficiently complex (or complicated, large if you prefer) theory fitting all data points. Because this theory simply encodes each observed case (including progressive sophistications of encoding), you naturally expect it to fail on unobserved cases -- it makes no effort at generalization. The simpler theories have a greater chance of generalization, they're more likely to be the "true" mechanism of the process that doesn't simply encode "edge cases" as you cited, and thus are more likely to also work on unobserved data.

Honestly I haven't seen attempts at making this process more rigorous, when applied to physics. There's the large corpus of machine learning study which provides concrete results and even concrete comparison tools, but the times I've asked a physicist I've been dismissed; while it seems incredibly valuable in the face of lack or large cost of experimental data, which is quite relevant today.

chriswarbo · on July 21, 2018

> Honestly I haven't seen attempts at making this process more rigorous, when applied to physics.

Marcus Hutter has expressed this idea quite well ( https://arxiv.org/pdf/0912.5434 ) arguing that (a) smaller/simpler theories have more predictive power and (b) the "size" of a theory includes the complexity of its equations and the parameters needed to specify some result. The latter is important because some theories trade off between these two: e.g. a multiverse theory might have simple equations ("every possibility happens somewhere") but require very precise "coordinates" to pin-point the actual possibility that we observe.

Not sure if other physicists know of or take it seriously though.

darkmighty · on July 21, 2018

Interesting. Apparently he needs to assume a particular multiverse theory to prove it. While I don't object to those on principle, I don't believe they're needed to prove heuristic, good enough versions of Ockham’s Razor that work in the real world (albeit without guarantees), based on the arguments outlined on the previous comments.

> The latter is important because some theories trade off between these two: e.g. a multiverse theory might have simple equations ("every possibility happens somewhere") but require very precise "coordinates" to pin-point the actual possibility that we observe.

I think this is an important observation that's quite obvious for ML researchers et al but again seems to escape current physics discussions. An example is the endless drama about "Fine tuning": if your new theory requires many less bits for equation description, that it requires fine tuning is irrelevant as long as the additional model parameter precision uses less bits -- then it should be the preferred candidate.

W.r.t. [computational] multiverse theories (and variants such as Tegmark's MUH, Schimidhuber's, and others), I do believe they're an inevitable progression of physics/philosophy. I just think it's a bit pretentious to have any certain about a particular flavor. I feel there's still much philosophical and mathematical ground to be covered; it tests the limits of our imagination. It seriously feels like a very important step for humanity at large though -- finally approaching metaphysical theories that actually make sense, and explain the basis of much about humanity, existence, ethics, etc. I think it's an important void to be filled after the decline of religion, hopefully in coonjunction with the spread of humanism.

yedawg · on July 20, 2018

Decrease the number of variables and conditions to any physical system; say to something binary and you end up with a 50/50 argument, which means at worst means you can always fall back on trial and error, and at best you might discover something you didn't see before. Simple theories are better than "holistic" theories, because they act as a starting point for many more testable simple theories. Knowledge at it's most primitive form is ultimately somewhat binary, completely testable and scalable.

OscarCunningham · on July 20, 2018

This is a good question, and one that I don't have a complete or rigorous answer to. But I can describe my intuition:

A theory is just a list of assertions. In the absence of any evidence, each assertion is just as likely to be true as false. Therefore the probability of a theory with n assertions is 2^-n. So the more assertions there are, the less probability.

HelloNurse · on July 20, 2018

Scientific theories are not likely to be true: they are either true enough to be taken seriously or false enough to be dead. Accumulation of evidence and knowledge makes scientific theories descend a ladder of wrongness that goes from unthinkable, to utterly ridiculous, to wrong but respectably clever, to somewhat grounded in reality, to good enough approximations for some purposes, to best in class but not perfect, to positively agreeing with all available evidence.

Given equally true theories scientists look to other properties to establish theory quality, and simplicity is a philosophically important one: it represents the belief that natural laws should be as simple and elegant as evidence allows them to be.