The One True Razor

Occam's Razor

Occam's Razor refers to the idea that the shortest explanation is often the best one. The razor is useful in a variety of domains - including science, AI and rational discourse.

There have been various attempts to formalise the idea. One of the first issues that comes up is: the shortest explanation in which language.

Formalising the razor

Formalising the razor is often done using the idea of Kolmogorov complexity. Kolmogorov complexity is used to quantify arbitarary informational entities. The Kolmogorov complexity of an object refers to the length of the shortest program which produces that object as its output.

However the idea merely transforms the problem of which language into the problem of which programming language.

Which language?

Various solutions have been proposed. One is to use FORTRAN 77. FORTRAN 77 is a common, small, simple, completely-specified programming language, that many people have heard of.

Another is to use some kind of simple Turing machine.

Often such choices are justified by the argument that interpreters can be coded concisely in these languages - so using another language only costs you the size of the interpreter.

This argument is OK - provided the entities you want to compare are large by comparison with the size of the interpreter. However, Occam's Razor is often used to distinguish between scientific theories and logical arguments - which are often themselves relatively small.

In such cases, the argument that the choice of language doesn't matter much (because you can switch languages easily by writing an interpreter) doesn't wash - and it becomes important to pick a sensible language in the first place.

Choosing a language

So: what is the best programming language to pick?

Several possible answers have been proposed:

One is to forget about programming languages - and use physical machines.

People can argue all day about which is the best programming language to measure complexity with - but physical machines are real concrete things, that execute god's programming language: the physical laws of the universe - surely you can't argue with that.

Unfortunately for this plan, we don't know what the physical laws of the universe are. Nor can we convincingly build very tiny machines. So, this metric is not very well defined today. Our experience of shrinking machines suggests that their capabilities change qualitatively as scale changes - e.g. consider quantum computers - so such a metric is not very likely to produce results that are stable as technology progresses. Possibly, its usefulness may improve in the future - but this doesn't help us much now.

Another is to forget about the actual laws of physics, and use an abstract version of them - such as a three-dimensional reversible universal cellular automaton (RUCA).

Firstly, note that an abstract machine produces quite different estimates of the complexity of some things. Essentially, a physical machine has experimental access to the laws of physics - and their associated physical constants - while the abstraction does not. If you ask a physical machine to describe what happens when a gold atom smashes into a lead crystal, it can experimentally execute that scenario and watch the results, while an abstract simulation can't do that. Since many physical constants appear to be complex and incompressible, there can be big differences in the resulting complexities.

Defining what the laws of physics are deals with the issue of not knowing what they are - while introducting a certain level of arbitrariness and unreality. However, the next issue is which cellular automaton to pick.

This is not a trivial issue. In some cellular automata, machines can be constructed to expand when they require more workspace (or memory). In other cellular automata this is impossible: any workspace which a machine can use must be built into its initial configuration - or it remains forever inaccessible. Obviously this can make a big difference to the size of the initial configuration required to produce a specified output. Nor is it obvious which type most closely represents the laws of physics - mass-energy can apparently be created in theory (e.g. by inflation). However, this seems to be pretty difficult in practice.

At this point, our tortuous journey to find the one true razor seems to have got bogged down.

The one true razor

The truth is that there is no one true razor. Just as Kolmogorov complexity is language dependent, so Occam's Razor is description-dependent.

There really are different versions of Occam's Razor - and different versions are appropriate under different circumstances.

If you want to know if bitstring X is more probable that bitstring Y the answer depends on how those strings were produced - not simply on the properties of the strings themselves.

Putting it this way makes the result seem pretty obvious - but it seems that the quest for the one true razor is never short of knights eager to pursue it.

References


Tim Tyler | Contact