Sailing the ship of state
![]()
In fanciful moments I like to think of a running program as a little spaceship following a trajectory through the computer’s “state space”–the space of possible values of each of its memory locations. This is a big space (2 to the power of the number of bits of memory in the machine) so easy to get lost in. The problem of writing a correct program is how to keep the spaceship on course.
If the ship is on course it’s easy to move it in the right direction. If it’s gotten off course, we have to figure out where it is and make some correction to get it back on course. If we don’t realize it’s gotten off course we’ll move it in what we think is the right direction, and probably make things worse. If we get too far off course we might not be able to recover at all. We’re better off staying on course in the first place.
In programming terms, staying on course corresponds to maintaining an “invariant” on the state, something you expect always to be true about it. Another way to put it is that an invariant rules out chunks of the state space–call them no-fly zones. A simple invariant might be that a list is sorted. States where the list is not sorted are ruled out, so you don’t have to worry about recovering from that situation.
Good programming languages can help by maintaining some simple invariants for you, and, for more complex invariants, making it easier to convince yourself that your code maintains them.
If you’re programming in C or C++, anything at all can happen, so the whole state space is your territory. If you overrun a stack-allocated array or dereference a bad pointer, it’s like pressing the hyperspace button in Asteroids. Who knows where your ship will end up?
In a more civilized language (say, Python or Ruby), you can’t cause a segfault or overwrite arbitrary memory like you can in C or C++; all points in the state space with bad stacks or invalid pointers are ruled out. You can rely on the invariant that any variable holds a valid Python (or Ruby) object. However, you don’t know what type of object; most any operation can raise a runtime exception because you passed it objects of the wrong type.
In a pinnacle-of-civilization language like OCaml, you get the same invariant as Python, and also the invariant (enforced by the typechecker) that any variable holds a value of the same type as the variable. All states where values of the wrong type might be passed to an operation are ruled out.
This stronger invariant comes at very low cost to the programmer: types are inferred, so you hardly ever have to write them down. Compare to Java, where the heavyweight type annotations tend to make people want to become Python programmers instead.
Moreover, OCaml’s type system allows you to express more detailed invariants than does Java’s. A pet example is null pointers: in Java, a variable of some compound type (not a base type) contains either an object of that type or the null pointer. Just as in Python, most any operation can raise a runtime exception because you passed it a null pointer. In OCaml, for each type there is a separate “option” type which allows null pointers; if you have a value of non-option type you can be sure that it is not null.
Of course, you can’t expect the programming language to do everything. The invariant you have in mind when you’re writing a program can be very complex; it isn’t something the compiler can figure out. Still, the language can make it easier for you to reason about invariants on your own.
The main tool for reasoning of any kind is to break the problem into smaller pieces. In programming we accomplish this with modules (or classes, or abstract datatypes). The state inside a module is private, so we can reason about its invariant without worrying about code outside the module.
Modularity in C++ is a sham: you can declare a field to be private, but that won’t stop anyone from overwriting its memory. Python and Ruby take a lax attitude, providing classes but making it easy to circumvent their interfaces. OCaml provides a strong module system, along with excellent facilities for building larger modules out of smaller ones.
It is sometimes said that languages with static type systems are restrictive, that they don’t provide the freedom that programmers need to get the job done. But getting the job done means making software that works; total freedom in programming is the freedom to make any mistake. We should instead be humble programmers1, and use languages that keep us from getting lost in space.
- See Edsger Dijkstra’s Turing Award lecture on The Humble Programmer.
Hey Jake, nice article. You just pushed OCaml to the top of my “programming languages to learn” list.
Why, the state space is not *that* big, it’s just a 1 x 1 x … x 1 order-(2^32)-hypercube. It has unit volume as well!
Of course I agree with you on the rest. But perhaps keeping the space analogy that should be “Hubble programmers”?
Nice to read a sane summary of those languages. It’s the conclusion I’ve been coming to myself.
[...] love OCaml at Skydeck, as Jake has written about before, and we often hear from programmers with questions and comments about it. Some love OCaml, [...]
I’m curious to see a comparison of OCaml versus Oberon? Oberon also imposes very, very strong interface boundaries and I’ve enjoyed this language immensely. Currently, I hack in Haskell (see http://www.falvotech.com/content/cut/3), and I’ve had similar experiences with Haskell too.
i mock your current so-called higher-order languages with my dependently typed one(s)!

Excellent post. We are ourselves building a finance app in Scala (which is an excellent language with great FP support), but I have to say that I somehow envy you hacking away in OCaml.