OCaml for the recovering Java programmer, part 1: objects and subtyping
August 21, 2007 | 6 Comments

It’s said that the fox knows many tricks, but the hedgehog knows one big trick. If Java is the hedgehog, with objects as its one big trick, then OCaml is the fox, with lots of different tools for structuring code. Many of the things you’d use objects for in Java have simpler, cleaner, or safer alternatives in OCaml: tuples and records for structuring data, higher order functions in place of one-method anonymous inner classes, parametric polymorphism for collections instead of pervasive downcasts (although this has improved with the introduction of Java generics), functors and signatures in place of (compile-time) parameterization of code with interfaces.
Nonetheless, sometimes you want objects—as I did recently when interfacing with some object-oriented native code—and you can get them in OCaml too (objects are of course the O). But they aren’t quite the objects that you’re used to in Java. In Java, you can put two objects with a common superclass into a single List. I tried to do that in OCaml and got a mysterious type error. It took me some time, a little research, and a little profanity, but I got my code working and learned some things.
One difference between Java and OCaml is nominal vs. structural subtyping. In Java, one class is a subclass of another only if you declare it to be so (e.g. Cat extends Pet); what matters is the names of the classes involved (hence “nominal”). In OCaml what matters is the methods that the class supports (its structure, hence “structural”); if you declare classes pet with a legs:int method and cat with legs:int and snooty:bool methods, then cat is a subclass of pet even though you have declared no relationship between them (as with “duck typing”, but statically checked.)
A second difference is that in Java subtyping coercions happen automatically, while in OCaml you must request them explicitly with the :> operator. In Java you can write
Pet p = new Cat();
while in OCaml you must write
let (p : pet) = (new cat : cat :> pet)
(In many cases you may omit the : cat part; see the manual for precise details.)
This is why I couldn’t put my cats and pets in the same list. There’s only one type variable in ‘a list; it can be instantiated with cat or pet but not both simultaneously. However, you can explicitly coerce the cats to pets and put them all in a pet list, and that’s what I ended up doing.
But hold on, this sounds kind of annoying. The whole point of subtyping is subsumption, the ability to pass a cat to a function expecting a pet. It would be a pain if you had to explicitly coerce the cat to a pet. In fact you don’t need the coercion in OCaml when making a function call, but the way this is accomplished is completely different from Java. In Java, the argument object is implicitly coerced to the supertype at the function call site; in OCaml the function is polymorphic, and a “row” variable is instantiated at the function call site.
Before we explain row variables, let’s review ordinary parametric polymorphism. Consider the identity function in OCaml:
let id o = o
The type of id is ‘a -> ‘a, where ‘a is a type variable which may be instantiated with any type. If you write id 3, ‘a is instantiated with int, and the type of id at this call is int -> int. So the type of the result is int (and in general will always be same as the type of the argument).
Contrast with this similar Java function (leaving aside generics):
Object id(Object o) { return o; }
The type of this function (in OCaml terms) is Object -> Object. If you pass it an Integer it is implicitly coerced to Object at the call site. The type of the result is always Object no matter what the argument type is.
Now consider the following function:
let print_legs o = print_int o#legs; o
You can see that whatever o is, it must have a legs:int method. And because o is returned, the result type should be the same as the argument type.
Typing this into the top level shows the type of print_legs to be (<legs:int; ..> as ‘a) -> ‘a. The syntax .. indicates an anonymous row variable, which may be instantiated with any collection of methods, such as the empty collection, or foo:float; bar:(unit -> unit), or snooty:bool. The syntax as ‘a names the entire argument type so it can be referred to in the result type.
Say we want to pass a cat to print_legs. The type cat is an abbreviation for <legs:int; snooty:bool>. The anonymous row variable may be instantiated with snooty:bool, giving print_legs the type <legs:int; snooty:bool> -> <legs:int; snooty:bool>, or equivalently cat -> cat. Or we can pass a pet by instantiating the row variable with the empty row, giving print_legs the type <legs:int> -> <legs:int>, or pet -> pet.
It’s important that the argument is not coerced; rather the row variable is instantiated at the call site to match the argument. So the result has the same type as the argument, just as with the identity function above. The Java equivalent:
Pet print_legs(Pet p) { out.print(p.legs()); p; }
always returns a Pet even if you pass it a cat.
One last thing: suppose we need to coerce a cat list (or some other parametric type containing cats). Clearly cat list should be a subtype of pet list (we are spared Java’s ArrayTypeMismatchException here because lists are immutable). But OCaml does not know how a type ‘a t should vary with ‘a. If ‘a t = ‘a list then it should be covariant (cat list is a subtype of pet list because cat is a subtype of pet); if ‘a t = ‘a -> unit then it should be contravariant (a pet -> unit function may be used anywhere a cat -> unit function is expected). OCaml lets us declare the variance of a type variable: (+’a) t for covariance, (-’a) t for contravariance. (Unfortunately the standard library types have no variance declarations, but you can add them for your own types.)
I don’t expect that I’ll need the OCaml object system much but it’s nice to understand better how it works.
Posted by: Jake
6 Responses to “OCaml for the recovering Java programmer, part 1: objects and subtyping”
Michael Neumann on August 22nd, 2007 2:16 am
There is a german (?) story for children about rabbits and hedgehogs. They make a foot race. Of course the rabbit thinks he will win, because he can run faster. But the hedgehog is actually more intelligent, because his woman-hedgehog waits at the goal, and as the rabbit cannot distinguish her from her husband, the rabbit thinks that the hedgehog has won the race.
This proves that hedgehogs are more intelligent than rabbits. So I’d say, Java is a rabbit ;-). And Ruby is a hedgehog ;-)
Paul Butler on August 22nd, 2007 7:57 am
Good article.
As someone who is learning ocaml after learning imperative programming (mostly in Java), I am still not really comfortable with the idea of duck typing in a statically typed language, but the idea is growing on me, and this article helped with that.
Samuel A. Falvo II on August 22nd, 2007 12:06 pm
This article actually only confuses me further. Although I’m not coding in OCaml, I do code in Haskell, and I fail utterly to see the value of duck typing. In a sense, the closest I can think of is that it is “automatically generated type classes.”
But this is pretty bad — I can think of plenty of animals that have legs, but which are NOT pets, and therefore, should NEVER appear in a list of pets. Yet, by this article’s contents, it is not possible to enforce this constraint with OCaml’s static type system.
Haskell’s typeclasses are explicitly defined, and therefore, gives the programmer a much finer grain of control over what people can do, or perhaps more importantly, cannot do, with their types.
Ivan Jager on September 5th, 2007 9:39 am
It may be worth mentioning the subclass syntax. Rather than saying you could instead say #pet . This is especially useful when you have more complicated classes.
Ivan Jager on September 5th, 2007 9:41 am
That should be, “Rather than saying <legs:int; ..> you could instead say #pet.”
(I wasn’t sure how this thing would treat angle brackets.)
steve on September 29th, 2008 8:48 am
The main problem with Ocaml is the lack of
libraries to do various things,
But take a look at microsoft’s F# —
Recently, F# became a first-class citizen
of the .NET
family of langugages. F# compiles ocaml, and provides access to ALL of the .NET
libraries making it powerful. With full
Visual Studio integration and scriping capabilities, there is nothing stopping it.