Note: If you are reading this on github.com, please read it on my blog, instead; it’s much nicer.
I was pleasantly surprised to discover that OCaml has been supporting modules as first-class objects since v3.12 (2011). Intuition suggests that first-class modules should be expressive enough to simulate Haskell-style typeclasses in OCaml. Turns out this is the route taken by Leo White et al to introduce adhoc polymorphism via Modular Implicits to OCaml. White et al’s paper describes their principled approach in detail, and is definitely worth a read. In this blog, I present a series of OCaml examples involving (first-class) modules that gradually build up to modular implicits. The full code (with directions to run) is available here.
First lets recall the basics of modules and module types
(signatures) in OCaml. Lets define a module type T1
:
module type T1 = sig
type a
val to_string: a -> string
val eg: a
end
Lets also define a couple of modules of type T1
:
module IntT1 : (T1 with type a = int) = struct
type a = int
let to_string = string_of_int
let eg = 0
end
(* module IntT1 : sig type a = int val to_string : a -> string end *)
module BoolT1 : (T1 with type a = bool) = struct
type a = bool
let to_string = string_of_bool
let eg = false
end
Below is a functor that creates a module of type T1
.
module XT1(X:sig
type t
val to_string: t -> string
val eg: t
end) : (T1 with type a = X.t) = struct
type a = X.t
let to_string = X.to_string
let eg = X.eg
end
The type of XT1
is:
module XT1 :
functor (X : sig type t val to_string : t -> string end) ->
sig type a = X.t val to_string : a -> string end
OCaml has first class modules. A first-class module is created by “packing” a module with a signature. Done using the “module” keyword.
let i = (module IntT1: T1)
let b = (module BoolT1: T1)
It is possible to create a list of first-class modules, each encapsulating a different type:
let mod_list = [i; b]
However, i and b cannot be used as modules themselves; they need to be unpacked first. Unpacking is done using the “val” keyword:
module I = (val i : T1)
module B = (val b : T1)
Note that the information that type a
in I
is int is lost while
packing it into a module of type T1
. Thus I.eg
is an abstract value
of type I.a
, as confirmed by utop:
utop # I.eg;;
- : I.a = <abstr>
Likewise, B.eg
is an <abstr>
value of type B.a
.
Alternatively, we could’ve packed IntT1
into a module of type (T1
with type a = int)
, but that makes it impossible to create [i;b]
.
It would have been clear by now that first-class modules are simply
variables with existential types. We pack a module to introduce an
existential type, and unpack it to eliminate the existential.
Like the values of other types, values of existential types are
first class. The t2_of_t1
function shown below takes a value of
type module T1
, which is an existential type, and returns a value
of type module T2
, which is another existential type:
module type T2 = sig
include T1
val print: a -> unit
end
(** val t2_of_t1: (module T1) -> (module T2) *)
let t2_of_t1 (x:(module T1)) =
let module X = (val x : T1) in
let module Y = struct
include X
let print a = print_string @@ to_string a;;
end in
let y = (module Y: T2) in
y
Unpacking an existential (i.e., conversion from a first-class
module to a normal module) is done automatically by OCaml during
pattern matches, so t2_of_t1
can be simplified as following:
let t2_of_t1 (module X: T1) =
let module Y = struct
include X
let print a = print_string @@ to_string a;;
end in
let y = (module Y: T2) in
y
We can now construct modules at run-time:
module I2 = (val (t2_of_t1 i):T2)
This feature lets us choose between various implementations of a signature at run-time, depending on the execution parameters. Another big advantage of first-class modules is that it helps us achieve qualified/bounded/adhoc polymorphism, and lets us obtain Haskell-style typeclasses in ML. For instance, lets consider a SERIALIZABLE signature that characterizes all serializable types:
module type SERIALIZABLE = sig
type t
val to_string: t -> string
val of_string: string -> t
end
(* For convenience *)
exception ConversionError
Abstract type t
is supposed to be instantiated with any concrete
serializable type. Base types such as int, float etc., are obviously
serializable:
module SInt : SERIALIZABLE with type t = int = struct
type t = int
let to_string = string_of_int
let of_string = int_of_string
end
module SFloat : SERIALIZABLE with type t = float = struct
type t = float
let to_string = string_of_float
let of_string = float_of_string
end
Polymorphic types like lists are serializable iff their type parameters are serializable:
module SList(A: SERIALIZABLE)
: (SERIALIZABLE with type t = A.t list) = struct
type t = A.t list
let to_string l = "["^(String.concat ";" @@
List.map A.to_string l)^"]"
let of_string s =
let l = String.length s in
let _ = if l>=2 && s.[0]='[' && s.[l-1]=']' then ()
else raise ConversionError in
let mid = String.sub s 1 (l-2) in
let tokens = Str.split (Str.regexp ";") mid in
let join p s = match p with "" -> s | _ -> p^";"^s in
let els = List.rev @@ fst @@ List.fold_left
(fun (els,p) s ->
try
let el = A.of_string @@ join p s in
(el::els,"")
with ConversionError -> (els, join p s))
([],"") tokens in
els
end
An aside: we implemented a serializable list as a functor above. An alternative approach is to implement it as a function. The second approach is more general than the first because functions can have polymorphic types but functors cannot (in OCaml).
(*
* val mk_SList :
* (module SERIALIZABLE with type t = 'a) ->
* (module SERIALIZABLE with type t = 'a list)
*)
let mk_SList (type a)
(module A : SERIALIZABLE with type t = a) =
let module SList = struct
module S = SList(A)
include S
end in
(module SList : SERIALIZABLE with type t = a list)
We now have multiple to_string
and of_string
functions from
SInt, SFloat, SList, and their compositions. Using first-class
modules, we can write to_string
and of_string
wrappers for
these functions that accept module parameters:
let to_string (type a)
(module S: SERIALIZABLE with type t = a) = S.to_string
let of_string (type a)
(module S: SERIALIZABLE with type t = a) = S.of_string
let "2" = to_string (module SInt) 2
let 3 = of_string (module SInt) "3"
let "[2;3]" = to_string (module SList(SInt)) [2;3];;
let [2;3] = of_string (module SList(SInt)) "[2;3]";;
let "[[2;3];[4;5]]" = to_string (module SList(SList(SInt))) [[2;3];[4;5]];;
let [[2;3];[4;5]] = of_string (module SList(SList(SInt))) "[[2;3];[4;5]]";;
Observe that in the above examples, single to_string
/of_string
function is being used for various SERIALIZABLE types. This is
exactly the behavior that typeclasses offer in Haskell! There is,
however, one significant difference. While GHC infers typeclasses
automatically, here we have to explicitly pass the module
parameters to to_string
/of_string
functions. This seems
redundant, given that such module parameters can be inferred from
the context. For example, when the to_string
function is applied
on [[2;3];[4;5]]
, the only module parameter that lets this
expression typecheck is SList(SList(SInt))
, which must be
inferable. This is infact the sort of reasoning performed by
Modular Implicits (Leo White et al) OCaml extension, which lets us
mark module parameters to a function as implicit, and later elide
them when applying the function. This is demonstrated below:
We first mark the modules that serve as typeclass instances with the
keyword implicit
:
implicit module SInt : SERIALIZABLE with type t = int = struct
type t = int
let to_string = string_of_int
let of_string = int_of_string
end
implicit module SFloat : SERIALIZABLE with type t = float = struct
type t = float
let to_string = string_of_float
let of_string = float_of_string
end
Functors can also be marked implicit, as long as their arguments are implicit (marked via the curly braces).
implicit module SList{A: SERIALIZABLE}
: (SERIALIZABLE with type t = A.t list) = struct
type t = A.t list
let to_string l = "["^(String.concat ";" @@
List.map A.to_string l)^"]"
let of_string s =
let l = String.length s in
let _ = if l>=2 && s.[0]='[' && s.[l-1]=']' then ()
else raise ConversionError in
let mid = String.sub s 1 (l-2) in
let tokens = Str.split (Str.regexp ";") mid in
let join p s = match p with "" -> s | _ -> p^";"^s in
let els = List.rev @@ fst @@ List.fold_left
(fun (els,p) s ->
try
let el = A.of_string @@ join p s in
(el::els,"")
with ConversionError -> (els, join p s))
([],"") tokens in
els
end
Lastly, we mark the module parameters of functions (for which adhoc polymorphism is intended) as implicit (again using curly braces):
let to_string (type a)
{S: SERIALIZABLE with type t = a} = S.to_string
let of_string (type a)
{S: SERIALIZABLE with type t = a} = S.of_string
Module parameters can now be elided:
let "2" = to_string 2
let 3 = of_string "3"
let "[2;3]" = to_string [2;3];;
let [2;3] = of_string "[2;3]";;
let "[[2;3];[4;5]]" = to_string [[2;3];[4;5]];;
let [[2;3];[4;5]] = of_string "[[2;3];[4;5]]";;
Observe that to_string
and of_string
do not take module
parameters. The code behaves as if both the functions are part of a
typeclass, for which int, float and list instances exist.
So, to conclude, First-class modules + implicits = typeclasses in ML!.