Before starting to write anything, we must know how to find if a code is allocating. The best way currently is to look at the Cmm intermediate representation. We can see it by calling `ocamlopt` with the `-dcmm` option:
`ocamlopt -c -dcmm test.ml`
let f x = (x,x)
Some excerpt from the output:
(function camlTest__f_4 (x_6/1204: val) (alloc 2048 x_6/1204 x_6/1204))
To improve readability, in this post we will clean a bit the variable names:
(function f (x: val) (alloc 2048 x x))
We see that the function f (named `camlTest__f_4`) is calling the `alloc` primitive, which obviously is an allocation. Here, this creates a size 2 block with tag 0 (2048 = 2 << 10 + 0) and containing two times the value `x_6/1204` which was `x` is the source. So we can detect if some code is allocating by doing `ocamlopt -c -dcmm test.ml 2>&1 | grep alloc` (obviously any function or variable named alloc will also appear).
It is possible to write some code that don’t allocate (in the heap) at all, but what are the limitations ? For instance the omnipresent fibonacci function does not allocate:
let rec fib = function | 0 -> 0 | 1 -> 1 | n -> fib (n-1) + fib (n-2)
(function fib (n: val) (if (!= n 1) (if (!= n 3) (let Paddint_arg (app "fib" (+ n -4) val) (+ (+ (app "fib" (+ n -2) val) Paddint_arg) -1)) 3) 1))
But quite a lot of common patterns do:
* Building structured values will allocate (tuple, records, sum types containing an element, …)
* Using floats, int64, … will allocate
* Declaring non-toplevel functions will allocate
Considering that, it can appear that it is almost impossible to write any non-trivial code without using those. But that’s not completely true.
There are some exceptions to those rules, where some normally allocating constructions can be optimised away. We will explain how to exploit them to be able to write some more code.
Maybe the most important one is the case of local references.
let fact n = let result = ref 1 in for i = 1 to n do result := n * !result done; !result
To improve readability, this has been cleaned and demangled
(function fact (n: val) (let (result 3) (seq (let (i 3 bound n) (catch (if (> i bound) (exit 3) (loop (assign result (+ (* (+ n -1) (>>s result 1)) 1)) (assign i (+ i 2)) (if (== i bound) (exit 3) ) with(3) )))) result)))
You can notice that allocation of the reference disappeared. The modifications were replaced by assignments (the `assign` operator) to the result variable. This transformation can happen when a reference is never used anywhere else than as an argument of the ! and := operator and does not appear in the closure of any local function like:
let counter () = let count = ref 0 in let next () = count := !count + 1; !count in next
This won’t happen in this case since count is in the closure of next.
The float, int32, int64 and nativeint types do not fit in the generic representation of values that can be stored in the OCaml heap, so they are boxed. This means that they are allocated and there is an annotation to tell the garbage collector to skip their content. So using them in general will allocate. But an important optimization is that local uses (some cases that obviously won’t go in the heap) are ‘unboxed’, i.e. not allocated.
Some 4.03 change also improve some cases of branching returning tuples
let positive_difference x y = let (min, max) = if x < y then (x, y) else (y, x) in max - min
(function positive_difference (x: val y: val) (catch (if (< x y) (exit 7 y x) (exit 7 x y)) with(7 max min) (+ (- max min) 1)))
You can do almost any control flow like that, but this is quite
unpractical and is still limited in many ways.
If you don’t want to write everything as for and while loops, you can
write functions for your control flow, but to prevent allocation you
will have to refrain from doing a few things. For instance, you should
not pass record or tupple as argument to functions of course, you
should pass each field separately as a different argument.
But what happens when you want to return multiple values ? There is
some ongoing project to try to optimise the allocations of some of
those cases away, but currently you can’t. Really ? NO !
Returning multiple values
If you bend a bit your mind, you may see that returning from a
function is almost the same thing as calling one… Or you can make it
that way. So let’s transform our code in ‘Continuation Passing Style’
For instance, let’s write a function that finds the minimum and the maximum of a list. That could be written like that:
let rec fold_left f init l = match l with |  -> init | h :: t -> let acc = f init h in fold_left f acc t let keep_min_max (min, max) v = let min = if v < min then v else min in let max = if v > max then v else max in min, max let find_min_max l = match l with |  -> invalid_arg "find_min_max" | h :: t -> fold_left keep_min_max (h, h) t
Continuation Passing Style
Transforming it to continuation passing style (CPS) replace every function return by a tail-call to a function representing ‘what happens after’. This function is usually called a continuation and a convention is to use the variable name ‘k’ for it.
Let’s start simply by turning only the keep_min_max function into continuation passing style.
let keep_min_max (min, max) v k = let min = if v < min then v else min in let max = if v > max then v else max in k (min, max) val keep_min_max : 'a * 'a -> 'a -> ('a * 'a -> 'b) -> 'b
That’s all here. But of course we need to modify a bit the function calling it.
let rec fold_left f init l = match l with |  -> init | h :: t -> let k acc = fold_left f acc t in f init h k val fold_left : ('a -> 'b -> ('a -> 'a) -> 'a) -> 'a -> 'b list -> 'a val find_min_max : 'a list -> 'a * 'a
Here instead of calling f then recursively calling fold_left, we prepare what we will do after calling f (that is calling fold_left) and then we call f with that continuation. find_min_max is unchanged and still has the same type.
But we can continue turning things in CPS, and a full conversion would result in:
let rec fold_left_k f init l k = match l with |  -> k init | h :: t -> let k acc = fold_left_k f acc t k in f init h k val fold_left_k : ('a -> 'b -> ('a -> 'c) -> 'c) -> 'a -> 'b list -> ('a -> 'c) -> 'c let keep_min_max_k (min, max) v k = let min = if v < min then v else min in let max = if v > max then v else max in k (min, max) val keep_min_max_k : 'a * 'a -> 'a -> ('a * 'a -> 'b) -> 'b let find_min_max_k l k = match l with |  -> invalid_arg "find_min_max" | h :: t -> fold_left_k keep_min_max (h, h) t k val find_min_max_k : 'a list -> ('a * 'a -> 'b) -> 'b let find_min_max l = find_min_max_k l (fun x -> x) val find_min_max : 'a list -> 'a * 'a
Where rectypes matter for performance reasons
That’s nice, we only have tail calls now, but we are not done removing allocation yet of course. We now need to get rid of the allocation of the closure in fold_left_k and of the couples in keep_min_max_k. For that, we need to pass everything that should be allocated as argument:
let rec fold_left_k2 f init1 init2 l k = match l with |  -> k init1 init2 | h :: t -> f init1 init2 h t fold_left_k2 k val fold_left_k2 : ('b -> 'c -> 'd -> 'd list -> 'a -> ('b -> 'c -> 'e) -> 'e) -> 'b -> 'c -> 'd list -> ('b -> 'c -> 'e) -> 'e as 'a let rec keep_min_max_k2 = fun min max v k_arg k k2 -> let min = if v < min then v else min in let max = if v > max then v else max in k keep_min_max_k2 min max k_arg k2 val keep_min_max_k2 : 'b -> 'b -> 'b -> 'c -> ('a -> 'b -> 'b -> 'c -> 'd -> 'e) -> 'd -> 'e as 'a let find_min_max_k2 l k = match l with |  -> invalid_arg "find_min_max" | h :: t -> fold_left_k2 keep_min_max_k2 h h t k val find_min_max_k2 : 'a list -> ('a -> 'a -> 'b) -> 'b
For some reason, we now need to activate ‘rectypes’ to allow functions to have a recursive type (the ‘as ‘a’) but we managed to completely get rid of allocations.
(function fold_left_k2 (f: val init1: val init2: val l: val k: val) (if (!= l 1) (app "caml_apply6" init1 init2 (load val l) l "fold_left_k2" k f val)) (app "caml_apply2" init1 init2 k val))) (function keep_min_max_k2 (min: val max: val v: val k: val k: val k2: val) (let (min (if (!= (extcall "caml_lessthan" v min val) 1) v min) max (if (!= (extcall "caml_greaterthan" v max val) 1) v max)) (app "caml_apply5" "keep_min_max_k2" min max k k2 k val))) (function find_min_max_k2 (l: val k: val) (if (!= l 1) (let h (load val l) (app "fold_left_k2" "keep_min_max_k2" h h t k val)) (raise "exception")))
So we can turn return points into call points and get rid of a lot of potential allocations like that. But of course there is no way to handle functions passing or returning sum types like that ! Well, I’m not so sure.
Let’s try with the option type for instance:
type 'a option = | None | Some of 'a let add_an_option_value opt v = match opt with | None -> v | Some n -> n + v let n1 = add_an_option_value (Some 3) 4 let n2 = add_an_option_value None 4
The case of the sum type tells us if there is some more values that we can get and their type. But there is another way to associate some type information with an actual value: GADTs
type ('a, 'b) option_case = | None' : ('a, unit) option_case | Some' : ('a, 'a) option_case let add_an_option_value (type t) (opt: (int, t) option_case) (n:t) v = match opt with | None' -> v | Some' -> n + v let n1 = add_an_option_value Some' 3 4 let n2 = add_an_option_value None' () 4
And voilà, no allocation anymore !
Combining that with the CPS transformation can get you quite far without allocating !
Now that we can manage almost any control flow without allocating, we need also to manipulate some values. That’s the point where we simply suggest to use the same approach as ASM.js: allocate a single large bigarray (this is some kind of malloc), consider integers as pointers and you can do anything. We won’t go into too much details here as this would require another post for that topic.
For some low level packed bitfield manipulation you can have a look at some more tricks
So if you want to write non allocating code in OCaml, turn everything in CPS, add additional arguments everywhere, turn your sum types in unboxed GADTs, manipulate a single large bigarrays. And enjoy !