Feb
5
2014

Here is a short report of some of our activities in last December and January !

A New Intel Backend for ocamlopt

With the support of LexiFi, we started working on a new Intel backend for the ocamlopt native code compiler. Currently, there are four Intel backends in ocamlopt: amd64/emit.mlp, amd64/emit_nt.mlp, i386/emit.mlp and i386/emit_nt.mlp, i.e. support for two processors (amd64 and i386) and two OS variants (Unices and Windows). These backends directly output assembly sources files, on which the platform assembler is called (gas on Unices, and masm on Windows).

The current organisation makes it hard to maintain these backends: code for a given processor has to be written in two almost identical files (Unix and Windows), with subtle differences in the syntax: for example, the destination operand is the second parameter in gas syntax, while it is the first one in AT&T syntax ('masm').

Our current work aims at merging, for each processor, the Unix and Windows backends, by making them generate an abstract representation of the assembly. This representation is shared between the two processors ('amd64' and 'i386'), so that we only have to develop two printers, one for gas syntax and one for masm syntax. As a consequence, maintenance of the backend will be much easier: while writting the assembly code, the developer does not need to care about the exact syntax. Moreover, the type-checker can verify that each assembler instruction is used with the correct number of well-formatted operands.

Finally, our hope is that it will be also possible to write optimization passes directly on the assembly representation, such as peephole optimizations or instruction re-scheduling. This work is available in OCaml SVN, in the "abstract_x86_asm" branch.

OPAM, new Release 1.1.1

OPAM has been shifted from the 1.1.0-RC to 1.1.1, with large stability and UI improvements. We put a lot of effort on improving the interface, and on helping to build other tools in the emerging ecosystem around OPAM. Louis visited OCamlLabs, which was a great opportunity to discuss the future of OPAM and the platform, and contribute to their effort towards opam-in-a-box, a new way to generate pre-configured VirtualBox instances with all OCaml packages locally installable by OPAM, particularly convenient for computer classrooms.

The many plans and objectives on OPAM can be seen and discussed on the work-in-progress OPAM roadmap. Lots of work is ongoing for the next releases, including Windows support, binary packages, and allowing more flexibility by shifting the compiler descriptions to the packages.

ocp-index and its new Brother, ocp-grep

On our continued efforts to improve the environment and tools for OCaml hackers, we also made some extensions to ocp-index, which in addition to completing and documenting the values from your libraries, using binary annotations to jump to value definitions, now comes with a tiny ocp-grep tool that offers the possibility to syntactically locate instances of a given identifier around your project -- handling open, local opens, module aliases, etc. In emacs, C-c / will get the fully qualified version of the ident under cursor and find all its uses throughout your project. Simple, efficient and very handy for refactorings. The ocp-index query interface has also been made more expressive. Some documentation is online and will be available shortly in upcoming release 1.1.

ocp-cmicomp: Compression of Interface Files for Try-OCaml

While developing Try-OCaml, we noticed a problem with big compiled interface files (.cmi). In Try-OCaml, such files are embedded inside the JavaScript file by js_of_ocaml, resulting in huge code files to download on connection (about 12 MB when linking Dom_html from js_of_ocaml, and about 40 MB when linking Core_kernel), and the browser freezing for a few seconds when opening the corresponding modules.

To reduce this problem, we developed a tool, called ocp-cmicomp, to compress compiled interface files. A compiled interface file is just a huge OCaml data structure, marshalled using output_value. This data structure is often created by copying values from other interface files (types, names of values, etc.) during the compilation process. As this is done transitively, the data structure has a lot of redundancy, but has lost most of the sharing. ocp-cmicomp tries to recover this sharing: it iterates on the data structure, hash-consing the immutable parts of it, to create a new data structure with almost optimal sharing.

To increase the opportunities for sharing, ocp-cmicomp also uses some heuristics: for example, it computes the most frequent methods in objects, and sort the list of methods of each object type in increasing order of frequency. As a consequence, different object types are more likely to share the same tail. Finally, we also had to be very careful: the type-checker internally uses a lot physical comparison between types (especially with polymorphic variables and row variables), so that we still had to prevent sharing of some immutable parts to avoid typing problems.

The result is quite interesting. For example, dom_html.cmi was reduced from 2.3 MB to 0.7 MB (-71%, with a lot of object types), and the corresponding JavaScript file for Try-OCaml decreased from 12 MB to 5 MB. core_kernel.cmi was reduced from 13.5 MB to 10 MB (-26%, no object types), while the corresponding JavaScript decreased from 40 MB to 30 MB !

OCamlRes: Bundling Auxiliary Files at Compile Time

A common problem when writing portable software is to locate the resources of the program, and its predefined configuration files. The program has to know the system on which it is running, which can be done like in old times by patching the source, generating a set of globals or at run-time. Either way, paths may then vary depending on the system. For instance, paths are often completely static on Unix while they are partially forged on bundled MacOS apps or on Windows. Then, there is always the task of bundling the binary with its auxiliary files which depends on the OS.

For big apps with lots of system interaction, it is something you have to undertake. However, for small apps, it is an unjustified burden. The alternative proposed by OCamlRes is to bundle these auxiliary files at compile time as an OCaml module source. Then, one can just compile the same (partially pre-generated) code for all platforms and distribute all-inclusive, naked binary files. This also has the side advantage of turning run-time exceptions for inexistent or invalid files to compile-time errors. OCamlRes is made of two parts:

  • an ocplib-ocamlres library to manipulate resources at run-to time, scan input files to build resource trees, and to dump resources in various formats

  • a command line tool ocp-ocamlres, that reads the ressources and bundles them into OCaml source files.

OCamlRes has several output formats, some more subtle than the default mechanism (which is to transform a directory structure on the disk into an OCaml static tree where each file is replaced by its content), and can (and will) be extended. An example is detailed in the documentation file.

Compiler optimisations

The last post mentioned improvements on the prototype compiler optimization allowing recursive functions specialization. Some quite complicated corner cases needed a rethink of some parts of the architecture. The first version of the patch was meant to be as simple as possible. To this end we chose to avoid as much as possible the need to maintain non trivialy checkable invariants on the intermediate language. That decision led us to add some constraints on what we allowed us to do. One such constraint that matters here, is that we wanted every crucial information (that break things up if the information is lost) to be propagated following the scope. For instance, that means that in a case like:

let x = let y = 1 in (y,y) in x

the information that y is an integer can escape its scope but if the information is lost, at worst the generated code is not as good as possible, but is still correct. But sometimes, some information about functions really matters:

let f x =
  let g y = x + y in
  g

let h a =
  let g = f a in
  g a

Let's suppose in this example that f cannot be inlined, but g can. Then, h becomes (with g.x being access to x in the closure of g):

let h a =
  let g = f a in
  a + g.x

Let's suppose that some other transformation elsewhere allowed f to be inlined now, then h becomes:

let h a =
  let x = a in
  let g y = x + y in (* and the code can be eliminated from the closure *)
  a + g.x

Here the closure of of g changes: the code is eliminated so only the x field is kept in the block, hence changing its offset. This information about the closure (what is effectively available in the closure) must be propagated to the use point (g.x) to be able to generate the offset access in the block. If this information is lost, there is no way to compile that part. The way to avoid that problem was to limit a bit the kind of cases where inlining is possible so that this kind of information could always be propagated following the scope. But in fact a few cases did not verify that property when dealing with inlining parameters from different compilation unit.

So we undertook to rewrite some part to be able to ensure that those kinds of information are effectively correctly propagated and add assertions everywhere to avoid forgeting a case. The main problem was to track all the strange corner cases, that would almost never happen or wouldn't matter if they were not optimally compiled, but must not loose any information to satisfy the assertions.

Alt-Ergo: More Confidence and More Features

Formalizing and Proving a Critical Part of the Core

Last month, we considered the formalization and the proof of a critical component of Alt-Ergo's core. This component handles equalities solving to produce equivalent substitutions. However, since Alt-Ergo handles several theories (linear integer and rational arithmetic, enumerated datatypes, records, ...), providing a global routine that combines solvers of these individual theories is needed to be able to solve mixed equalities.

The example below shows one of the difficulties we faced when designing our combination algorithm: the solution of the equality r = {a = r.a + r.b; b = 1; c = C} cannot just be of the form r |-> {a = r.a + r.b; b = 1; c = C} as the pivot r appears in the right-hand side of the solution. To avoid this kind of subtle occur-checks, we have to solve an auxiliary and simpler conjunction of three equalities in our combination framework: r = {a = k1 + k2; b = 1; c = C}, r.a = k1 and r.b = k2 (where k1 and k2 are fresh variables). We will then deduce that k2 |-> 1 and that k1 + k2 = k1, which has no solution.

type enum = A | B | C
type t = { a : int ; b : enum }
logic r : t

goal g: r = {a = r.a + r.b; b = 1; c = C} -> false

After having implemented a new combination algorithm in Alt-Ergo a few months ago, we considered its formalization and its proof, as we have done with most of the critical parts of Alt-Ergo. It was really surprising to see how types information associated to Alt-Ergo terms helped us to prove the termination of the combination algorithm, a crucial property that was hard to prove in our previous combination algorithms, and a challenging research problem as well.

Models Generation

On the development side, we conducted some preliminary experiments in order to extend Alt-Ergo with a model generation feature. This capability is useful to derive concrete test-cases that may either exhibit erroneous behaviors of the program being verified or bugs in its formal specification.

In a first step, we concentrated on model generation for the combination of the theories of uninterpreted functions and linear integer arithmetic. The following example illustrates this problem:

logic f : int -> int
logic x, y : int

goal g: 2*x >= 0 -> y >= 0 -> f(x) <> f(y) -> false

We have a satisfiable (but non-valid) formula, where x and y are in the integer intervals [0,+infinity[ and f(x) <> f(y). We would like to find concrete values for x, y and f that satisfy the formula. A first attempt to answer this question may be the following:

  • From an arithmetic point of view, x = 0 and y = 0 are possible values for x and y. So, Linear arithmetic suggests this partial model to other theories.

  • The theory of uninterpreted functions cannot agree with this solution. In fact, x = y = 0 would imply f(x) = f(y), which contradicts f(x) <> f(y). More generally, x should be different from y.

  • Now, if linear arithmetic suggests, x = 0 and y = 1, the theory of uninterpreted functions will agree. The next step is to find integer values for f(0) and f(1) such that f(0) <> f(1).

After having implemented a brute force technique that tries to construct such models, our main concern now is to find an elegant and more efficient 'divide and conquer' strategy that allows each theory to compute its own partial model with guarantees that this model will be coherent with the partial models of the other theories. It would be then immediate to automatically merge these partial solutions into a global one.