Aug
5
2013

Once again, here is the summary of our activities for last month. The highlight this month is the release of ocaml-top, an interactive editor for education which works well under Windows and that we hope professors all around the world will use to teach OCaml to their students. We are also continuying our work on the improvement of the performance of OCaml, with new inlining heuristics in the compiler and adding multicore support to the runtime.

Compiler updates

Last month, we started to get very nice results with our compiler performance improvements. First, Pierre Chambart polished the prototype implementation of his new flamba intermediate language and he started to get impressive micro-benchmarks results, with around 20% - 30% improvements on code using exceptions or functors. Following a discussion with our industrial users, he is currently focusing on improving the compilation of local recursive functions such as the very typical:

let f x y =
 let rec loop v =
   ... x ...
   loop z
 in
 loop x

A simple and reasonably efficient solutions is to eta-expand the auxiliary function, i.e. add an intermediate function calling the loop with all closure parameters passed as variables. The hard part is to then to add the right arguments to all the call sites: luckily enough the new inlining engine already does that kind of analysis so it can be re-used here. This means that these constructs will be compiled efficiently by the new inlining heuristics.

Second, Luca Saiu has finished debugging the native thread support on top of his multi-runtime variant of OCaml, which has become quite usable and is pretty robust now. He has tentatively started adding support for vmthreads as well, concurrently cleaning up context finalization and solving other minor issues, such as configuration scripts for architectures that do not support the multi-runtime features yet. Then, after writing documentation and running a full pass over the sources to remove debugging stubs and prints which pollute the code after months of low-level experimentation, he is going to prepare patches for discussion and submission to the main OCaml compiler.

Çagdas Bozman continued to improve the implementation of his profiling tools for both native and byte-code programs. A great output of his recent work is that the location information is much more precise: with very different techniques for native and byte code, the program locations are now uniquely identified. The usability was improved as well, as the profiling location tables are now embedded directly into the programs. He also improved the post-mortem profiling tools to re-type dumped heaps, which also leads to much more accurate profiling information. Çagdas is now actively using these tools on why3 and he expects to get feedback results very soon to continue to improve his tools.

Finally, Thomas Blanc is still working on whole program analysis for his internship, in order to find possibly uncaught exceptions. The project is moving quite well and the month was spent on analyzing the lambda intermediate representation of the compilation step. With the help of Pierre Chambart, he is working on a 0-CFA library that should allow to compute the "possible" values for each variable at each point of the program. The idea is to make a directed hypergraph with each hyperedge representing an instruction and each vertex being a state of the program. Then search a fixpoint in the possible values propagated through the graph. This allows the compiler to know everywhere in the program what possible values may be returned or what possible exceptions may be raised. In order to create a well-designed graph, it is needed to create a new intermediate representation that looks like Lambda except (mainly) that every expression gets an identifier. The next step is to specify a hypergraph construction for each primitive and control-flow.

Development Tools

Editors

This month, Louis Gesbert has been busy making the first release of ocaml-top, the simple graphical top-level for beginners and students. Together with the web-IDE, this project aims at lowering the entry barrier to OCaml. Ocaml-top features a clean and easy to access interface, with nonetheless advanced features like automatic semantic indentation, error marking, and integrated display of standard library types -- using the engines of ocp-indent and ocp-index of course. The biggest challenge was probably to make everything work as expected on Microsoft Windows, which was required for most beginners and classrooms.

ocaml-top

The two main issues were:

  • Setup the build environment: there are several versions of OCaml for Windows ; we generally want to avoid any dependency on cygwin on the generated program, but it's very hard to avoid any need for it in the build chain. The easiest solution at the moment is to "cross-compile" from cygwin using the mingw32 gcc compiler. The hard part is to get all the needed libraries properly setup: this felt a lot like Linux 15 years ago, you can find some binaries but generally not properly configured, and there is no consistent packaging system (or at least you can't find what you want in it).

  • Process management: ocaml-top runs the OCaml toplevel as a sub-process, so as not to be inpaired by any problem in the user program. Interacting with that process in a portable way is close to impossible, Windows having no POSIX signals, and read/write operations being very different in terms of blocking, etc. Some obscure C bindings were required to simulate a SIGINT that could tell the ocaml process to stop the current computation and return to the prompt. But at this cost, ocaml-top can be run with any existing external OCaml toplevel.

Not mentioning some gtk/lablgtk bugs that were often OS-specific. After having read horror stories about the most commonly used "Windows installer generator" NSIS, Louis opted for the Microsoft open source solution WiX which turned out to be quite clean and efficient, although using a lot of XML. The only point that might be in favor of NSIS is that it can generate the installer from Linux, so it's much convenient when you cross-compile, which is not the case here ; also worth mentioning, Xen and LVM are really great tools which do save a lot of time when working and testing between two (or more) different OSes.

Always on the editor front, David and Pierrick have been working on a web-IDE for OCaml since the beginning of their internship two months ago. For now, the IDE includes Ace, an editor, plugged with some features specific for OCaml, particularly ocp-indent, made possible by using js_of_ocaml which compiles bytecode to Javascript. It also includes a basic project manager that uses a server to store files for each user. Authentication is done by using Mozilla's Persona. One particularly nice feature they are working on is client-side bytecode generation: this means users can ask their browser to produce the byte-code of the project they are working on without any connection to the server ! Beware that this is still work-in-progress and the feature is not bug-free for the moment. The project (undocumented for now) is available on Github.

Tools

Meanwhile, most of my time last month has been spent preparing the next release of OPAM, with the help of Louis Gesbert. This new release will contain a lot of bug-fixes and an improved opam update mechanism: it will be much more flexible, faster and more stable than the one present in 1.0. Few months ago, I had already pushed a first batch of patches to the development version, which started to make things look much better. I decided last month to continue improving that feature and make it rock-solid: hence I have started a regression testing platform for OPAM which is still young but already damn useful to stabilize my new set of patches. opam-rt is also written in OCaml: it generates random repositories with random packages, shuffles everything around and checks that OPAM correctly picks-up the changes. In the future this will make it easier to test complex OPAM scenarios and will hopefully lead to a better OPAM.

ocp-index has seen some progress, with lots of rough edges rounded, and much better performance on big cmi files (typically module packs, like Core.Std). While more advanced functionality is being planned, it is already very helpful, and problems seen in earlier development versions have been fixed. The upcoming release also greatly improves the experience from emacs, and might become the first "stable". The flow of bugs reported on ocp-indent is drying up, which means the tool is gaining some maturity. Not much visible changes for the past month except for a few bug-fixes, but the library interface has been completely rewritten to offer much more flexibility while being more friendly. This has allowed it to be plugged in the Web-IDE (see above), which being executed in JavaScript has much tighter performance constraints -- the indent engine is only re-run where required after changes -- ; and in ocaml-top, where it is also used to detect top-level phrase bounds.

Community

We are proud to be well represented at the OCaml Developer Workshop 2013. This year it happens in Boston, in September, co-located with the Conference of Users of Functional Programming. Both conferences will contains a lot of OCaml-related talks: I am especially excited to hear about PHP type-inference efforts at Facebook using OCaml! If you are in the area around the 22/23 and 24 of September and you want to chat about OCamlPro and OCaml, we will be around!