Core Dump of S-REPLS 14

This is a very high level summary of the experience attending S-REPLS 14. Feel free to look at the respective website of the speakers for more details. To keep this blog concise, I thought I would just use introductory examples where possible to illustrate the idea rather than going into the actual theory behind them.

Graded types and algebraic effects

Graded type system is based on the idea of linear types. Linear types essentially allows one to restrict the usage of variables exactly once, which can be useful for resource management such as IO. Graded types go further than this and allows one to specify the number of times a variable can be used. For example, in the following code¹, the variable a needs to be used exactly twice.

dup : forall {a : Type} . a [2] -> (a, a)
dup [x] = (x, x)

This gets more powerful when combined with dependent types. In the following types, where we use dependent types to express the length of a vector, and with that, we can now enforce our map function to be applied exactly n times on a vector of length n.

map : forall {a : Type, b : Type, n : Nat} . (a -> b) [n] -> Vec n a -> Vec n b
map [_] Nil = Nil;
map [f] (Cons x xs) = Cons (f x) (map [f] xs)

Apart from fine-grained tracking of resource usage at compile-time, Granule also has features for tracking security levels, e.g. enforcing a resource to be private and not exposed to the outside world at compile time (similar to the private membership in OOP, but this one makes returning private resources, for example, fail at compile time). It also has features for tracking effects, such as open, write, read files, etc.

Somewhat Dynamic Build Systems

This talk is about the experience of building Buck2 at Meta. Build systems can be broadly classified as static (like make) and dynamic (like Excel²). A static build system has all the dependencies known upfront and will not change during compilation. For example, make has this information written down by the programmer and can hence construct dependency graphs for this. A dynamic build system takes into account possible changes that could happen at runtime and adapts its build process at it goes.

The speaker argues that usually there are two graphs in a build system, one is the target graph which is specified by the user in terms of dependency. The other one is the actual action graph, which is generated by the build system based on the target graph, and respects the dependencies in the action graph (in other words, the action graph). Usually there are more flexibilities in the action graph, for example, a build system can generate parallel actions or serial actions, as long as they respect the dependencies.

In a large scale system, working out all the dependencies can require quite some time, therefore a dynamic action graph gives more flexibility, hence leading to better performance, etc. The way I think about this is that rather than statically finding all the dependencies bottom up, we use a dynamic, top-down approach to find files to build on demand, saving us the time to look at the dependencies every time from scratch.

The Dafny Programming language and Static Verifier

I thought an example³ would help illustrate the use case of Dafny:

function fib(x: nat): nat
{
  if x == 0 then 0
  else if x == 1 then 1
  else fib(x - 1) + fib(x - 2)
}

method fib_m(n: nat) returns (y: nat)
  requires n > 0
  ensures y == fib(n)
{
  var x:= 0;
  y := 1;
  var i := 1;

  while i < n
    invariant 0 < i <= n 
    invariant x == fib(i-1) 
    invariant y == fib(i)
  {
    x, y := y, x + y;
    i := i + 1;
  }

}

This is an example proof of the fib_m empirical method matching the ‘mathematical’ definition of fibonacci function. Dafny provides Hoare logic style specification where requires specifies the precondition and ensures provides the postcondition. As with all Hoare-style proof, the tricky part is on loops. A loop invariant is generally needed to help Dafny to prove properties of the program, which requires manual effort to generate. Otherwise Dafny is generally quite automatic in terms of generating proofs, compared to some of the more manual theorem prover like Coq where one would often need to step through each instruction line by line and prove that the pre and post condition holds for each of them.

Note the general pattern of making machine checked proofs: we first need to have a specification of the system, in this case a mathematical definition of the fibonacci number. And then an actual implementation, in this case the fib_m method. We then define an abstraction relation between the spec and the impl, in this case the y == fib(m) relates the state in the implementation to the spec. And finally prove that the implementation indeed simulates/refines the spec, by checking that if the abstraction relation holds for the initial state of the implementation (i.e. precondition holds), then the it should also hold for the final state of the program (i.e. postcondition holds). This means that at every step, our implementation is abstracted by our specification, and therefore by transitivity, if it is abstracted in the beginning, it would be abstracted in the end.

In my opinion, the selling point of Dafny is its automation, which enables its usage at Amazon. This powerful automation can potentially shorten the proof to code ratio to about 1:10, which is usually the other way round using other more manual theorem provers.

Finding cheaper straightline instruction sequences more cheaply

This talk is about developing a bytecode optimiser to optimisations on Ethereum Virtual Machine (EVM) bytecode. The motivation here is particularly strong since one more instruction in the bytecode means more gas and hence more money :) Their main approach is to develop peephole optimisations on the bytecode to reduce the number of instructions, making the code more compact. This tool is based on the unbounded superoptimization paper which provides a new way of encoding the program semantics in a way that can be understand by an SMT solver, and then the satisfiability of the SMT formula can be translated to the existence of a program p that implements the original source code.

This tool has the potential to save millions of pounds if it were used at the beginning when Ethereum was introduced. The ebso is the super optimiser implemented in OCaml.

Conclusion

Overall, this was a great event, with lots of well-prepared and engaging talks. It was nice to see all the development in more advanced type theories and infrastructure (such as build systems), theorem provers, language constructs.

Acknowledgement

I would like to acknowledge that the content of this post is merely a paraphrase of the works done by speakers at S-REPLS 14. So please give all the credit to them.

Both examples taken from Granule project ↩︎
You might be surprised as to why Excel is a build system. Well, different cells can be thought of as different files, and formulas linking different cells are build rules. More details in Build Systems a la Carte ↩︎
Adapted from Dafny tutorial. ↩︎

Graded types and algebraic effects#

Somewhat Dynamic Build Systems#

The Dafny Programming language and Static Verifier#

Finding cheaper straightline instruction sequences more cheaply#

Conclusion#

Acknowledgement#