This is a very high level summary of the experience attending S-REPLS 14. Feel free to look at the respective website of the speakers for more details. To keep this blog concise, I thought I would just use introductory examples where possible to illustrate the idea rather than going into the actual theory behind them.
Graded types and algebraic effects
Graded type system is based on the idea of linear types. Linear types essentially
allows one to restrict the usage of variables exactly once, which can be useful
for resource management such as IO. Graded types go further than this and allows one
to specify the number of times a variable can be used. For example, in the following
code1, the variable a
needs to be used exactly twice.
dup : forall {a : Type} . a [2] -> (a, a)
dup [x] = (x, x)
This gets more powerful when combined with dependent types. In the following types, where we use dependent types to express the length of a vector, and with that, we can now enforce our map function to be applied exactly n times on a vector of length n.
map : forall {a : Type, b : Type, n : Nat} . (a -> b) [n] -> Vec n a -> Vec n b
map [_] Nil = Nil;
map [f] (Cons x xs) = Cons (f x) (map [f] xs)
Apart from fine-grained tracking of resource usage at compile-time, Granule also
has features for tracking security levels, e.g. enforcing a resource to be private
and not exposed to the outside world at compile time (similar to the private
membership in OOP, but this one makes returning private resources, for example,
fail at compile time). It also has features for tracking effects, such as open,
write, read files, etc.
Somewhat Dynamic Build Systems
This talk is about the experience of building Buck2 at Meta. Build systems
can be broadly classified as static (like make
) and dynamic (like Excel
2).
A static build system has all the dependencies known upfront and will not change
during compilation. For example, make
has this information written down by
the programmer and can hence construct dependency graphs for this. A dynamic
build system takes into account possible changes that could happen at runtime
and adapts its build process at it goes.
The speaker argues that usually there are two graphs in a build system, one is the target graph which is specified by the user in terms of dependency. The other one is the actual action graph, which is generated by the build system based on the target graph, and respects the dependencies in the action graph (in other words, the action graph). Usually there are more flexibilities in the action graph, for example, a build system can generate parallel actions or serial actions, as long as they respect the dependencies.
In a large scale system, working out all the dependencies can require quite some time, therefore a dynamic action graph gives more flexibility, hence leading to better performance, etc. The way I think about this is that rather than statically finding all the dependencies bottom up, we use a dynamic, top-down approach to find files to build on demand, saving us the time to look at the dependencies every time from scratch.
The Dafny Programming language and Static Verifier
I thought an example3 would help illustrate the use case of Dafny:
function fib(x: nat): nat
{
if x == 0 then 0
else if x == 1 then 1
else fib(x - 1) + fib(x - 2)
}
method fib_m(n: nat) returns (y: nat)
requires n > 0
ensures y == fib(n)
{
var x:= 0;
y := 1;
var i := 1;
while i < n
invariant 0 < i <= n
invariant x == fib(i-1)
invariant y == fib(i)
{
x, y := y, x + y;
i := i + 1;
}
}
This is an example proof of the fib_m
empirical method
matching the ‘mathematical’
definition of fibonacci function
. Dafny provides Hoare logic style specification
where requires
specifies the precondition and ensures
provides the postcondition.
As with all Hoare-style proof, the tricky part is on loops. A loop invariant is
generally needed to help Dafny to prove properties of the program, which requires
manual effort to generate. Otherwise Dafny is generally quite automatic in terms
of generating proofs, compared to some of the more manual theorem prover like Coq
where one would often need to step through each instruction line by line and prove
that the pre and post condition holds for each of them.
Note the general pattern of making machine checked proofs: we first need to have
a specification of the system, in this case a mathematical definition of the
fibonacci number. And then an actual implementation, in this case the fib_m
method
. We then define an abstraction relation between the spec and the impl,
in this case the y == fib(m)
relates the state in the implementation to the
spec. And finally prove that the implementation indeed simulates/refines the spec,
by checking that if the abstraction relation holds for the initial state of the
implementation (i.e. precondition holds), then the it should also hold for the final
state of the program (i.e. postcondition holds). This means that at every step,
our implementation is abstracted by our specification, and therefore by transitivity,
if it is abstracted in the beginning, it would be abstracted in the end.
In my opinion, the selling point of Dafny is its automation, which enables its usage at Amazon. This powerful automation can potentially shorten the proof to code ratio to about 1:10, which is usually the other way round using other more manual theorem provers.
Finding cheaper straightline instruction sequences more cheaply
This talk is about developing a bytecode optimiser to optimisations on Ethereum Virtual Machine (EVM) bytecode. The motivation here is particularly strong since one more instruction in the bytecode means more gas and hence more money :) Their main approach is to develop peephole optimisations on the bytecode to reduce the number of instructions, making the code more compact. This tool is based on the unbounded superoptimization paper which provides a new way of encoding the program semantics in a way that can be understand by an SMT solver, and then the satisfiability of the SMT formula can be translated to the existence of a program p that implements the original source code.
This tool has the potential to save millions of pounds if it were used at the beginning when Ethereum was introduced. The ebso is the super optimiser implemented in OCaml.
Conclusion
Overall, this was a great event, with lots of well-prepared and engaging talks. It was nice to see all the development in more advanced type theories and infrastructure (such as build systems), theorem provers, language constructs.
Acknowledgement
I would like to acknowledge that the content of this post is merely a paraphrase of the works done by speakers at S-REPLS 14. So please give all the credit to them.
Both examples taken from Granule project ↩︎
You might be surprised as to why Excel is a build system. Well, different cells can be thought of as different files, and formulas linking different cells are build rules. More details in Build Systems a la Carte ↩︎
Adapted from Dafny tutorial. ↩︎