Erlog: A Distributed Datalog Engine

Introduction In this blogpost we build a distributed datalog engine that can process datalog queries such as the one below in a distributed fashion. The key idea mimics the usual dataflow programming idea such as MapReduce where we shard our data and create a dataflow-graph to specify the computation we want. link("a","b"). link("b","c"). link("c","c"). link("c","d"). link("d","e"). link("e","f"). reachable(X, Y) :- link(X, Y). reachable(X, Y) :- link(X, Z), reachable(Z, Y). We will build our engine from first principle by looking at how we perform single node evaluation of datalog queries, and then extend it to multiple nodes....

April 16, 2023 · 7 min · Shuntian Liu