Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Bog

Bog is a database runtime that makes every attempt to do as much work as possible as early as possible, to make reads fast. This means compiling queries into functions that eagerly update their output as mutations occur.

Every database runtime is a collection of components, pieced together and integrated by its builders to satisfy some set of constraints. In this book we describe several components we’ve built in our journey to Bog v0.1.0. These can be used independently of each other, and eventually they will be used in concert to power any particular Bog.

Fold

Fold, our take on an incremental programming framework, is the engine that powers Bog. It’s a rust crate with iterator like primitives for materializing a stream of ever changing data into views. Statically typed and very, very fast.

Extremely Static Embedding (ESE)

ESE, our first published take on a compiler oriented approach to static embedding. It’s a flattening of a tokenizer and map of embeddings into a perfect hash function. It’s also evidence that the approach is worth generalizing, and that there is much to be rethought about how embedding runtimes currently function.

Incremental computation

Fold is a datastream oriented approach to incremental computation (IC). Incremental computation roughly means doing as little work as possible to resolve an output when an input to a program changes. It’s a well established research domain in computer science that is used in many different domains from spreadsheets, to compilers, to various web rendering frameworks.

In the database field, IC is generally referred to as Incremental View Maintenance (IVM), and is less well established. The academic activity around IVM is rich, but has only started to permeate the database project and product world in the past few years. Even then, it’s really only been leveraged in systems oriented for big data workflows (Materialize) or caching portions of query sets (ReadySet). Generally this seems to be a result of the research being fairly complex and existing database runtimes being built with architectures that make integrating IVM concepts difficult (although some IVM postgres plugins are gaining traction).

Fold is a new architecture — while it inherits from projects like Differential Dataflow, it is a distinct approach with a much simpler surface area.

ESE