When Compositionality Fails
The idea of abstraction is that we can take some complex thing and present it as some simpler thing. Where people can ignore the aspects of the thing not relevant to what they're actually trying to do...
View ArticleA Sniff Test for Some Query Optimizers
One important part of query planning is performing transformations over queries. Today I want to see how a couple common databases perform on a completely made-up and unrepresentative benchmark.I have...
View ArticleMy First Distributed System
I can show you a picture of the first distributed system I ever used: (Not entirely accurate, I had a Game Boy Color.)When I was a kid, we'd spend summers up at "the lake," where all kinds of fun stuff...
View ArticleThe Geometry of SQL
Today I want to talk about a way to think about some relational algebra operations.First, let's start with this relation:This is just a handful of games that appeared on a handful of systems, but let's...
View ArticleHeath's Theorem
We don't have all that much in the world of relational query planning that could be considered a "fundamental theorem," as in like, some central idea that everything else rests on. This is partly...
View ArticleNot all Graphs are Trees
It's pretty easy to imagine how to represent relational algebra expressions as a tree—they are already structurally rooted trees where each operator has its inputs as children.Even in a language like...
View ArticleThe Official NULL BITMAP Glossary: Graph Theory Edition
Last week a post of mine made it to God’s favourite website and one thing I was struck by was how many people disagreed about basic graph theory terminology. Some people also seemed to complain about...
View ArticleA Card Counting Trick
I used to call this thing a “game” but my friend Kevin (who is not the same Kevin as last week but who is also a mathematician) kept telling me it’s really more of a “trick.” So, it’s a trick, in the...
View ArticleNULL BITMAP Builds a Database #1: The Log is Literally the Database
It is time to end the tyranny of people becoming interested in database implementation and building a BTree. Let us turn to the succor of immutable storage.Today we are starting a new series. I know I...
View ArticleAvoiding Cross Products with the Query Graph
To compute the join of two relations, we find all pairs of rows their rows which have the same value for any columns with the same name. This is sometimes called the natural join in the software world,...
View ArticleTPC-See?
One thing about concurrency control (“isolation”) in a transactional database is that it incurs costs, and there’s broadly two kinds of such costs.The first such cost is the essential sacrifice of...
View ArticleNULL BITMAP Builds a Database #2: Enter the Memtable
I didn't realize how hard it would be to bin-pack episodes of this thing into the roughly ~750 word chunks that I try to keep issues of this newsletter at. Lots of things are like, oh that's small, so...
View ArticleIn Codd we Trust (or not)
Get in loser, we’re doing another Codd philosophizing session. Codd’s paper introducing the relational model opens up like so:Future users of large data banks must be protected from having to know how...
View ArticlePhysical Properties #4
Previous parts of this series:Physical Properties #1Physical Properties #2Physical Properties #3Relational query planning makes a distinction between "logical properties" and "physical properties."...
View ArticleTo Understand Correctness, You Must First Understand Incorrectness
Recently I was in a Discord channel where someone wrote something akin to “I have a question about linearizability” which had an attached thread with 80 replies.This is what game designers call...
View ArticleBenchmarks That Aren't Your Friends
We’ve now talkedtwice about an important dimension of a benchmark: the openness of the loop. While there’s more subtlety to it, if what you take away is:open-loop is better for measuring latency,...
View ArticleA Very Basic Decorrelator
Today we're going to begin implementing a simple query decorrelator. We're not going to finish it in this post, and I'm not sure how many posts it will take, and I'm not sure all the posts for it will...
View ArticleCAP is Good, Actually
It seems like there are two main takes regarding the CAP theorem online:In introductory materials, it is presented as a deep, fundamental truth about distributed computation. This mystique is often...
View ArticleSo You Want to Generate SQL Queries (me too)
We have talked before about how to appropriately test query planners.I wrote there:I love metamorphic testing for SQL databases because it in large part reduces the problem of testing a database to...
View ArticleWhen Compositionality Fails
The idea of abstraction is that we can take some complex thing and present it as some simpler thing. Where people can ignore the aspects of the thing not relevant to what they're actually trying to do...
View Article