When Compositionality Fails
The idea of abstraction is that we can take some complex thing and present it as some simpler thing. Where people can ignore the aspects of the thing not relevant to what they're actually trying to do...
View ArticleA Sniff Test for Some Query Optimizers
One important part of query planning is performing transformations over queries. Today I want to see how a couple common databases perform on a completely made-up and unrepresentative benchmark.I have...
View ArticleMy First Distributed System
I can show you a picture of the first distributed system I ever used: (Not entirely accurate, I had a Game Boy Color.)When I was a kid, we'd spend summers up at "the lake," where all kinds of fun stuff...
View ArticleThe Geometry of SQL
Today I want to talk about a way to think about some relational algebra operations.First, let's start with this relation:This is just a handful of games that appeared on a handful of systems, but let's...
View ArticleHeath's Theorem
We don't have all that much in the world of relational query planning that could be considered a "fundamental theorem," as in like, some central idea that everything else rests on. This is partly...
View ArticleNot all Graphs are Trees
It's pretty easy to imagine how to represent relational algebra expressions as a tree—they are already structurally rooted trees where each operator has its inputs as children.Even in a language like...
View ArticleThe Official NULL BITMAP Glossary: Graph Theory Edition
Last week a post of mine made it to God’s favourite website and one thing I was struck by was how many people disagreed about basic graph theory terminology. Some people also seemed to complain about...
View ArticleA Card Counting Trick
I used to call this thing a “game” but my friend Kevin (who is not the same Kevin as last week but who is also a mathematician) kept telling me it’s really more of a “trick.” So, it’s a trick, in the...
View ArticleNULL BITMAP Builds a Database #1: The Log is Literally the Database
It is time to end the tyranny of people becoming interested in database implementation and building a BTree. Let us turn to the succor of immutable storage.Today we are starting a new series. I know I...
View ArticleAvoiding Cross Products with the Query Graph
To compute the join of two relations, we find all pairs of rows their rows which have the same value for any columns with the same name. This is sometimes called the natural join in the software world,...
View ArticleTPC-See?
One thing about concurrency control (“isolation”) in a transactional database is that it incurs costs, and there’s broadly two kinds of such costs.The first such cost is the essential sacrifice of...
View ArticleThe Closed-Loop Benchmark Trap
Let's make a little mock database. It won't actually do anything but pretend to handle requests. I've had cause at work lately to write some Go so we're going to use...
View ArticleLanguages Without Abstraction
Implementing something like a compiler, there is the understanding that we want different representations of a program for different purposes. This is why we have stuff like a “control-flow graph” or...
View ArticleSQL's Grammar Ambiguity
I was going to write a longer, more involved post this week, but I wound up getting Covid and didn't have it in me to work through all the stuff I had to for that. So here is a smaller issue.I know of...
View ArticleMeasuring Throughput
Note: I'm trying out enabling comments. Behave! I reserve the right to disable them again for any reason including "the very idea of someone commenting made me anxious."We talked recently about why...
View ArticleThe Three Places for Data in an LSM
We have talked before about how to conceptualize what an LSM does. I want to talk about another way to think through how we put together this data structure.One way to think about what we want from a...
View ArticleSome of My Favourite Query Planning Papers
Something I've learned about programmers is that for some reason they absolutely love being recommended papers. They lose their minds for it. So here is an incomplete list of some of my favourite...
View ArticleA Very Basic Decorrelator
Today we're going to begin implementing a simple query decorrelator. We're not going to finish it in this post, and I'm not sure how many posts it will take, and I'm not sure all the posts for it will...
View ArticleCAP is Good, Actually
It seems like there are two main takes regarding the CAP theorem online:In introductory materials, it is presented as a deep, fundamental truth about distributed computation. This mystique is often...
View ArticleSo You Want to Generate SQL Queries (me too)
We have talked before about how to appropriately test query planners.I wrote there:I love metamorphic testing for SQL databases because it in large part reduces the problem of testing a database to...
View Article