Native queries for MongoDB
Once upon a time, object databases where a hot topic. They suffered from the same problem as modern databases — the query language was string based, which led to runtime errors in case of typos, code updates etc. One of the databases experimented with Native queries for persistent objects in Java and C#. Native queries were much easier to work with, since they were based on the language used to write the other code and benefitted from type checking, code completion etc.
Now fast forward to today. There are new languages and document databases, such as MongoDB, with the same inconvenience — queries are usually either string based or general dictionaries (such as JSON or BSON). In this post I'll try to implement native queries for MongoDB. I'll use Julia because it has nice macros. A similar solution would work for Rust (via macros) and Go (via its built-in code generation support).
We first define the type for persistent objects:
@kwdef struct Person name::String age::Int end
Instances of Person
will be converted into BSON when storing them and back from BSON when retrieving them.
Let's now assume we have a MongoDB collection in the collection
variable. We want to be able to use something like:
for obj in run( collection, @query p::Person -> p.name == "John" && p.age >= 30 ) println(obj) end
The query looks like a usual lambda expression in Julia, but of course this can't be passed to the MongoDB server. Under the hood the @query
macro translates the lambda expression into a BSON document. More specifically, it creates an instance of the following structure:
struct Query doc::Mongoc.BSON type::Type fn::Function end
The doc
field contains the corresponding BSON document, the type
field contains the type of the objects we want to fetch and the fn
field is the original lambda expression.
For example, in the above code the lambda expression is translated into the following BSON query:
{ "name" : "John", "age" : { "$gte" : 30 } }
The retrieved documents are then converted into instances of Person
(the type is taken from p::Person
) and an array of these instances is returned.
Under the hood
The macro is defined as follows:
macro query(fn) if fn.head != :(->) throw("query must be an anonymous function") end local sig = fn.args[1] if sig.head != :(::) throw("query function must take one typed argument") end local var = sig.args[1] local type = sig.args[2] local expr = fn.args[2] if expr.head != :block throw("query function must be a block") end local expr = queryfromast(expr.args[2], var) quote Query(Mongoc.BSON($expr), $type, $fn) end end
The queryfromast
function converts an expression (for example, the body of the anonymous function) into an instance of Expr
(an abstract syntax tree) representing the query BSON's attribute-value pairs.
The run
function is relatively simple:
function run(collection::Mongoc.Collection, query::Query) println("running with query BSON: $(query.doc)") local objects = Vector{query.type}() for doc in Mongoc.find(collection, query.doc) local obj = objectfrombson(doc, query.type) if query.fn(obj) push!(objects, obj) end end return objects end
Note that the lambda expression (stored in query.fn
) is evaluated for every fetched object. This is because the BSON query might be less specific than the lambda expression.