Relating collections in non-relational databases

This article discusses how I used the populate() method from Mongoose to link different collections within a MongoDB database to provide rich data, while not falling foul of document size limitations. It should be useful to relatively novice developers who know a bit about MongoDB and Mongoose, and are keen to go into a little more depth.

SQL and NoSQL databases

When I got to the point of learning about databases I was really excited. This felt like the point at which my web applications were suddenly going to have depth. I could have users, and my users could have data, and their data could persist - suddenly this massively opened up the possibilities for what I could build. This, more than any other point on my journey, made me feel like I was well on the way to becoming a software developer.

I soon learned that there are two main types of database. There are SQL (structured query language) databases, and there are NoSQL databases. SQL databases are "relational", i.e. data is spread across multiple tables which can be related together within a query, producing an output based on these relations. In contrast, NoSQL databases are "non-relational" - the data is not stored in tables with rows and columns. Instead it is stored in "documents" which resemble JSON and comprise a series of key/value pairs. These documents are stored in various "collections", which sit within your database.

For whatever reason, my learning has focused on NoSQL databases, in the form of MongoDB. I've looked at SQL but not really learned how to use it, so when I build an application MongoDB is the always the one I use.

Using MongoDB with your application

MongoDB is great. It's easy to configure, straightforward to use, and, because it doesn't use tables, you don't need to fully plan out what your data is going to look like in advance, so it's good for those of us who are at the stage where we're still figuring things out.

What's more the Mongoose library makes working with MongoDB even more straightforward.

As readers will know, before saving documents to a collection, you need first to create a Schema and a model, which is basically a template for the data. The Schema has a series of properties which represent the data it will hold; you give each property a name, and set its type (e.g. String, Number, Date etc).

For instance, I was making a fitness application with which users can track how much exercises they are doing, and my user Schema looked something like this:

const mongoose = require("mongoose");

const userSchema = new mongoose.Schema(
{
	email: String,
	password: String,
	username: String,
	// etc ...
})

module.exports.User = mongoose.model("User", userSchema);

An obstacle that I came up against quickly was the question of how to save a user's activities in this database.

Initially I thought I'd just have an activities array within each user, into which I'd save every activity a user recorded. However, the memory limits of individual documents means that this wasn't going to be an appropriate approach.

Instead, I needed a separate activities collection into which would go every activity from all users, which I would then relate back to the individual users somehow.

As you can probably see, this solution involves getting this non-relational database to act in a relational manner. Thanks to Mongoose, this is actually extremely simple!

Mongoose and populate()

I was able to achieve this relational behaviour using Mongoose's populate() method, which works as follows.

When creating my user Schema, I added a property called activities which was an array, set to the type of mongoose.Schema.Types.ObjectId, with a ref to the collection that contains the activity documents, like so:

const userSchema = new mongoose.Schema(
{
	email: String,
	password: String,
	username: String,
	activities: [{ type: mongoose.Schema.Types.ObjectId, ref: "Activity"}]
	// etc ...
})

When saving a user's activity to the activities collection, my app also takes the _id of the new activity document and adds it to that user's activities array. Thus the array doesn't get too huge as it contains only ids, rather than objects with the activity information. Instead these objects live in their own collection and are referred to in the user's document via their unique _id which is automatically created by MongoDB.

This is all well and good, but the next stage was to use this array to return information about the user's activities.

To do this is simply a case of appending the populate() method to your find method, specifying the property that you want to populate. Thus, to get a user's information, I used this query:

const user = await User.findById(userId).populate({path: "activities"})

This tells Mongoose that I want to find a specific user by their id, and that having found them I want the entries in the activities array to be populated, i.e. for the complete documents to be returned within the array, rather than just the ids. By doing this I was able to get a complete picture of the activities the user has been doing, without storing all that bulky information in their user document. I was then able to use this to render information about their activities to the user's dashboard.

And you can go beyond this too, adding options to your population (e.g. to order the results), and even populating multiple properties. In the following snippet I am populating the activities property, ordering the results by date in descending order, and then also populating the goals and suggestions property:

const user = await User.findById(userId).populate([{path: "activities", options: { sort: { date: -1 }}}, {path: "goals"}, {path: "suggestions"}]);

Thus, by using this extremely simple Mongoose method I was able to add a relational aspect to my database, in order to store and access my data in a smarter, more efficient manner.

Going forward

This experience helped me learn a lot about how databases work, and has allowed me to understand some of the difficulties inherent in working with data. The populate() method provides a neat way of working with non-relational databases that allows us to harness multiple collections. There is no doubt a lot more for me to learn about this area, and I can't wait to dig deeper. But for now, the populate() is giving me the ability to make more complex applications at this relatively early stage of my coding adventure, and as such I'm glad to be able to share it with others in a similar position.

MongodbBackendDatabasesPopulateMongoose
Avatar for Max Brookman-Byrne

Written by Max Brookman-Byrne

Self-taught junior web developer, currently honing my MERN skills with a bootcamp at The Developer Academy, Sheffield.

Loading

Fetching comments

Hey! 👋

Got something to say?

or to leave a comment.