Validating JSON Field Data in Django

Motivation

For my fifth project for Code Institute, I made a full stack application in which one of the things users could do is share recipes. Recipes involve lists of steps and lists of ingredients. Ideally, one could store the lists of ingredients and steps in arrays. However, as a rule, there is no way to store arrays in a Django backend natively. I therefore decided to store the lists in a Django JSON field. That, however, brings its own set of problems.

In this article, I will show you how to validate JSON field data on the backend using a simplified model as an example.

The Setup

I assume you have the Django REST framework up and running. For this example, I will not worry about things such as authentication and security. I will just focus on making sure the JSON data coming from the front end is in the format we want before it is allowed to be saved to the database.

I have a simple model with two fields:

class Book(models.Model):
    author = models.CharField(max_length=100)
    contributor = models.JSONField()

The author field is for the main author and is a simple CharField. The contributors field is a JSON field since a book might have numerous contributors. There is also a BookView class that automatically handles tne listing and saving of new Book objects along with a basic serializer class.

An Example of What Can Go Wrong

I have setup a command line program as a mock front end.

let url = "http://127.0.0.1:8000/book/";

let data = {author: "John Doe", contributor: {first_name: "Art", last_name: "Garfunkel", age: 23}};


postData(url, data).then((data) => {
    console.log(data);
});

Do not focus on the url or the postData method. Look at the data variable. It is a JavaScript object with an author field and then as a contributor field. The data for a contributor field has as its value a JavaScript object with three entries: first_name, last_name, and rank.

When the data is posted the backend, it is saved without a problem.

This is returned from the server.
{
  id: 5,
  author: 'John Doe',
  contributor: { first_name: 'Art', last_name: 'Garfunkel', age: 23 }
}

Now let us post a different book with the following data:

let data = {author: "Ray Bradbury", contributor: {type: "novel", number_of_pages: 435, age: 56, summary: "Not too bad for a novel."}};


postData(url, data).then((data) => {
    console.log(data);
});

The server accepts the response without a complaint:

{
  id: 6,
  author: 'Ray Bradbury',
  contributor: {
    type: 'novel',
    number_of_pages: 435,
    age: 56,
    summary: 'Not too bad for a novel.'
  }
}

Let us look at the two results:

    {
        "id": 5,
        "author": "John Doe",
        "contributor": {
            "first_name": "Art",
            "last_name": "Garfunkel",
            "age": 23
        }
    },
    {
        "id": 6,
        "author": "Ray Bradbury",
        "contributor": {
            "type": "novel",
            "number_of_pages": 435,
            "age": 56,
            "summary": "Not too bad for a novel."
        }
    }

Do you see the problem? It is very easy for a front end developer to simply add data for the contributor in any way they like - as long is it is in JSON format, the backed will accept it and save it. This could end up being a nightmare for any other developer using the API if they cannot be guaranteed that the data served to them is in a consistent format.

Validating JSON Data

The key to solving our problem is to have the backend check the JSON data against a schema. If the JSON data fits the schema, it is valid and saved. Otherwise, an error is returned.

For this case, I would like the JSON data to look like the following:

contributor: [
    {name: "John Doe},
    {name: "Jane Doe},
    {name: "Joe Smith},
    {name: "Mary Smith"}

]

In other words, we want array of JSON objects with key of name and a data type of string as the value. I wanted the data to be in an array because I wanted to have several objects of the same type for my list of contributors.

Our first step is to install a library called jsonschema.

pip install jsonschema

We create a file called validators.py in the app that has our model.

In this file, we will add the following import statement and basic schema:

import jsonschema

schema = {
    "type":  ,
    "items": ,
    "minItems":
}

We have to supply values for each of these keys.

The type is array and the minItems will be 1.

So now we have this:

schema = {
    "type": "array" ,
    "items":,
    "minItems": 1
}

For the items key, we will supply it with the following as a value:

{
    "type": "object",
    "properties": {
        "name": {"type": "string"}
        },
    "required": ["name"],
}

This means each item in our array will be of type object (ie JSON object) and each of these objects will have the following property: it will have a key of name and the value for that key will be of type string. For the required value, we can include as many fields as we like that we require for the data to be considered valid. In this case, we are saying that a key name is not only the one that is allowed, it is required for the JSON to be considered valid. So now we have this:

schema = {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "name": {"type": "string"}
        },
        "required": ["name"],
        },
    "minItems": 1
    }

Now that we have the schema defined, we can define a method to use this schema. This method will also go in validators.py.

def validate_contributors(value):
    jsonschema.validate(data, array_schema)

Hooking it All Up

Now we need to wire this all together.

In our serializers.py we define a serializer that uses our validator.

class JSONArraySerializer(serializers.ModelSerializer):
    class Meta:
        model = Book
        fields = ['id', 'author', 'contributor']

    contributor = serializers.JSONField()

    def validate_contributor(self, value):
        validate_contributors(value)
        return value

And in our view to display the Book objects we have:

class BookView(generics.ListCreateAPIView):
    queryset = Book.objects.all()
    serializer_class = JSONArraySerializer

And that is it! Now, the backend will only accept lists of JSON data in our required format.

Let's Check

So now, when I try to post new data:

let data = {author: "Raymond Killwill", contributor: [
    {name: "John Doe"},
    {name: "Jane Doe"},
    {name: "John Smith"},
    {name: "Jane Smith The Second"},


]};


postData(url, data).then((data) => {
    console.log(data);
});

I get a nice response:

{
  id: 14,
  author: 'Raymond Killwill',
  contributor: [
    { name: 'John Doe' },
    { name: 'Jane Doe' },
    { name: 'John Smith' },
    { name: 'Jane Smith The Second' }
  ]
}

And if I try to make a post with this as the data:

let data = {author: "Raymond Killwill", contributor: [
    {name: "John Doe"},
    {age: 5}
]};

I get this nice complaint from the server:

Failed validating 'required' in schema['items']:
    {'properties': {'name': {'type': 'string'}},
     'required': ['name'],
     'type': 'object'}

On instance[1]:
    {'age': 5}

Or if I try to sneak in wrong data type for name:

let data = {author: "Raymond Killwill", contributor: [
    {name: 5},
]};

Failed validating 'type' in schema['items']['properties']['name']:
    {'type': 'string'}

On instance[0]['name']:
    5

So now you know how to validate JSON data against a schema in the Django REST framework.

DjangoJsonValidationSchema
Avatar for tony-albanese

Written by tony-albanese

Loading

Fetching comments

Hey! 👋

Got something to say?

or to leave a comment.