Hello, welcome back to the 5th post of my Flask series, this post would be a continuation of my previous post, please click on any link below to view any post you missed
OUTLINE OF FLASK SERIES
- Structuring a Flask-Restful API for Production
- Creating Custom Error Pages and Handling Exceptions for Flask
- Flask authentication with JWT
- CRUD Operations with Flask
-
Using Marshmallow to Simplify Parameter Validation in APIs (this article)
At the end of this post, you will learn the following
- Create a schema for serialization/deserialization
- Validate the data in a client request
- Perform data filtering before displaying the data to the client.
We would install new dependencies for our new post
- Marshmallow is an ORM/ODM/framework-agnostic library for converting complex data types, such as objects, to and from native Python datatypes.
- Marshmallow-sqlalchemy: provides SQLAlchemy integration with the previously installed marshmallow library.
- Flask-marshmallow: integrates the previously installed marshmallow library with Flask applications and makes it easy to generate URL and hyperlink fields.
Activate your virtual environment and run pip install -r requirements.txt
Serialization versus Deserialization
Serialization is the process of transforming an object into a format that can be stored or transmitted. Typically in Flask apps, we use SQLAlchemy to handle our database management. However, there is one problem: front-end frameworks like React or Vue don’t understand how to read SQLAlchemy objects. This is where Marshmallow comes into play. Marshmallow is used to transform our SQLAlchemy objects into readable JSON data. Marshmallow also helps with the reverse (deserialization) — it can transform JSON data back into its original object from the serialized format.
In short, marshmallow schemas can be used to:
- Validate input data.
- Deserialize input data to app-level objects.
- Serialize app-level objects to primitive Python types. The serialized objects can then be rendered to standard formats such as JSON for use in an HTTP API.
Marshmallow Schemas
Marshmallow keeps track of the format of data through Schemas. A Schema is used to dictate the format of data sent to the server. It defines the fields that are accepted and validates the data types of the fields.
I will introduce you to schemas before we apply them to our User model.
Let’s start with a basic user “model”.
import datetime as dt class User: def __init__(self, name, email): self.id = id self.name = name self.email = email self.created_at = dt.datetime.now()
Create a schema by sub-classing marshmallow.Schema
and creating attributes that will represent the fields in your data. Defining a class with variables mapping attribute names to Field
objects.
from marshmallow import Schema, fields class UserSchema(Schema): id = fields.Int() name = fields.Str() email = fields.Email() created_at = fields.DateTime()
The data type of the fields is defined using the marshmallow fields. From the preceding example, the id field is an integer, while the name field is a string, email is an email and created_at is a Datetime. There are a number of different data types in marshmallow, including Bool, Float, and so on.
With the schema specified, we can start doing object serialization and deserialization.
FIELD VALIDATION
We can also add field-level validation during serialization/deserialization. Again, this can be done in the schema definition. For example, if we want to specify a field as mandatory, we can add in the required=True argument.
class UserSchema(Schema): id = fields.Int(required=True) name = fields.Str()
Serializing Objects (“Dumping”)
Serialize objects by passing them to your schema’s dump
method, which returns the formatted result.
user = User(name="Oluchi", email="Oluchi@python.org") schema = UserSchema() result = schema.dump(user)
Deserializing Objects (“Loading”)
The reverse of the dump
method is load
, which validates and deserializes an input dictionary to an application-level data structure.
schema = UserSchema() result = schema.load()
USING MARSHMALLOW TO VALIDATE THE USER DATA.
Create a new file schema.py and add the following code to it.
from marshmallow import (Schema, fields, post_dump, post_load, pre_load, validate) from werkzeug.security import check_password_hash, generate_password_hash class UserSchema(Schema): class Meta: ordered = True id = fields.Int(dump_only=True) username = fields.String(required=True) email = fields.Email(required=True) password = fields.Method( required=True, deserialize="load_password" ) created_at = fields.DateTime(dump_only=True) updated_at = fields.DateTime(dump_only=True) def load_password(self, value): return generate_password_hash(value) # Clean up data @pre_load def process_input(self, data, **kwargs): data["email"] = data["email"].lower().strip() return data user_schema = UserSchema()
In the UserSchema, we used @pre_load
to process our user email.
Data pre-processing and post-processing methods can be registered using the pre_load
, post_load
, pre_dump
, and post_dump
decorators.
In summary, the processing pipeline for deserialization is as follows:
@pre_load(pass_many=True)
methods@pre_load(pass_many=False)
methodsload(in_data, many)
(validation and deserialization)@post_load(pass_many=True)
methods@post_load(pass_many=False)
methods
The pipeline for serialization is similar, except that the pass_many=True
processors are invoked after the pass_many=False
processors.
@pre_dump(pass_many=False)
methods@pre_dump(pass_many=True)
methodsdump(obj, many)
(serialization)@post_dump(pass_many=False)
methods@post_dump(pass_many=True)
methods
Modify our user registration resource to make use of our UserSchema .
import re from http import HTTPStatus from flask import request from flask_jwt_extended import (create_access_token, create_refresh_token, get_jwt_identity, get_raw_jwt, jwt_optional, jwt_refresh_token_required, jwt_required) from flask_restful import Api, Resource from webargs import validate from webargs.fields import Email, Str from webargs.flaskparser import use_kwargs,use_args from werkzeug.security import check_password_hash, generate_password_hash from marshmallow import ValidationError from api.models import User from api.schemas import user_schema api= Api() black_list = set() class UserRegistrationResource(Resource): """Define endpoints for user registration.""" def post(self): """Create new user.""" json_input = request.get_json() try: data = user_schema.load(json_input) except ValidationError as err: return {"errors": err.messages}, 422 # Check if use and email exist before creation if User.get_by_username(data['username']): return {'message': 'username already exist'}, HTTPStatus.BAD_REQUEST if User.get_by_email(data['email']): return {'message': 'email already exist'}, HTTPStatus.BAD_REQUEST user = User(**data) user.save() data = user_schema.dump(user) data["message"] = "Successfully created a new user" return data, HTTPStatus.CREATED #----existing code
Schemas can be nested to represent relationships between objects (e.g. foreign key relationships). For example, our Note may have an author represented by a User object, like the example displayed below.
def must_not_be_blank(data): if not data: raise ValidationError("Data not provided.") class NoteSchema(Schema): id = fields.Integer(dump_only=True) title = fields.String(required=True, validate=must_not_be_blank) notes = fields.String() publish = fields.Boolean(dump_only=True) user = fields.Nested(UserSchema) created_at = fields.DateTime(dump_only=True) updated_at = fields.DateTime(dump_only=True) class Meta: ordered = True
The link to Github tutorial is here
Until then happy coding.