The Async Side of SQLModel Relationships — Part 2

The Async Side of SQLModel Relationships — Part 2

So far, we've looked at how to use foreign keys and relationship attributes to create database relationships. We've also seen how doing an implicit SQL I/O throws errors. Next, we are going to cover the various ways in which we can fix the errors.

TL;DR

Eager Loading Approach

  1. Add sa_relationship_kwargs={"lazy": LAZY_STRATEGY} to Relationship(...).

    Values for LAZY_STRATEGY have been described in SQLAlchemy's docs.

Lazy Loading Approach

  1. Generate queries using select(User).where(User.id == 1).options(LOADER_API(User.pets)).

    Values for LOADER_API have been described in SQLAlchemy's docs on Relationship Loader API.

Goal

Our goal remains the same. We want to create a pet management app.

In part 3 of this blog series, we discussed how to create relationships between SQLModel models.

Setting Up the Environment

We use Poetry for dependency management.

Our project environment looks like this:

sqlmodel-alembic-async/
├── sqlmodel_alembic_async/
│   ├── __init__.py
│   ├── databases.py
│   └── models.py
├── poetry.lock
└── pyproject.toml

At the time of writing, the dependencies are:

# pyproject.toml
[tool.poetry.dependencies]
python = "^3.8"
sqlmodel = "^0.0.8"
alembic = "^1.9.2"
aiosqlite = "^0.18.0" # you can use some other async database engine

Make sure you activate the Poetry shell using poetry shell.

Recap

>>> # ...
>>> user_someone.pets
INFO sqlalchemy.engine.Engine SELECT pet.id AS pet_id, pet.name AS pet_name, pet.user_id AS pet_user_id 
FROM pet 
WHERE ? = pet.user_id
INFO sqlalchemy.engine.Engine [generated in 0.00017s] (3,)

MissingGreenlet: greenlet_spawn has not been called; can't call await_only() here. Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)

According to the log message, SQLAlchemy is successfully constructing a query to retrieve pets with a foreign key called user_id equal to 3. However, on attempting to execute the query, an error occurs, indicating a failure in initiating a Greenlet coroutine, which is used for concurrent programming, most likely due to an implicit I/O operation not being performed in an async context.

Lazy or Not

To solve the problem, we can take two approaches:

  1. Load everything in one go. Also called eager loading.

  2. Load as required. Lazy loading, but with async support.

Load Everything in One Go

This is how the strategy works:

  1. You query the database for an instance of a User using the ORM.

     >>> db_user = await db.get(User, 1) # this means: select user with id = 1
     INFO sqlalchemy.engine.Engine BEGIN (implicit)
     INFO sqlalchemy.engine.Engine SELECT user.name AS user_name, user.id AS user_id 
     FROM user 
     WHERE user.id = ?
     INFO sqlalchemy.engine.Engine [generated in 0.00021s] (1,)
    

    (the logs continue below)

  2. The ORM issues an additional query to fetch the pets owned by the User.

     INFO sqlalchemy.engine.Engine SELECT pet.user_id AS pet_user_id, pet.id AS pet_id, pet.name AS pet_name 
     FROM pet 
     WHERE pet.user_id IN (?)
     INFO sqlalchemy.engine.Engine [generated in 0.00042s] (1,)
     >>> db_user.pets
     [...]
    

Different types of loading-related strategies in SQLAlchemy's ORM include lazy loading, select-in loading, joined loading, raise loading, subquery loading, write-only loading, and dynamic loading.

  • The default style is lazy loading, which loads a related reference on a single object at a time.

  • The select-in loading method loads all members of a linked collection/scalar references at the same time.

  • Joined loading loads rows that are related in the same result set.

  • Raise loading raises an exception if an unwanted lazy load occurs.

  • Subquery loading uses a subquery to load all members of a related collection/scalar reference.

  • Write-only loading only allows modification to a collection and requires a SELECT statement to query the collection.

  • Dynamic loading generates a Query object for the collection, however, it is less effective for managing big collections and is being phased out in favour of write-only loading.

The settings are fully explained in SQLAlchemy's documentation.

Also, the loading strategy you decide to use depends heavily on the use case. Each of them has its benefits and tradeoff. Start by looking into N+1 selects problem faced by an ORM.

We will stick with selectin loading strategy for our application.

# sqlmodel_alembic_async/models.py
# ...

class User(SQLModel, table=True):
    id: int | None = Field(default=None, primary_key=True)
    name: str

    pets: List["Pet"] = Relationship(
        back_populates="user",
+       sa_relationship_kwargs={"lazy": "selectin"},
    )

class Pet(SQLModel, table=True):
    id: int | None = Field(default=None, primary_key=True)
    name: str

    user_id: int | None = Field(default=None, foreign_key="user.id")
    user: Optional[User] = Relationship(
        back_populates="pets",
+       sa_relationship_kwargs={"lazy": "selectin"},
    )

And now you can query the database as usual.

>>> from sqlmodel import select
>>> stmt = select(Pet).where(Pet.id == 1)
>>> pet = (await db.exec(stmt)).one()
>>> pet
Pet(user_id=1, name='phroge', id=1, user=User(id=1, name='chonky', pets=[Pet(user_id=1, name='phroge', id=1, user=User(id=1, name='chonky', pets=[...]))]))
>>> pet.user
User(id=1, name='chonky', pets=[Pet(user_id=1, name='phroge', id=1, user=User(id=1, name='chonky', pets=[...]))])

Benefits

  • Very simple to implement.

Drawbacks

  • If the related objects are big, then loading them unnecessarily would waste CPU cycles.

  • It has been described as a code smell.

Load as Required

In this strategy, we will use the loader strategies described above within the query itself.

>>> from sqlmodel import select
>>> from sqlalchemy.orm import selectinload
>>>
>>> stmt = select(User).where(User.id == 5).options(selectinload(User.pets))
>>>
>>> db_user = (await db.exec(stmt)).one()
>>>
>>> db_user
User(id=5, name='waby', pets=[Pet(user_id=5, name='bb', id=4, user=User(id=5, name='waby', pets=[...]))])

Let's break down the above code.

>>> stmt = select(User).where(User.id == 5).options(selectinload(User.pets))

Here, we ask the database to select a User whose ID is equal to 1. Additionally, issue an additional query using selectinload strategy that loads the related objects called pets that is accessed via User.pets attribute.

Where did the selectinload() come from? It is a part of SQLAlchemy's Relationship Loader API.

>>> db_user = (await db.exec(stmt)).one()

Then we execute the statement and pick one item from the result.

And that's the "Load as Required" strategy.

Benefits

  • Truly lazy strategy: You load what you need.

Drawbacks

  • Complex to work with

A "Lazy" Tip

Should you choose to implement the "Load as Required" approach, it is possible to configure SQLAlchemy to raise an exception in the event of an unwanted implicit I/O operation being triggered by attribute access.

In the code below, we use the lazy="raise" strategy. Raise loading raises an exception if an unwanted lazy load occurs.

# sqlmodel_alembic_async/models.py
# ...

class User(UserBase, table=True):
    id: int | None = Field(default=None, primary_key=True)
    name: str

    pets: List["Pet"] = Relationship(
        back_populates="user",
+       sa_relationship_kwargs={"lazy": "raise"},
    )

class Pet(SQLModel, table=True):
    id: int | None = Field(default=None, primary_key=True)
    name: str

    user_id: int | None = Field(default=None, foreign_key="user.id")
    user: Optional[User] = Relationship(
        back_populates="pets",
+       sa_relationship_kwargs={"lazy": "raise"},
    )

What's next?

Now that we've learnt how to take responsibility for our relationships 😌, we will learn how to integrate it into a working FastAPI application.

Did you find this article valuable?

Support Arunanshu's Ramblings by becoming a sponsor. Any amount is appreciated!