So far, we've looked at how to use foreign keys and relationship attributes to create database relationships. We've also seen how doing an implicit SQL I/O throws errors. Next, we are going to cover the various ways in which we can fix the errors.
TL;DR
Eager Loading Approach
Add
sa_relationship_kwargs={"lazy": LAZY_STRATEGY}
toRelationship(...)
.Values for
LAZY_STRATEGY
have been described in SQLAlchemy's docs.
Lazy Loading Approach
Generate queries using
select(User).where(User.id == 1).options(LOADER_API(User.pets))
.Values for
LOADER_API
have been described in SQLAlchemy's docs on Relationship Loader API.
Goal
Our goal remains the same. We want to create a pet management app.
In part 3 of this blog series, we discussed how to create relationships between SQLModel models.
Setting Up the Environment
We use Poetry for dependency management.
Our project environment looks like this:
sqlmodel-alembic-async/
├── sqlmodel_alembic_async/
│ ├── __init__.py
│ ├── databases.py
│ └── models.py
├── poetry.lock
└── pyproject.toml
At the time of writing, the dependencies are:
# pyproject.toml
[tool.poetry.dependencies]
python = "^3.8"
sqlmodel = "^0.0.8"
alembic = "^1.9.2"
aiosqlite = "^0.18.0" # you can use some other async database engine
Make sure you activate the Poetry shell using
poetry shell
.
Recap
>>> # ...
>>> user_someone.pets
INFO sqlalchemy.engine.Engine SELECT pet.id AS pet_id, pet.name AS pet_name, pet.user_id AS pet_user_id
FROM pet
WHERE ? = pet.user_id
INFO sqlalchemy.engine.Engine [generated in 0.00017s] (3,)
MissingGreenlet: greenlet_spawn has not been called; can't call await_only() here. Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)
According to the log message, SQLAlchemy is successfully constructing a query to retrieve pets with a foreign key called user_id
equal to 3. However, on attempting to execute the query, an error occurs, indicating a failure in initiating a Greenlet coroutine, which is used for concurrent programming, most likely due to an implicit I/O operation not being performed in an async context.
Lazy or Not
To solve the problem, we can take two approaches:
Load everything in one go. Also called eager loading.
Load as required. Lazy loading, but with async support.
Load Everything in One Go
This is how the strategy works:
You query the database for an instance of a User using the ORM.
>>> db_user = await db.get(User, 1) # this means: select user with id = 1 INFO sqlalchemy.engine.Engine BEGIN (implicit) INFO sqlalchemy.engine.Engine SELECT user.name AS user_name, user.id AS user_id FROM user WHERE user.id = ? INFO sqlalchemy.engine.Engine [generated in 0.00021s] (1,)
(the logs continue below)
The ORM issues an additional query to fetch the pets owned by the User.
INFO sqlalchemy.engine.Engine SELECT pet.user_id AS pet_user_id, pet.id AS pet_id, pet.name AS pet_name FROM pet WHERE pet.user_id IN (?) INFO sqlalchemy.engine.Engine [generated in 0.00042s] (1,) >>> db_user.pets [...]
Different types of loading-related strategies in SQLAlchemy's ORM include lazy loading, select-in loading, joined loading, raise loading, subquery loading, write-only loading, and dynamic loading.
The default style is lazy loading, which loads a related reference on a single object at a time.
The select-in loading method loads all members of a linked collection/scalar references at the same time.
Joined loading loads rows that are related in the same result set.
Raise loading raises an exception if an unwanted lazy load occurs.
Subquery loading uses a subquery to load all members of a related collection/scalar reference.
Write-only loading only allows modification to a collection and requires a SELECT statement to query the collection.
Dynamic loading generates a Query object for the collection, however, it is less effective for managing big collections and is being phased out in favour of write-only loading.
The settings are fully explained in SQLAlchemy's documentation.
Also, the loading strategy you decide to use depends heavily on the use case. Each of them has its benefits and tradeoff. Start by looking into N+1 selects problem faced by an ORM.
We will stick with selectin
loading strategy for our application.
# sqlmodel_alembic_async/models.py
# ...
class User(SQLModel, table=True):
id: int | None = Field(default=None, primary_key=True)
name: str
pets: List["Pet"] = Relationship(
back_populates="user",
+ sa_relationship_kwargs={"lazy": "selectin"},
)
class Pet(SQLModel, table=True):
id: int | None = Field(default=None, primary_key=True)
name: str
user_id: int | None = Field(default=None, foreign_key="user.id")
user: Optional[User] = Relationship(
back_populates="pets",
+ sa_relationship_kwargs={"lazy": "selectin"},
)
And now you can query the database as usual.
>>> from sqlmodel import select
>>> stmt = select(Pet).where(Pet.id == 1)
>>> pet = (await db.exec(stmt)).one()
>>> pet
Pet(user_id=1, name='phroge', id=1, user=User(id=1, name='chonky', pets=[Pet(user_id=1, name='phroge', id=1, user=User(id=1, name='chonky', pets=[...]))]))
>>> pet.user
User(id=1, name='chonky', pets=[Pet(user_id=1, name='phroge', id=1, user=User(id=1, name='chonky', pets=[...]))])
Benefits
- Very simple to implement.
Drawbacks
If the related objects are big, then loading them unnecessarily would waste CPU cycles.
It has been described as a code smell.
Load as Required
In this strategy, we will use the loader strategies described above within the query itself.
>>> from sqlmodel import select
>>> from sqlalchemy.orm import selectinload
>>>
>>> stmt = select(User).where(User.id == 5).options(selectinload(User.pets))
>>>
>>> db_user = (await db.exec(stmt)).one()
>>>
>>> db_user
User(id=5, name='waby', pets=[Pet(user_id=5, name='bb', id=4, user=User(id=5, name='waby', pets=[...]))])
Let's break down the above code.
>>> stmt = select(User).where(User.id == 5).options(selectinload(User.pets))
Here, we ask the database to select a User whose ID is equal to 1. Additionally, issue an additional query using selectinload
strategy that loads the related objects called pets
that is accessed via User.pets
attribute.
Where did the selectinload()
come from? It is a part of SQLAlchemy's Relationship Loader API.
>>> db_user = (await db.exec(stmt)).one()
Then we execute the statement and pick one item from the result.
And that's the "Load as Required" strategy.
Benefits
- Truly lazy strategy: You load what you need.
Drawbacks
- Complex to work with
A "Lazy" Tip
Should you choose to implement the "Load as Required" approach, it is possible to configure SQLAlchemy to raise an exception in the event of an unwanted implicit I/O operation being triggered by attribute access.
In the code below, we use the lazy="raise"
strategy. Raise loading raises an exception if an unwanted lazy load occurs.
# sqlmodel_alembic_async/models.py
# ...
class User(UserBase, table=True):
id: int | None = Field(default=None, primary_key=True)
name: str
pets: List["Pet"] = Relationship(
back_populates="user",
+ sa_relationship_kwargs={"lazy": "raise"},
)
class Pet(SQLModel, table=True):
id: int | None = Field(default=None, primary_key=True)
name: str
user_id: int | None = Field(default=None, foreign_key="user.id")
user: Optional[User] = Relationship(
back_populates="pets",
+ sa_relationship_kwargs={"lazy": "raise"},
)
What's next?
Now that we've learnt how to take responsibility for our relationships 😌, we will learn how to integrate it into a working FastAPI application.