Sean Sullivan
03/09/2022, 6:35 PMmodel User {
id Int @id @default(autoincrement())
primaryEmail String @unique @map("primary_email")
emails Email[]
@@map("users")
}
model Email {
id Int @id @default(autoincrement())
userId Int @map("user_id")
user User @relation(fields: [userId], references: [id])
email String @unique
@@map("emails")
}
And I'd like to get all users who don't have an email, per the docs, I'm running:
prisma.user.findMany({
where: { emails: { none: {} } }
});
This creates SQL along the lines of:
SELECT * FROM users
WHERE (users.id) NOT IN (
SELECT users.id
FROM users
INNER JOIN emails
ON (emails.user_id) = (users.id)
WHERE (users.id IS NOT NULL)
)
The query plan for this
EXPLAIN SELECT * FROM users
WHERE (users.id) NOT IN (
SELECT users.id
FROM users
INNER JOIN emails
ON (emails.user_id) = (users.id)
WHERE (users.id IS NOT NULL)
);
QUERY PLAN
----------------------------------------------------------------------------------------------
Seq Scan on users (cost=2057.26..59509346.61 rows=24028 width=26)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=2057.26..4394.34 rows=55777 width=4)
-> Hash Join (cost=2057.26..3897.46 rows=55777 width=4)
Hash Cond: (emails.user_id = users_1.id)
-> Seq Scan on emails (cost=0.00..1069.77 rows=55777 width=4)
-> Hash (cost=1268.56..1268.56 rows=48056 width=4)
-> Seq Scan on users users_1 (cost=0.00..1268.56 rows=48056 width=4)
Filter: (id IS NOT NULL)
(10 rows)
is prohibitively expensive to run because it materializes a table of all users who have emails and then scans it for each row in users. Our users table is relatively small (~52,000 rows) but these queries take forever (i.e. they don't terminate) and started clogging up our db connection pool. I can write a handwritten query that joins a table of all users without an email that runs almost instantly.
I'm wondering if there is an explanation for this behavior so I can understand and avoid it in the future, or if this is sub-optimal performance that could be fixed with a PR?Vladi Stevanovic