Ran into a severe performance issue yesterday. I h...
# prisma-client
s
Ran into a severe performance issue yesterday. I have a model like
Copy code
model User {
  id           Int      @id @default(autoincrement())
  primaryEmail String   @unique @map("primary_email")
  emails        Email[]
  @@map("users")
}

model Email {
  id                 Int      @id @default(autoincrement())
  userId             Int      @map("user_id")
  user               User     @relation(fields: [userId], references: [id])
  email              String   @unique
  @@map("emails")
}
And I'd like to get all users who don't have an email, per the docs, I'm running:
Copy code
prisma.user.findMany({
  where: { emails: { none: {} } }
});
This creates SQL along the lines of:
Copy code
SELECT * FROM users
WHERE (users.id) NOT IN (
    SELECT users.id
    FROM users
    INNER JOIN emails
        ON (emails.user_id) = (users.id)
    WHERE (users.id IS NOT NULL)
)
The query plan for this
Copy code
EXPLAIN SELECT * FROM users
WHERE (users.id) NOT IN (
    SELECT users.id
    FROM users
    INNER JOIN emails
        ON (emails.user_id) = (users.id)
    WHERE (users.id IS NOT NULL)
);
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Seq Scan on users  (cost=2057.26..59509346.61 rows=24028 width=26)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
     ->  Materialize  (cost=2057.26..4394.34 rows=55777 width=4)
           ->  Hash Join  (cost=2057.26..3897.46 rows=55777 width=4)
                 Hash Cond: (emails.user_id = users_1.id)
                 ->  Seq Scan on emails  (cost=0.00..1069.77 rows=55777 width=4)
                 ->  Hash  (cost=1268.56..1268.56 rows=48056 width=4)
                       ->  Seq Scan on users users_1  (cost=0.00..1268.56 rows=48056 width=4)
                             Filter: (id IS NOT NULL)
(10 rows)
is prohibitively expensive to run because it materializes a table of all users who have emails and then scans it for each row in users. Our users table is relatively small (~52,000 rows) but these queries take forever (i.e. they don't terminate) and started clogging up our db connection pool. I can write a handwritten query that joins a table of all users without an email that runs almost instantly. I'm wondering if there is an explanation for this behavior so I can understand and avoid it in the future, or if this is sub-optimal performance that could be fixed with a PR?
v
@Sean Sullivan thank you for raising this! I've moved the convo to #C036EJQ7J6N