:wave: Hello. I have a need to get 5-15k records j...
# orm-help
r
đź‘‹ Hello. I have a need to get 5-15k records joining 5-6 tables and this is a bottleneck on the DB given how prisma does the "joins". For each join it creates queries with all the ids from the previous one. This is a non issue for the DB when using a join but it is an issue this way. It's not performant and the CPU is on fire. How are you dealing with this problem?
đź‘€ 1
o
I usually use raw queries with Prisma when handling batch jobs or any long/big queries.
r
Yes seems that it's the only alternative for now using prisma. Here's a link of an issue that kinda describes what I'm facing: https://github.com/prisma/prisma1/issues/4744
@Omar You then deal with the mapping of the records? Or you know if there's some method to send records and get them mapped to your schema? Eg: have joined tables under the nested field
n
Hey Ricardo đź‘‹ You can indeed use Raw Queries for complex database operations. On the note of prisma using select instead of joins, you might want to have a look at this conversation for the idea behind this https://prisma.slack.com/archives/CA491RJH0/p1646982182446919?thread_ts=1646799638.123979&cid=CA491RJH0
o
@Ricardo Seromenho I'm not sure I got your question. Did you mean typing the result of a raw query? if yes, then this is what you need: https://www.prisma.io/docs/concepts/components/prisma-client/raw-database-access#typing-queryraw-results
r
@Nurul I've also seens this video

https://www.youtube.com/watch?v=RAoXdyI_PH4â–ľ

sometime ago that also explains that. I dno't have real facts but the thruth is that those queries are killing the DB. It's firing 4/5 queries to related tables with thousands of id's in the in conditions, so I'm not sure how is this true for larger datasets. The article on the thread you mentioned is very good. Thank you for pointing that. On that article
Copy code
...However, this advice doesn't apply as much to MySQL, because it was designed to handle connecting and disconnecting very efficiently and to respond to small and simple queries very quickly...
I'm not sure what small means but the query is indeed simple but i don't think it's small. It something like
Copy code
SELECT foo FROM TABLE WHERE id IN (15 thousand id's here or more)
@Omar No I wasn't referring to the types but to map the record results to the schema. For example. The following schema us
Copy code
model User {
  id    Int    @id @default(autoincrement())
  posts Post[]
}

model Post {
  id       Int  @id @default(autoincrement())
  author   User @relation(fields: [authorId], references: [id])
  authorId Int  @map("author_id")// relation scalar field  (used in the `@relation` attribute above)
}
Doing this raw query:
Copy code
SELECT p.id, p.author_id FROM post p LEFT JOIN user u ON p.author_id = u.id
My question is, is there some method we could use from prisma to try to map the results to the schema? The goal have something like prisma does already eg: result = [{id: 1, author: { id: 1 }} ]
o
Thanks for explaining @Ricardo Seromenho Unfortunately, I haven't come across such a use case. But I think it's a nice feature if Prisma decide to support it in the future, since there was some talk before on Github about an improved Prisma client that supports advanced queries. Alternatively, It might be done as open source project, but it has to be written in Rust I guess like Prisma client, so it won't affect the performance of your app.
r
Bringing this up again. I'm having another issue with the way prisma does the joins. The big (but simple) query it is generating is not even going through. Getting
Copy code
Invalid `prisma.foo.findMany()` invocation in

Can't reach database server at `...`:`5432`

Please make sure your database server is running at `...`:`5432`.
clientVersion: '3.9.2',
Just to make sure the query was ok and nothing wrong with it I added
take: 5
and it works. Without the
take: 5
it fails. I believe it's because the query being so big with thousands of records.
o
Does it work when you use a local database? or is it locally to begin with? Because sometimes you need to raise the db instance class(in case of aws), t2 or t3 micro sometimes won't handle bigger queries, even for testing..
n
Hey Ricardo, thanks for testing and confirming by using the
take
argument. We would definitely like to fix this issue, would it be possible for you to provide us with a way so that we could reproduce this?
r
@Omar It's a t3.small. Increasing not sure it will have any performance improvement as the bottleneck is on the CPU. @Nurul Do you have already something in mind to fix this? What will be the approach? What do you suggest me to share so you can reproduce?
@Nurul I have created a repo with the reproduction of the performance (or error) with "joins" on a big data set here: https://github.com/seromenho/prisma-join-performance Added a failing test here: https://github.com/seromenho/prisma-join-performance/runs/6396852581?check_suite_focus=true As we can see find many fails.reated an issue to rtack this here: https://github.com/prisma/prisma/issues/13306
🥲 1
đź‘€ 1
j
The more I see stuff like this… the more I am disappointed in Prisma and pretty turned off on actually trying to get implemented. I love the idea of having a source of truth for models and first-class typescript support. But generating SQL from a
find()
method has already been solved by tools like Sequelize and Knex. Breaking “complex” queries into N “simple” queries is a great way to start blowing up your database/connections in an attempt to avoid writing SQL. There is no way NodeJS/Go Filtering + Network serialization is faster/lower resource than native SQL on Postgres/MySQL If a query is too complicated to be implemented in a single SQL call, I can always take the step of splitting up my
find()
calls manually. But when I explicitly want to combine a couple
JOINS
(like in that video linked, which returns only one result) it doesn’t seem like that is possible… I’ll admit I currently have not implemented Prisma yet and have mostly investigated it as an alternative to Sequelize/TypeORM so please let me know if I am missing something.
👍 1
n
Thanks Ricardo for providing the reproduction of the performance issue as well as creating a GitHub Issue. 🙌 I’ll share this with our engineering team so that we can look into it.
👍 1