sooo how do you guys deal with m to n relations in dynamodb Serverless Stack #random

sooo, how do you guys deal with m to n relations i...

Adrian Schweizer

11/16/2021, 9:22 PM

sooo, how do you guys deal with m to n relations in dynamodb? Make a separate mapping table? Or is there some trick I'm not thinking of?

Adrian Schweizer

11/16/2021, 9:22 PM

for example books and authors. An author can write many books, and a book can be written by multiple authors

Garret Harp

11/16/2021, 9:28 PM

It always depends on the exact use case and what kind of queries you want to be able to perform.

Jon Holman

11/16/2021, 9:28 PM

https://www.alexdebrie.com/posts/dynamodb-single-table/

Adrian Schweizer

11/16/2021, 9:30 PM

Yet data modeling with DynamoDB is tricky for those used to the relational databases that have dominated for the past few decades.

Ain't that the truth right there

Adrian Schweizer

11/16/2021, 9:37 PM

@Garret Harp Well, in the use case of books and authors, you would obviously want to be able to list all books of an author and also all authors of a book

Adrian Schweizer

11/16/2021, 9:38 PM

so, let's say you have a list of authors, then you click on an author, and it should list all the books he's written, and when you click on a book, it should then show book details including all authors of that particular book

Jon Holman

11/16/2021, 9:42 PM

I am not an expert on this, Alex Debrie and Rick Houlihan are. I'm wondering if you could have a GSI that is the primary key flipped.

Jon Holman

11/16/2021, 9:43 PM

another good article, https://www.alexdebrie.com/posts/dynamodb-one-to-many/

Adrian Schweizer

11/16/2021, 9:47 PM

thanks

Adrian Schweizer

11/16/2021, 9:50 PM

The main criticism of multi table approach seems to be network latency, but in many cases you could probably cache the mappings and then it would also be only a single lookup

Adrian Schweizer

11/16/2021, 9:51 PM

but yeah, I didn't even think of using compound sort keys with startsWith

Jon Holman

11/16/2021, 9:54 PM

yeah, it's a whole new way of thinking. You do not want to normalize your data. You want a single query to be able to get all of your relevant data.

Jon Holman

11/16/2021, 9:55 PM

and on the sort key you can do different logic like starts with

Adrian Schweizer

11/16/2021, 9:58 PM

i don't really see how many to many relations are done...the examples all seem to be one to many only, or maybe I just don't get it yet...

Adrian Schweizer

11/16/2021, 9:59 PM

Maybe you just duplicate the data?

Jon Holman

11/16/2021, 9:59 PM

I'm wondering if you can create global secondary index with the partition key and sort key flipped

Jon Holman

11/16/2021, 10:00 PM

or you could duplicate the data, but it would be better to not

Adrian Schweizer

11/16/2021, 10:03 PM

hmmm, yeah, I gotta try this

Adrian Schweizer

11/16/2021, 10:04 PM

I mean, I should have been able to come up with that, too, since I'm already doing it on the "mapping table" of the multi table approach

Adrian Schweizer

11/16/2021, 10:04 PM

my guts tell me it won't work, but my guts have been wrong many times before in this new (for me) serverless world

Adrian Schweizer

11/16/2021, 10:06 PM

hmmm, I can't come up with a way to update the table, though...in the case of many to many

Adrian Schweizer

11/16/2021, 10:15 PM

i found this: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-adjacency-graphs.html

Adrian Schweizer

11/16/2021, 10:19 PM

their example is super confusing for me, though...bills and invoices? I don't work in finance, god damn it

Adrian Schweizer

11/16/2021, 10:20 PM

what even is a bill that belongs to multiple invoices? I have no idea

Adrian Schweizer

11/16/2021, 10:22 PM

it doesn't even seem to be many to many, because apparently, the bills have different data when they belong to different invoices...

Adrian Schweizer

11/16/2021, 10:23 PM

I guess their example is when you have a bill in a restaurant for a table, and then multiple invoices for people who want to pay separately, so the bill gets spread over multiple users (invoices)

Adrian Schweizer

11/16/2021, 10:24 PM

but that is not technically many to many, at least not how I understand it

Adrian Schweizer

11/16/2021, 10:24 PM

because obviously, you can get away without data duplication in this example, but that's not true for all many to many examples

Adrian Schweizer

11/16/2021, 10:25 PM

eh, I'm probably just gonna do multi table and worry about it later when I have performance problems

Adrian Schweizer

11/16/2021, 10:26 PM

future me will hate me, but whatever, he's not here yet 🙂

Devin

11/16/2021, 10:37 PM

I purchased and worked through the DynamoDB Book. It’s really vey good. I learned a TON.

Devin

11/16/2021, 10:38 PM

that said I still feel totally lost doing single table design. What I feel like I really need is the Single Table Design workbook. That is just a bunch of exercises where you model, change update… over and over.

Devin

11/16/2021, 10:43 PM

In the mean time, I’m trying to do single table wherever I can and just do multi for the rest. I def have “fewer” tables

Simon Reilly

11/16/2021, 11:01 PM

I think in the adjacency list example that's shared, the invoice is equivalent to your book, so your book id would be pk, and then authors would be sk. Then you have gsi_pk_1 pointing at the sk column (author) so you can query that. It works because gsi_pk_1 can contain duplicates because GSI can have duplicate primary key. Since pk would by default be replicated to the GSI you can always query gsi_pk_1 with author name and get a list of book ids from the pk column, along with any other data you want to add to the GSI. I think the main drawback on Single Table Design is its in-flexible, it would be tough to update if your use case changed

Adrian Schweizer

11/16/2021, 11:21 PM

i don't see how that supports full many to many queries from both sides without data duplication, though

Adrian Schweizer

11/16/2021, 11:21 PM

or BatchGetItem

Adrian Schweizer

11/16/2021, 11:21 PM

because unlike in their example, author data does not change per book

Adrian Schweizer

11/16/2021, 11:23 PM

most articles I've read now either introduce data duplication or then use BatchGetItem after an initial query to load the rest of the data

Adrian Schweizer

11/16/2021, 11:25 PM

i think turning it around and storing author book bookdata items would be better in this case, because it would lead to only a few of the books having duplicated data, and not many times, so you could still get away with relatively cheap updates

Adrian Schweizer

11/16/2021, 11:26 PM

since the amount of books authors write is greater than the amount of authors books have (usually only 1)

Adrian Schweizer

11/16/2021, 11:27 PM

i will probably do it this way, because my real case is similarly lopsided

Simon Reilly

11/17/2021, 8:08 AM

Ooh nice logic, makes sense

Open in Slack

Previous Next