This message was deleted.
# hamilton-help
s
This message was deleted.
e
Welcome! I actually think that’s quite a clever solution TBH. It depends a bit on how you’re doing it — E.G. how many map operations there are and how many filter operations you have. If you have a few filter operations on a few specific features, you can filter each one of them, join to a dataframe, then
extract_columns
on that. Then you can conduct the rest of the map operation. That said, I’d actually continue doing it simliarly to how you’re doing it — it makes a lot of sense to keep the operations having the same index, especially if you have the filter on, say, some sentinal value. In that case, you could: 1. Write a custom results_builder to join them and filter, handling the index specially 2. Actually do that within a node (write a dataframe join that does the filtering) You could pass along the filter values, or just utilize the index to do the join. Makes sense?
👍 1
s
One clarifying question, does everything fit in memory? Or is that a concern?
m
Thank you both for prompt responses! The data does fit into memory - no complications expected there. I think custom
results_builder
is exactly what I was missing - thank you for the pointer!
👍 1
s
Cool. I would recommend pairing the custom result builder + new materializer functionality. Will link once at a keyboard, but check out the materializer example in the examples folder.
Here’s the example — https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/materialization ; this will allow you to create one driver, and then switch out/add how the final dataframe is “materialized” more easily and flexibly.
👍 1