I'm basically trying to wrap some legacy stuff int...
# hamilton-help
t
I'm basically trying to wrap some legacy stuff into hamilton that will at some point in the future get replaced but i'm also trying to do it in a way that doesn't look particularly awful and make the flow hard to follow. I'm generating a bunch of dataframes and then I want to write them out to a number of spark tables. But some of the tables depend on columns from the other tables. I do have it working by writing the table to spark, returning the dataframe back to the flow then passing it back to a new materializer for the dependant flow but I can't help but feel I'm missing something.
Copy code
materializers = [
        to.spark(dependencies=["generate_customers"], id="custs_to_df", table_name=f"input.customers_{random_string}", spark=spark, combine=base.PandasDataFrameResult()),
        to.spark(dependencies=["generate_accounts"], id="accounts_to_df", table_name=f"input.accounts_{random_string}", spark=spark, combine=base.PandasDataFrameResult()),
        to.spark(dependencies=["generate_transactions"], id="transactions_to_df", table_name=f"input.transactions_{random_string}", spark=spark, combine=base.PandasDataFrameResult()),
        to.spark(dependencies=["generate_aml"], id="aml_to_df", table_name=f"input.aml_{random_string}", spark=spark, combine=base.PandasDataFrameResult()),
        to.spark(dependencies=["generate_entity_link_table"], id="entity_links_to_df", table_name=f"input.entity_links_{random_string}", spark=spark, combine=base.PandasDataFrameResult()),
        to.spark(dependencies=["generate_entity_table"], id="entities_to_df", table_name=f"input.entities_{random_string}", spark=spark, combine=base.PandasDataFrameResult()),

    ]
I'd like to do something like this. But then the generate_transactions dependency depends on the generate_accounts block and the entity_link also depends on accounts and customers for example.