This message was deleted Hamilton Open Source #hamilton-help

Join Slack

This message was deleted.

# hamilton-help

Slackbot

08/28/2023, 12:50 AM

This message was deleted.

Stefan Krawczyk

08/28/2023, 1:03 AM

You’re 1/2 way there I think. So yes to pass dict to driver or as inputs, But in terms of what to run, you need to define a “flow” from those source DBs and name them explicitly. e.g.

Copy code

# dataflow.py
def source_a(param1: str, param2: str) -> pd.Dataframe:
   # ... code to pull from source
   return df

def filtered_dataframe(source_a: pd.Dataframe) -> pd.Dataframe:
   """this depends on the output of source_a"""
   df = source_a... # logic to filter on source_a
   return df

and then in the driver:

Copy code

import dataflow
from hamilton import driver

invariant_parameters = {"param1": ...}
dr = driver.Driver(invariant_paramters, dataflow)
df = dr.execute(["filtered_dataframe"], inputs={"param2": ...})

It will know to run

source_a

before

filtered_dataframe

because that’s how the flow is defined. This is where you’d request the “outputs” you want, and Hamilton will determine the path to take, to get that output for you. Does that help? QQ: Are the two SQL servers interchangeable? or do you join the data from them?

Jarrod Hamilton

08/28/2023, 3:52 AM

Yes I think so! I'll try to put the above in practice tomorrow morning. They are different SQL servers, I do join them but we don't have a linked server to do the join on the server itself so I do it in pandas.

👍 1

Stefan Krawczyk

08/28/2023, 4:22 AM

So then your code would probably look something like:

Copy code

def source_a(param1: str, param2: str) -> pd.Dataframe:
   # ... code to pull from source
   return df

def source_b(param1: str, param2: str) -> pd.Dataframe:
   # ... code to pull from source
   return df

def joined_data(source_a: pd.Dataframe, source_b: pd.Dataframe) -> pd.Dataframe:
   # join dfs
   return df

Copy code

def filtered_dataframe(joined_data: pd.Dataframe) -> pd.Dataframe:
   """this depends on the output of joined_data"""
   df = joined_data... # logic to filter...
   return df

etc

Open in Slack

Previous Next