This message was deleted.
# hamilton-help
s
This message was deleted.
s
You’re 1/2 way there I think. So yes to pass dict to driver or as inputs, But in terms of what to run, you need to define a “flow” from those source DBs and name them explicitly. e.g.
Copy code
# dataflow.py
def source_a(param1: str, param2: str) -> pd.Dataframe:
   # ... code to pull from source
   return df

def filtered_dataframe(source_a: pd.Dataframe) -> pd.Dataframe:
   """this depends on the output of source_a"""
   df = source_a... # logic to filter on source_a
   return df
and then in the driver:
Copy code
import dataflow
from hamilton import driver

invariant_parameters = {"param1": ...}
dr = driver.Driver(invariant_paramters, dataflow)
df = dr.execute(["filtered_dataframe"], inputs={"param2": ...})
It will know to run
source_a
before
filtered_dataframe
because that’s how the flow is defined. This is where you’d request the “outputs” you want, and Hamilton will determine the path to take, to get that output for you. Does that help? QQ: Are the two SQL servers interchangeable? or do you join the data from them?
j
Yes I think so! I'll try to put the above in practice tomorrow morning. They are different SQL servers, I do join them but we don't have a linked server to do the join on the server itself so I do it in pandas.
👍 1
s
So then your code would probably look something like:
Copy code
def source_a(param1: str, param2: str) -> pd.Dataframe:
   # ... code to pull from source
   return df

def source_b(param1: str, param2: str) -> pd.Dataframe:
   # ... code to pull from source
   return df

def joined_data(source_a: pd.Dataframe, source_b: pd.Dataframe) -> pd.Dataframe:
   # join dfs
   return df
Copy code
def filtered_dataframe(joined_data: pd.Dataframe) -> pd.Dataframe:
   """this depends on the output of joined_data"""
   df = joined_data... # logic to filter...
   return df

etc