Slackbot
10/05/2022, 11:15 PMElijah Ben Izzy
10/05/2022, 11:17 PMdef df() -> pd.DataFrame:
return pd.DataFrame.from_records([{'a' : 1}])
def a() -> pd.Series:
return pd.Series([1])
Elijah Ben Izzy
10/05/2022, 11:17 PMStefan Krawczyk
10/05/2022, 11:29 PM.execute()
is the way to go.Elijah Ben Izzy
10/05/2022, 11:44 PMa
(edited a bit for clarity)
import pandas as pd
from hamilton.ad_hoc_utils import create_temporary_module
from hamilton.driver import Driver
@extract_columns('a', 'b')
def a() -> pd.Series:
return pd.DataFrame.from_records([{'a' : 1, 'b': 2}])
def c(a: pd.Series) -> pd.Series:
return a*2
# This is smart enough to not run "a" and use the input
result = Driver({}, create_temporary_module(df, c)).execute(final_vars=['c', 'a'], overrides={'a' : pd.Series([2])})
Elijah Ben Izzy
10/05/2022, 11:46 PMa
in this case matches a passed in series.Elijah Ben Izzy
10/06/2022, 12:03 AMa
as an input it will not use that.
# this has no knowledge of the fact that the dataframe has the column `a` in it
result = Driver(
dict(df=pd.DataFrame.from_records([{'a' : 10, 'b': 20}])),
create_temporary_module(df, c)).execute(final_vars=['c'])
You’d have to pass the series in as an override to get it to use that. Hope this helps!