This message was deleted.
# hamilton-help
s
This message was deleted.
e
That’s a great question. Think I know but I want to double-check with a quick test. To clarify, it would be something like this:
Copy code
def df() -> pd.DataFrame:
   return pd.DataFrame.from_records([{'a' : 1}])

def a() -> pd.Series:
   return pd.Series([1])
Then in the driver compute both, right?
👍 1
s
@John Herr @Elijah Ben Izzy will show some code. But in short the driver does not unpack a dataframe passed into it so it should compute 'a' from the function definition. If you want to short circuit computation I think overrides parameter on
.execute()
is the way to go.
e
OK, cool, an example to clarify. I think I may have misunderstood at first. To demonstrate overrides, this is how you might approach it. Note that I’m passing in an override for
a
(edited a bit for clarity)
Copy code
import pandas as pd
from hamilton.ad_hoc_utils import create_temporary_module
from hamilton.driver import Driver

@extract_columns('a', 'b')
def a() -> pd.Series:
    return pd.DataFrame.from_records([{'a' : 1, 'b': 2}])

def c(a: pd.Series) -> pd.Series:
    return a*2

# This is smart enough to not run "a" and use the input 
result = Driver({}, create_temporary_module(df, c)).execute(final_vars=['c', 'a'], overrides={'a' : pd.Series([2])})
This is cause Hamilton thinks of inputs, etc… as distinct items. A dataframe is a dataframe unless you tell it to extract columns. Overrides allow you to short-circuit execution, but the names have to match up. E.G.
a
in this case matches a passed in series.
Note that if you happen to pass a dataframe in that has
a
as an input it will not use that.
Copy code
# this has no knowledge of the fact that the dataframe has the column `a` in it
result = Driver(
    dict(df=pd.DataFrame.from_records([{'a' : 10, 'b': 20}])), 
    create_temporary_module(df, c)).execute(final_vars=['c'])
You’d have to pass the series in as an override to get it to use that. Hope this helps!