This message was deleted.
# hamilton-help
s
This message was deleted.
👀 1
f
Toy example 1, dataframe as input, perform a filter, return dataframe as output
Copy code
numbers = range(10)
df1 = pd.DataFrame({
    'a': numbers,
})

# filter to even numbers
df1.query('a%2==0')
Toy example 2, two dataframes as input to be merged on a common key, some ops performed on the merged table
Copy code
numbers = np.arange(10)
df1 = pd.DataFrame({
    'a': numbers,
    'b': numbers,
})

df2 = pd.DataFrame({
    'a': numbers,
    'c': numbers*2,
})

# merge
df3 = df1.merge(df2, on='a')
# perform ops on merged df
df3['d'] = df3.eval('b+c')
w
You can work with data frames as args to your functions
Hamilton does not enforce working on series
e
Thanks! Happy you enjoyed it! Also thanks @Wit Jakuczun — appreciate the answer. TO add more… This is how you would do series:
Copy code
def numbers() -> pd.Series:
    return pd.Series(range(10))

def a(numbers: pd.Series) -> pd.Series:
    return a[a%2==0]

...
driver.execute(['a']) # Will give you a dataframe with `a_filtered` in it
You can also do this at a dataframe level:
Copy code
def df1() -> pd.DataFrame:
    return pd.DataFrame({
        'a': numbers,
    })

def df_filtered(df1: pd.DataFrame) -> pd.DataFrame:
    return df1.query('a%2==0')
IMO the first is a little more “hamiltonic”, but they both work well.
👍 2
For your second case, its pretty natural:
Copy code
def df1() -> pd.DataFrame:
    return ...

def df2()  -> pd.DataFrame:
    return ...

@extract_columns('a', 'b', 'c')
def df3(df1: pd.DataFramge, df2: pd.DataFrame) -> pd.DataFrame:
    out = df1.merge(df2, on='a')
    out['d'] = out.eval('b+c')
    return out
Then you can ask for ‘a’, ‘b’, and ‘c’, and the framework will put them together. The extract_columns also exposes
df3
as a node, so you can use that too.
f
Very slick. Okay I'll have to play with these and gain some better intuition but this is a great place to start. Thanks all!
e
Awesome! Feel free to ask us if you have any more Qs 🙂