Slackbot
11/16/2022, 9:46 PMFilip Piasevoli
11/16/2022, 9:46 PMnumbers = range(10)
df1 = pd.DataFrame({
'a': numbers,
})
# filter to even numbers
df1.query('a%2==0')
Filip Piasevoli
11/16/2022, 9:48 PMnumbers = np.arange(10)
df1 = pd.DataFrame({
'a': numbers,
'b': numbers,
})
df2 = pd.DataFrame({
'a': numbers,
'c': numbers*2,
})
# merge
df3 = df1.merge(df2, on='a')
# perform ops on merged df
df3['d'] = df3.eval('b+c')
Wit Jakuczun
11/16/2022, 10:05 PMWit Jakuczun
11/16/2022, 10:06 PMElijah Ben Izzy
11/16/2022, 10:06 PMdef numbers() -> pd.Series:
return pd.Series(range(10))
def a(numbers: pd.Series) -> pd.Series:
return a[a%2==0]
...
driver.execute(['a']) # Will give you a dataframe with `a_filtered` in it
You can also do this at a dataframe level:
def df1() -> pd.DataFrame:
return pd.DataFrame({
'a': numbers,
})
def df_filtered(df1: pd.DataFrame) -> pd.DataFrame:
return df1.query('a%2==0')
IMO the first is a little more “hamiltonic”, but they both work well.Elijah Ben Izzy
11/16/2022, 10:09 PMdef df1() -> pd.DataFrame:
return ...
def df2() -> pd.DataFrame:
return ...
@extract_columns('a', 'b', 'c')
def df3(df1: pd.DataFramge, df2: pd.DataFrame) -> pd.DataFrame:
out = df1.merge(df2, on='a')
out['d'] = out.eval('b+c')
return out
Then you can ask for ‘a’, ‘b’, and ‘c’, and the framework will put them together. The extract_columns also exposes df3
as a node, so you can use that too.Filip Piasevoli
11/16/2022, 10:15 PMElijah Ben Izzy
11/16/2022, 10:16 PM