Slackbot
03/07/2024, 11:17 PMElijah Ben Izzy
03/07/2024, 11:47 PM@extract_fields(
{
"A": pd.DataFrame,
"B": pd.DataFrame,
...
}
)
def dataframe_partition(partitions: list[str], df: pd.DataFrame) -> pd.DataFrame:
# get a dict by grouping
# return dict of name to dataframe
(b) how to apply a pipeline of operations — this uses @pipe
to transform
def _rename_columns(df: pd.DataFrame, column_map: dict) -> pd.DataFrame:
...
def _filter_columns(df: pd.DataFrame, cols_to_drop: list[str]) -> pd.DataFrame:
...
@pipe(
step(_rename_columns, column_map={"foo" : "bar"}),
step(_rename_columns, cols_to_drop=...),
... # add as many as you'd like
)
def A_processed(A: pd.DataFrame) -> pd.DataFrame:
print("I've just done a ton of transformations, each one of which is a node in the DAG")
return A # it gets passed the result of transforming them
Elijah Ben Izzy
03/07/2024, 11:48 PMNicolas Huray
03/07/2024, 11:48 PMNicolas Huray
03/07/2024, 11:49 PMElijah Ben Izzy
03/07/2024, 11:49 PMNicolas Huray
03/07/2024, 11:49 PMElijah Ben Izzy
03/07/2024, 11:50 PMNicolas Huray
03/08/2024, 1:06 AM@extract_fields(
{
"A White Horse": pd.DataFrame,
"On a cherry tree": pd.DataFrame,
...
}
)
def dataframe_partition(partitions: list[str], df: pd.DataFrame) -> pd.DataFrame:
# get a dict by grouping
# return dict of name to dataframe
Nicolas Huray
03/08/2024, 1:07 AM@pipe(
step(_rename_columns, column_map={"foo" : "bar"}),
step(_rename_columns, cols_to_drop=...),
... # add as many as you'd like
)
# A should be applied on "A White Horse" Dataframe
def A_processed(A: pd.DataFrame) -> pd.DataFrame:
print("I've just done a ton of transformations, each one of which is a node in the DAG")
return A # it gets passed the result of transforming them
Elijah Ben Izzy
03/08/2024, 1:08 AMA_processed
function as was output by @extract_fields
?Nicolas Huray
03/08/2024, 1:08 AMNicolas Huray
03/08/2024, 1:10 AM@extract_fields(
{
"col1 == 'A' and col2 =='B'": pd.DataFrame,
"col1 == 'B' and col2 =='C'": pd.DataFrame,
...
}
)
def dataframe_partition(partitions: list[str], df: pd.DataFrame) -> pd.DataFrame:
# get a dict by grouping
# return dict of name to dataframe
Elijah Ben Izzy
03/08/2024, 1:15 AMStefan Krawczyk
03/08/2024, 1:16 AM@inject
?ThreadScribe
03/08/2024, 1:16 AM@inject
?"
Let me see what I can find... 🤔ThreadScribe
03/08/2024, 1:16 AM@inject
can be used in Hamilton:
python
@inject(nums=group(source('a'), value(10), source('b'), value(2)))
def a_plus_10_plus_b_plus_2(nums: List[int]) -> int:
return sum(nums)
This example demonstrates how @inject
allows you to replace parameters with values passed in, similar to a @parameterize
call. The complete reference documentation for @inject
can be found [here](https://hamilton.dagworks.io/en/latest/reference/decorators/inject/).Stefan Krawczyk
03/08/2024, 1:18 AM@inject
or @parameterize*
that can take in declarations of source("name with space")
.
Otherwise are the partitions fairly static?Nicolas Huray
03/08/2024, 1:22 AMStefan Krawczyk
03/08/2024, 1:31 AM@inject
and @pipe
don’t work together… 😕Stefan Krawczyk
03/08/2024, 1:36 AM_
for spaces/non-python variable characters.Stefan Krawczyk
03/08/2024, 1:37 AMdataframe_partition
you could return a dictionary of the mappings, that you could request in the output to then post process things back.Nicolas Huray
03/08/2024, 1:38 AMStefan Krawczyk
03/08/2024, 1:39 AMStefan Krawczyk
03/08/2024, 1:41 AM@inject(A=source("col1 == 'A' and col2 =='B'"))
# A should be applied on "A White Horse" Dataframe
def A_unprocessed(A: pd.DataFrame) -> pd.DataFrame:
return A
@pipe(
step(_echo, v=1),
step(_echo, v=2),
)
def A_processed2(A_unprocessed: pd.DataFrame) -> pd.DataFrame:
print("I've just done a ton of transformations, each one of which is a node in the DAG")
return A_unprocessed
Nicolas Huray
03/08/2024, 1:48 AMStefan Krawczyk
03/08/2024, 1:49 AMNicolas Huray
03/08/2024, 1:49 AMNicolas Huray
03/08/2024, 1:49 AMNicolas Huray
03/08/2024, 1:58 AMStefan Krawczyk
03/08/2024, 2:12 AMNicolas Huray
03/09/2024, 12:01 AMNicolas Huray
03/09/2024, 12:03 AMNicolas Huray
03/09/2024, 12:03 AMNicolas Huray
03/09/2024, 3:30 AMElijah Ben Izzy
03/09/2024, 3:49 AM