Slackbot
10/13/2023, 11:28 PMStefan Krawczyk
10/13/2023, 11:37 PMStefan Krawczyk
10/13/2023, 11:38 PMdashed lines
. Outputs are marked with a rectangular
shape. So you see here, you can have inputs that are also outputs.Stefan Krawczyk
10/13/2023, 11:39 PMStefan Krawczyk
10/13/2023, 11:43 PMmaterializer
functionality — what an output can be, can be visualized more explicitly.
from hamilton import base
from hamilton.io.materialization import to
# instead of execute you can do:
result, _ = dr.materialize(
to.memory(
id="example_df",
dependencies=output_columns,
combine=base.PandasDataFrameResult()
),
inputs=initial_columns
)
# and then to visualize:
dr.visualize_materialization(
to.memory(
id="example_df",
dependencies=output_columns,
combine=base.PandasDataFrameResult()
),
inputs=initial_columns
)
Where the visualization output will be:JVial
10/14/2023, 8:40 AMStefan Krawczyk
10/15/2023, 3:47 AMa value is marked with dashed lines than it is an inputyes. It’ll also have
Input:
in the node value.
a value is marked as a rectangle it is an outputcorrect.
if a value is marked as a dashed rectangle it is a input and outputcorrect 👍
JVial
10/15/2023, 8:34 AMJVial
10/15/2023, 9:11 AMdef customer_first_name(customer_first_name: pd.Series) -> pd.Series:
return customer_first_name.str.replace('Lana', '', case=False)
from hamilton import driver, ad_hoc_utils
temp_module = ad_hoc_utils.create_temporary_module(
customer_first_name)
config = {}
dr = driver.Driver(config, temp_module)
output_columns = [
'customer_first_name'
]
input_data = customer_data_df.to_dict('series')
df = dr.execute(output_columns, inputs=input_data)
RecursionError: maximum recursion depth exceededStefan Krawczyk
10/15/2023, 4:26 PM_raw
as a suffix for inputs.
def customer_first_name(customer_first_name_raw: pd.Series) -> pd.Series:
return customer_first_name.str.replace('Lana', '', case=False)
...
input_data = customer_data_df.to_dict('series')
input_data = {f"{c}_raw": v for c, v in input_data.items()} # add _raw suffix
df = dr.execute(output_columns, inputs=input_data)
Note: depending on the transforms you’re doing you might like this issue that should be done soon.