Seth Stokes
03/27/2024, 7:41 PMsaved_formatted_data_output
is typed as a dict
?
@save_to.excel(
path=source("path_to_save"),
output_name_="saved_formatted_data_output",
index=False
)
@config.when(data_product="gui")
@schema.output(
("Attribute_A", "str"),
("Attribute_B", "str"),
("Attribute_C", "str"),
)
def formatted_data_output__gui(data: pd.DataFrame) -> pd.DataFrame:
...
hamilton.function_modifiers.base.InvalidDecoratorException: Node saved_formatted_data_output has type typing.Dict[str, typing.Any] which is not a registered type for a dataset. Registered types are {'pandas': <class 'pandas.core.frame.DataFrame'>, 'polars': <class 'polars.dataframe.frame.DataFrame'>}. If you found this, either (a) ensure you have the right package installed, or (b) reach out to the team to figure out how to add yours.
Thierry Jean
03/27/2024, 7:48 PMpd.DataFrame
is actually just a dictionary of columns. I suspect @schema
might interact with the returned value of formatted_data_output
Seth Stokes
03/27/2024, 7:51 PM@schema
got passed the issueThierry Jean
03/27/2024, 7:51 PM@schema
does it work?Thierry Jean
03/27/2024, 7:51 PMThierry Jean
03/27/2024, 8:01 PMtarget_
as follow also fixes the issue
@schema.output(
("Attribute_A", "str"),
("Attribute_B", "str"),
("Attribute_C", "str"),
target_="formatted_data_output",
)
Can you try it on your end?Seth Stokes
03/27/2024, 8:25 PMElijah Ben Izzy
03/27/2024, 8:27 PMThierry Jean
03/27/2024, 8:30 PMhamilton.functions.base
function resolve_nodes()
. More precisely, the issue is that it keeps track of a list of nodes associated with the function formatted_data_output
and in this case, the materializer is pushed to position 0 in the list.
Therefore, the schema actually receives the materializer output, a metadata dictionary, instead of the dataframeThierry Jean
03/27/2024, 8:31 PMThierry Jean
03/27/2024, 8:34 PMtarget_
should be a robust solution!Elijah Ben Izzy
03/27/2024, 8:58 PM