This message was deleted.
# hamilton-help
s
This message was deleted.
đź‘€ 1
s
@JVial thanks for the question. I get the following diagram.
So yes, inputs can be outputs. Inputs are marked as “Input: X” with
dashed lines
. Outputs are marked with a
rectangular
shape. So you see here, you can have inputs that are also outputs.
To understand your question better — are you asking is there a better way to visualize things in the graph? or?
If you use the
materializer
functionality — what an output can be, can be visualized more explicitly.
Copy code
from hamilton import base
from hamilton.io.materialization import to
# instead of execute you can do:
result, _ = dr.materialize(
  to.memory(
    id="example_df",
    dependencies=output_columns,
    combine=base.PandasDataFrameResult()
  ),
      inputs=initial_columns
)

# and then to visualize:
dr.visualize_materialization(
  to.memory(
    id="example_df",
    dependencies=output_columns,
    combine=base.PandasDataFrameResult()
  ),
      inputs=initial_columns
)
Where the visualization output will be:
j
Hello and thanks for the fast reply. Yeah so my question is, can i visiualize that an input is equal to an output (1:1 -> input == output) If I understand it correctly, so if in the graph a value is marked with dashed lines than it is an input. If a value is marked as a rectangle it is an output. And if a value is marked as a dashed rectangle it is a input and output? Is my understanding correct?
s
a value is marked with dashed lines than it is an input
yes. It’ll also have
Input:
in the node value.
a value is marked as a rectangle it is an output
correct.
if a value is marked as a dashed rectangle it is a input and output
correct 👍
j
@Stefan Krawczyk Thank you for the explanation. Now I get it :)👏
Ok sry 🙂 I have one more question. Is it possible to use the same name for an input and an output but transform it? I mean I have a pd.Series names first_name and i have a output names first_name. I have a function called first_name which get the pd.Series first_name as input, manipulate it (e.g rename a string).
Copy code
def customer_first_name(customer_first_name: pd.Series) -> pd.Series:
  return customer_first_name.str.replace('Lana', '', case=False)

from hamilton import driver, ad_hoc_utils

temp_module = ad_hoc_utils.create_temporary_module(
     customer_first_name)
config = {}
dr = driver.Driver(config, temp_module)

output_columns = [
    'customer_first_name'
]

input_data = customer_data_df.to_dict('series')
df = dr.execute(output_columns, inputs=input_data)
RecursionError: maximum recursion depth exceeded
s
@JVial yep that doesn’t work. What you’ve effectively defined there is a node with an edge to itself (i.e. a loop); Hamilton cant tell the difference between input and output there. More broadly, Hamilton was created to try to make it really easy to debug an output. i.e. go from output to code, and understand the order of computation to it. So we try to make it hard to “redefine” i.e. “mutate”, the same thing twice. So you’ll need to name either the input or the function differently; it’s common to use
_raw
as a suffix for inputs.
Copy code
def customer_first_name(customer_first_name_raw: pd.Series) -> pd.Series:
  return customer_first_name.str.replace('Lana', '', case=False)

...

input_data = customer_data_df.to_dict('series')
input_data = {f"{c}_raw": v for c, v in input_data.items()}  # add _raw suffix
df = dr.execute(output_columns, inputs=input_data)
Note: depending on the transforms you’re doing you might like this issue that should be done soon.
👍 1