This message was deleted Hamilton Open Source #hamilton-help

Join Slack

This message was deleted.

# hamilton-help

Slackbot

02/20/2024, 1:33 PM

This message was deleted.

Thierry Jean

02/20/2024, 1:57 PM

Hi Roel! You're talking about distinguishing "optional" from "required" inputs in the visualizations? Is there a way you would like to see it displayed? For the

hamilton.plugins.h_experiments.ExperimentTracker

, I implemented a utility function to get the default parameters of a node (i.e., the function has

a: int = 1

instead of

a: int)

. (see here) If you're comfortable, you could start with that and try some custom viz styles?

Thierry Jean

02/20/2024, 2:00 PM

Otherwise, I've been thinking about a way to surface from the Driver all the: • Config keys • Required inputs • Optional inputs to automatically generate a sort of "configuration space" of all possible DAGs (without using overrides). Does that sound useful? It could help with ML experiments, or help DAG authors to expand/constraint the input space to only valid "configurations" for downstream users

Roel Bertens

02/21/2024, 6:07 PM

That definitely sounds useful. I also want to help the users with a quick and easy way to tell them which inputs they need to supply when they want to compute a certain (set of) node(s)

Elijah Ben Izzy

02/21/2024, 6:10 PM

@Roel Bertens will this work for you? • https://hamilton.dagworks.io/en/latest/reference/drivers/Driver/#hamilton.driver.Driver.what_is_upstream_of You can also know that inputs are from external, given that they have this field set to `true`: https://github.com/DAGWorks-Inc/hamilton/blob/2cfe00c494db65120e45626f07aafba78ac47432/hamilton/graph_types.py#L28

💯 1

Roel Bertens

02/21/2024, 6:32 PM

That’s perfect

🔥 1

Roel Bertens

02/22/2024, 9:40 AM

@Thierry Jean I am using

_get_default_input

now indeed to check this but since this method is not expected to be used outside the module it might be good to make it clear that you can actually use this method (i.e. a publicly available method).

Thierry Jean

02/22/2024, 12:47 PM

We move slower for public-facing API, but you can be confident with this one since it's based on the already public

graph_types.HamiltonNode

, then it simply inspects the function the underlying function signature. @Elijah Ben Izzy We could move it to

graph_utils

? or make it a property of

HamiltonNode

? Similarly, we could move

hash_source_code()

from to a property.

Thierry Jean

02/22/2024, 12:51 PM

To illustrate the problem @Elijah Ben Izzy, when the following function is called:

Copy code

def A(external: int = 7) -> float:
  return external / 2

# call 1 
dr.execute(["A"])
# call 2 
dr.execute(["A"], inputs={"external": 5})

For call 1 an call 2,

external

should be

is_external_input

(from the dataflow's POV) and they will have value 7 and 5 respectively. In the experiment tracker, I disambiguate the two using

Copy code

inputs = dict()  # from .execute()
logged_inputs = []

# graph is graph_types.HamiltonGraph
# node is graph_types.HamiltonNode
for node in graph.nodes:

  # filter out config nodes which are external but no origin
  elif node.is_external_input and node.originating_functions:
    self.logged_inputs.append(
      dict(
        name=node.name,
        value=inputs.get(node.name),  # value from execute
        default_value=_get_default_input(node),  # value from function signature
      )
    )

which would lead to call 1

{name: A, value: 7, default_value: None}

and call 2

{name: A, value: 5, default_value: 7}

Elijah Ben Izzy

02/22/2024, 2:41 PM

Ok, so yeah, we don’t expose it. I’d like to, but we don’t currently track the default values 9they get applied at function time…) @Thierry Jean the

_get_default_input

is largely correct but might run into some issues (it assumes 1:1 mapping of fn -> node)… I’d like to add in

defaults

value. It’ll be a dict of

str

value

. @Roel Bertens in the meanwhile you can use the optional_dependencies to know which ones are defaults, and use the logic in

get_default_input

Elijah Ben Izzy

02/22/2024, 3:03 PM

@Roel Bertens created https://github.com/DAGWorks-Inc/hamilton/issues/710 to track, feel free to comment

👍 1

Roel Bertens

02/23/2024, 9:06 AM

Using optional_dependencies works, thanks

🙌 1

Roel Bertens

03/01/2024, 1:48 PM

@Thierry Jean Hi, so I've been using this already to identify inputs with default values, which works. I use this now (feedback welcome):

Copy code

def _is_input_node_with_default(node) -> bool:
    """Return True if the input node has a default value specified

    NOTE: also return True if the default is None
    """          
    if node.is_external_input and node.originating_functions is not None:  # exclude config nodes
        origin_function = node.originating_functions[0]
        param = inspect.signature(origin_function).parameters[node.name]
        return False if param.default is inspect._empty else True
    return False

Based on this I want to style these input nodes differently but I notice they are not reachable via the

custom_style

function which I pass to

display_all_functions

. How can I change their look?

Thierry Jean

03/01/2024, 4:29 PM

Hi @Roel Bertens! I'm currently looking into it. Do you have any code I could it test it on?

Roel Bertens

03/01/2024, 5:17 PM

Copy code

%%cell_to_module -m my_module --display

def test(a: int, b: int = 2) -> int:
   return a+b

Roel Bertens

03/01/2024, 5:17 PM

where a and b should have a different style

Roel Bertens

03/01/2024, 5:28 PM

wrt styling I am using somethng like this

Copy code

def custom_style_function(
    *, node: graph_types.HamiltonNode, node_class: str
) -> Tuple[dict, Optional[str], Optional[str]]:
    if _is_input_node_with_default(node):
        style = ({"fillcolor": "#c1f5cf"}, node_class, "optional")
    else:
        style = ({}, node_class, None)

    return style

And then run

Copy code

dr.display_all_functions(custom_style_function=custom_style_function))

Thierry Jean

03/01/2024, 6:10 PM

On the left side, you see that one picture has grouped inputs and the other has ungrouped + styled inputs. The two available options would be styling the entire group or ungrouping nodes and styling individual input nodes

Roel Bertens

03/01/2024, 7:04 PM

Nice! Do you also have the code for that example?

Thierry Jean

03/01/2024, 7:06 PM

I would need to make a PR with some code change, but wanted to check with you if you preferred: 1. grouped inputs with style applied to whole group 2. ungrouped inputs with style applied to individual inputs this feature request hits a few quirks, but we should be able to sort it for now until we do a greater overhaul of the visualization code

Roel Bertens

03/04/2024, 7:44 AM

@Thierry Jean From these options I would prefer (2) because I want to be able to show the difference between optional and non-optional inputs. Would an option 3 be possible where you group the optional and also group the non-optional inputs? Or maybe an option 4 where all input are grouped but within the node the font of optional inputs is different. So my preference in order: 4, 3, 2.

Roel Bertens

03/04/2024, 7:47 AM

There is also another thing that I was already considering which doesn't solve the same problem but still helps with creating clarity in the DAG. I would like a config setting to exclude the optional inputs from the DAG. Is that something useful to support or better to try and implement myself?

Thierry Jean

03/04/2024, 4:27 PM

In the short term, options 1 and 2 are most manageable, but other options aren't impossible. This is because of graphviz limitations. "Grouped inputs" are actually a single node that contains a table in the

label

attribute (see Graphviz docs). These tables use a HTML-like syntax specific to Graphviz. To style individual inputs of a grouped input, we would need to expose the node

label

attribute On the opposite, creating an individual node for each inputs, and potentially using subgraphs to group them, creates messier layouts generally and a lot more entropy/variations between reruns of the same viz since it's non-deterministic.

Roel Bertens

03/04/2024, 4:41 PM

And would excluding optional inputs from the DAG be an option on the short term?

Thierry Jean

03/04/2024, 4:52 PM

Option #2 requires two changes: 1. support ungrouping input nodes: ◦ this isn't too difficult, and it would resolve another issue with the

deduplicate_inputs

which is sometimes unintuitive 2. create new logic to identify default inputs (you could style or hide it): ◦ in short, the ad hoc logic we shared earlier in this thread to identify default inputs wouldn't work. I can give you an example workaround, but the API will likely change when we'll decide how to support default values

Thierry Jean

03/04/2024, 5:48 PM

@ThreadScribe can you summarize this discussion into a structured GitHub issue for me?

ThreadScribe

03/04/2024, 5:48 PM

🪄 Summoning the summary elves... one moment!

ThreadScribe

03/04/2024, 5:48 PM

Hey @Thierry Jean, your summary is ready! ### Clarifying Optional Inputs for Hamilton Driver In a recent discussion, the team explored ways to visually distinguish optional inputs from required ones in a Hamilton driver. Here are the key points from the conversation: - The request was to make it clear to the user that certain inputs to the driver are optional. This was in context of differentiating "optional" from "required" inputs in the visualizations. - A utility function was implemented in the

hamilton.plugins.h_experiments.ExperimentTracker

to get the default parameters of a node. This function inspects the function signature to distinguish default values for inputs. - The team also discussed the possibility of surfacing all the configuration keys, required inputs, and optional inputs to automatically generate a "configuration space" of all possible DAGs. This could potentially help with ML experiments or enable DAG authors to constrain the input space to valid configurations for downstream users. - Discussions included moving certain methods to different modules or properties for better accessibility and usage. - Sample code snippets were shared for identifying and styling input nodes with default values, along with the challenges and options for applying different styles to input nodes. - Short-term solutions were discussed, including support for ungrouping input nodes and creating new logic to identify default inputs for styling or hiding. - Graphviz limitations were highlighted, along with the potential challenges in implementing certain display options. - It was noted that excluding optional inputs from the DAG displayed a complex set of challenges and limitations, particularly in relation to Graphviz and the current API logic. The team also agreed to summarize the discussion into a structured GitHub issue to further track the progress of the potential changes. Feel free to refer to the complete discussion for additional context.

ThreadScribe

03/04/2024, 5:48 PM

You can also view your Scribe HERE 👀

Thierry Jean

03/04/2024, 5:48 PM

@Stefan Krawczyk this was pretty fast, only took seconds

Open in Slack

Previous Next