This message was deleted.
# hamilton-help
s
This message was deleted.
t
Hi Roel! You're talking about distinguishing "optional" from "required" inputs in the visualizations? Is there a way you would like to see it displayed? For the
hamilton.plugins.h_experiments.ExperimentTracker
, I implemented a utility function to get the default parameters of a node (i.e., the function has
a: int = 1
instead of
a: int)
. (see here) If you're comfortable, you could start with that and try some custom viz styles?
Otherwise, I've been thinking about a way to surface from the Driver all the: • Config keys • Required inputs • Optional inputs to automatically generate a sort of "configuration space" of all possible DAGs (without using overrides). Does that sound useful? It could help with ML experiments, or help DAG authors to expand/constraint the input space to only valid "configurations" for downstream users
r
That definitely sounds useful. I also want to help the users with a quick and easy way to tell them which inputs they need to supply when they want to compute a certain (set of) node(s)
e
r
That’s perfect
🔥 1
@Thierry Jean I am using
_get_default_input
now indeed to check this but since this method is not expected to be used outside the module it might be good to make it clear that you can actually use this method (i.e. a publicly available method).
t
We move slower for public-facing API, but you can be confident with this one since it's based on the already public
graph_types.HamiltonNode
, then it simply inspects the function the underlying function signature. @Elijah Ben Izzy We could move it to
graph_utils
? or make it a property of
HamiltonNode
? Similarly, we could move
hash_source_code()
from to a property.
To illustrate the problem @Elijah Ben Izzy, when the following function is called:
Copy code
def A(external: int = 7) -> float:
  return external / 2

# call 1 
dr.execute(["A"])
# call 2 
dr.execute(["A"], inputs={"external": 5})
For call 1 an call 2,
external
should be
is_external_input
(from the dataflow's POV) and they will have value 7 and 5 respectively. In the experiment tracker, I disambiguate the two using
Copy code
inputs = dict()  # from .execute()
logged_inputs = []

# graph is graph_types.HamiltonGraph
# node is graph_types.HamiltonNode
for node in graph.nodes:

  # filter out config nodes which are external but no origin
  elif node.is_external_input and node.originating_functions:
    self.logged_inputs.append(
      dict(
        name=node.name,
        value=inputs.get(node.name),  # value from execute
        default_value=_get_default_input(node),  # value from function signature
      )
    )
which would lead to call 1
{name: A, value: 7, default_value: None}
and call 2
{name: A, value: 5, default_value: 7}
e
Ok, so yeah, we don’t expose it. I’d like to, but we don’t currently track the default values 9they get applied at function time…) @Thierry Jean the
_get_default_input
is largely correct but might run into some issues (it assumes 1:1 mapping of fn -> node)… I’d like to add in
defaults
value. It’ll be a dict of
str
->
value
. @Roel Bertens in the meanwhile you can use the optional_dependencies to know which ones are defaults, and use the logic in
get_default_input
.
@Roel Bertens created https://github.com/DAGWorks-Inc/hamilton/issues/710 to track, feel free to comment
👍 1
r
Using optional_dependencies works, thanks
🙌 1
@Thierry Jean Hi, so I've been using this already to identify inputs with default values, which works. I use this now (feedback welcome):
Copy code
def _is_input_node_with_default(node) -> bool:
    """Return True if the input node has a default value specified

    NOTE: also return True if the default is None
    """          
    if node.is_external_input and node.originating_functions is not None:  # exclude config nodes
        origin_function = node.originating_functions[0]
        param = inspect.signature(origin_function).parameters[node.name]
        return False if param.default is inspect._empty else True
    return False
Based on this I want to style these input nodes differently but I notice they are not reachable via the
custom_style
function which I pass to
display_all_functions
. How can I change their look?
t
Hi @Roel Bertens! I'm currently looking into it. Do you have any code I could it test it on?
r
Copy code
%%cell_to_module -m my_module --display

def test(a: int, b: int = 2) -> int:
   return a+b
where a and b should have a different style
wrt styling I am using somethng like this
Copy code
def custom_style_function(
    *, node: graph_types.HamiltonNode, node_class: str
) -> Tuple[dict, Optional[str], Optional[str]]:
    if _is_input_node_with_default(node):
        style = ({"fillcolor": "#c1f5cf"}, node_class, "optional")
    else:
        style = ({}, node_class, None)

    return style
And then run
Copy code
dr.display_all_functions(custom_style_function=custom_style_function))
t
On the left side, you see that one picture has grouped inputs and the other has ungrouped + styled inputs. The two available options would be styling the entire group or ungrouping nodes and styling individual input nodes
r
Nice! Do you also have the code for that example?
t
I would need to make a PR with some code change, but wanted to check with you if you preferred: 1. grouped inputs with style applied to whole group 2. ungrouped inputs with style applied to individual inputs this feature request hits a few quirks, but we should be able to sort it for now until we do a greater overhaul of the visualization code
r
@Thierry Jean From these options I would prefer (2) because I want to be able to show the difference between optional and non-optional inputs. Would an option 3 be possible where you group the optional and also group the non-optional inputs? Or maybe an option 4 where all input are grouped but within the node the font of optional inputs is different. So my preference in order: 4, 3, 2.
There is also another thing that I was already considering which doesn't solve the same problem but still helps with creating clarity in the DAG. I would like a config setting to exclude the optional inputs from the DAG. Is that something useful to support or better to try and implement myself?
t
In the short term, options 1 and 2 are most manageable, but other options aren't impossible. This is because of graphviz limitations. "Grouped inputs" are actually a single node that contains a table in the
label
attribute (see Graphviz docs). These tables use a HTML-like syntax specific to Graphviz. To style individual inputs of a grouped input, we would need to expose the node
label
attribute On the opposite, creating an individual node for each inputs, and potentially using subgraphs to group them, creates messier layouts generally and a lot more entropy/variations between reruns of the same viz since it's non-deterministic.
r
And would excluding optional inputs from the DAG be an option on the short term?
t
Option #2 requires two changes: 1. support ungrouping input nodes: ◦ this isn't too difficult, and it would resolve another issue with the
deduplicate_inputs
which is sometimes unintuitive 2. create new logic to identify default inputs (you could style or hide it): ◦ in short, the ad hoc logic we shared earlier in this thread to identify default inputs wouldn't work. I can give you an example workaround, but the API will likely change when we'll decide how to support default values
@ThreadScribe can you summarize this discussion into a structured GitHub issue for me?
t
🪄 Summoning the summary elves... one moment!
Hey @Thierry Jean, your summary is ready! ### Clarifying Optional Inputs for Hamilton Driver In a recent discussion, the team explored ways to visually distinguish optional inputs from required ones in a Hamilton driver. Here are the key points from the conversation: - The request was to make it clear to the user that certain inputs to the driver are optional. This was in context of differentiating "optional" from "required" inputs in the visualizations. - A utility function was implemented in the
hamilton.plugins.h_experiments.ExperimentTracker
to get the default parameters of a node. This function inspects the function signature to distinguish default values for inputs. - The team also discussed the possibility of surfacing all the configuration keys, required inputs, and optional inputs to automatically generate a "configuration space" of all possible DAGs. This could potentially help with ML experiments or enable DAG authors to constrain the input space to valid configurations for downstream users. - Discussions included moving certain methods to different modules or properties for better accessibility and usage. - Sample code snippets were shared for identifying and styling input nodes with default values, along with the challenges and options for applying different styles to input nodes. - Short-term solutions were discussed, including support for ungrouping input nodes and creating new logic to identify default inputs for styling or hiding. - Graphviz limitations were highlighted, along with the potential challenges in implementing certain display options. - It was noted that excluding optional inputs from the DAG displayed a complex set of challenges and limitations, particularly in relation to Graphviz and the current API logic. The team also agreed to summarize the discussion into a structured GitHub issue to further track the progress of the potential changes. Feel free to refer to the complete discussion for additional context.
You can also view your Scribe HERE 👀
t
@Stefan Krawczyk this was pretty fast, only took seconds