This message was deleted Hamilton Open Source #general

Join Slack

This message was deleted.

# general

Slackbot

12/20/2023, 7:03 PM

This message was deleted.

Roel Bertens

12/21/2023, 8:17 AM

Is there an overview of the current capabilities somewhere? I would be interested to be able to use different types of nodes e.g. by using a decorator to make the diagram easier to read for specific use cases.

Elijah Ben Izzy

12/21/2023, 6:11 PM

@Roel Bertens and I caught up offline. To tie the loop, we don’t have all of them documented, they’re a bit spread out. But there are currently a few different functions that should be well-documented: • Driver.display_all_functions • Driver.display_downstream_of • Driver.display_upstream_of • Driver.visualize_path_between • Driver.visualize_materialization @Roel Bertens is scoping out adding the notion of adding schema information to make viz easier

Thierry Jean

12/21/2023, 6:45 PM

• would like to have them all name with the same convention

display_

• infer file type from name/path • pipe graphviz to avoid generating a DOT file • create a structured object

HamiltonDisplayConfig

that's well documented and type-annotated that encompasses all of the public interface (args, kwargs, render_kwargs, graph_kwargs) • advanced: provide string alias for long type annotations • advanced: enable theming (requires structured objects and config)

miek

12/25/2023, 7:29 PM

+1 on avoiding the .dot file generation and render .png directly

Arthur Andres

12/29/2023, 9:22 AM

Is there a way to render inputs as nodes (with a different color), instead of having them floating (and duplicated) next to each node?

Stefan Krawczyk

12/29/2023, 5:08 PM

there’s a flag that might help. deduplicate_inputs=True

🔥 2

👍 1

Stefan Krawczyk

12/29/2023, 5:10 PM

otherwise yeah we’re thinking about how to bring more customization to visualization

Roel Bertens

01/09/2024, 7:10 AM

@Stefan Krawczyk any news on this topic? I want to be able to tag nodes with different types to give them a different color to show some more structure in the DAG. Are there similar ideas or examples of this already? We could e.g. @tag to define a type for the node and use a config to choose the colors corresponding to each type. Or we could tag with colors directly which could be automatically picked up. But then maybe a special decorator for this would be better. Then you could also include a name to display in the legend for that color too. E.g. @custom_node_type(color=.., name=..) What do you think?

Elijah Ben Izzy

01/09/2024, 3:30 PM

@Roel Bertens interesting — to flesh it out more: What types do you want? Toy example maybe? I like the idea of tagging categories/metadata, or passing in a tag to hamilton to categorize/putting it in the legend, just want to make it a bit more concrete. The other possibility is outputting the .dot file and modifying that if you want to work quickly/build something more custom. Should be pretty easy to x-reference that with the node tags

Roel Bertens

01/09/2024, 3:52 PM

My example: I have three different types of nodes: input data, intermediate data and output data. I want to give them a different look such that it is easier to take all the info from the diagram as a user

Roel Bertens

01/09/2024, 3:53 PM

input data is source data. Intermediate is pretty clear. And output data (features) is what is ready for the users to use

Thierry Jean

01/09/2024, 5:01 PM

I think this should get you started! The general idea is to define a stylesheet containing Graphviz attributes and parse the tags from the `driver.list_all_variables()`to edit the output of `driver.display_*`which is a `graphviz.Digraph`object full solution: https://gist.github.com/zilto/e034d12cf6d632f0f3ea9c3686830ce6 interactive browser graphviz editor: https://edotor.net/

Copy code

# main.py
import graphviz
from hamilton import driver

import functions

# customize graphviz render: <https://graphviz.org/docs/nodes/>
# careful with overwriting string attributes; fillcolor should be safe
level_stylesheet = dict(
    intermediate=dict(
        level="intermediate",  # add arbitrary metadata to the DOT file; could collide with graphviz attributes
        fillcolor="royalblue",  # edits the style
    ),
    final=dict(
        level="final",
        fillcolor="aquamarine",
    )
)


dr = driver.Builder().with_modules(functions).build()
g: graphviz.Digraph = dr.display_all_functions()

for v in dr.list_available_variables():
    if level := v.tags.get("level"):
        g.node(v.name, **level_stylesheet[level])

Stefan Krawczyk

01/09/2024, 5:43 PM

🤔 setting up a discussion on enabling node level styling — https://github.com/DAGWorks-Inc/hamilton/discussions/624

Roel Bertens

01/09/2024, 5:52 PM

Thanks @Thierry Jean for the example! Am I correct that this doesn’t include the different styling in the legend?

Stefan Krawczyk

01/09/2024, 5:58 PM

the legend is a “subgraph” property on the graphviz object — so I assume it’s accessible to be modified/added to (will look up code in a bit / wait for Thierry).

Thierry Jean

01/09/2024, 6:14 PM

As a temporary solution (hoping to support it better), you can add this section

Copy code

for v in dr.list_available_variables():
    if level := v.tags.get("level"):
        g.node(v.name, **level_stylesheet[level])
        continue

# the style used for Function nodes
default_node_style = dict(
    shape="rectangle",
    margin="0.15",
    style="rounded,filled",
    fillcolor="#b4d8e4",
    fontname="Helvetica",
)

# `cluster__legend` is the name of the legend subgraph
with g.subgraph(name='cluster__legend') as legend_subgraph:
    for level, style in level_stylesheet.items():
        legend_node = dict(**default_node_style)  # set default style
        legend_node.update(**style)  # update default style with stylesheet
        legend_subgraph.node(level, label=level, **legend_node)

Roel Bertens

01/09/2024, 6:33 PM

Awesome thanks for the quick replies.

Thierry Jean

01/09/2024, 6:35 PM

No problem! It's good to get a sense of what viz features are useful so we can scope how to move forward

Stefan Krawczyk

01/09/2024, 6:38 PM

@Roel Bertens FYI - tags include the python module a function was in, in case that’s helpful.

Roel Bertens

01/09/2024, 7:00 PM

It is becoming a bit hacky but for the short term I've also added this to remove 'function' from the legend because all my nodes have another style.

Copy code

to_remove = '\t\tfunction [fillcolor="#b4d8e4" fontname=Helvetica margin=0.15 shape=rectangle style="rounded,filled"]\n'
g.body = [l for l in g.body if l != to_remove]

👍 1

Thierry Jean

01/09/2024, 7:12 PM

In the future, we could have custom styling generated before the legend so the legend always matches what's displayed!

Roel Bertens

01/09/2024, 7:29 PM

I'm running into another issue. When I use display_upstream_of I can't iterate over the variables using list_available_variables because then I get also the variables that are not shown. Any quick tip to only get the ones in the graph? Besides parsing the g.body

Thierry Jean

01/09/2024, 7:31 PM

you should be able to use a similar workflow with

driver.what_is_upstream_of(VAR_NAME_1, VAR_NAME_2, ...)

, which returns variables objects like

list_available_variables()

. There is also

what_is_downstream_of()

and

what_is_path_between()

👍 1

🙌 1

Stefan Krawczyk

01/15/2024, 11:26 PM

Okay I have a prototype up https://github.com/DAGWorks-Inc/hamilton/pull/642

👍 2

Thierry Jean

01/15/2024, 11:35 PM

https://github.com/DAGWorks-Inc/hamilton/pull/635#issuecomment-1892873768

Open in Slack

Previous Next