This message was deleted Hamilton Open Source #hamilton-help

Join Slack

This message was deleted.

# hamilton-help

Slackbot

11/17/2022, 7:59 PM

This message was deleted.

Elijah Ben Izzy

11/17/2022, 8:03 PM

Hey! Hamilton uses them in two ways: (1) validates that the inputs are the right type (2) validates that nodes are matches That said, @Stefan Krawczyk thought about how to make it a looser type-check as an option. https://github.com/stitchfix/hamilton/issues/181. Our hope is its relatively unobtrusive and makes your code more readable, but would be curious why you want to turn it off 🙂

Stefan Krawczyk

11/17/2022, 8:09 PM

Clarification on (1) — we validate inputs to the DAG (not each function call) match what’s expected when you do

execute()

🙏 1

Stefan Krawczyk

11/17/2022, 8:10 PM

But otherwise it’s used for DAG construction (point (2)). It’s straightforward to augment — depends on what you want to achieve / model 🙂

Zouhair Mahboubi

11/17/2022, 11:57 PM

thanks - I think we were debating whether to have functions/nodes return pd.Series vs np.arrays, and realized that we’d have to do a bunch of ‘refactoring’ to make that work due to the type hints (unless from the get-go we declare them as unions). couple of follow ups: 1. if our DAG outputs are now numpy arrays instead of pd.Series, will the driver still produce a pd.Dataframe? if not, will return a dictionary instead (I guess we can always turn it into a df with

from_dict

) 2. if the outputs are a mix of pd.Series and other types (ints, etc.), will the driver automatically fallback to returning dictionaries? i.e. does it concatenate into a df only if everything is a pd.Series, and otherwise returns a dict?

Stefan Krawczyk

11/18/2022, 12:12 AM

realized that we’d have to do a bunch of ‘refactoring’ to make that work

Search and replace shouldn’t be hard to place in a bunch of unions? Otherwise you can define a graph adapter to make them equivalent. Let me know if you want code for that. Regarding: (1) If using the default driver and adapter then yes — since the following seems to work.

Copy code

a = {'a': np.array([1,2,3]), 'b': np.array([3, 4, 5])}
pd.DataFrame(a)
   a  b
0  1  3
1  2  4
2  3  5

If you want a dictionary of output then you need to pass in adapter to the driver that will do that. Sorry but I see we’re lacking in docs here — but the idea can be seen in https://hamilton-docs.gitbook.io/docs/reference/api-reference/available-drivers and you change step 3 …

Stefan Krawczyk

11/18/2022, 12:15 AM

regarding: (2) with the default driver and adapter, it will try to do https://github.com/stitchfix/hamilton/blob/main/hamilton/base.py#L172 — so will rely on Pandas being able to stitch things together appropriately. Else it will fail. For both (1) & (2) - if you want a dictionary back — then you need to instantiate https://github.com/stitchfix/hamilton/blob/main/hamilton/base.py#L35 and pass that to https://github.com/stitchfix/hamilton/blob/main/hamilton/base.py#L333, and in turn pass that to the driver. One of the design goals of Hamilton was that you should hopefully only have to touch/change driver code, and leave logic untouched…

👍 1

❤️ 1

2 Views

Open in Slack

Previous Next