This message was deleted.
# hamilton-help
s
This message was deleted.
e
Hey! Hamilton uses them in two ways: (1) validates that the inputs are the right type (2) validates that nodes are matches That said, @Stefan Krawczyk thought about how to make it a looser type-check as an option. https://github.com/stitchfix/hamilton/issues/181. Our hope is its relatively unobtrusive and makes your code more readable, but would be curious why you want to turn it off 🙂
s
Clarification on (1) — we validate inputs to the DAG (not each function call) match what’s expected when you do
execute()
.
🙏 1
But otherwise it’s used for DAG construction (point (2)). It’s straightforward to augment — depends on what you want to achieve / model 🙂
z
thanks - I think we were debating whether to have functions/nodes return pd.Series vs np.arrays, and realized that we’d have to do a bunch of ‘refactoring’ to make that work due to the type hints (unless from the get-go we declare them as unions). couple of follow ups: 1. if our DAG outputs are now numpy arrays instead of pd.Series, will the driver still produce a pd.Dataframe? if not, will return a dictionary instead (I guess we can always turn it into a df with
from_dict
) 2. if the outputs are a mix of pd.Series and other types (ints, etc.), will the driver automatically fallback to returning dictionaries? i.e. does it concatenate into a df only if everything is a pd.Series, and otherwise returns a dict?
s
realized that we’d have to do a bunch of ‘refactoring’ to make that work
Search and replace shouldn’t be hard to place in a bunch of unions? Otherwise you can define a graph adapter to make them equivalent. Let me know if you want code for that. Regarding: (1) If using the default driver and adapter then yes — since the following seems to work.
Copy code
a = {'a': np.array([1,2,3]), 'b': np.array([3, 4, 5])}
pd.DataFrame(a)
   a  b
0  1  3
1  2  4
2  3  5
If you want a dictionary of output then you need to pass in adapter to the driver that will do that. Sorry but I see we’re lacking in docs here — but the idea can be seen in https://hamilton-docs.gitbook.io/docs/reference/api-reference/available-drivers and you change step 3 …
regarding: (2) with the default driver and adapter, it will try to do https://github.com/stitchfix/hamilton/blob/main/hamilton/base.py#L172 — so will rely on Pandas being able to stitch things together appropriately. Else it will fail. For both (1) & (2) - if you want a dictionary back — then you need to instantiate https://github.com/stitchfix/hamilton/blob/main/hamilton/base.py#L35 and pass that to https://github.com/stitchfix/hamilton/blob/main/hamilton/base.py#L333, and in turn pass that to the driver. One of the design goals of Hamilton was that you should hopefully only have to touch/change driver code, and leave logic untouched…
👍 1
❤️ 1