Slackbot
04/19/2023, 11:57 AMElijah Ben Izzy
04/19/2023, 3:58 PMconfig.when
: https://hamilton.readthedocs.io/en/latest/reference/api-reference/available-decorators.html#config-when. For instance…
@config.when(region='UK')
def function__us(dep_1: ..., dep_2: ...) -> ...:
"""DAG only gets constructed with this version when the region is set to US"""
@config.when_not(region='UK')
def function(dep_3: ...) -> ...:
"""DAG gets constructed with this version when the region is not US (default)"""
Then in the driver, you just pass config in as the first argument:
dr = driver.Driver({"region" : "US"})
There are some more advanced features with config
if you want — you can actually use it to dynamically resolve any decorator using resolve
(https://hamilton.readthedocs.io/en/latest/reference/api-reference/decorators.html#resolve), but I’d stay away from it if you really need it.
One other thing to note is that config
is different from inputs
— config
is used to build the shape of the DAG, and inputs
is used to pass data in (although anything in config
is also passed into inputs
).
Does this get at what you’re trying to do?Stefan Krawczyk
04/19/2023, 4:16 PMAnkush Kundaliya
04/19/2023, 4:39 PMElijah Ben Izzy
04/19/2023, 4:49 PM#common.py
def some_logic(external_input: ...) -> ...:
...
#setting_1.py
def external_input() -> ...:
return SOME_VALUE
#setting_2.py
def external_input() -> ...:
return SOME_OTHER_VALUE
Then in the driver:
immport common, setting_1, setting_2
dr = driver.Driver({}, common, setting_1)
dr_2 = driver.Driver({}, common, setting_2)
This is pretty simple, but you can imagine having some pretty complex configurations put in..Ankush Kundaliya
04/19/2023, 5:00 PM#common.py
@parameterize(
datasetA={...},
datasetB={...},
)
def make_dataset(external_input: ...) -> ...:
...
Now my DAG has both datasetA and datasetB nodes in all cases, but external_input
for datasetA and datasetB will differ from case to case. Can this be configured? I don’t want to split the make_dataset
into two hamilton functions`datasetA` and datasetB
.Elijah Ben Izzy
04/19/2023, 5:07 PM@parameterize(
datasetA={"external_input" : source("external_input_dataset_a")},
datasetB={"external_input" : source("external_input_dataset_b")},
)
...
Then, all you need is to pass in external_input_dataset_a
and external_input_dataset_b
to the driver:
dr = driver.Driver(common)
dr.execute(
["datasetA", "datasetB"],
inputs={"external_input_dataset_a" : ..., "external_input_dataset_b" : ...})
or you can define them in the other modules…Ankush Kundaliya
04/19/2023, 5:10 PMElijah Ben Izzy
04/19/2023, 5:11 PMAnkush Kundaliya
04/19/2023, 5:13 PMElijah Ben Izzy
04/19/2023, 5:15 PMAnkush Kundaliya
04/19/2023, 5:16 PMElijah Ben Izzy
04/19/2023, 5:18 PMStefan Krawczyk
04/19/2023, 5:19 PM