This message was deleted.
# hamilton-help
s
This message was deleted.
đź‘€ 1
s
Welcome & thanks for the questions! I’ll give the TLDR response first, then ask a few more clarifying questions. > Something similar to @config.when() but not based on the static configuration but dynamically evaluated based on another node Not right now. We could enable something like this, but there’s a few edge cases to think through. Otherwise there’s usually a non-dynamic way to structure things. > Also, is it possible to not execute a node if this node has an input that has itself not been computed? Hamilton won’t compute a node until all its inputs have been computed and it’s on the computational path to a requested output.
Questions: • bigger picture: what are you trying to avoid/optimize for with this dynamism? • what do you want to encode in the DAG exactly? Would each motor have a corresponding node, or would it be a parameterization of the DAG, i.e. you pass in the motor ID as input? • How complex is the DAG? • If it’s a parameterization, would the DAG be run in a loop? or?
n
Thank you for your quick response. Here are some clarifying elements: • I have some computations to perform that do not make sense when some preliminary conditions are not met (i.e. computing the active power of a motor or checking its state when the motor is actually turned off). • I pass a motor ID at the input and all the nodes of the DAG compute elements corresponding to this particular motor. • Not very complex, but I would prefer not to pass all previous nodes as input to subsequent nodes. i.e. in my example, I should pass the result of the first node ("motor_off") to all subsequent nodes to discard the nodes computation if the motor is off. • Yes. Right now, the Hamilton DAG is run for every motor individually at the "micro level". We also have a "macro level" DAG using Metaflow that orchestrates the computations for every motor (and at every timestamp coming from our sensors that provide the input data to the "micro" DAG).
👍 1
s
I have more questions! > I have some computations to perform that do not make sense when some preliminary conditions are not met (i.e. computing the active power of a motor or checking its state when the motor is actually turned off). are these ever numeric values? Or are they just boolean in nature? Do you want to “short-circuit” the DAG early (i.e. stop computation of a particular path), or go down some alternate path? Is this condition always at the “start” of the DAG? or? Otherwise what do you want returned in the case of the motor being off, or some part not being computable? e.g.
Copy code
result = dr.execute(["foo", "bar", "baz", ...], inputs={"motor_id": 1})
print(result) # ???
Not very complex, but I would prefer not to pass all previous nodes as input to subsequent nodes. i.e. in my example, I should pass the result of the first node (“motor_off”) to all subsequent nodes to discard the nodes computation if the motor is off.
To clarify you mean not doing this:
Copy code
def transform_foo(motor_on: bool, ...) -> ...:
    if not motor_on:
       return None
    ...

def transform_bar(motor_on: bool, ...) -> ...:
    if not motor_on:
       return None 
    ...
or this — checking for None values on all inputs (or some other sentinel value)
Copy code
def transform_foo(motor_on: bool, ...) -> ...:
    if not motor_on:
       return None
    ...

def transform_theta(transform_foo: float, ...) -> ...:
    if transform_foo is None or ... is None... :
       return None
    ...
n
You are correct. I would like to short-circuit parts of the DAG based on results from previous nodes. All subsequent "short-circuited" nodes would just not return anything, even if they are in the requested outputs, and as a result not appear in the results dataframe. For now I had the idea of chaining multiple DAGs together and conditionally execute them based on the result of previous DAGs. E.g. DAG1 would check the consistency of the data (among other things check if the motor is turned off) and if the data is consistent, I would compute some new features in DAG2 (power, load percentage, etc.).
s
> For now I had the idea of chaining multiple DAGs together and conditionally execute them based on the result of previous DAGs. E.g. DAG1 would check the consistency of the data (among other things check if the motor is turned off) and if the data is consistent, I would compute some new features in DAG2 (power, load percentage, etc.). that sounds pretty reasonable and also understandable to someone coming to read the code. In terms of short-circuiting, are you looking to create a dataframe as an output? or a dictionary? or some custom object? If we were to have short-circuiting, would you need to have similar checks to the multiple chaining approach because you’d need to determine what was or wasn’t computed after the Hamilton run 🤔? (just thinking through what’s clearer and easier to maintain)
n
A pandas dataframe is probably the easiest, although a dict could work as well since in my case I only have a single row in my dataframe within the "micro DAG". In my case, the checks could be made on the results output dataframe/dict (the missing columns/keys indicate which nodes have not been run).
s
right. Did you consider vectorization? Or are you doing it per motor to try to parallelize things? or?
n
Yes, the "micro DAG" by Hamilton is per motor and the parallelization is performed on the "macro DAG" ran by Metaflow.
s
Cool. Hamilton does have some parallelization capabilities that could work here 🤔 (applicability dependent on the logic, etc) — here’s a quick sketch.
n
Super, will try this out! Thank you