This message was deleted.
# hamilton-help
s
This message was deleted.
e
Hey — a few approaches. AFK but I can show you code samples later. 1. You can do so in the functions, but you have to maintain the join index with the series. Then, you may need to write a custom results builder to join if there’s funky logic (e.g. you might want to do a chained left merge). I’d recommend testing this out. 2. You can always set the values you don’t want as nan (or some sentinel value) in functions so the index stays the same and add to the results builder or some post-processing step to remove them. 3. You can write a function that accepts upstream series and joins them in the way you want — effectively (1) but you make it part of the DAG rather than outside.
j
Thanks for the approaches :) Codesamples would be really helpful :))
e
So yeah! Here’s what it lokos like (pseudocode): 1. Results builder — see https://hamilton.dagworks.io/en/latest/reference/api-extensions/custom-result-builders/ — should be easy enough to adapt. 2. nans:
Copy code
res = dr.execute(["ID", "customer_id", "first_name"], ...)
res = res.dropna() # You should figure out the best way to do this -- maybe across just some columns?
3. Joining in functions:
Copy code
def final_result(ID: pd.Series, customer_id: pd.Series, first_name: pd.Series, filter_name: str) -> pd.DataFrame:
    df = pd.concat([ID, customer_id, first_name], axis=0)
    return df[df.first_name == filter_name]

res = dr.execute(["final_results"], ...)
OTOH, it might just work! Worth a try. All it does is a concat (I think), and pandas is smart about indices:
Copy code
>>> a = pd.Series(index=[1,2,3], data=['a','b','c'])
>>> b = pd.Series(index=[2,3,4], data=['e', 'f', 'g'])
>>> pd.concat([a,b], axis=1)
     0    1
1    a  NaN
2    b    e
3    c    f
4  NaN    g
So, if you manage the index carefully, this will just… work:
Copy code
>>> def a() -> pd.Series:
...     return pd.Series(index=[1,2,3], data=['a','b','c'])
...
>>> def b() -> pd.Series:
...     return pd.Series(index=[2,3,4], data=['e', 'f', 'g'])
...
>>> from hamilton.ad_hoc_utils import create_temporary_module
>>> dr.execute(["a","b"])
     a    b
1    a  NaN
2    b    e
3    c    f
4  NaN    g
>>> dr.execute(["a","b"]).dropna()
   a  b
2  b  e
3  c  f
That said, managing indices can be a little tricky, so you may want to consider building your own results builder to handle edge cases and make it more explicit what’s happening.