This message was deleted.
# hamilton-help
s
This message was deleted.
s
So you want to create multiple functions that output dataframes, that you would then want to run extract_columns on to expose those columns?
b
Yes, I'm already passing in a list of dicts to
@parameterize
(as an expanded dict comprehension), ideally I could just specify multiple output columns there.
s
🤔 hmm. Would have to think about this one. Challenge would be to ensure things are still evident as to what is going on and thus readable. That said,
extract_columns
is just syntactic sugar for:
Copy code
def column_a(my_df: pd.Dataframe) -> pd.Series:
   return my_df['column_a']

def column_b(my_df: pd.Dataframe) -> pd.Series:
    return my_df['column_b']
Which I think (would need to write some code to prove this to myself) you could write as a separate parametrize function itself, rather than sticking it all into one parameterize function.
e
Building on what @Stefan Krawczyk said, going from memory — currently doing it that exact way is going to be tricky — mechanically it should work but the names will conflict with each other. E.G. we’ll extract the same columns on all parameterizations. Does each parameteization product the same set of volume? Or different ones? There are a few approaches I can think of: 1. Split into two — have a function for each parameterization that's an identity with extract_columns 2. Fold it into a single parametrization where the function only returns the colum (as Stefan is suggesting). Not at my computer now but I'll be mulling this over.
Also highly recommend trying the new
@parameterize
decorator — allows for both values and inputs :)
b
Each parameterization produces a different set of three columns. (if that's what you're asking?) I'll try stuff out and see what I can come up with.
(I'm already using
@parameterize
, it's great -- although the docs are a little confusing still, they talk about
source()
and
value()
initially and then about
upstream()
and
literal()
, are they the same things? Or is upstream/literal an old way of writing it?)
e
Yep, what I was asking. And ugh, need to fix it! We settled on source and value, upstream and literal are older. We switched halfway through making it. Will fix the docs, thanks!
s
(I’m already using
@parameterize
, it’s great -- although the docs are a little confusing still, they talk about
source()
and
value()
initially and then about
upstream()
and
literal()
, are they the same things? Or is upstream/literal an old way of writing it?
[edit] what @Elijah Ben Izzy said [/edit]. If you have time please feel free to create an issue for this — else I’ll try to get to this in the afternoon, if not early next week.
or @Elijah Ben Izzy do you have this one?
e
Yep I can handle it soon 👍
OK, tries to make it a little clearer. Still use `literal`/`upstream` internally and in some places to describe it, but the APIs in documentation are made consistent in this PR: https://github.com/stitchfix/hamilton/pull/192
s
@Elijah Ben Izzy do you need to update gitbook too?
e
Think k got it but will check tomorrow
Gitbook had it right
m
For those searching for combined
@parameterize_sources
and
@extract_columns
functionality (like I was): https://github.com/stitchfix/hamilton/issues/196
e
Hey Michael -- I'll be digging into that soon. In the meanwhile, feel free to post your thoughts + use-case! The more general we can make it/more use-cases we think about before building the happier we'll be,
m
Hey Elijah, I think the use cases described in the issue were pretty much what I was thinking: a multi-input and multi-output function that is called multiple times through use of a parameterization The approach I would take now is to use
@parameterize_sources
with a function that outputs a DataFrame and then unique functions to then extract the columns (there is a good example with in the issue that shows this with the my_disaggregator functions with
@extract_columns
).
s
in case anyone is wondering, There’s a branch up with functionality for “Using tables/dataframes for parameterization” Issue 196. If you wanted to play around with it — see this comment — we’d love any feedback.
e
Yeah! So I'd love for y'all to take the API for a spin -- its not super polished yet but it would be great to get the community's take on what the API should look like from first principles.