This message was deleted Hamilton Open Source #hamilton-help

Join Slack

This message was deleted.

# hamilton-help

Slackbot

12/06/2022, 10:29 PM

This message was deleted.

👀 1

Stefan Krawczyk

12/06/2022, 11:16 PM

Thanks for the question! To confirm my understanding, you: 1. have a process that generates dataframes, e.g.

Copy code

def sample_field1(n:int)-> pd.DataFrame:
	return DataFrame({'field1': [38115, 71525, 84920, 25997])

2. you want to cross join N of these dataframes. 3. you would like to understand how to model this with Hamilton. Is that correct?

Stefan Krawczyk

12/06/2022, 11:19 PM

You could do something like: 1. Define all the “sample functions” explicitly

Copy code

def sample_field1(n:int)-> pd.DataFrame:
	return DataFrame({'field1': [38115, 71525, 84920, 25997])

def sample_field2(...) -> pd.DataFrame:
    ...

def sample_field3(...) -> pd.DataFrame:
    ...

2. define the cross join explicitly:

Copy code

def cross_join_of_fields(sample_field1: pd.DataFrame, sample_field2: pd.DataFrame, sample_field3: pd.DataFrame) -> pd.DataFrame:
    # outputs a new dataframe that is the cross-join (merge with how='cross')

3. go and use it downstream

Copy code

def some_other_function(cross_join_of_fields: pd.DataFrame) -> ...:
   ...

Stefan Krawczyk

12/06/2022, 11:26 PM

There are few other ways to write the above, especially the

sample functions

(e.g. using @parameterize). Otherwise for the cross join function, we require you to be explicit in naming the inputs to the function. If you end up having to change that function often then please chime in on this issue, which could help in this case.

Baldo Faieta

12/07/2022, 1:49 AM

Thanks for the response. I guess I was hoping there would be another way other than specifying the cross join every time. I can see that having something like what is discussed in issue #226 would work. I'll try to work with what is here to see how much of a pain it is and report back.

👍 1

2 Views

Open in Slack

Previous Next