Slackbot
10/31/2022, 5:13 PMHarry Wrightson
10/31/2022, 5:16 PMStefan Krawczyk
10/31/2022, 5:17 PMdef model_function(col1: pd.Series, ..., colN: pd.Series) -> ...:
# update the function signature for each and every column we want for the model
It’s then very clear when things are changing from a change management process, but yes this requires updates anytime something changes — and maybe be verbose to write out.
Question, just to understand a bit better before answering more, what’s the pain for you? Development? Or is this going to change frequently, if so how frequently?Harry Wrightson
10/31/2022, 5:26 PMStefan Krawczyk
10/31/2022, 5:32 PMdef data_set(col1: pd.Series, ..., colN: pd.Series) -> pd.DataFrame:
# this function describes the columns that go into the data set.
# logic to create the dataframe
return df
@extract_fields('train_set', 'test_set')
def train_test_split(data_set: pd.DataFrame, split_ratio: float, ...) -> Dict[str, pd.DataFrame]:
# logic to split the data_set
return {'train_set': train_df, 'test_set': test_df}
def train_model(train_set: pd.DataFrame, ...) -> ...:
# fit the model...
You can see some of this structure in the scikit-learn example.Stefan Krawczyk
10/31/2022, 5:32 PMStefan Krawczyk
10/31/2022, 5:35 PMHarry Wrightson
10/31/2022, 5:40 PMStefan Krawczyk
10/31/2022, 5:41 PMHarry Wrightson
10/31/2022, 5:52 PMWit Jakuczun
11/01/2022, 9:45 AMStefan Krawczyk
11/01/2022, 5:37 PM