Slackbot
10/17/2023, 10:04 PMElijah Ben Izzy
10/17/2023, 10:08 PMdef hourly_results(
period_start: pd.Series,
revenue: pd.Series,
price: pd.Series
) -> pd.DataFrame:
return pd.merge(...)
def daily_results(hourly_results: pd.DataFrame) -> pd.DataFrame:
return group_by_day(hourly_results)
Then you can call it as you want, or call out hourly_results
or daily_results
as an output.Elijah Ben Izzy
10/17/2023, 10:09 PMAlec Hewitt
10/17/2023, 10:10 PMElijah Ben Izzy
10/17/2023, 10:13 PMdef revenue_hourly(revenue: pd.Series) -> pd.Series:
return ...
def price_hourly(price: pd.Series) -> pd.Series:
return ...
dr.execute(['revenue_hourly', 'price_hourly'], ...)
You can also mess with the naming. Given that you have only two it makes sense to do them individually, but parameterization/extraction decorators can always help.Alec Hewitt
10/17/2023, 10:13 PMdaily_results
is a Pandera schema. Eg:
def daily_results(hourly_results: DataFrame[HourlyPanderaSchema]) -> DataFrame[DailyPanderaSchema]:
...
Alec Hewitt
10/17/2023, 10:14 PMAlec Hewitt
10/17/2023, 10:15 PMdr.execute(['revenue_hourly', 'price_hourly'], ...)
I still want my final data frame to have just a revenue
and price
columnsElijah Ben Izzy
10/17/2023, 10:18 PMcheck_output
decorator:
@check_output(schema=DailyPanderaSchema())
def daily_results(...):
...
Unfortunately we don’t yet have the ability to use output type-hinting (yet), but I think we could figure that out actually.
Re: renaming you have two options:
1. Change it so revenue
is the final name, and the prior one is called revenue_daily
2. Rename later — after the fact df.columns = […]
. You can also make this part of a dataframe, e.g. query hourly_results
that changes the name and joins the series, kind of like what I suggested above.
You could also do a custom result builder if you want — there are a few ways to decouple the name from the exact column generated, but these two are the simplest.Alec Hewitt
10/17/2023, 10:25 PMdef revenue_daily(raw_revenue: pd.Series) -> pd.Series:
# Sum by Day
...
def revenue_hourly(raw_revenue: pd.Series) -> pd.Series:
# Sum by Hour
...
@config.when(granularity=HOURLY)
def revenue__hourly_granularity(revenue_hourly: pd.Series) -> pd.Series
return revenue_hourly
@config.when(granularity=DAILY)
def revenue__daily_granularity(revenue_daily: pd.Series) -> pd.Series
return revenue_daily
Elijah Ben Izzy
10/17/2023, 10:28 PMAlec Hewitt
10/17/2023, 10:29 PMElijah Ben Izzy
10/17/2023, 10:30 PMAlec Hewitt
10/17/2023, 10:32 PMElijah Ben Izzy
10/17/2023, 10:32 PM