Slackbot
03/27/2023, 1:49 PMElijah Ben Izzy
03/27/2023, 3:13 PMMichal Siedlaczek
03/27/2023, 3:14 PMElijah Ben Izzy
03/27/2023, 5:02 PMMichal Siedlaczek
03/27/2023, 9:00 PMMichal Siedlaczek
03/27/2023, 9:01 PMElijah Ben Izzy
03/27/2023, 9:04 PMMichal Siedlaczek
03/27/2023, 9:13 PMMichal Siedlaczek
03/27/2023, 9:15 PMI can put some pseudo code to make this a little clearer later tonight!thanks, but don't worry about it for now. I think I have a good-enough idea, I'll just need to maybe do it on a small example to get a closer look. I'll come back with some concrete questions if I hit an obstacle 🙂
Elijah Ben Izzy
03/27/2023, 10:01 PMStefan Krawczyk
03/28/2023, 3:59 AMMichal Siedlaczek
03/29/2023, 10:16 PMMichal Siedlaczek
03/29/2023, 10:17 PMlabel
for node resoltutionMichal Siedlaczek
03/29/2023, 10:17 PMStefan Krawczyk
03/29/2023, 10:37 PMMichal Siedlaczek
03/29/2023, 10:38 PMMichal Siedlaczek
03/29/2023, 10:39 PMStefan Krawczyk
03/29/2023, 10:39 PMMichal Siedlaczek
03/29/2023, 10:39 PMMichal Siedlaczek
03/29/2023, 10:40 PMhow many places do you want to use the caching?didn't count, but might be 10-ish
Michal Siedlaczek
03/29/2023, 10:40 PMMichal Siedlaczek
03/29/2023, 10:42 PMElijah Ben Izzy
03/29/2023, 10:43 PMStefan Krawczyk
03/29/2023, 10:44 PM@config.when(use_cache="False")
def foo__compute(...) -> pd.DataFrame:
df = ... # compute it
cache_df(df, name=...)
return df
@config.when(use_cache="True")
def foo__cached() -> pd.DataFrame:
df = load_df(name= ...)
return df
Stefan Krawczyk
03/29/2023, 10:45 PMMichal Siedlaczek
03/29/2023, 10:45 PMMichal Siedlaczek
03/29/2023, 10:46 PMload_cache
one really doesn't require function body, but it is annoying that I have to define itMichal Siedlaczek
03/29/2023, 10:47 PMMichal Siedlaczek
03/29/2023, 10:47 PMStefan Krawczyk
03/29/2023, 10:48 PMfoo
would also work — but then whether that works well depends on how you want things to be configurableStefan Krawczyk
03/29/2023, 10:50 PM@my_cache_decorator
def foo(...) -> pd.DataFrame:
df = ...
return df
and if done correctly Hamilton would still crawl this, and create the node foo
it’d just have the path check for the file or not; i.e. load from cache if it exists, else compute it.Stefan Krawczyk
03/29/2023, 10:51 PMMichal Siedlaczek
03/29/2023, 10:52 PMStefan Krawczyk
03/29/2023, 10:53 PMconfig
could also work — you can get the function name by just doing fn.___name___
I believe.Michal Siedlaczek
03/29/2023, 10:54 PMfn.___name___
when writing/loading the file -- that worksMichal Siedlaczek
03/29/2023, 10:55 PMMichal Siedlaczek
03/29/2023, 10:55 PMMichal Siedlaczek
03/29/2023, 10:55 PMStefan Krawczyk
03/29/2023, 10:55 PMMichal Siedlaczek
03/29/2023, 10:56 PMMichal Siedlaczek
03/29/2023, 10:57 PMStefan Krawczyk
03/29/2023, 11:00 PMMichal Siedlaczek
03/29/2023, 11:08 PMElijah Ben Izzy
03/30/2023, 2:14 PMtarget
to the caching function, then we could load up when we actually run it…
@checkpoint(target="foo")
@extract_columns('bar', 'baz')
def foo() -> pd.DataFrame:
"""Some expensive computation"""
But, currently, this requires some surgery into the way it works (particularly being aware of nodes)Stefan Krawczyk
03/30/2023, 2:19 PMElijah Ben Izzy
03/30/2023, 3:34 PMoverrides
. You’d still have to use the function name (and only have ones that the function name corresponds to the nodes), but then you’d only need a single decorator/implementation, and get short-circuiting as well.Michal Siedlaczek
03/30/2023, 4:28 PMElijah Ben Izzy
03/30/2023, 5:36 PMMichal Siedlaczek
03/30/2023, 9:54 PM@extract_fields
for example, but I work around it.Michal Siedlaczek
03/31/2023, 2:19 PMMichal Siedlaczek
03/31/2023, 2:21 PMElijah Ben Izzy
03/31/2023, 3:08 PMElijah Ben Izzy
03/31/2023, 3:09 PMMichal Siedlaczek
03/31/2023, 6:57 PMStefan Krawczyk
03/31/2023, 7:59 PMMichal Siedlaczek
03/31/2023, 7:59 PMElijah Ben Izzy
03/31/2023, 9:16 PMMichal Siedlaczek
03/31/2023, 9:54 PMWorking now on a caching mechanism that’s built in — will likely take a different approach, but I want to ensure that it has at least as much functionality as yoursI'm definitely interested in seeing what comes out of it. I'm sure my approach is limited, including some ways I'm not even aware of. but in my use case, it seems to be working great so far.
Elijah Ben Izzy
03/31/2023, 9:55 PMMichal Siedlaczek
03/31/2023, 9:55 PMMichal Siedlaczek
03/31/2023, 9:56 PMMichal Siedlaczek
03/31/2023, 9:57 PMMichal Siedlaczek
03/31/2023, 9:58 PMElijah Ben Izzy
03/31/2023, 10:01 PMElijah Ben Izzy
03/31/2023, 10:01 PMMichal Siedlaczek
03/31/2023, 10:10 PMMichal Siedlaczek
03/31/2023, 10:11 PMMichal Siedlaczek
03/31/2023, 10:13 PMMichal Siedlaczek
03/31/2023, 10:13 PMMichal Siedlaczek
03/31/2023, 10:14 PMElijah Ben Izzy
03/31/2023, 10:16 PMMichal Siedlaczek
03/31/2023, 10:28 PMMichal Siedlaczek
03/31/2023, 10:29 PMMichal Siedlaczek
03/31/2023, 10:29 PMMichal Siedlaczek
03/31/2023, 10:30 PMMichal Siedlaczek
03/31/2023, 10:31 PM--cache
flag or something)Elijah Ben Izzy
03/31/2023, 10:31 PMElijah Ben Izzy
03/31/2023, 10:31 PMStefan Krawczyk
03/31/2023, 10:32 PMMichal Siedlaczek
03/31/2023, 10:32 PMMichal Siedlaczek
03/31/2023, 10:32 PMMichal Siedlaczek
03/31/2023, 10:33 PMMichal Siedlaczek
03/31/2023, 10:35 PM