I'm trying out the hamilton.experimental.h_cache.C...
# hamilton-help
g
I'm trying out the hamilton.experimental.h_cache.CachingGraphAdapter on an Azure Linux ML Compute, but I'm getting a 'database is locked' error:
Copy code
from hamilton.plugins import h_diskcache
from hamilton.driver import Builder

cache_adapter = h_diskcache.DiskCacheAdapter()

builder = (Builder()
           .with_adapters(cache_adapter)
           )
It runs for about a minute and then outputs: --------------------------------------------------------------------------- OperationalError Traceback (most recent call last) Cell In[9], <vscode-notebook-cell:?execution_count=9&line=4|line 4> <vscode-notebook-cell:?execution_count=9&line=1|1> from hamilton.plugins import h_diskcache <vscode-notebook-cell:?execution_count=9&line=2|2> from hamilton.driver import Builder ----> <vscode-notebook-cell:?execution_count=9&line=4|4> cache_adapter = h_diskcache.DiskCacheAdapter() <vscode-notebook-cell:?execution_count=9&line=6|6> builder = (Builder() <vscode-notebook-cell:?execution_count=9&line=7|7> .with_adapters(cache_adapter) <vscode-notebook-cell:?execution_count=9&line=8|8> ) File /anaconda/envs/pdf-env/lib/python3.10/site-packages/hamilton/plugins/h_diskcache.py:84, in DiskCacheAdapter.__init__(self, cache_vars, cache_path, **cache_settings) 82 self.cache_vars = cache_vars if cache_vars else [] 83 self.cache_path = cache_path ---> 84 self.cache = diskcache.Cache(directory=cache_path, **cache_settings) 85 self.nodes_history: Dict[str, List[str]] = self.cache.get( 86 key=DiskCacheAdapter.nodes_history_key, default=dict() 87 ) # type: ignore 88 self.used_nodes_hash: Dict[str, str] = dict() File /anaconda/envs/pdf-env/lib/python3.10/site-packages/diskcache/core.py:478, in Cache.__init__(self, directory, timeout, disk, **settings) 476 for key, value in sorted(sets.items()): 477 if key.startswith('sqlite_'): --> 478 self.reset(key, value, update=False) 480 sql( 481 'CREATE TABLE IF NOT EXISTS Settings (' 482 ' key TEXT NOT NULL UNIQUE,' 483 ' value)' 484 ) 486 # Setup Disk object (must happen after settings initialized). File /anaconda/envs/pdf-env/lib/python3.10/site-packages/diskcache/core.py:2438, in Cache.reset(self, key, value, update) 2436 update = True 2437 if update: -> 2438 sql('PRAGMA %s = %s' % (pragma, value)).fetchall() 2439 break 2440 except sqlite3.OperationalError as exc: OperationalError: database is locked === Any ideas?
s
try moving the “db”. If you look under
.
where the code ran there should be some file
it could be something to do with state of the DB
or providing a new “cache_path” to
DiskCacheAdapter
g
I see cache.db When I move the cach_path - it does the same (+error)
s
is there a local filesystem mounted? can you verify you can write to that path?
g
It's interesting... as long as the notebook kernel is running - I can't delete the file. only when I restart the kernel -I can delete the cache.db Maybe something keeps the db "open" regarding filesystem - it's mounted via Azure ML compute. I write to that path all the time.
s
ah — interesting — I’ll try to recreate it. You’re running it via VSCode or via Jupyter?
g
VSCode (Notebook) On Linux (Azure ML Compute)
s
what’s the easiest way to replicate this?
you run something, and stop it before it completes and then you try to create the adapter again? or?
or just try to create the adapter twice?
g
Just to make sure we understand each other - my problem with this is that I can't get it to work. Regarding the deletion - I opened a VSCode notebook, ran the code above and waited for a minute until it gave me the error. From that moment - I couldn't delete the cache.db file until I restarted the kernel. Hope this helps!
👍 1