Hi, I am using materialize, and add a `to.pickle` ...
# hamilton-help
r
Hi, I am using materialize, and add a
to.pickle
. I found the materilizer already added into graph, but it can not get
node.version
. The materializer's version is None, so it can not be sorted. Here is a snapshot:
👀 1
e
Ok so this is a bug that needs to be fixed. Will look when I’m at my computer. @Thierry Jean any thoughts on an easy way to get past this?
r
Genius! Is this because I use it in a weird way? Because I ran https://github.com/DAGWorks-Inc/hamilton/blob/main/examples/experiment_management/run.py many days ago and it worked well.
e
(At least I think it’s a bug, don’t want to commit before I look more :) )
r
Thanks a lot!
e
We made changes that might have brought this up — the weird thing is that version would error before, but this might be a case we didn’t account for
t
Here's the gist of the bug: • the ExperimentTracker used to hash nodes directly, but this was replaced in PR #734 • Now, the version hash is provided via
HamiltonNode.version
• the new versioning API was designed and tested in the context of the CLI, which doesn't have to deal with materializers The bug raises some design questions (for version, experiment tracking, and caching): •
HamiltonNode.version
contains information about code version • a materializer has a code version based on it's class (instead of a function) • does it make sense to version the data of a materializer? after all, if someone calls for a materializer, they probably expect results to be materialized
👍 1
r
Can I use ExperimentTracker with dr.execute? I got this error instead:
Materializer dependency [..., n_chains: int, ...] is not a string, a function, or a driver.Variable.
t
@Roy Kid the ExperimentTracker was designed to be used with
.materialize()
, we're currently implementing a fix! Regarding this other error message, you're getting this when calling
.execute()
?
r
Yes. I read the source code and you are right, I can not bypass materialize by using
.execute()
e
We should have a clean error for this? E.G. detect this error and call out the execution method?
👍 1
t
we could check
run_before_graph_execution
that the
HamiltonGraph
contains nodes tagged as materializers. I'm wondering how it handles
additional_vars
s
@Roy Kid Will have it out soon. Fix here
@Roy Kid I can’t publish the new package to pypi because pypi is having issues. If you need to get unblocked you can install from github directly.
Copy code
pip install <git+ssh://git@github.com/dagworks-inc/hamilton.git@main>
Okay this is up
1.55.1
r
Thanks! I was just on a plane and now it's working. Looks like everything is working fine!
hmmm, some wired things happen again: First time I run materialize it works fine, but after one successful run, it stuck somewhere:
Copy code
(wflow) exp/test [ python test.py                                                                                                                           ] 12:00 AM
^CTraceback (most recent call last):
  File "/proj/snic2021-5-546/users/x_jicli/exp/test/test.py", line 35, in <module>
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 57, in wrapped_fn
    return call_fn(*args, **kwargs)
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 1470, in materialize
    raw_results = self.raw_execute(
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 650, in raw_execute
    results = self.graph_executor.execute(
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 228, in execute
    executors.run_graph_to_completion(execution_state, self.execution_manager)
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/execution/executors.py", line 374, in run_graph_to_completion
    while not GraphState.is_terminal(execution_state.get_graph_state()):
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/execution/state.py", line 433, in get_graph_state
    def get_graph_state(self) -> GraphState:
KeyboardInterrupt
or:
Copy code
(wflow) x_jicli/exp [ python test.py                                                                                                                        ] 11:56 PM
^CTraceback (most recent call last):
  File "/proj/snic2021-5-546/users/x_jicli/exp/test.py", line 34, in <module>
    dr.materialize(*materilizers)
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 57, in wrapped_fn
    return call_fn(*args, **kwargs)
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 1470, in materialize
    raw_results = self.raw_execute(
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 650, in raw_execute
    results = self.graph_executor.execute(
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/driver.py", line 228, in execute
    executors.run_graph_to_completion(execution_state, self.execution_manager)
  File "/home/x_jicli/miniconda3/envs/wflow/lib/python3.10/site-packages/hamilton/execution/executors.py", line 374, in run_graph_to_completion
    while not GraphState.is_terminal(execution_state.get_graph_state()):
KeyboardInterrupt
and I try to make a minimal reproducible demo:
Copy code
# test.py
from hamilton import driver
from hamilton.plugins import h_experiments
from hamilton.io.materialization import to, from_

def connect(a:dict) -> str:
    print('connect')
    print(a)
    return 'connect'

if __name__ == "__main__":
    import test

    tracker_hook = h_experiments.ExperimentTracker(
        experiment_name='exp',
        base_directory='.',
    )


    dr = (
        driver.Builder()
        .with_modules(test)
        .enable_dynamic_execution(allow_experimental_mode=True)
        # .with_execution_manager(execution_manager)
        .with_adapters(tracker_hook)
        .build()
    )
    materilizers = [
        from_.pickle(path='./data.pickle', target='a'),
    ]
    dr.materialize(*materilizers)
I only tested it on HPC, so I don't know if it is environmental problem or my code's problem.
👀 1
s
Ok will look at this. Yes this is weird.
@Roy Kid so the issue in the above code is that there is no output requested. Thus yeah you’ve found a 🐛 — but it makes sense. e.g. this shows a blank image:
Copy code
dr.visualize_materialization(*materilizers, output_file_path='./test.png')
But the following executes fine — if we request
connect
as an output
Copy code
dr.visualize_materialization(*materilizers, additional_vars=["connect"], output_file_path='./test.png')

dr.materialize(*materilizers, additional_vars=["connect"])
r
thanks! a little bit difficult to debug this! i will try it again 😋
👍 1