Another interesting thing is I can not use `spy` i...
# hamilton-help
r
Another interesting thing is I can not use
spy
in
pytest_mock
to check
call_count
of function. I think it because when it wrap the object, it loss
__module__
information or something else, and
inspace.getmodule
can not find its module. I ask this because I want to know how many times the node is executed when I write unit test of caching. Maybe there is another way to do that?
t
I also noticed that individual function calls can get a bit lost with Hamilton as I worked with performance profilers. My conclusion is that the "node functions" are generally bundled under
driver.execute()
and you lose granularly in the Python call stack. The best approach to know how often a node is executed is to create a lifecycle hook / adapter. Do you want to write tests for your own project or it's to add tests to the Hamilton library?
r
Ah, super. I think creating a new adapter is the best choice. I haven't thought about it yet because I want to figure out if the cache mechanism is working or not. I think a
ProfilerAdapter
is quite important, and I would like to make a prototype. Once I have a mature design, I will start a draft in hamilton's repo.
That's why I still use hamilton. Quite easy to make an extension
😁 1
t
That's why I still use hamilton. Quite easy to make an extension
Yup, that's the spirit!
data science is particularly hard to profile because different pieces matter: RAM, CPU, GPU, network From my research scalene was one of the easiest to use: https://github.com/plasma-umass/scalene This was a very insightful talk:

https://www.youtube.com/watch?v=44Jk_ab4XMY

r
Thx for sharing! I will learn and think about it.