OpenHands CLI has become immensely helpful for research stuff too -- even when it comes to cleaning up some vibe-coded function
lolsob
Basically, I have code in a Jupyter notebook that's very messy. I sent an agent to refactor it into a library so it can be reused in many different places. BUT i need to do a quality check to make sure it has the exact same behavior as before.
After manually reviewing the refactored code, I tried to set a debugging script myself to check the difference, but it is just VERY boring to check JSON fields one-by-one to make sure they match & update the refactored code to debug (e.g., add print here and there...)
And then OH CLI just nailed it 🐕 (and found a stupid bug that was introduced by me, the human 😅 )
can you read SOME_NOTEBOOK.ipynb
, we have ran this script to processed a bunch of completions in /mnt/data/research/data/XXX/downloaded_data/**/processed_completions.json
. i have a new implementation for process completions in /mnt/data/data-sdk/XXX/trace_segments.py
function completions_to_trace_segments
. Can you help me write a test script that goes through every processed_completions.json
and makes sure that after loading completions.json, passing it through completions_to_trace_segments, it will return trace segments that match the existing processed_completions.json?