Say I'd like to modify a notebook node significant...
# general
m
Say I'd like to modify a notebook node significantly enough that I can't just rinse and repeat with
ploomber build
until it works as intended. How should I go about running it in isolation as a proper notebook in VSC or jupyter, and still having benefits of the information in
upstream = [ 'get', ...]
preamble and perhaps in
pipeline.yaml
and
env.yaml
? I could add some shim code to inject desired inputs into
upstream
, only to excise it later, when not needed anymore, but I'm hoping you guys could suggest a tried and proved pattern...
e
ploomber build
has a
--partially
argument, which will build your pipeline up until the task name that you pass. e.g.
ploomber build --partially train-model
this will execute in the same way as
ploomber build
it'll just stop once it reaches that task, does this work?
m
No exactly. I'd rather execute that 1 notebook in VCS or jupyter using its upstream inputs as they are now. If I'm doing it naively,
upstream['get']
obviously breaks because it's defined in a stub at the top as
upstream = ['get']
, not as a dictionary with path(s) to my inputs (that's a job for papermill at pipeline run, I believe ). Once I'm happy with NB, I would reintegrate it and run entire pipeline.
e
ah got it! so we have an integration with jupyter that will automatically inject a cell with the parameters extracted from pipeline.yaml and env.yaml: https://docs.ploomber.io/en/latest/user-guide/jupyter.html if you're using vscode, you can do the injection manually: https://docs.ploomber.io/en/latest/user-guide/editors.html
m
I knew you should have had a pattern or two for such cases up your sleeve! Let me read. Thanks!
👍 1