Say I d like to modify a notebook node significantly enough Ploomber #general

Say I'd like to modify a notebook node significant...

Marcin Gierdalski

06/10/2024, 10:20 PM

Say I'd like to modify a notebook node significantly enough that I can't just rinse and repeat with

ploomber build

until it works as intended. How should I go about running it in isolation as a proper notebook in VSC or jupyter, and still having benefits of the information in

upstream = [ 'get', ...]

preamble and perhaps in

pipeline.yaml

and

env.yaml

? I could add some shim code to inject desired inputs into

upstream

, only to excise it later, when not needed anymore, but I'm hoping you guys could suggest a tried and proved pattern...

Eduardo

06/11/2024, 6:16 PM

ploomber build

has a

--partially

argument, which will build your pipeline up until the task name that you pass. e.g.

ploomber build --partially train-model

this will execute in the same way as

ploomber build

it'll just stop once it reaches that task, does this work?

Marcin Gierdalski

06/11/2024, 6:24 PM

No exactly. I'd rather execute that 1 notebook in VCS or jupyter using its upstream inputs as they are now. If I'm doing it naively,

upstream['get']

obviously breaks because it's defined in a stub at the top as

upstream = ['get']

, not as a dictionary with path(s) to my inputs (that's a job for papermill at pipeline run, I believe ). Once I'm happy with NB, I would reintegrate it and run entire pipeline.

Eduardo

06/11/2024, 6:26 PM

ah got it! so we have an integration with jupyter that will automatically inject a cell with the parameters extracted from pipeline.yaml and env.yaml: https://docs.ploomber.io/en/latest/user-guide/jupyter.html if you're using vscode, you can do the injection manually: https://docs.ploomber.io/en/latest/user-guide/editors.html

Marcin Gierdalski

06/11/2024, 6:28 PM

I knew you should have had a pattern or two for such cases up your sleeve! Let me read. Thanks!

👍 1

Open in Slack

Previous Next