This message was deleted Ploomber #ask-anything

Join Slack

This message was deleted.

# ask-anything

Slackbot

03/01/2022, 1:54 PM

This message was deleted.

MrFiat124Spider

03/01/2022, 1:55 PM

like I'm struggling on finding an example to copy for accessing tags in a .py file

MrFiat124Spider

03/01/2022, 2:04 PM

maybe a walkthrough of one of these on the docs page https://github.com/ploomber/projects/tree/master/templates

Eduardo

03/01/2022, 2:29 PM

thanks for your feedback. are you looking for an example to edit the cell tags? I' m not sure I'm following

MrFiat124Spider

03/01/2022, 2:34 PM

Copy code

# %% tags=["parameters"]
upstream = ["raw"]
product = None

# %%
df = pd.read_csv(upstream['get']['data'])

MrFiat124Spider

03/01/2022, 2:35 PM

this was confusing me, how does the upstream reference 'get' if the only upstream is "raw"?

MrFiat124Spider

03/01/2022, 2:37 PM

also it seems the advanced ML example totally ditches the .yaml file and builds the pipeline in python API, is this because the python API gives the user lower level or more functional control over the pipeline?

Eduardo

03/01/2022, 2:39 PM

oh, that example seems wrong. is that somewhere in our docs?

Eduardo

03/01/2022, 2:39 PM

yep, the ml advanced uses the Python API since it gives more flexibility, but it requires you to write more code

👍 1

Eduardo

03/01/2022, 2:41 PM

oh, I spotted the error. I'll fix it, thanks!

👍 1

MrFiat124Spider

03/01/2022, 2:44 PM

thanks!

Eduardo

03/01/2022, 2:46 PM

please keep this questions coming. it's pretty hard to write good docs, so all feedback is welcome, especially errors like this

👍 1

MrFiat124Spider

03/01/2022, 2:46 PM

would it be possible to number the steps on this page https://docs.ploomber.io/en/latest/get-started/basic-concepts.html

MrFiat124Spider

03/01/2022, 2:47 PM

the page seems kinda hard to follow for complete noobs (for which the page is intended)

MrFiat124Spider

03/01/2022, 2:48 PM

reference the .yaml in this diagram, it just kinda appears later in the page

MrFiat124Spider

03/01/2022, 2:49 PM

why is the cell injection explanation in the Tasks: scripts/notebooks section and not in the others?

MrFiat124Spider

03/01/2022, 2:50 PM

cell injection is key to all of ploomber no? I'd include it in a specific task section only if it was specific to this task?

MrFiat124Spider

03/01/2022, 2:51 PM

to make this more clear, maybe show a .yaml of these task functions being used. its not clear how they get used in the overall orchestration from just this page

MrFiat124Spider

03/01/2022, 2:52 PM

further there is a clean.py and a functions.clean

MrFiat124Spider

03/01/2022, 2:53 PM

these are separate things but the common name in the docs makes it confusing

MrFiat124Spider

03/01/2022, 2:53 PM

right after showing the yaml

MrFiat124Spider

03/01/2022, 2:53 PM

Copy code

tasks:
  # this is a sql script task
  - source: raw.sql
    product: [schema, name, table]
    # ...

  # this is a function task
  # "my_functions.clean" is equivalent to: from my_functions import clean
  - source: my_functions.clean
    product: output/clean.csv

  # this is a script task (notebooks work the same)
  - source: plot.py
    product:
      # scripts always generate a notebook (more on this in the next section)
      nb: output/plots.ipynb

MrFiat124Spider

03/01/2022, 2:53 PM

you explain about a clean.py file

MrFiat124Spider

03/01/2022, 2:53 PM

but in fact it is never used in the just referenced pipeline, kinda confusing?

MrFiat124Spider

03/01/2022, 2:54 PM

i get you want to demonstrate each task primitive, but why no demonstrate the .py primitive with the later plot.py file?

MrFiat124Spider

03/01/2022, 2:57 PM

I guess the ideal would be to demonstrate a minimal pipeline showcasing all 3 types of tasks and the required info to use each

MrFiat124Spider

03/01/2022, 2:58 PM

not v easy, hopefully the ramblings of a total noob will help

Eduardo

03/01/2022, 3:09 PM

this is great. thanks a lot. do you think it'd be better to have the same example implemented with scripts, functions and SQL?

Eduardo

03/01/2022, 3:09 PM

so basically split this into three parts

Eduardo

03/01/2022, 3:19 PM

what's your github handle? I'll tag you in an issue i'm writing now so you can provide more feedback there

MrFiat124Spider

03/01/2022, 3:19 PM

rcyost

MrFiat124Spider

03/01/2022, 3:20 PM

that would be useful, but if it is more worth your time a single pipeline with 3 steps showing each of the 3 types of tasks a bit more clearly would be the baseline to get the basic info needed

🙏 1

Open in Slack

Previous Next