This message was deleted.
# ask-anything
s
This message was deleted.
i
You need to define this in the code itself. look on how it’s done in the clean.py
So the pipeline.yaml isn’t the right place to do it 🙂
v
Ok! But the downstream task can be dependent on two different upstream tasks. I call the downstream task twice in my pipeline.yaml so it should have a different upstream task for every call. Is this possible?
i
So A -> C & B -> C ? You can get that by specifying both in the array: upstream = [‘A’, ‘B’]. Ok I just read the earlier thread with Eduardo where he mentions
#make upstream=['filter_website_content']
v
No, A -> C || B -> C
yes, sorry maybe i should have continued in that thread
i
No worries. It’s not showing in the plot but the logic does work?
From the example it look like A & B -> C
v
The logic works yes!
👍 1
but not the plot
i
Can you please share what it shows you?
v
yes A and B are upstream from C, but C can run from A without B having been executed.
yes,
i
Ok so it sounds like it’s an expected behavior since you were using the same tasks for both as a dependency, did you use the names on the tasks?
Copy code
# make upstream=['get_website_content']
- source: scripts/website_statistics.py
  name: stats-1

# make upstream=['filter_website_content']
- source: scripts/website_statistics.py
  name: stats-2
Copy code
name: stats-2
and
stats-1
in the task definition?
So in the above .yaml you should have get_webiste_content -> stats-1 and filter_website_content -> stats-2 independently
v
yes, this is what I want:
yes I used different
name:
for the the two different task definitions of task C in the pipeline.yaml
i
Check out your .py code, in the upstream is it = None?
That’s what creates the node connection
v
yes it’s None
i
Without anything in the upstream:
With setting the dependency in 1 task
So you’d need to configure it in the .py file, also make sure you save the output to the product, depends on how you called the product, for instance, you can save it like that if you have an output named model:
Copy code
with open(product['model'], 'wb') as f:
    pickle.dump(clf, f)
Try that and let me know if it works for you 🙂
v
Ok, so when I set the upstream dependencies in the stats task .py file I get this: (btw the tasks are renamed here)
But this is what I want:
e
Oh apologies, my earlier advice was missing something. Since you want the same script to have different upstream dependencies you need to turn off automated upstream extraction. so in your pipeline.yaml:
Copy code
meta:
  extract_upstream: false

tasks:
  - source: scripts/get.py
    product: out/get.ipynb

  - source: scripts/second.py
    product: out/second.ipynb
    upstream: [get]

  - source: scripts/third.py
    product: out/third.ipynb
    upstream: [get]
then in your py files, make
upstream=None
v
Perfect it works, thanks a lot you guys! 🙌🙏
🙌 1
e
sure!