This message was deleted.
# ask-anything
s
This message was deleted.
👍 1
e
let me see if I understood correctly, are you looking for something like this:
Copy code
git checkout v1 # go to tag v1
ploomber build # store outputs in products/v1
git checkout v2 # go to tag v2
ploomber build # store outputs in products/v2
may I ask more a bit the use case? are v1 and v2 code versions that you want to keep active? in other words, having v2 does not replace v1
a
yes exactly, the pipeline has some stages: get data, long post process, store to s3-parquet/postgres, (pstage) read them and "plot", (pp)produce byproducts. the pstage and pp needs to be running several times, if someone finds problem need to fix and rerun, publish v1.1. but for sure vX.X are independent. they should be able to run in parallel.
can ploomber detect on which tag it is "sitting" and set appropriate variables? 🤔
e
ok, yes it can! there's a feature that detects that but sadly it's undocumented. let me grab that info for you
it's funny because I implemented that a while ago for my projects but I never documented it 😅
here's how you do it, add a prefix to your products:
Copy code
tasks:
    # tasks.get, features and join are python functions
  - source: tasks.get
    product: '{{products}}/get.parquet'

  - source: tasks.features
    product: '{{products}}/features.parquet'

  - source: tasks.join
    product: '{{products}}/join.parquet'

    # fit.py is a script
  - source: fit.py
    name: fit
    product:
        # that generates an html report as output
        nb: '{{products}}/nb.html'
        model: '{{products}}/model.pickle'

    # only show outputs (not code) in the report
    nbconvert_export_kwargs:
      exclude_input: True
then create an
env.yaml
with this:
Copy code
_module: .
products: '{{git}}'
meerkat 1
in the same folder as the
pipeline.yaml
to test it do
ploomber status
, you'll see that the prefix changes depending on the tag/commit
since it seems that you only want to version a few of the stages, you can apply the
{{product}}
prefix to only a few tasks, in that case some tasks can go to a common folder (say
products
) and others to the versioned folder (
v1/
,
v2/
)
a
ok need to go deeper next week and prepare simple use case + bundle it into docker+ci
this is great news!! thank you!
e
sure, thanks for bringing this up! can you please open an issue? so I have a reminder to document this, and other people can see how this works in the docs
a
how to make products on the remote storage??
e
for uploading to remote storage you have 2 options. either you use the client directly in your script, or you use the built-in feature, here's an example: https://github.com/ploomber/projects/tree/master/cookbook/file-client
we're still lacking a bit on the remote storage documentation so if you have any questions, please send your questions
a
i am using minio, so mc client is my choice
issue is #615
🙏 1
e
alrigh thanks for opening the issue! since we don't have support for minio yet, you can use the python client directly. the on_finish hooks are useful here, you can register a function to execute when your task executes successfully, this is a good place to upload your outputs - docs: https://docs.ploomber.io/en/latest/cookbook/hooks.html
this is an interesting use case, please share your feedback once you start working on the implementation!
a
definitely will share, btw i was looking on the case with s3 examples, how should I define the bucket in the yaml file?
e