This message was deleted.
# ask-anything
s
This message was deleted.
e
how are you running your pipeline? are you using soopervisor?
j
Hi Eduardo, I'm just using ploomber build, for the specific case of the task, I'm using ploomber task SOMETASK -some-envs--bla BLA
e
if by "requirements" you mean "virtual envs", then it's possible if your tasks are scripts or notebooks. You can pass
papermill_params
to your task and pass a different
kernel_name
. of course, this involves some setup since you need to ensure you have multiple kernels registered and they are discoverable by your current environment. is this what you need? what I'm not following clearly is how you're planning to distribute in different computers
j
Thank you for your answer. What I mean is the following: Assume a pipeline with certain tasks. Some tasks process (transforms) data, specifically, data cleansing and geospatial operations. After a task is executed the output is written in a standard format that will be parsed by the next task. Now, there is step that requires intensive computational power and, therefore, I want to run this task in a high performance computer. As this computer has limited resources, I don't want to install the entire environment that I use for the entire pipeline. I only want to install the packages that are necessary and sufficient for the specified task. I have seen that ploomber has a mechanism for installing software ,through virtual environments, using a requirements.yml file (I suppose). and then doing somehing like conda env create -f requirements.yml (for example in the case of conda). My question is: Does this feature is exclusive for the entire pypeline or, if it's possible to make ploomber to compile a requirements.yml file for a specific task? or, more likely, I am misunderstanding the feature of ploomber install ... Thank you for your time and your patience.
e
thanks for the explanation! the
ploomber install
command installs packages for the whole pipeline. to solve your problem you'd have to manually list the dependencies that the task you want to execute requires and create the requirements.txt, then execute it in a different machine. In Ploomber Cloud, we actually have this functionality, and we're able to infer a requirements.txt based on the contents of a jupyter notebook. most of the logic is open-source so if you wanna take a look, it's here. but for your use case, it's probably simpler to just create the requirements.txt file manually
j
ok! thanks, I will take a look. I though that that functionality may be available because of the general dependence inference that ploomber is doing for the whole pipeline. But I see where are you going with ploomber Cloud. Thanks for the explanation. It is a really neat project! Thank you for your time and work.
e
sure! feel free to ask any other questions!