helpful-crowd-74546
04/12/2022, 11:20 AMworkflow.py
needs to access both data and train, do you still recommend this structure and reference everything in workflow.py
relative root? Probably not explaining the idea that well, but interested in what you think is the best practice with respect to workflows and relative imports/access to other folders containing data/code important to your project.
├── Dockerfile
├── data
│ └── some_folder_with_data
├── train
│ ├── train.py
├── docker_build_and_tag.sh
├── flyte
│ ├── __init__.py
│ └── workflows
│ ├── __init__.py
│ └── workflow.py
├── flyte.config
└── requirements.txt
broad-monitor-993
04/12/2022, 1:03 PMflyte
source, you have a few options:
• structure the project as you have it, and make sure that your PYTHONPATH
env var points to your root project directory in whichever docker image you build so that it has access to the data
and train
modules.
• combine data
and train
into a package (e.g. my_package
), so you can pip install it locally and on the docker image.
├── Dockerfile
├── <my_package>
│ ├── data
│ └── some_folder_with_data
│ ├── train
│ ├── train.py
├── setup.py # for my_package
├── docker_build_and_tag.sh
├── flyte
│ ├── __init__.py
│ └── workflows
│ ├── __init__.py
│ └── workflow.py
├── flyte.config
└── requirements.txt
Then you could install and import my_package
from your tasks/workflows moduleshelpful-crowd-74546
04/12/2022, 1:46 PMbroad-monitor-993
04/12/2022, 5:31 PM