https://outerbounds.com/ logo
Join Slack
Powered by
  • h

    hundreds-wire-22547

    09/15/2025, 9:31 PM
    seeing the following error on foreach that I believe occurs when the list to be iterated over is empty
    Copy code
    Invalid self.next() transition detected on line 52:
        Foreach iterator over table_batches in step start produced zero splits. Check your variable.
    ✅ 1
    0
    a
    • 2
    • 4
  • m

    melodic-farmer-82535

    09/15/2025, 4:43 PM
    Hi, I’m trying to setup Metaflow in my company’s internal AWS. I’ve got a API Gateway authorisation type AWS_IAM that’s needed for me to create the api gateway. However , my admin role doesn't have execute-api:Invoke permission so it’s blocking all requests. If i try to make a manual request with the authorisation header in postman it’s fine, but through metaflow I just keep running into a Missing Authentication Token error. I’ve tried enabling the APIBasicAuth to use the API key and try to bypass but same issue. I’m a data scientist and not much of a cloud expert so running out of ideas of what to try to fix this. Has anyone encountered something similar? How did you fix?
    0
    f
    • 2
    • 57
  • f

    few-dress-69520

    09/15/2025, 11:11 AM
    Hi all, I've been playing around with delaying environment fetching. In particular I'm trying to use a flow parameter called
    env_name
    to define the env a step should be executed with:
    Copy code
    @named_env(
            name="@{METAFLOW_INIT_ENV_NAME}",
            fetch_at_exec=True,
        )
    If I understand the docs correctly, this is how it should be used. Indeed, this does work fine locally and on AWS batch, presumably because it executes this code which assembles the necessary variables from the flow parameters. But when I deploy it as a step-function it instead executes bootstrap_environment (in particular CondaEnvironment.sub_envvars_in_envname without the addl_env argument) which doesn't parse the flow parameters and thus fails with
    metaflow.metaflow_environment.InvalidEnvironmentException: Could not find 'METAFLOW_INIT_ENV_NAME' in the environment -- needed to resolve '@{METAFLOW_INIT_ENV_NAME}'
    Is it possible to also include code like this in the bootstrap_environment function?
    0
    d
    • 2
    • 1
  • f

    fast-vr-44972

    09/12/2025, 9:09 AM
    Didn't know you guys were in the town 😉
    ✨ 2
    😂 1
    metaflow 1
    0
    v
    • 2
    • 1
  • e

    enough-article-90757

    09/11/2025, 8:23 PM
    Hey, I've been getting this error over the last few days that seems like a Metaflow problem:
    Copy code
    Note that the flow was deployed with a modified name due to Kubernetes naming
    conventions on Argo Workflows. The original flow name is stored in the workflow
    annotations.
        Internal error
    Traceback (most recent call last):
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/cli.py", line 658, in main
        start(auto_envvar_prefix="METAFLOW", obj=state)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/_vendor/click/core.py", line 829, in __call__
        return self.main(args, kwargs)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/_vendor/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/cli_components/utils.py", line 69, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/_vendor/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/_vendor/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, ctx.params)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/_vendor/click/core.py", line 610, in invoke
        return callback(args, kwargs)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/_vendor/click/decorators.py", line 33, in new_
    func
        return f(get_current_context().obj, args, kwargs)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/plugins/argo/argo_workflows_cli.py", line 341,
     in create
        ArgoWorkflows.delete(obj._v1_workflow_name)
      File "/opt/pyenv/versions/3.9.23/lib/python3.9/site-packages/metaflow/plugins/argo/argo_workflows.py", line 263, in
    delete
        workflow_template["metadata"]["annotations"].get(
    TypeError: 'NoneType' object is not subscriptable
    This is the version I'm running:
    Copy code
    ❯ pip show metaflow
    Name: metaflow
    Version: 2.18.3
    Summary: Metaflow: More AI and ML, Less Engineering
    Home-page:
    Author: Metaflow Developers
    Author-email: <mailto:help@metaflow.org|help@metaflow.org>
    License: Apache Software License
    Location: /opt/pyenv/versions/3.9.23/lib/python3.9/site-packages
    Requires: boto3, requests
    Required-by:
    Is this a Metaflow error?
    ✅ 1
    0
    s
    • 2
    • 2
  • f

    famous-airline-14628

    09/11/2025, 11:07 AM
    Hiya! I was really excited about the release of conditional/split-switch steps in Metaflow 2.18, as that was one of the main blockers for us migrating our workflows to it. However, I see in the StepFunctions docs that this is not yet supported, and if I have a use-case I should reach out on slack. Well, here I am! 😅 I'm also happy to try to contribute to this as well, just not sure where to start.
    ✅ 1
    0
    a
    • 2
    • 3
  • s

    shy-midnight-40599

    09/10/2025, 4:25 PM
    Hi Team, We are deploying metaflow using Stepfunctions/AWS Batch(with Fargate based Compute env). We are trying to run multiple executions of same flow to do load test. What we noticed is the jobs running times are abnormal. There are jobs which took around 2 to 3 minutes to complete and there are jobs which shows runtime as 2 hours. When we checked the logs, we could see the step completed in 2 minutes(based on the logs we added to the step). After completing the step, the job was running for 2 hours for no reason. Anyone else faced this? or any idea why. Let me know if you need more details on this.
    0
    a
    i
    f
    • 4
    • 9
  • h

    happy-journalist-26770

    09/09/2025, 12:58 PM
    Hi, Metaflow UI isnt displaying logs in the UI lately. Im running on K8s - deployed via helm. metaflow_version - 2.18.3 Images: • public.ecr.aws/outerbounds/metaflow_ui:1.3.5-146-ge6d68f08-obp • public.ecr.aws/outerbounds/metaflow_metadata_service:2.5.0 • public.ecr.aws/outerbounds/metaflow_metadata_service:2.5.0 Do let me know if im missing anything
    0
    h
    • 2
    • 6
  • f

    fast-vr-44972

    09/09/2025, 12:07 PM
    I don't think you can pass a custom virtual env to
    pypi.
    0
  • f

    fast-vr-44972

    09/09/2025, 12:01 PM
    Most probably it's mismatching virtual env.
    pypi
    seems to be managing its own virtual env. https://github.com/Netflix/metaflow/blob/master/metaflow/plugins/pypi/pypi_decorator.py#L35
    0
  • q

    quick-carpet-67110

    09/09/2025, 11:19 AM
    Question about using custom
    image
    in
    @kubernetes
    decorator together
    @pypi
    decorator
    Hey everyone! We have a situation where most of our steps share a lot of packages but still require custom installations every now and then. So we have a base Docker image that is built with all of the common dependencies, but we would like to use the
    @pypi
    decorator to install the custom deps on the fly. Is this currently possible? I did a quick and dirty example flow with a custom base image and a custom dependency installed in the
    @pypi
    decorator and the code inside the step was not able to import PyTorch, even though it is available in the custom image.
    Copy code
    @kubernetes(tolerations=[{"key": "something", "operator": "Equal", "value": "another_value", "effect": "NoSchedule"}], gpu=1, image="pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime")
        @pypi(python="3.10.0",
            packages={
                "implicit": "0.7.2",
            },
        )
        @step
        def gpu(self):
    I searched in the docs and was able to find some information, but I am not sure if
    system-wide
    packages in the snippet refers to the container images packages or something else. Can anyone shed some light on whether or not the setup I am describing above is achievable with Metaflow? Thank you!
    0
    a
    • 2
    • 2
  • a

    ancient-fish-13211

    09/03/2025, 1:40 PM
    Hi again everyone, Does anyone know of a way to share a local docker image to the minikube setup that metaflow-dev sets up, so that I can test with it in flows? I've tried multiple ways that have all failed and without access to the minikube commands directly I can't see a way to do it. Thanks
    0
    e
    • 2
    • 2
  • h

    hundreds-wire-22547

    09/02/2025, 11:20 PM
    Hi, I upgraded pydantic
    version = "2.10.5"
    ->
    version = "2.11.7"
    and now seeing an error like below, is this a known issue?
    Copy code
    File "/tmp/ray/session_2025-06-16_12-59-52_860923_1/runtime_resources/working_dir_files/_ray_pkg_e01e7abcca487ccc/metaflow/datastore/task_datastore.py", line 369, in load_artifacts
        yield name, pickle.loads(blob)
                    ^^^^^^^^^^^^^^^^^^
    AttributeError: 'FieldInfo' object has no attribute 'evaluated'
    0
    h
    • 2
    • 7
  • c

    clever-midnight-3739

    09/02/2025, 4:22 PM
    Hi everyone! I am new to MetaFlow and trying to understand how to deploy a flow using remote resources on different compute backends. In this tutorial, there is a flow with steps assigned either locally, on AWS Batch or on remote k8s. How is MetaFlow set up in this case? How does the config.json look like to support both AWS Batch and k8s? Also, in cases where this flow is deployed (for instance, on Argo), on which compute would each of these steps run? Thank you very much for your help!
    0
    a
    • 2
    • 1
  • a

    adorable-truck-38791

    09/01/2025, 1:55 PM
    hello, I'm trying to use the
    metaflow-dev up
    command... it seems to be mostly working, but it seems to keep asking for my password when it's starting all of the services. the weirder thing is that it keeps saying my password is wrong, so i'm not even sure what password it's trying to ask for (is it something related to the minikube/argo roles or something like that? I have no idea)- any thoughts on what I should be trying to fix this?
    0
    • 1
    • 4
  • c

    crooked-camera-86023

    08/29/2025, 10:51 PM
    Hello I tried to apply metaflow.tf to our own internal infra, and I got the following error and really appreciate any pointer/help.
    0
    h
    • 2
    • 5
  • a

    ancient-fish-13211

    08/29/2025, 10:04 AM
    Hi everyone, I'm having a bit of a silly problem using metaflow as a dependency in a pycharm project. When trying to import things like
    Copy code
    from metaflow import FlowSpec, step, kubernetes, retry
    FlowSpec and step import fine, but I get Cannot find reference errors for kubernetes and retry. If I launch a python console or a notebook the imports work fine so it seems like an indexing issue. I've tried the typical invalidate caches, with no luck. I'd rather not just disable the warnings if possible. Has anyone had similar issues or have a solution? Many thanks
    ✅ 1
    0
    f
    • 2
    • 2
  • d

    dry-beach-38304

    08/28/2025, 7:39 AM
    For anyone using the Metaflow Netflix Extensions — it’s been updated to 1.3.0 (compatible with the new packaging framework introduced in Metaflow 2.16.0). Apologies for taking so long to update, a bug in Mamba 2.3.1 was preventing the tests from running correctly (2.3.2 was released 2 days ago) and I wanted to make sure things were not too broken. Lots of bug fixes and a few new features. It is not compatible with Metaflow < 2.16.0. More features coming soon as well.
    excited 1
    💯 1
    0
  • n

    narrow-forest-28560

    08/27/2025, 11:12 PM
    Hey everyone, There’s a lot of encouraging new developments (company has been mostly using Metaflow<2.12). I wonder if it is now possible with some reusable utility interfaces, to wrap any single function with a decorator to make it into a runnable flow without having to write a whole separate flow script. Particularly, we have been using Runner API in Airflow DAGs to run flows on AWS Batch. Furthermore, it makes sense as part of this to be able to unit test such functions. I’m encouraged by the latest Metaflow 2.18 blog post, but if the entry point is as simple as an RESTful endpoint say via FastAPI, it would be an encouraging push to update to newer Metaflow versions (with internal training).
    0
    a
    s
    +2
    • 5
    • 18
  • s

    square-wire-39606

    08/27/2025, 9:29 PM
    conditionals are now GA
    💯 1
    ❤️ 1
    🙌 1
    0
  • s

    square-wire-39606

    08/27/2025, 9:28 PM
    old thread but conditionals are now GA
    0
  • s

    square-wire-39606

    08/27/2025, 9:27 PM
    old thread, but finally we got around to implementing it
    0
  • c

    calm-rainbow-82717

    08/26/2025, 7:41 PM
    Hey everyone, I have a question regarding cli command in metaflow. Is there a way to customize the cli command after the python file, like e.g.
    myflow.py
    . I want to create something like
    python myflow.py data check
    ``python myflow.py data plan` next to the existing ones
    python myflow.py run
    ``python myflow.py show` , any idea if it's possible to do this? I see it seems possible to use the metaflow-extension-template? And I also wonder if there's some other way to achieve the goal. like the customizing stepdecorators using a generator function. Thanks in advance!
    ✅ 1
    0
    d
    • 2
    • 3
  • h

    hundreds-receptionist-20478

    08/26/2025, 7:11 PM
    👋🏻 Hey everyone! I’m an experienced AI Agent Developer open to new projects or full-time roles. I specialize in building autonomous agents using GPT-4, LangChain, AutoGen, CrewAI, and other advanced frameworks. What I Do: • Autonomous research & data-gathering bots • Multi-agent systems for delegation & collaboration • AI assistants with memory, planning & tool use • Trading bots, IVR agents, customer support agents & more Tech Stack: • Python, TypeScript, Go, C++ • LangChain,Langraph, AutoGen, ReAct, CrewAI • OpenAI, Claude, Hugging Face, Playwright, API integrations I'm especially interested in ambitious startups, Web3 projects, and next-gen AI tools. Feel free to reach out if you’re building something exciting — happy to chat!
    👋 2
    👋🏼 1
    0
  • g

    great-egg-84692

    08/26/2025, 5:02 PM
    does metaflow support configuring `concurrencyPolicy`for argo CronWorkflow, https://argo-workflows.readthedocs.io/en/latest/cron-workflows/?
    0
    a
    t
    • 3
    • 23
  • f

    few-dress-69520

    08/26/2025, 11:17 AM
    I'm running into a problem when trying to resolving named environments with packages from a private pypi repository (AWS codeartifact). I want to use pip environment variables to pass the extra-index-url, e.g. through PIP_EXTRA_INDEX_URL. My understanding from other posts here is that what I'm trying to do should work. I've tried
    Copy code
    PIP_EXTRA_INDEX_URL=<url_with_temporary_token> metaflow environment resolve -r requirements.txt --alias test_env
    which fails already in the first step of resolving the environment. It just doesn't have access to the private repo and fails to resolve any private packages.
    Copy code
    ERROR: Could not find a version that satisfies the requirement <private-package>==0.1 (from versions: none)
    ERROR: No matching distribution found for <private-package>==0.1
    Strangely, when creating a pip.conf that contains the extra-index-url it almost works. When running
    Copy code
    PIP_CONFIG_FILE=pip.conf metaflow environment resolve -r requirements.txt --alias test_env
    Metaflow is able to resolve the environment including the private packages and their dependencies, but in the step where it downloads the packages from the web, I get a
    401 Client Error: Unauthorized for url:
    for the private repo. It looks like when trying to download from the web it doesn't use the pip.conf anymore but instead tries to directly access the url prepared earlier in the process (without the token) and hence fails. I see that there is some auth handling here but this doesn't seem to do the thing that's necessary for my use case. I'm using metaflow==2.15.21 and metaflow-netflixext==1.2.3.
    0
    d
    • 2
    • 1
  • a

    acoustic-river-26222

    08/23/2025, 6:03 PM
    Hi everyone!! I am running
    netflixoss/metaflow_metadata_service:v2.4.12
    for the UI service startup. When running the command i get
    "/opt/latest/bin/python3 -m services.ui_backend_service.ui_server": stat /opt/latest/bin/python3 -m services.ui_backend_service.ui_server: no such file or directory
    . Do you know if the path of the container init script changed ? 😁
    ✅ 1
    0
    s
    • 2
    • 2
  • b

    bland-garden-80695

    08/23/2025, 12:04 AM
    Hey All, while I work on testing the decorators and test it with Agentic workflow. I had some generic questions for the team. 1. Who are the primary users? Do Metaflow intend to cater them, or expand to other professions/fields in the future? 2. In which direction is Metaflow moving at this stage? What is the vision of the product?
    ✅ 1
    0
    v
    • 2
    • 6
  • a

    adorable-truck-38791

    08/22/2025, 2:30 PM
    hey Metaflow team, I wanted to see whether it's feasible to do something like version and track the packages built for each flow (not the runs but rather the code backing the different runs). Basically, I have an org where there are people who write code around statistics and other math functions but are not engineering savvy. With this in mind, I wanted to see whether I could make something like the following: 1. I define a somewhat abstract flow that has static inputs and outputs 2. Someone else would just give me a function that adheres to the static interface (a simple statistical function that takes some well-defined data inputs and produces some well-defined outputs) 3. I'm able to run the same higher level flow where it just handles some of the data input handling and the function output handling in a consistent way 4. I build some index of the packages build & run for the different custom functions given so that I can: a. Re-run that specific package with different data inputs potentially b. Produce some index of the packages and runs The main goal here is to try to close the gap between the more engineer-y interface of Metaflow and the technical capabilities of people who are more math/statistics focused... but I think there are a few things that are hard: 1. Dynamically creating a
    FlowSpec
    subclass such that one of the steps calls whatever random function gets thrown into the mix 2. Managing the underlying packages backing the different runs I might be way over-complicating this, so I would appreciate any thoughts or pointers! I'm happy to dig into the code and work through some of the internal APIs for this. I do realize there are security concerns with executing arbitrary functions in this manner, but I think that is manageable in the environment we work in
    0
    h
    • 2
    • 8
  • b

    brash-gold-6157

    08/21/2025, 2:49 PM
    Hi all, I'm new to metaflow trying to help setup a poc environment. We already have Argo workflows deployed, and multiple kubernetes clusters in Azure and onprem. I don't really want to just run the pre-built terraform config as we already have many of the resources we need, included storage accounts etc. Can anyone point me to documentation on this? Should I be starting with the helm charts here? https://github.com/outerbounds/metaflow-tools/tree/master/charts/metaflow Many thanks!
    ✅ 1
    0
    a
    h
    • 3
    • 3