https://flyte.org logo
Join Slack
Powered by
# flyte-support
  • r

    rapid-artist-48509

    10/28/2025, 1:48 AM
    dumb q: if i want to backup / restore the flyte postgres db, do just treat it like any old postgres DB that's used by a webapp? like i can just pg_dump / pg_restore? (ref https://flyte-org.slack.com/archives/C06H1SFA19R/p1761615884646699?thread_ts=1761615882.693109&cid=C06H1SFA19R )
    e
    • 2
    • 1
  • a

    average-secretary-61436

    10/28/2025, 2:25 PM
    sometimes we pass around FlyteDirectories and FlyteFiles that need to be fully downloaded to specific locations for cli applications to use. Are there builtin functions to do this sort of thing?
  • c

    cool-nest-98527

    10/28/2025, 4:39 PM
    ❔ Does anyone know how to delete local credentials created with the CLI PKCE auth, to force another log in while testing? TIA 🙏
    f
    • 2
    • 3
  • l

    little-cricket-84530

    10/28/2025, 11:10 PM
    Hey folks.. Does flyte have the ability to ensure that a given workflow can have only 1 instance running, irrespective of inputs (i.e. I can’t rely on caching)
    f
    • 2
    • 4
  • f

    few-angle-62167

    10/29/2025, 10:48 AM
    Hi! currently, I am trying to use flyteconnector bigquery. However, after trying to open StructuredDataset using
    Copy code
    bq_template = BigQueryTask(
        name="<name>",
        inputs={},
        query_template="SELECT * FROM <project_id>.<dataset_id>.<table>",
        output_structured_dataset_type=StructuredDataset,
        task_config=BigQueryConfig(ProjectID="<project_id>"),
    )
    @task(
        container_image=image_name,
    )
    def convert_bq_table_to_pandas_dataframe(ds: StructuredDataset) -> pd.DataFrame:
        return ds.open(pd.DataFrame).all()
    
    @workflow
    def full_bigquery_wf() -> pd.DataFrame:
        ds = bq_template()
        return convert_bq_table_to_pandas_dataframe(ds=ds)
    So, what happen is when the bigquery task query data from bq it uses flyteconnector service account but after that when the python task try to extract pandas dataframe it is unable to do so.
    Copy code
    google.api_core.exceptions.PermissionDenied: 403 Access Denied: Dataset <project_id>:<job_id>: User does not have permission to access results of another user's job.
    I have already deploy flyteconnector and enable plugin as documentation mentioned. Any help would be greatly appreciate :).
    a
    • 2
    • 4
  • m

    mysterious-painter-66441

    10/30/2025, 3:37 PM
    Hi could you please give an example of fetch a workflow from the cluster and create and register a launch plan for this fetched workflow? Would it possible?
    e
    • 2
    • 3
  • s

    strong-soccer-41351

    10/31/2025, 11:27 AM
    Hi team, is there any documentation related to configuring a
    flyte-core
    helm deployment using a custom MinIO S3 bucket, without IAM configuration? There's helm chart parameters to pass in accessKey and secretKey but we want to avoid baking long-term credentials into our source code. I checked all the pages under https://docs-legacy.flyte.org/en/latest/deployment/deployment/index.html and https://www.union.ai/docs/v1/flyte/deployment/flyte-deployment/. I also checked the example flyte core chart and read its README.md but I haven't seen if there's alternatives or usages for the accessKey secretKey fields
    f
    a
    c
    • 4
    • 20
  • g

    gentle-tomato-480

    11/03/2025, 11:25 AM
    Hey everyone, looking into making my workflows more reliable by limiting how many executions can run concurrently. I am looking in the docs but haven't found a configuration that manages this. I'm ok with any parallelism within workflow executions My main goal is to handle peaks of incoming data (that trigger new workflow executions) better and prevent rising costs and jobs from failing due to timeouts and limited resources/failing to scale the cluster. Looking at https://www.union.ai/docs/v1/flyte/deployment/flyte-configuration/performance/#1-workers-the-workqueue-and-the-evaluation-loop and https://www.union.ai/docs/v1/flyte/deployment/configuration-reference/scheduler-config/#queue-configcompositequeueconfig
    w
    e
    • 3
    • 8
  • g

    gentle-tomato-480

    11/03/2025, 12:03 PM
    Btw, the links in the https://www.union.ai/docs/v1/flyte/deployment/configuration-reference/ subpages (scheduler, datacatalog, flyteadmin, propeller) don't point to the page sections but instead refer back to https://www.union.ai/docs/v1/flyte/deployment/configuration-reference/
  • a

    abundant-laptop-47033

    11/04/2025, 9:33 PM
    Hello! Is there a plan to release a 1.16 patch with this fix? We would love to try it out when it's available!
    c
    • 2
    • 4
  • g

    gentle-tomato-480

    11/05/2025, 2:23 PM
    Did flytectl
    v0.9.0
    got removed/deprecated for the
    flytectl-setup-action
    ? I was using that in my CICD and it was still working last week. Today I'm getting:
    Copy code
    Error: Unable to find flytectl version "v0.9.0" for platform "Linux" and architecture "x86_64".
    in my GHA workflow when running this action.
    a
    • 2
    • 2
  • h

    high-autumn-89220

    11/10/2025, 5:08 PM
    hey all, im trying to get flyte working with okta for user + machine to machine auth. has anyone been able to make okta work with the Client Credential (
    ClientSecret
    ) auth type? does anyone know if it will work without custom auth servers on our plan? been struggling with this for a few weeks to no avail
    c
    • 2
    • 5
  • w

    wonderful-continent-24967

    11/12/2025, 12:01 AM
    What could be potential reasons for Cache write error in a Flyte task? I am seeing this error in flyte console -
    Failed to write output for this execution to cache.
    . I looked into datacatalog logs for the corresponding flyte task, nothing unusual there. Datacatalog created, updated & deleted reservations for that task as other tasks. We are using flyte
    1.15.3
    a
    • 2
    • 2
  • f

    fancy-hamburger-89099

    11/12/2025, 10:10 AM
    Hi, I am facing a very strange issue, and I am out of ideas. We have 4 instances of Flyte, all of which are configured the same and run the latest version. Each of them is running on a different cluster, and we route traffic using Ingress Nginx Controller, which is configured in exactly the same way on all clusters. All instances use Azure AD SSO, and all use the same App Registration/credentials. However, for some reason, one of these 4 instances does not work. The issue is that when I access the URL, I get to the login page, and then successfully log in using the Azure AD SSO but after that, every request fails on 400 error
    Copy code
    400 Bad Request
    Request Header Or Cookie Too Large
    nginx
    I tried different browsers, incognito mode, wiping cookies, everything. This only happens on that one instance, and it works without any issues on the other 3. Any ideas?
    a
    c
    • 3
    • 7
  • a

    abundant-judge-84756

    11/12/2025, 11:34 AM
    Hi! 👋 We're running into an issue where executions are stuck in an
    ABORTING
    state and can't be fully terminated. The executions include a dynamic workflow step, and these dynamic workflows show 2 x tasks as
    RUNNING
    - the task descriptions specify they are
    initializing
    . I think these initializing dynamic tasks are somehow blocking the workflows from resolving the abort request. Any suggestions for ways we can trigger these workflows to transition to
    ABORTED
    ? We're currently running flyte
    1.15.3
    .
    c
    • 2
    • 2
  • c

    cool-waitress-85601

    11/12/2025, 3:40 PM
    Hi, is there a way to use podman instead of docker to build images when running
    pyflyte run --remote
    ?
    a
    • 2
    • 8
  • f

    fierce-monitor-77717

    11/13/2025, 12:20 PM
    Hi, is there any plan to support python3.13/14 in flytekit any soon?
    e
    a
    • 3
    • 10
  • c

    cool-waitress-85601

    11/17/2025, 5:41 PM
    Hi everyone, I'm desperately trying to setup flyte-core with an s3 bucket and provide my access key and secret key via a secret. I can't find how to do that, the documentation isn't clear on what form should that secret take and the ai bot ins't helping and giving contradictory and false information. Can someone please provide an example? Thanks a lot
    c
    • 2
    • 17
  • m

    mysterious-painter-66441

    11/17/2025, 9:57 PM
    Hi Flyte Team, I noticed that in Flyte UI, workflow inputs defined as structured types (e.g.,
    dataclass
    ) are displayed as a single opaque field rather than expanding into individual attributes. This makes it unclear to users what values are expected for each field. Could you advise if there’s a recommended approach to make structured inputs more user-friendly in the UI? For example, is there a way to automatically expand fields or provide schema hints for structured types? Thanks for your help!
    • 1
    • 1
  • b

    brash-ram-89454

    11/18/2025, 1:23 PM
    Just a heads up that, Flyte v1 docs are down at the moment: https://www.union.ai/docs/v1/flyte/user-guide/
    b
    a
    f
    • 4
    • 3
  • c

    cool-waitress-85601

    11/18/2025, 8:49 PM
    Hi! I'm trying to figure out if/how it's possible to setup flyte for multi-tenancy, ie. isolate tenants workloads in separate namespaces, without sharing/mounting any global secret, thus relying only on tenant-scoped secrets. Ideally tenant workloads would run under tenant namespace. While there seems to be a way to have propellers per tenants, thus enabling true parallelism, IIUC there doesn't seem to be any way to isolate metadata per tenant, since there's a single s3 configuration shared by admin and all propellers/task executions. Which means sharing the bucket secret with all tenants, which wouldn't fit our requirements. Has anybody any experience/recommendations to share? Thanks a lot
    c
    • 2
    • 21
  • g

    gray-ocean-43286

    11/19/2025, 4:33 PM
    Hello Gents, I am currently working on Flyte to AWS Sagemaker Integration and facing problems with the idempotence_token in the create_sagemaker_deployment method in the flytekitplugin-awssagemaker_inference plugin version 1.16.1. I am currently testing model, ednpoint config and endpoint deployment using the Flyte Sagemaker plugin and passing the idempotence_token=False in the create_sagemaker_deployment method. But the endpoint_config deployment task still keeps expecting the idempotence_token field in it's input (which is the model_creation task's output). Copilot keeps saying this is known bug and I need to set it to True in order resolve it. But when I set it to True, the model_creation task itself fails in Flyte and gives me an error like so - failed to do boto task with error: Could not find the key model_name}-{idempotence_token in {'model_path': 's3://s3-bucket/models/xgboost-model.tar.gz', 'execution_role_arn': 'arnawsiam::account-id:role/app-flyte-sagemaker-executor-role', 'model_name': 'xgboost-diabetes-endpoint-model'}.. Having a tough time figuring this one out. I have tried multiple approaches but all in vain. Anyone who knows what this is all about?
    f
    t
    • 3
    • 3
  • c

    cool-waitress-85601

    11/19/2025, 4:45 PM
    Hello folks, Do you know what metadata go into the project/domain specific bucket vs the global bucket when you use
    raw_output_data_config
    ? For instance the user local code when using fast registration, will it be uploaded to the global or project scoped bucket? More generally, what data would go in the global bucket vs the project scoped bucket? Thanks
  • e

    early-addition-41415

    11/20/2025, 10:21 PM
    in flyte-binary if you are not on aws or is there a way to provide access keys using secrets in helm values, so that aws can be accessed frrom somewhere else
  • e

    early-addition-41415

    11/20/2025, 10:22 PM
    specifically here https://github.com/flyteorg/flyte/blob/master/charts/flyte-binary/values.yaml#L85-L87
  • e

    early-addition-41415

    11/20/2025, 10:23 PM
    we need ti use authtype as accesskey
  • f

    fancy-twilight-30247

    11/21/2025, 10:12 AM
    Hey everyone- I have a question about running multi-node pytorch workflows and error/exception handling. We're currently defining our training task as something like this:
    Copy code
    @task(
        task_config=task_config,
        cache=False,
        container_image=container_image,
        pod_template=pod_template,
        timeout=timeout,
        retries=max_retries,
    )
    def flyte_training_main_task():
      ...
    with the task_config being (note that we don't really need the elastic part of things - we just need to launch a multi-node pytorch task):
    Copy code
    task_config = Elastic(
        nnodes=num_nodes,
        nproc_per_node=8,
    )
    Now imagine that a rank in the distributed training has an error of some sort - is there a way for us to configure our task so that the whole task/workflow is terminated (including all the pods corresponding to it) as soon as a single rank errors? Currently it seems like it requires all the ranks to exit/error until the task/workflow is terminated, which we often don't want (because other ranks might be stuck until NCCL timeout or might be stuck for other reasons). I've tried raising special exception types like
    SignalException
    or
    ChildFailedError
    , but it seems like it always waits until all the ranks exit. One hacky workaround I could think of is to manually terminate the workflow, but that also does not seem ideal. Thanks!!
    👀 1
    f
    t
    • 3
    • 10
  • n

    numerous-hamburger-7178

    11/25/2025, 11:45 PM
    Do newer versions of flyte have pydantic inputs to workflows/tasks show up as something other than structs? I've been using dataclassjsonmixin to get well formatted input in the UI but want to try switching over to pydantic but on a flyte 1.16.2 deployment, an example wf shows up as struct
    g
    f
    l
    • 4
    • 8
  • c

    cool-waitress-85601

    11/26/2025, 1:16 PM
    Hi folks, as anybody tried using Dex as the external authorization server? I'd be interested to hear about it. Thanks
    f
    • 2
    • 1
  • a

    aloof-magazine-44547

    12/01/2025, 10:39 AM
    Hi, can I get some help to merge https://github.com/flyteorg/flytekit/pull/3339? Its about serialising and deserialising models with FlyteFile/FlyteDirectory in them, causing a attribute error. cc @swift-oil-78197