gentle-tomato-480
11/03/2025, 12:03 PMabundant-laptop-47033
11/04/2025, 9:33 PMgentle-tomato-480
11/05/2025, 2:23 PMv0.9.0 got removed/deprecated for the flytectl-setup-action? I was using that in my CICD and it was still working last week.
Today I'm getting:
Error: Unable to find flytectl version "v0.9.0" for platform "Linux" and architecture "x86_64".
in my GHA workflow when running this action.high-autumn-89220
11/10/2025, 5:08 PMClientSecret) auth type? does anyone know if it will work without custom auth servers on our plan? been struggling with this for a few weeks to no availwonderful-continent-24967
11/12/2025, 12:01 AMFailed to write output for this execution to cache. . I looked into datacatalog logs for the corresponding flyte task, nothing unusual there. Datacatalog created, updated & deleted reservations for that task as other tasks. We are using flyte 1.15.3fancy-hamburger-89099
11/12/2025, 10:10 AM400 Bad Request
Request Header Or Cookie Too Large
nginx
I tried different browsers, incognito mode, wiping cookies, everything.
This only happens on that one instance, and it works without any issues on the other 3.
Any ideas?abundant-judge-84756
11/12/2025, 11:34 AMABORTING state and can't be fully terminated. The executions include a dynamic workflow step, and these dynamic workflows show 2 x tasks as RUNNING - the task descriptions specify they are initializing. I think these initializing dynamic tasks are somehow blocking the workflows from resolving the abort request. Any suggestions for ways we can trigger these workflows to transition to ABORTED? We're currently running flyte 1.15.3.cool-waitress-85601
11/12/2025, 3:40 PMpyflyte run --remote?fierce-monitor-77717
11/13/2025, 12:20 PMcool-waitress-85601
11/17/2025, 5:41 PMmysterious-painter-66441
11/17/2025, 9:57 PMdataclass) are displayed as a single opaque field rather than expanding into individual attributes. This makes it unclear to users what values are expected for each field.
Could you advise if there’s a recommended approach to make structured inputs more user-friendly in the UI? For example, is there a way to automatically expand fields or provide schema hints for structured types?
Thanks for your help!brash-ram-89454
11/18/2025, 1:23 PMcool-waitress-85601
11/18/2025, 8:49 PMgray-ocean-43286
11/19/2025, 4:33 PMcool-waitress-85601
11/19/2025, 4:45 PMraw_output_data_config ? For instance the user local code when using fast registration, will it be uploaded to the global or project scoped bucket? More generally, what data would go in the global bucket vs the project scoped bucket?
Thanksearly-addition-41415
11/20/2025, 10:21 PMearly-addition-41415
11/20/2025, 10:22 PMearly-addition-41415
11/20/2025, 10:23 PMfancy-twilight-30247
11/21/2025, 10:12 AM@task(
task_config=task_config,
cache=False,
container_image=container_image,
pod_template=pod_template,
timeout=timeout,
retries=max_retries,
)
def flyte_training_main_task():
...
with the task_config being (note that we don't really need the elastic part of things - we just need to launch a multi-node pytorch task):
task_config = Elastic(
nnodes=num_nodes,
nproc_per_node=8,
)
Now imagine that a rank in the distributed training has an error of some sort - is there a way for us to configure our task so that the whole task/workflow is terminated (including all the pods corresponding to it) as soon as a single rank errors? Currently it seems like it requires all the ranks to exit/error until the task/workflow is terminated, which we often don't want (because other ranks might be stuck until NCCL timeout or might be stuck for other reasons). I've tried raising special exception types like SignalException or ChildFailedError , but it seems like it always waits until all the ranks exit. One hacky workaround I could think of is to manually terminate the workflow, but that also does not seem ideal.
Thanks!!numerous-hamburger-7178
11/25/2025, 11:45 PMcool-waitress-85601
11/26/2025, 1:16 PMaloof-magazine-44547
12/01/2025, 10:39 AMthankful-lighter-72752
12/01/2025, 11:01 PMproud-napkin-10936
12/03/2025, 11:16 AMwooden-scooter-1097
12/03/2025, 9:50 PMflytectl demo start, and the flyte-sandbox-xxx and flyteconnector-xxx services never get out of Pending state. Looking at the Docker (Rancher) logs show a few x509 ca cert issues as well as some "back-off" entries. Not sure what's going on. I am on the company VPN which I'm not allowed to disable, so if it's a cert issue, not sure how to get around it. --admin.insecure and --admin.insecureSkipVerify doesn't help. Ideas?
Sample log entries...
2025-12-03T21:43:57.491204261Z E1203 21:43:57.491163 68 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"local-path-provisioner\" with ErrImagePull: \"failed to pull and unpack image \\\"<http://docker.io/rancher/local-path-provisioner:v0.0.24\\\|docker.io/rancher/local-path-provisioner:v0.0.24\\\>": failed to copy: httpReadSeeker: failed open: failed to do request: Get \\\"<https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/10/10ada9a7f8ab578464314da2df287d1d384c6ef9f474d00dc73bf232599df55f/data?expires=1764801238&signature=KC81Pwa1VNzUPyOJ089%2BQZbYlH4%3D&version=2>\\\": tls: failed to verify certificate: x509: certificate signed by unknown authority\"" pod="kube-system/local-path-provisioner-84db5d44d9-q2chh" podUID="fad13c92-96bd-4cec-b19f-0e9ade5ffb19"
...
2025-12-03T21:44:05.221227848Z E1203 21:44:05.220969 68 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"coredns\" with ImagePullBackOff: \"Back-off pulling image \\\"rancher/mirrored-coredns-coredns:1.10.1\\\"\"" pod="kube-system/coredns-6799fbcd5-27h25" podUID="1fa7b663-8c6b-492e-a816-d35a29e56e30"fierce-oil-47448
12/03/2025, 11:54 PMflytectl install instructions mention:
• curl -sL <https://ctl.flyte.org/install> | bash
This errors out on Ubunutu:
flyteorg/flyte info checking GitHub for latest tag
flyteorg/flyte crit unable to find '' - use 'latest' or see <https://github.com/flyteorg/flyte/releases> for detailshandsome-lock-30336
12/04/2025, 5:00 PMnice-hairdresser-45030
12/05/2025, 2:31 PMinject-finalizer to at least get eventual consistency.
But my question is whether propeller not evaluating pods for hours in such a scenario is expected or unexpected. I know that I can shard propeller but this would only help me if I break this down into multiple workflows? Any other parameters I can tune so have propeller evaluate the pods earlier?
Thank you!melodic-mechanic-59879
12/06/2025, 10:45 PMfierce-farmer-40956
12/08/2025, 12:29 PMduplicate key value violates unique constraint "tasks_pkey" (SQLSTATE 23505)
and our executions are all in UNKNOWN state. Would you have a pointer?