fierce-policeman-48118
07/30/2025, 9:13 AMrapid-artist-48509
08/01/2025, 5:13 PMimage_spec.exist()
will be None
and should_build
goes to `False`: https://github.com/flyteorg/flytekit/blob/eb5a67f76aaef96a44bde04afb72b87592cc8b7a/flytekit/image_spec/image_spec.py#L479
• except the noop
builder relies on should_build
being True
in order to effectively overwrite the image name: https://github.com/flyteorg/flytekit/blob/eb5a67f76aaef96a44bde04afb72b87592cc8b7a/flytekit/image_spec/noop_builder.py#L9
Thus noop
builder will never work as intended if the launching shell does not have docker
available. I think that's not as intended, because the noop
builder is meant to essentially skip docker (and so ideally would work as intended if the user environment does not have docker available eh?)freezing-tailor-85994
08/01/2025, 5:53 PMbfrench@LM-BFRENCH:~/Documents/Code/monorepo$ pyflyte register example/flyteify.py -p 'ml-example' -d 'staging' -v 'bf-2025-08-01'
Running pyflyte register from /Users/bfrench/Documents/Code/monorepo with images ImageConfig(default_image=Image(name='default', fqn='<http://cr.flyte.org/flyteorg/flytekit|cr.flyte.org/flyteorg/flytekit>', tag='py3.10-1.16.3', digest=None), images=[Image(name='default', fqn='<http://cr.flyte.org/flyteorg/flytekit|cr.flyte.org/flyteorg/flytekit>', tag='py3.10-1.16.3', digest=None)]) and image destination folder /root on 1 package(s) ('/Users/bfrench/Documents/Code/monorepo/example/flyteify.py',)
Registering against <http://flyte.COMPANY.net|flyte.COMPANY.net>
Detected Root /Users/bfrench/Documents/Code/monorepo/example, using this to create deployable package...
Loading packages ['flyteify'] under source root /Users/bfrench/Documents/Code/monorepo/example
No output path provided, using a temporary directory at /var/folders/8n/83r7h0kx0xbgnddnmbggtz_m0000gp/T/tmp7_mk_77l instead
AttributeError: 'str' object has no attribute 'labels'
This is code that runs perfectly normally but flyte is complaining about an error with almost no detail provided. Has anyone seen this before?acceptable-knife-37130
08/03/2025, 4:54 AMasyncio
was choosen over anyio
acceptable-knife-37130
08/03/2025, 4:56 AManyio
will provide better functionality compared to asyncio
.acceptable-knife-37130
08/03/2025, 4:57 AMacceptable-knife-37130
08/03/2025, 2:52 PMadmin:
endpoint: dns:///localhost:30080
insecure: true
image:
builder: local
task:
domain: development
project: flytesnacks
acceptable-knife-37130
08/03/2025, 2:53 PM# hello.py
# /// script
# requires-python = "==3.13"
# dependencies = [
# "flyte==0.2.0b23",
# ]
# ///
import flyte
# A TaskEnvironment provides a way of grouping the configuration used by tasks.
env = flyte.TaskEnvironment(name="hello_world", resources=flyte.Resources(memory="250Mi"))
# Use a TaskEnvironment to define tasks, which are regular Python functions.
@env.task
def fn(x: int) -> int: # Type annotations are recommended.
slope, intercept = 2, 5
return slope * x + intercept
# Tasks can call other tasks.
# All tasks defined with a given TaskEnvironment will run in their own separate containers,
# but those containers will all be configured identically.
@env.task
def main(x_list: list[int]) -> float:
x_len = len(x_list)
if x_len < 10:
raise ValueError(f"x_list doesn't have a larger enough sample size, found: {x_len}")
# flyte.map is like Python map, but runs in parallel.
y_list = list(flyte.map(fn, x_list))
y_mean = sum(y_list) / len(y_list)
return y_mean
# Running this script locally will perform a flyte.run, sending your task code to your remote Union/Flyte instance.
if __name__ == "__main__":
# Establish a remote connection from within your script.
flyte.init_from_config("config.yaml")
#flyte.init(project="flytesnacks", domain="development", endpoint="localhost:30080",insecure=True,insecure_skip_verify=False)
# Run your tasks remotely inline and pass parameter data.
run = flyte.run(main, x_list=list(range(10)))
# x= flyte.with_runcontext(mode="local").run(main, x_list=list(range(10)))
# print(x)
# Print various attributes of the run.
print(run.name)
print(run.url)
# Stream the logs from the remote run to the terminal.
run.wait(run)
acceptable-knife-37130
08/04/2025, 1:16 PMacceptable-knife-37130
08/04/2025, 1:16 PMcuddly-napkin-839
08/04/2025, 2:01 PMFile "/usr/local/lib/python3.12/site-packages/flytekit/core/data_persistence.py", line 614, in async_get_data
raise FlyteDownloadDataException(
flytekit.exceptions.system.FlyteDownloadDataException: SYSTEM:DownloadDataError: error=Failed to get data from s3://*****/test-project-01/development/YR7RMXMIOOCGYZIJKBIO2N4KUI======/fast6c01ca0737d31ff994073617f3ac5dec.tar.gz to /root/ (recursive=False).
Original exception: Unable to locate credentials
.
If I get it correctly, the difference to the registration process of the workflow is that the user context changed right?
But now I stuck for 2 days because I can’t figure out what is the right place and best practice for providing the credentials to the workflow. Do I have to create a k8s-Service Account and connect them to the launchplan? And what is the right way to attach the credentials to the SA? The documentation is very focused on hyperscaler usage and less on prem setups.
I’m thankful for any kind of help.gentle-night-59824
08/05/2025, 3:42 PMPodTemplate
resource in our cluster, and added an item to initContainers
within the spec
• I use pod_template_name
in the task decorator to route it to the above resource
• I additionally specify another template via pod_template
in the task decorator, just to add some extra tolerations
• and when I launch the task, I do see its pod get most of the fields from the PodTemplate
resource like the volumes and mounts, but it just doesn't seem to merge in the initContainers
happy-parrot-43932
08/06/2025, 2:15 AMbland-dress-83134
08/06/2025, 2:20 PMPodTemplate
for this task had /bin/bash -c
as the command
(where arguments
was the pyflyte-fast-execute ...
etc)
[...]
/usr/local/lib/python3.10/dist-packages/flytekit/bin/entrypoint.py:754 in │
│ fast_execute_task_cmd │
│ │
│ ❱ 754 │ p = subprocess.Popen(cmd, env=env) │
│ │
│ /usr/lib/python3.10/subprocess.py:971 in __init__ │
│ │
│ ❱ 971 │ │ │ self._execute_child(args, executable, preexec_fn, close_f │
│ │
│ /usr/lib/python3.10/subprocess.py:1733 in _execute_child │
│ │
│ ❱ 1733 │ │ │ │ executable = args[0] │
╰──────────────────────────────────────────────────────────────────────────────╯
IndexError: list index out of range
• I was happy to find and correct the cause, but I was also confused because another domain for this same project was using the same PodTemplate
and didn't have the `command`: I confirmed that the configured PodTemplate
in my cluster_resource_manager
config did not have this command
either. I can't figure out when I added or removed this command
but I'm assuming I removed it at some point: I'm guessing the cluster resource manager / syncresources can't detect removed keys from a PodTemplate? Is that a known limitation?
◦ I resolved this by doing a manual kubectl edit podtemplate...
and removing the command
entirely but curious whether its worth reporting thisacceptable-knife-37130
08/06/2025, 4:05 PM# Current Implementation
run = flyte.run(main, x_list=list(range(10)))
# Suggestion
run = flyte.run(main, x_list=list(range(10),logger,log_level))
Does it sound fair if we could pass a logger to the function, so we can get a detailed trace and place the logger file on a location of our choice.
We would also need a log_level [INFO,DEBUG,ERROR] for the level of verbosity that is required
Flyte CLI flyte --help
has flyte -vvv get logs <run-name>
I don’t see anything like that for python code.salmon-truck-59834
08/06/2025, 4:43 PMcontainer_image
parameter like regular Flyte tasks. Is there a recommended approach for running Flyte containerized tasks on Slurm?acceptable-noon-24676
08/07/2025, 5:38 AMwhite-island-91320
08/07/2025, 8:02 AMcrooked-holiday-38139
08/07/2025, 11:07 AMpodRFC3339StartTime
and podRFC3339FinishTime
but the problem we have is that Grafana expects datetimes in RFC3339 in millisecond precision so our links don't currently work as podRFC3339StartTime which reads the pod.CreationTimestamp, and of course the creation timestamp is in seconds precision.
The way we've "solved" this is to instead use the unix timestamp and pad it with zeroes:
Hopefully, that helps folks who are using Grafana for their logging.Copy code&from={{ "{{" }} .podUnixStartTime {{ "}}" }}000&to={{ "{{" }} .podUnixFinishTime {{ "}}" }}000
acceptable-knife-37130
08/07/2025, 4:39 PMshy-morning-17240
08/07/2025, 6:47 PM@task(..., enable_deck=True)
and map_task(my_function, concurrency=some_number, enable_deck=True)
, my tasks generated using map_task still don't show a Deck button to show rendered outputs for each deck.
I re-worked my code to remove other complexities like @dynamic workflows and use of functools to define constant paramenters, but even when defining a @workflow that calls map_task(task_function), the Decks don't show up in Flyte UI.
Do I have to return Deck object from the mapped task in order to get it to show up in the Flyte Dashboard? Is there another way to access the Decks from each of these mapped tasks that's not through the executed workflow(e.g. some kubernetes hack, another place in the Flyte UI)? Is there something I need to consider/add in the Flyte helm configuration file?acceptable-knife-37130
08/08/2025, 4:33 AMmammoth-quill-44336
08/08/2025, 10:12 PMflat-monkey-49105
08/09/2025, 7:11 AMfrom dataclasses import dataclass
from typing import Annotated
from flytekit import Cache, HashMethod, StructuredDataset, task, workflow
import pandas as pd
import logging
@dataclass
class Data:
metadata: str
df: StructuredDataset
def hash_pandas_dataframe(df: pd.DataFrame) -> str:
return str(pd.util.hash_pandas_object(df))
def hash_data(data: Data) -> str:
# I cannot access the pd.Dataframe in the hash function?
return str(pd.util.hash_pandas_object(data.df.open(pd.DataFrame).all()))
@task
def generate_data_a() -> Annotated[Data, HashMethod(hash_data)]:
data = Data(
metadata="hello",
df=StructuredDataset(pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})),
)
return data
@task(cache=Cache(version="1.3"))
def process_data_a(data: Data) -> bool:
logging.error(f"process_data_a: {data.df.open(pd.DataFrame).all()}")
return True
@task
def generate_data_b() -> Annotated[pd.DataFrame, HashMethod(hash_pandas_dataframe)]:
return pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
@task(cache=Cache(version="1.3"))
def process_data_b(data: pd.DataFrame) -> bool:
logging.error(f"process_data_b: {data}")
return True
@workflow
def cache_workflow() -> None:
# With the custom hashMethod for the `Data` object it crashes but perhaps that is also not the correct way to do it
data_a = generate_data_a()
process_data_a(data_a)
# This caches correctly
data_b = generate_data_b()
process_data_b(data_b)
return
bored-fountain-1423
08/11/2025, 6:47 AMfreezing-tailor-85994
08/11/2025, 4:01 PMpipelines
- src
- - tasks
- - - mytasks.py
- - workflows
- - - myworkflow.py
- - utils
- - - pipeline_utils.py
- shared
- - src
- - - some_utils.py
mytasks.py
has the import from shared.src.some_utils import util
which runs fine on local but when I package up using pyflyte register pipelines/src/workflows/
from the root of the monorepo, the packed tarball only includes pipeline/src/tasks
, pipeline/src/utils/
and pipeline/src/workflows
but not shared
despite references so I get immediate crashes. Has anyone else seen/solved this problem?hallowed-barista-69501
08/11/2025, 4:50 PMflytekitplugins-omegaconf
plugin. I’m unsure about the best pattern for loading configs into workflows. It seems like I either need:
1. A Python entry point that builds the DictConfig
before calling the workflow
2. A Flyte task inside the workflow that builds/loads the config
Is there some established best practice pattern here? With the Python entry point approach it looks like you can’t run remotely with pyflyte run ...
and instead have to python run.py
cc: @acoustic-oyster-33294acceptable-noon-24676
08/12/2025, 8:53 AMearly-napkin-90297
08/13/2025, 11:57 AMpyflyte package
in our deployments..
The setup:
• using flytekit==1.16.1
• tasks are using ImageSpec
containers without specifying any of the source_root
, copy
or source_copy_mode
args (i.e. using the defaults)
When changing code or dependencies in one of the tasks, pyflyte package
generates new image tags and rebuilds every ImageSpec
container, even the ones not affected by the code/deps changes.
What's the recommended strategy to ensure that the ImageSpec
tag only depends on the relevant task code and dependencies?gray-machine-6182
08/14/2025, 7:40 PM