billions-hairdresser-78656
04/24/2025, 9:52 PM<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behavior if you are running under QEMU)
Running Execution on Remote.
Request rejected by the API, due to Invalid input.
RPC Failed, with Status: StatusCode.INVALID_ARGUMENT
Details: Requested MEMORY default [2Gi] is greater than current limit set in the platform configuration [1Gi]. Please contact Flyte Admins to change these limits or consult the configuration
Researching I see that there are several configMaps that can be modified:
• flyte-admin-base-config
• flyte-admin-clusters-config
• flyte-clusterresourcesync-config
Could you tell me some doc or how to correctly configure these parameters?curved-whale-1505
04/25/2025, 1:21 AMclean-glass-36808
04/25/2025, 3:07 AMcuddly-engine-34540
04/25/2025, 12:31 PM@workflow
def data_processing_workflow(
trigger_file_s3_uri: str,
triggering_timestamp: str,
hub_name: str,
environment: str,
poll_interval: int = 60,
poll_timeout: int = 7200,
sleep_time: int = 0
) -> None:
wait_until_timestamp_task(
triggering_timestamp=triggering_timestamp,
sleep_time=sleep_time
)
should_launch, processing_params = validate_and_get_params_task(
trigger_file_s3_uri=trigger_file_s3_uri,
hub_name=hub_name,
environment=environment
)
...
Notice validate_and_get_params_task
do not receive inputs from wait_until_timestamp_task
, which outputs None.
def wait_until_timestamp_task(
triggering_timestamp: str,
sleep_time: int = 120
) -> None:
...
Currently validate_and_get_params_task
and wait_until_timestamp_task
runs in parallel inside data_processing_workflow
How can I make sure validate_and_get_params_task
runs only after wait_until_timestamp_task
, even though the former do not receive inputs from the latter?
It worked to alter wait_until_timestamp_task
to output a dummy string then alter validate_and_get_params_task
to receive this dummy string as input, but it seems hacky. Is there another way?freezing-tailor-85994
04/25/2025, 6:03 PMechoing-park-83350
04/25/2025, 9:23 PMpyflyte
command with respect to project root in my terminal.
pyflyte run --remote --project some-project --domain development tests/regression/workflows/test_deidentify_workflow.py test_deidentify_clinical_note_file --storage_account 'abc12345' --base_path 'container_name/validation/clinical_note/'
However when trying to run this same workflow from a local jupyter notebook using the following code against the remote cluster using flytekit.FlyteRemote
(this notebook file is located in the root/tests/
directory of my project):
from flytekit import Config, FlyteRemote
from tests.regression.workflows.test_deidentify_workflow import test_deidentify_clinical_note_file
remote = FlyteRemote(
config=Config.auto(),
default_project="some-project",
default_domain="development",
interactive_mode_enabled=True,
)
remote.fast_register_workflow(entity=test_deidentify_clinical_note_file)
execution = remote.execute(test_deidentify_clinical_note_file, inputs={"storage_account": "abc12345","base_path": "container-name/validation/clinical_note/"}, wait=True)
print(execution.outputs)
I can see the attempt of the workflow execution in the UI but it results in the following ERROR:
FlyteAssertion: USER:AssertionError: error=Outputs could not be found because the execution ended in failure. Error
message: Trace:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/flytekit/bin/entrypoint.py", line 163, in _dispatch_execute
task_def = load_task()
^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/flytekit/bin/entrypoint.py", line 578, in load_task
return resolver_obj.load_task(loader_args=resolver_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/flytekit/core/utils.py", line 312, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/flytekit/core/python_auto_container.py", line 332, in load_task
loaded_data = cloudpickle.load(f)
^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'tests'
Message:
ModuleNotFoundError: No module named 'tests'
The module tests
it’s referring to is a local directory located immediately under the root project directory that houses test code and is where the workflow I am trying to run is located (albeit a directories down).
It seems like the project code isn’t being packaged and registered correctly before attempting to execute the workflow.
I have tried manually setting the sys path to the project root path in the notebook before registering and executing the workflow but that seems to make no difference.
I suspect I am misconfiguring FlyteRemote
or need to further configure Jupyter for Flyte usage in some way.
Anyone have any insight or could help me solve this problem?curved-whale-1505
04/26/2025, 5:27 PMcurved-whale-1505
04/27/2025, 8:13 PMpyflyte
or flytectl
with the HTTP REST API instead of GRPC API? I haven't been able to figure out how to set admin.endpoint
properly to use the HTTP REST successfully. GRPC works fine when I use kubectl
to forward port 8089
and set it dns:///localhost:8089
.wonderful-continent-24967
04/29/2025, 10:32 PMUNKNOWN
and the sub-workflow that contains that task has status RUNNING
. Seems like a similar issue was reported here https://github.com/flyteorg/flyte/issues/3536 .
Any pointers on what could be wrong here?curved-whale-1505
04/30/2025, 7:19 AMCannot check peer: missing selected ALPN property.
bland-dress-83134
04/30/2025, 9:08 AMclean-glass-36808
04/30/2025, 4:47 PMLast known status message: AlreadyExists: Event Already Exists, caused by [event has already been sent]
I can''t tell if this indicates a larger issue or if Flyte Propeller should just be updated to more gracefully handle AlreadyExists
. Going to dig deeper into this to understand if DB state was updated in Flyte Admin but maybe gRPC call failed the first time the event was sent..helpful-jelly-64228
05/01/2025, 11:11 AM{
"cpus": 16,
"mesh_result": {
"instance_info": {
"instance_id": "...",
"public_ip": "..."
},
"mesh_log": {
"path": "<s3://flytecfd-bucket/task-data/mesh-dir-6d95f0d05e21457aa117451cbf4ffdfe/mesh/mesh.log>"
},
"mesh_dir": {
"path": "<s3://flytecfd-bucket/task-data/mesh-dir-6d95f0d05e21457aa117451cbf4ffdfe/mesh/constant/polyMesh/>"
}
},
"cases": {
"case_aoa_01.00": {
"type": "multi-part blob",
"uri": "<s3://flytecfd-bucket/task-data/cases-dir-29b204be507a48e79a657657beb1e1f3/case_aoa_01.00/>"
}
}
}
Here the mesh_log and mesh_dir is part of a dataclass and is working as expected and FlyteDirectory and FlyteFile is initailized the same way.
@dataclasses.dataclass
class MeshResult:
instance_info: InstanceInfo
mesh_log: FlyteFile
mesh_dir: FlyteDirectory
def start_solvers_wf(
cases: dict[str, FlyteDirectory], mesh_result: MeshResult, cpus: int
)
Any idea what could be causing this?brief-egg-3683
05/01/2025, 1:52 PMclean-glass-36808
05/01/2025, 11:25 PMechoing-account-76888
05/02/2025, 12:41 AMbland-dress-83134
05/02/2025, 9:44 AMtimout
arg for @task
decorators, but just making sure there isn't a system-wide default that can be configured?abundant-judge-84756
05/02/2025, 10:26 AMwebApi
settings listed in the connector.Config
on this docs page? There's a small amount of info on the page, but not a lot.
We're still trying to understand why we're unable to use connectors/agents at scale - as soon as we try to send 1000+ tasks to our connectors, flytepropeller starts to significantly slow down - we see the unprocessed queue depth grow, flytepropeller CPU usage spikes, and the throughput of tasks is very slow.
It's not clear whether this is an issue with the connector setup (eg. the number of grpc worker threads?), something to do with the propeller web API, or something else. We're trying to identify which specific settings we need to modify to be able to improve propeller 🤝 connector throughput - any advice would be greatly appreciated 🙏busy-lawyer-8908
05/02/2025, 6:02 PMflytekit.Artifact
entities visible/browsable in the OSS Flyte UI anywhere?bored-laptop-29637
05/02/2025, 8:17 PMtask_node = wf.add_task(
my_task,
...
).with_overrides(name="staging_model_calculation")
But when I go to the actual flyte execution every task is named just my_task
. Should I be applying this override in a different spot?curved-whale-1505
05/03/2025, 3:00 PMsparse-carpenter-66912
05/05/2025, 7:31 AMImageSpec
in a --remote
execution? Seems like a standard thing, but I can't get it to work. I described it here in more detailrapid-artist-48509
05/05/2025, 4:18 PMbrave-nail-30599
05/05/2025, 7:09 PMnice-kangaroo-62690
05/06/2025, 9:43 AMContainerTask
, but it looks like that would require a lot of contorsions and stack juggling.gentle-night-59824
05/07/2025, 12:40 AMerr missing entity of type Tag with identifier
so I queried our DB for the tag name, dataset fields in the logs but I do see a row exists in the tags
table, so I'm unsure why datacatalog might report this - has anyone seen this or have ideas? I also looked at our DB metrics too and doesn't seem like there were any latency spikes, and it seems consistent for particular tasks whereas other tasks are able to query cache fine. I wasn't able to identify anything unique about the problematic tasks either, and they all use cache_serializable
helpful-church-28990
05/07/2025, 11:36 AMnutritious-cat-43409
05/07/2025, 12:18 PMcrooked-holiday-38139
05/07/2025, 1:53 PMcat: /var/inputs/input_file: No such file or directory
... which makes sense, the file isn't getting into the docker container. I look through the container_task.py source, and I can see that we bind a mount for the output, but I can't see how the inputs get in to the container, I had assumed that we'd mount two directories, one for inputs and one for outputs.
How do inputs get into the ContainerTask? Can a FlyteFile be given as an input to a container?brainy-carpenter-31280
05/07/2025, 4:26 PM