elegant-energy-77053
08/06/2024, 3:45 PMelegant-energy-77053
08/06/2024, 3:46 PMpowerful-horse-58724
08/06/2024, 5:22 PMelegant-energy-77053
08/06/2024, 5:23 PMelegant-energy-77053
08/18/2024, 6:50 AMelegant-energy-77053
08/18/2024, 6:50 AMelegant-energy-77053
08/18/2024, 6:50 AMelegant-energy-77053
08/18/2024, 6:51 AMelegant-energy-77053
08/18/2024, 6:51 AMelegant-energy-77053
08/18/2024, 6:52 AMelegant-energy-77053
08/18/2024, 6:53 AMelegant-energy-77053
08/18/2024, 6:53 AMelegant-energy-77053
08/18/2024, 6:53 AMelegant-energy-77053
08/18/2024, 7:18 AMpowerful-horse-58724
09/17/2024, 9:28 PMacoustic-nest-94594
10/09/2024, 9:26 PMdazzling-spring-85404
10/14/2024, 8:33 AMripe-nest-20732
10/16/2024, 3:47 PMbrash-ice-98462
12/14/2024, 3:47 PMsilly-book-73230
01/13/2025, 12:04 PM@task(
requests=Resources(cpu="8", mem="54Gi", gpu="2"),
limits=Resources(cpu="100", mem="1Ti"),
pod_template=PodTemplate(
pod_spec=V1PodSpec(
containers=[
V1Container(
name="primary",
),
],
node_selector={
"cloud.google.com/gke-accelerator": "nvidia-l4",
"cloud.google.com/gke-accelerator-count": "2",
},
)
),
)
I see that Flyte also has a features for selecting GPUs: https://docs.flyte.org/en/latest/api/flytekit/extras.accelerators.html
However, if I remove the pod_template and just add the accelerator kwarg, then the flytepropellor gives the following error:
│ E0113 12:02:55.686281 1 workers.go:103] error syncing '-': failed at Node[-]. Runt │
│ imeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [GKE Warden constraints violat │
│ ons[] failed to create resource, caused by: admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE W │
│ arden rejected the request because it violates one or more constraints. │
│ Violations details: {"[denied by autogke-gpu-limitation]":["When requesting 'nvidia.com/gpu' resources, you must specify either node selector │
│ 'cloud.google.com/gke-accelerator' with accelerator type or node selector 'cloud.google.com/compute-class' with existing custom compute clas │
│ s which has at least one GPU priority rule."]}
This suggests that the right GKE config is not properly set by providing the accelerator kwarg. Is this supposed to happen? If not, what is the point of the accelerator kwarg?future-boots-58005
01/14/2025, 1:03 AMsilly-book-73230
01/14/2025, 8:52 AMk8s:
plugins:
k8s:
interruptible-node-selector:
<http://cloud.google.com/gke-spot|cloud.google.com/gke-spot>: "true"
However, this conflicts with the default setting
default-annotations:
<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>: "false"
While I could override this default, is there a way to set the safe-to-evict
annotation only on all the non-interruptible tasks?worried-airplane-87065
03/11/2025, 6:44 PMmelodic-mechanic-59879
03/13/2025, 6:59 AMacoustic-city-8573
03/16/2025, 1:48 PMgreat-businessperson-79530
03/19/2025, 7:37 AMworried-airplane-87065
03/21/2025, 7:06 AMpb2_my_proto
from functions. Happy to take a crack at a PR.fancy-car-5539
04/04/2025, 10:32 PMmelodic-mechanic-59879
05/01/2025, 2:42 PMquaint-midnight-92440
05/24/2025, 6:26 AM