This message was deleted BentoML #ask-for-help

Join Slack

This message was deleted.

# ask-for-help

Slackbot

12/09/2022, 8:41 PM

This message was deleted.

🏁 1

Jakub Chledowski

12/09/2022, 8:41 PM

I was able to push the bento to Yatai, but can't find a way to create the deployment.

Sean

12/09/2022, 10:37 PM

Hi @Jakub Chledowski, you can do so with Yatai OpenAPIs. Try appending

/swagger

to your yatai deployment, you will see a list of available API to manager and deploy bentos.

❤️ 1

Sean

12/09/2022, 10:41 PM

The capabilities of Python client of Yatai is currently just limited to pushing and pulling bentos. We will add deployment functionality to the Yatai client in the future. https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/yatai_client/__init__.py#L153

Jakub Chledowski

12/09/2022, 10:41 PM

I see, thanks for the info!

Xipeng Guan

12/09/2022, 11:25 PM

You also can use kubectl to create a BentoDeployment in the cmd line

Jakub Chledowski

12/10/2022, 10:30 AM

Thanks, it seems I will be able to make it work with swagger + create_deployment post, do you think using kubectl would be easier with gh actions?

Xipeng Guan

12/10/2022, 10:59 AM

@Jakub Chledowski There is a handly github action that can deploy BentoDeployment to k8s cluster https://github.com/marketplace/actions/deploy-to-kubernetes-cluster

👀 1

Jakub Chledowski

12/10/2022, 1:17 PM

Thanks, I tried running through the post request, I think I am almost there, but I am still getting one error

Copy code

[2022-12-10 14:12:54] [Pod] [test-post-request-v25-c94566fc6-qqxc5] [Failed] Error: Error response from daemon: Minimum memory limit allowed is 6MB
[2022-12-10 14:12:58] [HorizontalPodAutoscaler] [test-post-request-v25] [FailedGetResourceMetric] failed to get memory utilization: unable to get metrics for resource memory: no metrics returned from resource metrics API
[2022-12-10 14:12:58] [HorizontalPodAutoscaler] [test-post-request-v25] [FailedComputeMetricsReplicas] invalid metrics (1 invalid out of 1), first error is: failed to get memory utilization: unable to get metrics for resource memory: no metrics returned from resource metrics API

I am not sure what I am doing wrong, as this

Copy code

{
  "description": "testing post request",
  "do_not_deploy": false,
  "kube_namespace": "yatai",
  "labels": [
    {
      "key": "string",
      "value": "string"
    }
  ],
  "name": "test-post-request-v25",
  "targets": [
    {
      "bento": "iqujfmtx4k5fmt7y",
      "bento_repository": "embedding_extractor",
      "canary_rules": [
        {
          "cookie": "string",
          "header": "string",
          "header_value": "string",
          "type": "weight",
          "weight": 0
        }
      ],
      "config": {
        "enable_debug_mode": false,
        "enable_debug_pod_receive_production_traffic": false,
        "enable_ingress": true,
        "enable_stealing_traffic_debug_mode": false,
        "hpa_conf": {
          "cpu": 0,
          "gpu": 0,
          "max_replicas": 2,
          "memory": "100",
          "min_replicas": 1,
          "qps": 0
        },
        "kubeResourceUid": "string",
        "kubeResourceVersion": "string",
        "resources": {
          "limits": {
            "cpu": "98m",
            "gpu": "0",
            "memory": "100"
          },
          "requests": {
            "cpu": "98m",
            "gpu": "0",
            "memory": "100"
          }
        },
        "runners": {
          "best_ipp_49_embedding_extractor": {
            "enable_debug_mode": false,
            "enable_debug_pod_receive_production_traffic": false,
            "enable_stealing_traffic_debug_mode": false,
            "hpa_conf": {
              "cpu": 1,
              "gpu": 0,
              "max_replicas": 2,
              "memory": "100",
              "min_replicas": 1,
              "qps": 0
            },
            "resources": {
              "limits": {
                "cpu": "98m",
                "gpu": "0",
                "memory": "100"
              },
              "requests": {
                "cpu": "98m",
                "gpu": "0",
                "memory": "100"
              }
            }
          }
        }
      },
      "type": "stable"
    }
  ]
}

is my input, and there is a memory field, but it seems like it's not getting propagated to the deployment. This is what I am getting when I click on

update

button to check what was passed (img1 is the main configuration and img2 is the runners configuration) - see imgs below. Are you by any chance aware of a way to pass the memory limits through the post request? :)

Xipeng Guan

12/10/2022, 1:20 PM

Memory resources are missing units

Jakub Chledowski

12/10/2022, 1:20 PM

I tried using

Mi

but didn't work 😞

Xipeng Guan

12/10/2022, 1:21 PM

What is the body of the request and the response returned and the error message?

Jakub Chledowski

12/10/2022, 1:25 PM

So this is the body after changing to 100m:

Copy code

{
  "description": "testing post request",
  "do_not_deploy": false,
  "kube_namespace": "yatai",
  "labels": [
    {
      "key": "string",
      "value": "string"
    }
  ],
  "name": "test-post-request-v26",
  "targets": [
    {
      "bento": "iqujfmtx4k5fmt7y",
      "bento_repository": "embedding_extractor",
      "canary_rules": [
        {
          "cookie": "string",
          "header": "string",
          "header_value": "string",
          "type": "weight",
          "weight": 0
        }
      ],
      "config": {
        "enable_debug_mode": false,
        "enable_debug_pod_receive_production_traffic": false,
        "enable_ingress": true,
        "enable_stealing_traffic_debug_mode": false,
        "hpa_conf": {
          "cpu": 0,
          "gpu": 0,
          "max_replicas": 2,
          "memory": "100m",
          "min_replicas": 1,
          "qps": 0
        },
        "kubeResourceUid": "string",
        "kubeResourceVersion": "string",
        "resources": {
          "limits": {
            "cpu": "98m",
            "gpu": "0",
            "memory": "100m"
          },
          "requests": {
            "cpu": "98m",
            "gpu": "0",
            "memory": "100m"
          }
        },
        "runners": {
          "best_ipp_49_embedding_extractor": {
            "enable_debug_mode": false,
            "enable_debug_pod_receive_production_traffic": false,
            "enable_stealing_traffic_debug_mode": false,
            "hpa_conf": {
              "cpu": 1,
              "gpu": 0,
              "max_replicas": 2,
              "memory": "100m",
              "min_replicas": 1,
              "qps": 0
            },
            "resources": {
              "limits": {
                "cpu": "98m",
                "gpu": "0",
                "memory": "100m"
              },
              "requests": {
                "cpu": "98m",
                "gpu": "0",
                "memory": "100m"
              }
            }
          }
        }
      },
      "type": "stable"
    }
  ]
}

Response is in the json below. The error is in events in Yatai:

Jakub Chledowski

12/10/2022, 1:26 PM

I'm guessing this is because the "memory" field is not propagated and the default is used for the config (500Mi - 1024Mi), but the default for runners is not set and it's empty.

Xipeng Guan

12/10/2022, 1:28 PM

The memory you set is too small, please set the right size or use the right unit https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory

Jakub Chledowski

12/10/2022, 1:36 PM

Thanks a lot, I was certain I tried Mi, but maybe I made a typo somewhere else. Now my events don't have the memory error:

Copy code

[2022-12-10 14:34:29] [HorizontalPodAutoscaler] [test-post-request-v28] [FailedGetResourceMetric] failed to get memory utilization: unable to get metrics for resource memory: no metrics returned from resource metrics API
[2022-12-10 14:34:29] [HorizontalPodAutoscaler] [test-post-request-v28] [FailedComputeMetricsReplicas] invalid metrics (1 invalid out of 1), first error is: failed to get memory utilization: unable to get metrics for resource memory: no metrics returned from resource metrics API
[2022-12-10 14:34:34] [Pod] [test-post-request-v28-runner-0-59c6646c66-ds985] [Unhealthy] Readiness probe failed: Get "<http://172.31.12.220:3000/readyz>": dial tcp 172.31.12.220:3000: connect: connection refused
[2022-12-10 14:34:34] [Pod] [test-post-request-v28-runner-0-59c6646c66-ds985] [Unhealthy] Liveness probe failed: Get "<http://172.31.12.220:3000/livez>": dial tcp 172.31.12.220:3000: connect: connection refused

Thanks a lot! I will work on the next error now 😉

Jakub Chledowski

12/10/2022, 2:44 PM

@Xipeng Guan I managed to get it working by adjusting the limits to fit the available cluster. Big thank you for your help gratitude mercicheckeredflag parrot

👍 1

Open in Slack

Previous Next