Hi guys, i'm confused with DataType.image() I am t...
# general
u
Hi guys, i'm confused with DataType.image() I am try to use yolo model to detect image, and crop object to next embedding theme.
Copy code
@udf(
    return_dtype=DataType.list(
        DataType.struct(
            {
                "class": DataType.string(),
                "score": DataType.float64(),
                "cropped_img": DataType.image(),
                "bbox": DataType.list(DataType.int64()),
            }
        )
    ),
    num_gpus=1,
    batch_size=16,
)
class YOLOWorldOnnxObjDetect:
    def __init__(
        self,
        model_path: str,
        device: str = "cuda:0",
        confidence: float = 0.25,
    ):
         # int model
         pass

    def __call__(self, images_2d_col: Series) -> List[List[dict]]:
        images: List[np.ndarray] = images_2d_col.to_pylist()
        results = self.yolo.predict(source=images, conf=self.confidence)
        for r in results:
            img_result = []
            orig_img = r.orig_img
            for box in r.boxes:
                x1, y1, x2, y2 = box.xyxy[0].cpu().numpy().astype(int)
                x1, y1 = max(0, x1), max(0, y1)
                x2, y2 = min(orig_img.shape[1], x2), min(orig_img.shape[0], y2)
                x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
                cls = int(box.cls[0])
                img_result.append(
                    {
                        "class": self.yolo.names[cls],
                        "score": float(box.conf[0]),
                        "cropped_img": {
                            "cropimg": cv2.cvtColor(
                                orig_img[y1:y2, x1:x2], cv2.COLOR_BGR2RGB
                            ),
                        },
                        "bbox": [x1, y1, x2, y2],
                    }
                )
            objs.append(img_result)
        return objs
the cropped_img must return with dict, if direct return np.ndarray, will raise Could not convert array(..., dtype=uint8) with type numpy.ndarray: was expecting tuple of (key, value) pair error why?
e
Hey Iyu, good question. Upon inspection I see a couple things that stand out. 1. Concerning returning numpy arrays in UDFs, the return_dtype that works best for me is:
Copy code
daft.DataType.tensor(dtype=daft.DataType.float32()) # Adjust dtype as needed
You can then convert the tensor back to an image type, but I can see that most of the results you care about are either numpy or pytorch tensors. The
daft.DataType.from_numpy_dtype()
helper can also be used. Also I've found the col("image").image.crop() expression to be pretty convenient, if you'd prefer to post process your image segmentation with the daft instead of inside the udf.
👏 1
r
Here's a Daft example which does object detection with YOLO. Everett has a good suggestion to do post-processing with .crop — in this application I do post-processing which draws the bounding box, but same difference. • https://github.com/rchowell/Derezz/blob/main/derezz/index.pyhttps://github.com/rchowell/Derezz/blob/main/derezz/util.py#L79-L107https://github.com/rchowell/Derezz/blob/main/derezz/util.py#L67-L76
🔥 2
u
Thk evey one, after question , I found expression
image.crop()
instant of return crop image. it work will for me. R Conner Howell example very useful. seem the best way to return image , is return tensor type first , and the use expression to transform it?
r
Correct, it’s best now to use np.ndarray to represent image values.
u
Do we need to describe it in the documentation? When I first try,`DataType.image()` It confused me for a long time, also try lot of times
e
I’d echo that. The only thing that tipped me off that you could represent the image type as a numpy array was the querying images example. Did you primarily reference the API section?
r
We have this document right now which is a work in progress. https://docs.daft.ai/en/stable/api/datatypes/#daft-to-python
Please let us know if there’s any additional information that would be helpful, and we will certainly document more.
u
This is my steps: • I want create a udf use to detect object, there are some old python class code(use queue mode, it will return
{class, score, crop_img, bbox}
) • I found https://docs.daft.ai/en/stable/api/udf/ this doc, follow the return_dtype link • See image type, so I write the code like top message. I think https://docs.daft.ai/en/stable/api/datatypes/#daft.datatype.DataType.image maybe can add some attention note, "if you want return image type, please use tensor type" or give an example?