YOLO classifier🔗

Summary🔗

YOLO classification demo shows you how to identify and locate objects which belong to one of 20 categories identified by a Tiny YOLO classifier.

Detailed description🔗

Let us walk through the schema below top-down.

Processing graph

Processing graph🔗

  • The images originate from the virtual camera configured in Image Source.

  • Scale Image scales the image to the size required by the YOLO classifier model. In this case, the size is 416 x 416 pixels.

    The Tiny YOLO model requires a fixed image size of 416 × 416 pixels.

    The Tiny YOLO model requires a fixed image size of 416 × 416 pixels.🔗

  • Image to Tensor converts a color image into a tensor of shape 1 x 3 x height x width. 1 is the number of images per tensor (N, batch size), 3 is the number of color channels (C) in an RGB image. Height (H) and width (W) are both 416 pixels.

    The input to the model needs to be an image tensor in NCHW layout.

    The input to the model needs to be an image tensor in NCHW layout.🔗

  • Run Onnx Model runs the ONNX model. The model file is selected with a file dialog that appears by clicking on Model File input parameter. Connectable input and output sockets appear after the model file has been loaded. The output of this model is a tensor of shape 1 x 125 x 13 x 13. It contains encoded information about identified objects and their locations.

    The ONNX model is stored as a resource file in the app.

    The ONNX model is stored as a resource file in the app.🔗

  • Process Yolo Result takes the output tensor of the ONNX tool and decodes the class and location data. The detection is accepted only if the probability of correct detection is higher than Confidence Threshold. The same object can typically be detected several times within a slightly different bounding box. These false detections can be prevented by specifying how much the detected bounding boxes can overlap. The maximum amount of overlap is determined by Overlap Ratio Threshold. The number of object classes in Tiny YOLO V2 is 20. The locations of the identified objects are given as Frameand Size matrices. The corresponding class indices and confidence levels are also given as matrices.

    The output tensor of a YOLO model needs to be decoded.

    The output tensor of a YOLO model needs to be decoded.🔗

  • Finally, JavaScript inputs the selected indices and converts them to class names.

    The script tool takes in class indices and outputs class names.

    The script tool takes in class indices and outputs class names.🔗

The script code is given in the images below.

Creating a table with JavaScript is straightforward.

Creating a table with JavaScript is straightforward.🔗

In this example, the Frame and Size outputs of the postprocessing tool are connected to the Blur Regions tool. It blurs the bounding boxes of the found objects.

The blur regions tool applies a smoothing filter on image regions.

The blur regions tool applies a smoothing filter on image regions.🔗

Here is an example of an input image with Frame and Size widget overlays. Also the blurred image is shown.

The result of smoothing the two detected objects.

The result of smoothing the two detected objects.🔗