Note

This is the documentation for the latest development branch and may refer to features that are not available in released versions. If you are looking for the documentation for a specific release, use the drop-down menu on the left and select the desired version.

K230 YOLO Battle

Contents

K230 YOLO Battle#

YOLOv5 Fruit Classification#

YOLOv5 Source Code and Training Environment Setup#

For setting up the YOLOv5 training environment, please refer to ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)

git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt

If you have already set up the environment, please ignore this step.

Training Data Preparation#

Please download the provided sample dataset, which includes classification, detection, and segmentation datasets for a scenario with three types of fruits (apple, banana, orange). Extract the dataset to the yolov5 directory, and use fruits_cls as the dataset for the fruit classification task. The sample dataset also contains a rotated object detection dataset yolo_pen_obb for a desktop pen scenario, and a license plate keypoint dataset car_plate. These two tasks are not supported in the YOLOv5 module of the k230.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Classification task data does not need to be annotated with tools, just organize the directories according to the format. Convert the annotated data into the training data format officially supported by yolov5 for subsequent training.

cd yolov5
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv5 to Train the Fruit Classification Model#

Execute the following command in the yolov5 directory to train the three-class fruit classification model using yolov5:

python classify/train.py --model yolov5n-cls.pt --data datasets/fruits_cls --epochs 100 --batch-size 8 --imgsz 224 --device '0'

Converting Fruit Classification kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolov5.zip to the yolov5 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov5.zip
unzip test_yolov5.zip

According to the following commands, first export the pt model under runs/train-cls/exp/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path by yourself
python export.py --weight runs/train-cls/exp/weights/best.pt --imgsz 224 --batch 1 --include onnx
cd test_yolov5/classify
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/train-cls/exp/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

Path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

The quantization strategy is Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.

Model File Copying#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, you just need to modify the corresponding path when writing the code.

YOLOv5 Module#

The YOLOv5 class integrates three tasks of YOLOv5, including classification (classify), detection (detect), and segmentation (segment); it supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLOv5.

  • Import Method

from libs.YOLO import YOLOv5
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports three types of tasks, options are ‘classify’/’detect’/’segment’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring images, ‘video’ means inferring real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path of kmodel copied to the development board;

str

labels

Category Label List

Label names for different categories;

list[str]

rgb888p_size

Inference Frame Resolution

Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when training the YOLOv5 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Category confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask Threshold

Binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is enabled, options 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the following code, modify the parameter variables defined in __main__ according to the actual situation;

from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    confidence_threshold = 0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLOv5 instance
    yolo=YOLOv5(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the following code, modify the variables defined in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example. For custom scenarios, please modify it to your own model path, label name, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399/nt35516/nt35532/gc9503/aml020t/jd9852/ili9806/virt; among which hdmi defaults to lt9611, and lcd defaults to st7701
    display_mode="lcd"
    # Display resolution, None means using the default resolution of the current display; when using virt, you can set it manually here, for example [800, 480]
    display_size=None
    rgb888p_size=[640,360]
    confidence_threshold = 0.5
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode, display_size=display_size)
    # Create PipeLine, you can pass in sensor_id to select the camera as needed, for example pl.create(sensor_id=2)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLOv5 instance
    yolo=YOLOv5(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv5 Fruit Detection#

YOLOv5 Source Code and Training Environment Setup#

For setting up the YOLOv5 training environment, please refer to ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)

git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt

If you have already set up the environment, please ignore this step.

Training Data Preparation#

Please download the provided sample dataset. The sample dataset includes three categories of fruits (apple, banana, orange) as scenarios, and provides classification, detection, and segmentation datasets respectively. Extract the dataset to the yolov5 directory, and please use fruits_yolo as the dataset for the fruit detection task. The sample dataset also includes a rotated object detection dataset yolo_pen_obb for desktop pens, and a license plate keypoint dataset car_plate. These two tasks are not supported in the YOLOv5 module of k230.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolov5 for subsequent training.

cd yolov5
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv5 to Train the Fruit Detection Model#

Execute the command in the yolov5 directory to use yolov5 to train the three-category fruit detection model:

python train.py --weight yolov5n.pt --cfg models/yolov5n.yaml --data datasets/fruits_yolo.yaml --epochs 300 --batch-size 8 --imgsz 320 --device '0'

Converting Fruit Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. nncase can be installed online using pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl at https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# Besides nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolov5.zip to the yolov5 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov5.zip
unzip test_yolov5.zip

Follow the commands below to first export the pt model under runs/train/exp/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path yourself
python export.py --weight runs/train/exp/weights/best.pt --imgsz 320 --batch 1 --include onnx
cd test_yolov5/detect
# Convert kmodel, please customize the onnx model path. The generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/train/exp/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Options are k230/CPU, corresponding to k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining the quantization precision of data and weights, 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.

Copying Model Files#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing the code.

YOLOv5 Module#

The YOLOv5 class integrates three tasks of YOLOv5, including classify, detect, and segment; it supports two inference modes, including image and video; this class encapsulates the kmodel inference process of YOLOv5.

  • Import Method

from libs.YOLO import YOLOv5
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports three types of tasks, options are ‘classify’/’detect’/’segment’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

The kmodel path copied to the development board;

str

labels

Category Label List

The label names of different categories;

list[str]

rgb888p_size

Inference Frame Resolution

The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

The input resolution during YOLOv5 model training, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

The category confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask Threshold

The binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

max_boxes_num

Maximum Detection Boxes

The maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the code below, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify it to your own test image, model path, label name, model input size for custom scenarios
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    nms_threshold=0.45
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLOv5 instance
    yolo=YOLOv5(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the code below, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify it to your own model path, label name, model input size for custom scenarios
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399/nt35516/nt35532/gc9503/aml020t/jd9852/ili9806/virt; among them, hdmi corresponds to lt9611 by default, and lcd corresponds to st7701 by default
    display_mode="lcd"
    # Display resolution, None means using the current default resolution of the display; when using virt, it can be manually set here, such as [800, 480]
    display_size=None
    rgb888p_size=[640,360]
    confidence_threshold = 0.8
    nms_threshold=0.45
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode, display_size=display_size)
    # Create PipeLine, you can pass in sensor_id to select the camera as needed, for example, pl.create(sensor_id=2)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLOv5 instance
    yolo=YOLOv5(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv5 Fruit Segmentation#

YOLOv5 Source Code and Training Environment Setup#

For setting up the YOLOv5 training environment, please refer to ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)

git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt

If you have already set up the environment, please ignore this step.

Training Data Preparation#

Please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenes. Extract the dataset into the yolov5 directory and use fruits_seg as the dataset for the fruit segmentation task. The sample dataset also includes a rotated object detection desktop pen scene dataset yolo_pen_obb and a license plate keypoint dataset car_plate. These two tasks are not supported in the YOLOv5 module of the k230.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data to the training data format officially supported by yolov5 for subsequent training.

cd yolov5
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv5 to Train the Fruit Segmentation Model#

Execute the command in the yolov5 directory to train the three-class fruit segmentation model using yolov5:

python segment/train.py --weight yolov5n-seg.pt --cfg models/segment/yolov5n-seg.yaml --data datasets/fruits_seg.yaml --epochs 100 --batch-size 8 --imgsz 320 --device '0'

Converting the Fruit Segmentation kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 yourself and add environment variables. Installing nncase via pip online is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and install using pip in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolov5.zip into the yolov5 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov5.zip
unzip test_yolov5.zip

According to the following commands, first export the model under runs/train-seg/exp/weights to an onnx model, and then convert it to a kmodel model:

python export.py --weight runs/train-seg/exp/weights/best.pt --imgsz 320 --batch 1 --include onnx
cd test_yolov5/segment
# Convert to kmodel. The onnx model path is user-defined, and the generated kmodel is in the same directory as the onnx model.
python to_kmodel.py --target k230 --model ../../runs/train-seg/exp/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

Path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect to the IDE and copy the converted model and test images to the path CanMV/data. This path can be customized; you just need to modify the corresponding path when writing the code.

YOLOv5 Module#

The YOLOv5 class integrates three tasks of YOLOv5, including classification (classify), detection (detect), and segmentation (segment); it supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLOv5.

  • Import Method

from libs.YOLO import YOLOv5
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports three tasks, options are ‘classify’/’detect’/’segment’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path of the kmodel copied to the development board;

str

labels

Class Label List

Label names for different classes;

list[str]

rgb888p_size

Inference Frame Resolution

Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when the YOLOv5 model was trained, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Class confidence threshold for the classification task, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

NMS Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

Mask Threshold

The binarization threshold for segmenting the object in the detection box during the segmentation task;

float【0~1】

max_boxes_num

Maximum Number of Detection Boxes

The maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is enabled, options are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the following code, modify the parameter variables defined in __main__ according to the actual situation;

from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label names, and model input size.
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    nms_threshold=0.45
    mask_threshold=0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLOv5 instance
    yolo=YOLOv5(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the following code, modify the variables defined in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label names, and model input size.
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    nms_threshold=0.45
    mask_threshold=0.5
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLOv5 instance
    yolo=YOLOv5(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv8 Fruit Classification#

YOLOv8 Source Code and Training Environment Setup#

For YOLOv8 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolov8. Please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolov8 directory. Please use fruits_cls as the dataset for the fruit classification task. The sample dataset also contains a rotated object detection desktop pen scene dataset yolo_pen_obb and a license plate keypoint dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Classification task data does not need to be annotated with tools, just organize the directories according to the format. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.

cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv8 to Train the Fruit Classification Model#

Execute the command in the yolov8 directory to train the three-class fruit classification model using yolov8:

yolo classify train data=datasets/fruits_cls model=yolov8n-cls.pt epochs=100 imgsz=224

Converting the Fruit Classification kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library requires offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, the other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip

Follow the commands below to first export the pt model under runs/classify/train/weights as an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path yourself
yolo export model=runs/classify/train/weights/best.pt format=onnx imgsz=224
cd test_yolov8/classify
# Convert kmodel, please choose the onnx model path yourself, the generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/classify/train/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

The options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, just need to modify the corresponding path when writing code.

YOLOv8 Module#

The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the kmodel inference process of YOLOv8.

  • Import Method

from libs.YOLO import YOLOv8
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports four types of tasks, the options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, the options are ‘image’/’video’, ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

The kmodel path copied to the development board;

str

labels

Class Label List

The label names of different categories;

list[str]

rgb888p_size

Inference Frame Resolution

The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

The input resolution when training the YOLOv8 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

The class confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

NMS Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

Mask Threshold

The binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

The number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

The maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, the options are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model to Implement Image Inference#

For image inference, please refer to the following code, modify the definition parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. Please modify it to your own test image, model path, label name, and model input size for custom scenarios
    img_path="/data/test_apple.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    confidence_threshold = 0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLOv8 instance
    yolo=YOLOv8(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model to Implement Video Inference#

For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. Please modify it to your own model path, label name, and model input size for custom scenarios
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611 with resolution 1920*1080; lcd defaults to st7701 with resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.8
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLOv8 instance
    yolo=YOLOv8(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Frame-by-frame inference
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv8 Fruit Detection#

YOLOv8 Source Code and Training Environment Setup#

For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolov8. Please download the provided example dataset. The example dataset includes classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolov8 directory. Please use fruits_yolo as the dataset for the fruit detection task. The example dataset also includes a rotated object detection dataset yolo_pen_obb for the desktop pen scenario, and a license plate keypoint dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.

cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv8 to Train the Fruit Detection Model#

Execute the command in the yolov8 directory to use yolov8 to train the three-class fruit detection model:

yolo detect train data=datasets/fruits_yolo.yaml model=yolov8n.pt epochs=300 imgsz=320

Converting the Fruit Detection Kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip

Follow the commands below to first export the pt model under runs/detect/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please select the pt model path by yourself
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/detect
# Convert kmodel, please select the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/detect/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Instructions

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

The image data used during model conversion, used in the quantization stage

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build Firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run the code in the IDE.

Copying Model Files#

Connect the IDE and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing the code.

YOLOv8 Module#

The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLOv8.

  • Import Method

from libs.YOLO import YOLOv8
  • Parameter Description

Parameter Name

Description

Instructions

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

The kmodel path copied to the development board;

str

labels

Class Label List

Label names for different categories;

list[str]

rgb888p_size

Inference Frame Resolution

The resolution of the current frame for inference, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

The input resolution when training the YOLOv8 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

The class confidence threshold for the classification task, and the object confidence threshold for the detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask Threshold

The binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

The number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

The dimension of keypoints in the keypoint detection task. Only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

The maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model to Implement Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    nms_threshold=0.45
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLOv8 instance
    yolo=YOLOv8(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model to Implement Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399. Among them, hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.5
    nms_threshold=0.45
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLOv8 instance
    yolo=YOLOv8(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Infer frame by frame
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv8 Fruit Segmentation#

YOLOv8 Source Code and Training Environment Setup#

For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolov8. Please download the provided example dataset. The example dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolov8 directory. Please use fruits_seg as the dataset for the fruit segmentation task. The example dataset also contains a rotated object detection desktop pen scene dataset yolo_pen_obb and a license plate keypoint dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Manually convert the annotated data into the training data format officially supported by yolov8 for subsequent training.

cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv8 to Train the Fruit Segmentation Model#

Execute the following command in the yolov8 directory to use yolov8 to train a three-class fruit segmentation model:

yolo segment train data=datasets/fruits_seg.yaml model=yolov8n-seg.pt epochs=100 imgsz=320

Converting Fruit Segmentation kmodel#

The model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library requires offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip

According to the following commands, first export the pt model under runs/segment/train/weights as an onnx model, then convert it to a kmodel model:

# Export onnx, please choose the pt model path yourself
yolo export model=runs/segment/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/segment
# Convert kmodel, please choose the onnx model path yourself, the generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/segment/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Instructions

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

Path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used in model conversion, used during the quantization phase

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, you only need to modify the corresponding path when writing the code.

YOLOv8 Module#

The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLOv8.

  • Import Method

from libs.YOLO import YOLOv8
  • Parameter Description

Parameter Name

Description

Instructions

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path of the kmodel copied to the development board;

str

labels

Class Label List

Label names of different categories;

list[str]

rgb888p_size

Inference Frame Resolution

Current frame resolution for inference, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when training the YOLOv8 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Category confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask Threshold

Binarization threshold for segmenting objects in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the training model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model to Implement Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify the test image, model path, label name, and model input size for your custom scenario
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    nms_threshold=0.45
    mask_threshold=0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLOv8 instance
    yolo=YOLOv8(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model to Implement Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify the model path, label name, and model input size for your custom scenario
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    nms_threshold=0.45
    mask_threshold=0.5
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLOv8 instance
    yolo=YOLOv8(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv8 Rotation Target Detection#

YOLOv8 Source Code and Training Environment Setup#

For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolov8, please download the provided sample dataset, which contains one rotation target detection category (pen) as the scene with dataset provided respectively. Extract the dataset to the yolov8 directory, please use yolo_pen_obb as the dataset for the rotation target detection task.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.

cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv8 to Train the Rotation Target Detection Model#

Execute the following command in the yolov8 directory, using yolov8 to train a single-class rotation target detection model:

yolo obb train data=datasets/pen_obb.yaml model=yolov8n-obb.pt epochs=100 imgsz=320

Converting Rotation Target Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 yourself and add environment variables. nncase can be installed online via pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used in the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip

According to the following commands, first export the pt model under runs/obb/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please select the pt model path yourself
yolo export model=runs/obb/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/obb
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/obb/train/weights/best.onnx --dataset ../calibration_obb --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

The image data used in model conversion, used in the quantization stage

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combined with data and weights quantization precision, 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself, see the tutorial: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, just modify the corresponding path when writing code.

YOLOv8 Module#

The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotation target detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the YOLOv8 kmodel inference process.

  • Import Method

from libs.YOLO import YOLOv8
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring images, ‘video’ means inferring real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

The kmodel path copied to the development board;

str

labels

Category Label List

Label names for different categories;

list[str]

rgb888p_size

Inference Frame Resolution

The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

The input resolution during YOLOv8 model training, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

The category confidence threshold for classification tasks, the target confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask Threshold

The binarization threshold for segmenting objects in the detection box in segmentation tasks;

float【0~1】

kp_num

Number of Keypoints

The number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

The dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

The maximum number of detection boxes allowed to return in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the following code, modify the parameter variables defined in __main__ according to the actual situation;

from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example, please modify the path of your own test image, model, label name, and model input size for custom scenarios
    img_path="/data/test_obb.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ['pen']
    model_input_size=[320,320]

    confidence_threshold = 0.1
    nms_threshold=0.6
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLOv8 instance
    yolo=YOLOv8(task_type="obb",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the following code, modify the variables defined in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example, please modify the path of your own model, label name, and model input size for custom scenarios
    kmodel_path="/data/best_yolov8n.kmodel"
    labels = ['pen']
    model_input_size=[320,320]

    # Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.1
    nms_threshold=0.6
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLOv8 instance
    yolo=YOLOv8(task_type="obb",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Frame-by-frame inference
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLOv8 License Plate Corner Point Detection#

YOLOv8 Source Code and Training Environment Setup#

For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolov8. Please download the provided sample dataset. The sample dataset includes a dataset for the scenario of detecting the four corners of a license plate. Extract the dataset to the yolov8 directory and use car_plate as the dataset for the keypoint detection task.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.

cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLOv8 to Train a Keypoint Detection Model#

Execute the command in the yolov8 directory to train a single-class rotated object detection model using yolov8:

yolo pose train data=datasets/car_plate.yaml model=yolov8n-pose.pt epochs=100 imgsz=320

Converting Keypoint Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7 to be installed
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip

Follow the commands below to first export the pt model under runs/pose/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path by yourself
yolo export model=runs/pose/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/pose
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel will be in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/pose/train/weights/best.onnx --dataset ../calibration_pose --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Optional values are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), and write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing code.

YOLOv8 Module#

The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the kmodel inference process of YOLOv8.

  • Import Method

from libs.YOLO import YOLOv8
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports four tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, optional values are ‘image’/’video’. ‘image’ means inference on an image, ‘video’ means inference on a real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

The path of the kmodel copied to the development board;

str

labels

Class Label List

Label names of different classes;

list[str]

rgb888p_size

Inference Frame Resolution

The resolution of the current frame being inferred, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

The input resolution when the YOLOv8 model was trained, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi ([1920,1080]) and lcd ([800,480]);

list[int]

conf_thresh

Confidence Threshold

The class confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

NMS Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

Mask Threshold

The binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

The number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

The maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is enabled, optional values are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the code below, and modify the definition parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, model input size, number of keypoints, and keypoint dimension
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ['plate']
    model_input_size=[320,320]
    kp_num=4
    kp_dim=2

    confidence_threshold = 0.5
    nms_threshold=0.45
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLOv8 model
    yolo=YOLOv8(task_type="pose",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the code below, and modify the definition variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label name, model input size, number of keypoints, and keypoint dimension
    kmodel_path="/data/best.kmodel"
    labels = ["plate"]
    model_input_size=[320,320]
    kp_num=4
    kp_dim=2

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi is defaulted to lt9611 with a resolution of 1920*1080; lcd is defaulted to st7701 with a resolution of 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    nms_threshold=0.45
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLOv8 model
    yolo=YOLOv8(task_type="pose",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO11 Fruit Classification#

YOLO11 Source Code and Training Environment Setup#

For setting up the YOLO11 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo11. Please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three fruit classes (apple, banana, orange) as scenarios. Extract the dataset to the yolo11 directory, and please use fruits_cls as the dataset for the fruit classification task. The sample dataset also includes a rotated object detection dataset yolo_pen_obb for a desktop pen scenario, and a license plate keypoint detection dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Classification task data does not need to be annotated using tools; you only need to organize the directory structure according to the format. Convert the annotated data to the training data format officially supported by yolo11 for subsequent training.

cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO11 to Train the Fruit Classification Model#

Execute the command in the yolo11 directory, and use yolo11 to train the three-class fruit classification model:

yolo classify train data=datasets/fruits_cls model=yolo11n-cls.pt epochs=100 imgsz=224

Converting the Fruit Classification kmodel#

Model conversion requires installing the following libraries in the training environment:

# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# Windows platform: Please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding Python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolo11.zip to the yolo11 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip

According to the following commands, first export the pt model under runs/classify/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please select the pt model path yourself
yolo export model=runs/classify/train/weights/best.pt format=onnx imgsz=224
cd test_yolo11/classify
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/classify/train/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Instructions

Type

target

Target Platform

Options are k230/CPU, corresponding to k230 chip;

str

model

Model Path

Path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combining data and weights quantization precision, 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run the code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, you only need to modify the corresponding path when writing code.

YOLO11 Module#

The YOLO11 class integrates five tasks of YOLO11, including classify, detect, segment, rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the YOLO11 kmodel inference process.

  • Import Method

from libs.YOLO import YOLO11
  • Parameter Description

Parameter Name

Description

Instructions

Type

task_type

Task Type

Supports four tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path of the kmodel copied to the development board;

str

labels

Class Label List

Label names of different classes;

list[str]

rgb888p_size

Inference Frame Resolution

Current frame resolution for inference, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when training the YOLO11 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Class confidence threshold for classification task, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask Threshold

Binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example, please modify it to your own test image, model path, label name, and model input size for custom scenarios
    img_path="/data/test_apple.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    confidence_threshold = 0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO11 instance
    yolo=YOLO11(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example, please modify it to your own model path, label name, and model input size for custom scenarios
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.8
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO11 instance
    yolo=YOLO11(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO11 Fruit Detection#

YOLO11 Source Code and Training Environment Setup#

For setting up the YOLO11 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo11, please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset into the yolo11 directory, and please use fruits_yolo as the dataset for the fruit detection task. The sample dataset also includes a rotated object detection desktop pen scenario dataset yolo_pen_obb, and a license plate keypoint detection dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the yolo11 officially supported training data format for subsequent training.

cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO11 to Train the Fruit Detection Model#

Execute the command in the yolo11 directory, and use yolo11 to train the three-class fruit detection model:

yolo detect train data=datasets/fruits_yolo.yaml model=yolo11n.pt epochs=300 imgsz=320

Convert Fruit Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library requires offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the nncase_kpu-2.*-py2.py3-none-win_amd64.whl download directory
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, the script also uses the following libraries:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolo11.zip into the yolo11 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip

According to the following commands, first export the pt model under runs/detect/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path by yourself
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/detect
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/detect/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Notes

Type

target

Target Platform

Options are k230/CPU, corresponding to k230 chip;

str

model

Model Path

Path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization phase

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flash Image and Install CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware by yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, just modify the corresponding path when writing code.

YOLO11 Module#

The YOLO11 class integrates five YOLO11 tasks, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the YOLO11 kmodel inference process.

  • Import Method

from libs.YOLO import YOLO11
  • Parameter Description

Parameter Name

Description

Notes

Type

task_type

Task Type

Supports four types of tasks. Options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes. Options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path of the kmodel copied to the development board;

str

labels

Class Label List

Label names for different classes;

list[str]

rgb888p_size

Inference Frame Resolution

Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution during YOLO11 model training, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’. Supports hdmi ([1920,1080]) and lcd ([800,480]);

list[int]

conf_thresh

Confidence Threshold

Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, e.g., 0.5;

float【0~1】

nms_thresh

NMS Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

Mask Threshold

Binarization threshold for segmenting objects within detection boxes in segmentation tasks;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task. Only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame;

int

debug_mode

Debug Mode

Whether the timing function is enabled. Options are 0/1. 0 means no timing, 1 means timing;

int【0/1】

Deploy the Model to Implement Image Inference#

For image inference, please refer to the following code, modify the definition parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify to your own test image, model path, label names, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    nms_threshold=0.45
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLO11 instance
    yolo=YOLO11(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploy the Model to Implement Video Inference#

For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify to your own model path, label names, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399. hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.5
    nms_threshold=0.45
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLO11 instance
    yolo=YOLO11(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Inference frame by frame
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO11 Fruit Segmentation#

YOLO11 Source Code and Training Environment Setup#

For YOLO11 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo11. Please download the provided sample dataset. The sample dataset includes classification, detection, and segmentation datasets for three categories of fruits (apple, banana, orange) as scenes. Unzip the dataset into the yolo11 directory, and please use fruits_seg as the dataset for the fruit segmentation task. The sample dataset also includes a rotated object detection dataset yolo_pen_obb for the desktop pen scene, and a license plate keypoint detection dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Then convert the annotated data into the training data format officially supported by yolo11 for subsequent training.

cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO11 to Train the Fruit Segmentation Model#

Execute the command in the yolo11 directory to train the three-category fruit segmentation model using yolo11:

yolo segment train data=datasets/fruits_seg.yaml model=yolo11n-seg.pt epochs=100 imgsz=320

Converting Fruit Segmentation kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 yourself and add environment variables. pip can be used to install nncase online, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, unzip the model conversion script tool test_yolo11.zip into the yolo11 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip

According to the following commands, first export the pt model under runs/segment/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path yourself
yolo export model=runs/segment/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/segment
# Convert kmodel, please choose the onnx model path yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/segment/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model conversion script (to_kmodel.py) parameter description:

Parameter Name

Description

Description

Type

target

Target Platform

Optional values are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

Path to the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining data and weights quantization accuracy. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying Model on k230 Using MicroPython#

Flashing Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.

Copying Model Files#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing code.

YOLO11 Module#

The YOLO11 class integrates five tasks of YOLO11, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the kmodel inference process of YOLO11.

  • Import Method

from libs.YOLO import YOLO11
  • Parameter Description

Parameter Name

Description

Description

Type

task_type

Task Type

Supports four types of tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, optional values are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on a real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path to the kmodel copied to the development board;

str

labels

Class Label List

Label names for different categories;

list[str]

rgb888p_size

Inference Frame Resolution

Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when training the YOLO11 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Category confidence threshold for classification tasks, target confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

Mask Threshold

Binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of the keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the training model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to return in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is enabled, optional values are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying Model for Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify it to your own test image, model path, label name, model input size for custom scenarios
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    nms_threshold=0.45
    mask_threshold=0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO11 instance
    yolo=YOLO11(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying Model for Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify it to your own model path, label name, model input size for custom scenarios
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    nms_threshold=0.45
    mask_threshold=0.5
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO11 instance
    yolo=YOLO11(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Frame-by-frame inference
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO11 Rotated Object Detection#

YOLO11 Source Code and Training Environment Setup#

For setting up the YOLO11 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo11. Please download the provided sample dataset, which contains a rotated object detection dataset for a single-class rotated pen detection (pen) scenario. Extract the dataset into the yolo11 directory and use yolo_pen_obb as the dataset for the rotated object detection task.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Convert the annotated data into the training data format officially supported by yolo11 for subsequent training.

cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO11 Rotated Object Detection Model#

Execute the command in the yolo11 directory to use yolo11 to train a single-class rotated object detection model:

yolo obb train data=datasets/pen_obb.yaml model=yolo11n-obb.pt epochs=100 imgsz=320

Converting Rotated Object Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# Besides nncase and nncase-kpu, the script also uses the following libraries:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolo11.zip into the yolo11 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip

According to the following commands, first export the pt model under runs/obb/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please select the pt model path by yourself
yolo export model=runs/obb/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/obb
# Convert kmodel, please select the onnx model path by yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/obb/train/weights/best.onnx --dataset ../calibration_obb --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Notes

Type

target

Target platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model path

Path of the ONNX model to be converted;

str

dataset

Calibration image set

Image data used during model conversion, used in the quantization phase

str

input_width

Input width

Width of the model input

int

input_height

Input height

Height of the model input

int

ptq_option

Quantization method

The quantization strategies are Kld and NoClip, combined with data and weights quantization precision. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying Models on k230 Using MicroPython#

Flashing Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself, see the tutorial: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write code and run it in the IDE.

Model File Copying#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, just modify the corresponding path when writing the code.

YOLO11 Module#

The YOLO11 class integrates four tasks of YOLO11, including classification (classify), detection (detect), segmentation (segment), and rotated object detection (obb); it supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLO11.

  • Import Method

from libs.YOLO import YOLO11
  • Parameter Description

Parameter Name

Description

Notes

Type

task_type

Task type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference mode

Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera;

str

kmodel_path

kmodel path

The kmodel path copied to the development board;

str

labels

Class label list

Label names for different classes;

list[str]

rgb888p_size

Inference frame resolution

The resolution of the current frame being inferred, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model input resolution

The input resolution during YOLO11 model training, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence threshold

Class confidence threshold for classification tasks, target confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

nms threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

mask threshold

The binarization threshold for segmenting objects in the detection box in the segmentation task;

float【0~1】

kp_num

Number of keypoints

The number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint dimension

The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the training model;

int【2/3】

max_boxes_num

Maximum number of detection boxes

The maximum number of detection boxes allowed to be returned in one frame;

int

debug_mode

Debug mode

Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying Model for Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify to your own test image, model path, label name, model input size for custom scenarios
    img_path="/data/test_obb.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ['pen']
    model_input_size=[320,320]

    confidence_threshold = 0.1
    nms_threshold=0.6
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO11 instance
    yolo=YOLO11(task_type="obb",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying Model for Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.Utils import *
from libs.YOLO import YOLO11
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example, please modify to your own model path, label name, model input size for custom scenarios
    kmodel_path="/data/best.kmodel"
    labels = ['pen']
    model_input_size=[320,320]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi is set to lt9611 by default, resolution 1920*1080; lcd is set to st7701 by default, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.1
    nms_threshold=0.6
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO11 instance
    yolo=YOLO11(task_type="obb",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Frame-by-frame inference
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO11 License Plate Corner Detection#

YOLO11 Source Code and Training Environment Setup#

For YOLO11 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo11. Please download the provided sample dataset, which includes a dataset for the license plate detection four-corner keypoint scenario. Extract the dataset to the yolo11 directory, and please use car_plate as the dataset for the keypoint detection task.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo11 for subsequent training.

cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO11 to Train a Keypoint Detection Model#

Execute the command in the yolo11 directory, using yolo11 to train a single-class rotated object detection model:

yolo pose train data=datasets/car_plate.yaml model=yolo11n-pose.pt epochs=100 imgsz=320

Converting Keypoint Detection kmodel#

The model conversion requires the following libraries to be installed in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. nncase can be installed online via pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolo11.zip to the yolo11 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip

According to the following commands, first export the pt model under runs/pose/train/weights as an onnx model, then convert it to a kmodel model:

# Export onnx, please choose the pt model path by yourself
yolo export model=runs/pose/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/pose
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/pose/train/weights/best.onnx --dataset ../calibration_pose --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model conversion script (to_kmodel.py) parameter description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Optional values are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

Path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combined with the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Alternatively, use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, you only need to modify the corresponding path when writing code.

YOLO11 Module#

The YOLO11 class integrates the five tasks of YOLO11, including classify, detect, segment, obb, and pose; it supports two inference modes, including image and video; this class encapsulates the YOLO11 kmodel inference process.

  • Import Method

from libs.YOLO import YOLO11
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports four types of tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, optional values are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on a real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path of the kmodel copied to the development board;

str

labels

Category Label List

Label names of different categories;

list[str]

rgb888p_size

Inference Frame Resolution

Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when training the YOLO11 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Category confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

nms_thresh

NMS Threshold

Non-maximum suppression threshold, required for detection and segmentation tasks;

float【0~1】

mask_thresh

Mask Threshold

Binarization threshold for segmenting objects in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is effective, optional values are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model to Implement Image Inference#

For image inference, please refer to the following code, modify the definition parameters in __main__ according to the actual situation;

from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify to your own test image, model path, label name, model input size, number of keypoints, and keypoint dimension
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ['plate']
    model_input_size=[320,320]
    kp_num=4
    kp_dim=2

    confidence_threshold = 0.5
    nms_threshold=0.45
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO11 model
    yolo=YOLO11(task_type="pose",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model to Implement Video Inference#

For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify to your own model path, label name, model input size, number of keypoints, and keypoint dimension
    kmodel_path="/data/best.kmodel"
    labels = ["plate"]
    model_input_size=[320,320]
    kp_num=4
    kp_dim=2

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. Among them, hdmi defaults to lt9611 with a resolution of 1920*1080; lcd defaults to st7701 with a resolution of 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    nms_threshold=0.45
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO11 model
    yolo=YOLO11(task_type="pose",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO26 Fruit Classification#

YOLO26 Source Code and Training Environment Setup#

YOLO26 is the latest generation of real-time object detection models released by Ultralytics, featuring a series of fundamental architectural innovations that enable end-to-end NMS-free inference, aiming to provide more powerful and easily deployable solutions for edge computing and low-power devices.

For the YOLO26 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo26. Please download the provided sample dataset, which contains classification, detection, and segmentation datasets for three types of fruit (apple, banana, orange) scenarios. Extract the dataset to the yolo26 directory, and please use fruits_cls as the dataset for the fruit classification task. The sample dataset also includes a rotated object detection desktop pen scenario dataset yolo_pen_obb and a license plate keypoint detection dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Classification task data does not need to be annotated with tools, just divide the directories according to the format. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.

cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO26 to Train the Fruit Classification Model#

Execute the following command in the yolo26 directory to use yolo26 to train the three-class fruit classification model:

yolo classify train data=datasets/fruits_cls model=yolo26n-cls.pt epochs=100 imgsz=224

Converting Fruit Classification to kmodel#

The model conversion requires the following libraries to be installed in the training environment:

# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# Windows platform: Please install dotnet-7 by yourself and add environment variables. Online installation of nncase using pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used in the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip

According to the following commands, first export the pt model under runs/classify/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please select the pt model path yourself
yolo export model=runs/classify/train/weights/best.pt format=onnx imgsz=224
cd test_yolo26/classify
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/classify/train/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../

💡 Description of model conversion script (to_kmodel.py) parameters:

Parameter Name

Description

Note

Type

target

Target Platform

Optional values are k230/CPU, corresponding to k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

The image data used during model conversion, used in the quantization phase

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

The quantization strategies are Kld and NoClip, combining the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware by yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write and run the code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, just modify the corresponding path when writing the code.

YOLO26 Module#

The YOLO26 class integrates the five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLO26.

  • Import Method

from libs.YOLO import YOLO26
  • Parameter Description

Parameter Name

Description

Note

Type

task_type

Task Type

Supports four types of tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, optional values are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on the real-time video stream collected by the camera;

str

kmodel_path

kmodel Path

The kmodel path copied to the development board;

str

labels

Class Label List

The label names of different categories;

list[str]

rgb888p_size

Inference Frame Resolution

The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

The input resolution when training the YOLO26 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

The class confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

mask_thresh

mask Threshold

The binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

The number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

The maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, optional values are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model to Implement Image Inference#

For image inference, please refer to the following code, modify the definition parameters in __main__ according to the actual situation;

from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    confidence_threshold = 0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model to Implement Video Inference#

For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[224,224]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. The default hdmi is set to lt9611 with a resolution of 1920*1080; the default lcd is set to st7701 with a resolution of 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.8
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Inference frame by frame
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO26 Fruit Detection#

YOLO11 Source Code and Training Environment Setup#

YOLO26 is the latest generation real-time object detection model released by Ultralytics. It incorporates a series of fundamental architectural innovations, achieving end-to-end NMS-free inference, and is designed to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.

For setting up the YOLO26 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo26. Please download the provided sample dataset. The sample dataset includes classification, detection, and segmentation datasets for three categories of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolo26 directory. Please use fruits_yolo as the dataset for the fruit detection task. The sample dataset also includes a rotated object detection dataset for the desktop pen scenario yolo_pen_obb, and a license plate keypoint detection dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Manually convert the annotated data into the training data format officially supported by yolo26 for subsequent training.

cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO26 to Train the Fruit Detection Model#

Execute the command in the yolo26 directory to train the three-category fruit detection model using yolo26:

yolo detect train data=datasets/fruits_yolo.yaml model=yolo26n.pt epochs=300 imgsz=320

Converting Fruit Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# Windows platform: Please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip

According to the following commands, first export the pt model under runs/detect/train/weights to the onnx model, and then convert it to the kmodel model:

# Export onnx, please choose the pt model path by yourself
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/detect
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/detect/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

Path to the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

Width of model input

int

input_height

Input Height

Height of model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combined with the quantization precision of data and weights. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying the Model on k230 Using MicroPython#

Flashing the Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware by yourself. See the tutorial: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write code in the IDE and run it.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized; you only need to modify the corresponding path when writing code.

YOLO26 Module#

The YOLO26 class integrates five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection, and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the YOLO26 kmodel inference process.

  • Import Method

from libs.YOLO import YOLO26
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’. ‘image’ means inference on an image, ‘video’ means inference on real-time video stream captured by the camera;

str

kmodel_path

kmodel Path

Path to the kmodel copied to the development board;

str

labels

Class Label List

Label names for different classes;

list[str]

rgb888p_size

Inference Frame Resolution

Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when the YOLO26 model was trained, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

mask_thresh

Mask Threshold

Binarization threshold for segmenting objects in the detection box in segmentation tasks;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying the Model for Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize the YOLO26 model
    yolo=YOLO26(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    print(res)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying the Model for Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. hdmi defaults to lt9611 with resolution 1920*1080; lcd defaults to st7701 with resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.5
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize the YOLO26 model
    yolo=YOLO26(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            # Frame-by-frame inference
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO26 Fruit Segmentation#

YOLO26 Source Code and Training Environment Setup#

YOLO26 is the latest generation real-time object detection model released by Ultralytics, featuring a series of fundamental architectural innovations that enable end-to-end NMS-free inference, aiming to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.

For setting up the YOLO26 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo26. Please download the provided sample dataset, which includes three categories of fruits (apple, banana, orange) as scenes, and provides classification, detection, and segmentation datasets respectively. Extract the dataset to the yolo26 directory, and please use fruits_seg as the dataset for the fruit segmentation task. The sample dataset also includes a rotated object detection desktop pen scene dataset yolo_pen_obb, and a license plate keypoint detection dataset car_plate.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.

cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Training Fruit Segmentation Model with YOLO26#

Execute the following command in the yolo26 directory to train a three-class fruit segmentation model using yolo26:

yolo segment train data=datasets/fruits_seg.yaml model=yolo26n-seg.pt epochs=100 imgsz=320

Converting Fruit Segmentation kmodel#

Model conversion requires installing the following libraries in the training environment:

# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# Windows platform: Please install dotnet-7 by yourself and add environment variables. nncase supports online installation via pip, but nncase-kpu library needs offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl at https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, install using pip in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip

According to the following command, first export the pt model under runs/segment/train/weights to an onnx model, then convert it to a kmodel model:

# Export onnx, please select the pt model path yourself
yolo export model=runs/segment/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/segment
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/segment/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Description

Type

target

Target Platform

Options are k230/CPU, corresponding to k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization phase

str

input_width

Input Width

The width of the model input

int

input_height

Input Height

The height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combining data and weights quantization precision. 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying Model on k230 Using MicroPython#

Flash Image and Install CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure the latest features are supported! Or use the latest code to compile the firmware yourself, tutorial see: Firmware Compilation.

Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, copy the converted model and test images to the path CanMV/data directory. This path can be customized, just need to modify the corresponding path when writing code.

YOLO26 Module#

The YOLO26 class integrates the five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); supports two inference modes, including image and video stream (video); this class encapsulates the YOLO26 kmodel inference process.

  • Import Method

from libs.YOLO import YOLO26
  • Parameter Description

Parameter Name

Description

Description

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream collected by the camera;

str

kmodel_path

kmodel Path

Path of kmodel copied to the development board;

str

labels

Class Label List

Label names for different classes;

list[str]

rgb888p_size

Inference Frame Resolution

Current frame inference resolution, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when training the YOLO26 model, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

mask_thresh

Mask Threshold

Binarization threshold for segmenting objects in the detection box in the segmentation task;

float【0~1】

kp_num

Keypoint Number

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the training model;

int【2/3】

max_boxes_num

Maximum Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is enabled, options 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying Model for Image Inference#

For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    confidence_threshold = 0.5
    mask_threshold=0.5
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying Model for Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ["apple","banana","orange"]
    model_input_size=[320,320]

    # Add display mode, default hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    mask_threshold=0.5
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO26 Rotated Object Detection#

YOLO26 Source Code and Training Environment Setup#

YOLO26 is the latest generation real-time object detection model released by Ultralytics, which has undergone a series of fundamental architectural innovations, achieving end-to-end NMS-free inference, aiming to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.

For setting up the YOLO26 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please skip this step.

Training Data Preparation#

You can first create a new folder yolo26, please download the provided sample dataset. The sample dataset contains a rotated object detection dataset for the scenario of single-class rotated pen detection (pen). Extract the dataset to the yolo26 directory, please use yolo_pen_obb as the dataset for the rotated object detection task.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.

cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please skip this step.

Using the YOLO26 Rotated Object Detection Model#

Execute the command in the yolo26 directory, and use yolo26 to train a single-class rotated object detection model:

yolo obb train data=datasets/pen_obb.yaml model=yolo26n-obb.pt epochs=100 imgsz=320

Converting Rotated Object Detection kmodel#

Model conversion requires installing the following libraries in the training environment:

# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# Windows platform: Please install dotnet-7 by yourself and add environment variables. pip online installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool, and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip

According to the following commands, first export the pt model under runs/obb/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path by yourself
yolo export model=runs/obb/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/obb
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/obb/train/weights/best.onnx --dataset ../calibration_obb --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model Conversion Script (to_kmodel.py) Parameter Description:

Parameter Name

Description

Explanation

Type

target

Target Platform

Options are k230/CPU, corresponding to the k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization phase

str

input_width

Input Width

Width of the model input

int

input_height

Input Height

Height of the model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combining the quantization accuracy of data and weights, 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying Models on k230 Using MicroPython#

Flashing Image and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Daily Build Firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself, see the tutorial: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.

Model File Copy#

Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, you only need to modify the corresponding path when writing the code.

YOLO26 Module#

The YOLO26 class integrates four tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), and rotated object detection (obb); it supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLO26.

  • Import Method

from libs.YOLO import YOLO26
  • Parameter Description

Parameter Name

Description

Explanation

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring images, ‘video’ means inferring real-time video streams captured by the camera;

str

kmodel_path

kmodel Path

Path of the kmodel copied to the development board;

str

labels

Class Label List

Label names for different categories;

list[str]

rgb888p_size

Inference Frame Resolution

Current frame resolution for inference, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when the YOLOv8 model is trained, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

mask_thresh

mask Threshold

Binarization threshold for segmenting objects in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing;

int【0/1】

Deploying Model to Implement Image Inference#

For image inference, please refer to the following code, modify the definition parameter variables in __main__ according to the actual situation;

from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ['pen']
    model_input_size=[320,320]

    confidence_threshold = 0.1
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="obb",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,max_boxes_num=100,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying Model to Implement Video Inference#

For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is just an example. For custom scenarios, please modify it to your own model path, label name, and model input size
    kmodel_path="/data/best.kmodel"
    labels = ['pen']
    model_input_size=[320,320]

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[640,360]
    confidence_threshold = 0.1
    # Initialize PipeLine
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="obb",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

YOLO26 License Plate Corner Detection#

YOLO26 Source Code and Training Environment Setup#

YOLO26 is the latest generation real-time object detection model released by Ultralytics. It has undergone a series of fundamental architectural innovations, achieving end-to-end NMS-free inference, and is designed to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.

For YOLO26 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)

# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics

If you have already set up the environment, please ignore this step.

Training Data Preparation#

You can first create a new folder yolo26, please download the provided sample dataset. The sample dataset contains datasets for the license plate detection four-corner keypoint scenario. Extract the dataset to the yolo26 directory, please use car_plate as the dataset for the keypoint detection task.

If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.

cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip

If you have already downloaded the data, please ignore this step.

Using YOLO26 to Train the Keypoint Detection Model#

Execute the following command in the yolo26 directory to use yolo26 to train a rotation object detection model:

yolo pose train data=datasets/car_plate.yaml model=yolo26n-pose.pt epochs=100 imgsz=320

Convert Keypoint Detection kmodel#

Model conversion requires the following libraries to be installed in the training environment:

# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0

# windows platform: please install dotnet-7 by yourself and add environment variables. nncase can be installed online using pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl at https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the nncase_kpu-2.*-py2.py3-none-win_amd64.whl download directory
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl

# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36

Download the script tool and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;

wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip

According to the following commands, first export the pt model under runs/pose/train/weights to an onnx model, and then convert it to a kmodel model:

# Export onnx, please choose the pt model path by yourself
yolo export model=runs/pose/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/pose
# Convert kmodel, please choose the onnx model path by yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/pose/train/weights/best.onnx --dataset ../calibration_pose --input_width 320 --input_height 320 --ptq_option 0
cd ../../

💡 Model conversion script (to_kmodel.py) parameter description:

Parameter Name

Description

Description

Type

target

Target Platform

Options are k230/CPU, corresponding to k230 chip;

str

model

Model Path

The path of the ONNX model to be converted;

str

dataset

Calibration Image Set

Image data used during model conversion, used in the quantization stage

str

input_width

Input Width

Width of model input

int

input_height

Input Height

Height of model input

int

ptq_option

Quantization Method

Quantization strategies are Kld and NoClip, combined with the quantization precision of data and weights, 0 is NoClip+[uint8,uint8], 1 is NoClip+[uint8,int16], 2 is NoClip+[int16,uint8], 3 is Kld+[uint8,uint8], 4 is Kld+[uint8,int16], 5 is Kld+[int16,uint8]

0/1/2/3/4/5

Deploying Models on k230 Using MicroPython#

Flashing Firmware and Installing CanMV IDE#

💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.

Download and install CanMV IDE (Download link: CanMV IDE download), write code in the IDE and run it.

Model File Copy#

Connect the IDE and copy the converted model and test images to the path CanMV/data directory. This path can be customized, just need to modify the corresponding path when writing code.

YOLO26 Module#

The YOLO26 class integrates the five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), keypoint detection (pose); supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLO26.

  • Import Method

from libs.YOLO import YOLO26
  • Parameter Description

Parameter Name

Description

Description

Type

task_type

Task Type

Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’;

str

mode

Inference Mode

Supports two inference modes, options are ‘image’/’video’, ‘image’ means inference image, ‘video’ means inference real-time video stream collected by camera;

str

kmodel_path

kmodel Path

kmodel path copied to the development board;

str

labels

Class Label List

Label names of different categories;

list[str]

rgb888p_size

Inference Frame Resolution

Inference current frame resolution, such as [1920,1080], [1280,720], [640,640];

list[int]

model_input_size

Model Input Resolution

Input resolution when YOLO11 model is trained, such as [224,224], [320,320], [640,640];

list[int]

display_size

Display Resolution

Set when inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]);

list[int]

conf_thresh

Confidence Threshold

Category confidence threshold for classification tasks, target confidence threshold for detection and segmentation tasks, such as 0.5;

float【0~1】

mask_thresh

Mask Threshold

Binarization threshold for segmenting the object in the detection box in the segmentation task;

float【0~1】

kp_num

Number of Keypoints

Number of keypoints in the keypoint detection task;

int

kp_dim

Keypoint Dimension

Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model;

int【2/3】

max_boxes_num

Maximum Number of Detection Boxes

Maximum number of detection boxes allowed to be returned in one frame of image;

int

debug_mode

Debug Mode

Whether the timing function is enabled, options 0/1, 0 is no timing, 1 is timing;

int【0/1】

Deploying Model to Implement Image Inference#

For image inference, please refer to the following code, modify the definition parameters in __main__ according to the actual situation;

from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, model input size, number of keypoints, and keypoint dimension
    img_path="/data/test.jpg"
    kmodel_path="/data/best.kmodel"
    labels = ['plate']
    model_input_size=[320,320]
    kp_num=4
    kp_dim=2

    confidence_threshold = 0.5
    nms_threshold=0.45
    img,img_ori=read_image(img_path)
    rgb888p_size=[img.shape[2],img.shape[1]]
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="pose",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,max_boxes_num=100,debug_mode=0)
    yolo.config_preprocess()
    res=yolo.run(img)
    print(res)
    yolo.draw_result(res,img_ori)
    yolo.deinit()
    gc.collect()

Deploying Model to Implement Video Inference#

For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;

from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image

if __name__=="__main__":
    # This is only an example. For custom scenarios, please modify it to your own model path, label name, model input size, number of keypoints, and keypoint dimension
    kmodel_path="/data/best.kmodel"
    labels = ["plate"]
    model_input_size=[320,320]
    kp_num=4
    kp_dim=2

    # Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi is defaulted to lt9611, resolution 1920*1080; lcd is defaulted to st7701, resolution 800*480
    display_mode="lcd"
    rgb888p_size=[320,320]
    confidence_threshold = 0.5
    pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
    pl.create()
    display_size=pl.get_display_size()
    # Initialize YOLO26 model
    yolo=YOLO26(task_type="pose",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
    yolo.config_preprocess()
    while True:
        with ScopedTiming("total",1):
            img=pl.get_frame()
            res=yolo.run(img)
            yolo.draw_result(res,pl.osd_img)
            pl.show_image()
            gc.collect()
    yolo.deinit()
    pl.destroy()

kmodel Conversion Verification#

The model conversion script toolkits (test_yolov5/test_yolov8/test_yolo11/test_yolo26) downloaded for different models contain kmodel verification scripts.

yolo26 adds the nms process to the model for implementation, and the output data has actual meaning, which is not suitable for measuring with cosine similarity. Moreover, the output shape is not large, and tiny differences will amplify the computational difference of cosine. You can use the test_***_onnx.py and test_***_kmodel.py scripts to test the actual inference results.

Note: Executing the verification script requires adding environment variables

linux:

# The paths in the commands below are the paths of the Python environment where nncase is installed. Please adapt and modify according to your environment
export NNCASE_PLUGIN_PATH=$NNCASE_PLUGIN_PATH:/usr/local/lib/python3.9/site-packages/
export PATH=$PATH:/usr/local/lib/python3.9/site-packages/
source /etc/profile

windows:

Add the Lib/site-packages path under the Python environment where nncase is installed to the system variable Path of the environment variables.

Compare onnx Output and kmodel Output#

Generate Input bin File#

Navigate to the classify/detect/segment/obb/pose directory and execute the following command:

python save_bin.py --image ../test_images/test.jpg --input_width 224 --input_height 224

Executing the script will generate the bin files onnx_input_float32.bin and kmodel_input_uint8.bin in the current directory, which serve as input files for the onnx model and kmodel model.

Compare Output#

Copy the converted models best.onnx and best.kmodel to the calssify/detect/segment directory, then execute the verification script with the following command:

python simulate.py --model best.onnx --model_input onnx_input_float32.bin --kmodel best.kmodel --kmodel_input kmodel_input_uint8.bin --input_width 224 --input_height 224

The following output will be obtained:

output 0 cosine similarity : 0.9985673427581787

The script will sequentially compare the cosine similarity of the outputs. If the similarity is above 0.99, the model is generally considered usable; otherwise, actual inference testing is required, or the quantization parameters need to be changed to re-export the kmodel. If the model has multiple outputs, there will be multiple lines of similarity comparison information. For example, for a segmentation task, there are two outputs, and the similarity comparison information is as follows:

output 0 cosine similarity : 0.9999530911445618
output 1 cosine similarity : 0.9983288645744324

ONNX Model Inference on Images#

Navigate to the classify/detect/segment/obb/pose directory. Taking the classification task as an example, open test_cls_onnx.py, modify the parameters in main() to fit your model, and then execute the command:

python test_cls_onnx.py

After the command executes successfully, the results will be saved to onnx_cls_results.jpg .

The detection task, segmentation task, oriented object detection task, and keypoint detection task are similar. Execute test_det_onnx.py, test_seg_onnx.py, test_obb_onnx.py, test_pose_onnx.py respectively.

Kmodel Model Inference on Images#

Navigate to the classify/detect/segment/obb/pose directory. Taking the classification task as an example, open test_cls_kmodel.py, modify the parameters in main() to fit your model, and then execute the command:

python test_cls_kmodel.py

After the command executes successfully, the results will be saved to kmodel_cls_results.jpg .

The detection task, segmentation task, oriented object detection task, and keypoint detection task are similar. Execute test_det_kmodel.py, test_seg_kmodel.py, test_obb_kmodel.py, test_pose_kmodel.py respectively.

Tuning Guide#

When the model does not perform well on the K230, consider tuning from aspects such as threshold settings, model size, input resolution, quantization method, and training data quality.

Adjust Thresholds#

Adjust the confidence threshold, nms threshold, and mask threshold to tune the deployment performance without changing the model. In detection tasks, raising the confidence threshold and lowering the nms threshold will reduce the number of detection boxes; conversely, lowering the confidence threshold and raising the nms threshold will increase the number of detection boxes. In segmentation tasks, the mask threshold affects the division of segmentation regions. You can adjust according to the actual scenario to find the threshold for better performance.

Change Model#

Choose models of different sizes to balance speed, memory usage, and accuracy. You can select n/s/m/l models for training and conversion according to your actual needs.

Change Input Resolution#

Change the input resolution of the model to fit your scenario. A larger resolution may improve the deployment performance but will consume more inference time.

Modify Quantization Method#

The model conversion script provides 3 quantization parameters, performing uint8 quantization or int16 quantization on data and weights.

In the kmodel conversion script, different quantization methods are specified by choosing different ptq_option values.

ptq_option

data

weights

calibrate_method

0

uint8

uint8

NoClip

1

uint8

int16

NoClip

2

int16

uint8

NoClip

3

uint8

uint8

Kld

4

uint8

int16

Kld

5

int16

uint8

Kld

Improve Data Quality#

If the training results are poor, please improve the dataset quality, optimizing from aspects such as data volume, reasonable data distribution, annotation quality, and training parameter settings.

Tuning Tips#

  • The impact of quantization parameters on performance is greater on YOLOv8 and YOLO11 than on YOLOv5; compare different quantized models to see the effect;

  • Input resolution has a greater impact on inference speed than model size;

  • Differences in the data distribution between training data and K230 camera data may affect the deployment performance. You can use K230 to collect some data and annotate it yourself for training;

Comments list
Comments
Log in