K230 YOLO Battle#
YOLOv5 Fruit Classification#
YOLOv5 Source Code and Training Environment Setup#
For setting up the YOLOv5 training environment, please refer to ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt
If you have already set up the environment, please ignore this step.
Training Data Preparation#
Please download the provided sample dataset, which includes classification, detection, and segmentation datasets for a scenario with three types of fruits (apple, banana, orange). Extract the dataset to the yolov5 directory, and use fruits_cls as the dataset for the fruit classification task. The sample dataset also contains a rotated object detection dataset yolo_pen_obb for a desktop pen scenario, and a license plate keypoint dataset car_plate. These two tasks are not supported in the YOLOv5 module of the k230.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Classification task data does not need to be annotated with tools, just organize the directories according to the format. Convert the annotated data into the training data format officially supported by yolov5 for subsequent training.
cd yolov5
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv5 to Train the Fruit Classification Model#
Execute the following command in the yolov5 directory to train the three-class fruit classification model using yolov5:
python classify/train.py --model yolov5n-cls.pt --data datasets/fruits_cls --epochs 100 --batch-size 8 --imgsz 224 --device '0'
Converting Fruit Classification kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolov5.zip to the yolov5 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov5.zip
unzip test_yolov5.zip
According to the following commands, first export the pt model under runs/train-cls/exp/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path by yourself
python export.py --weight runs/train-cls/exp/weights/best.pt --imgsz 224 --batch 1 --include onnx
cd test_yolov5/classify
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/train-cls/exp/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategy is |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.
Model File Copying#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, you just need to modify the corresponding path when writing the code.
YOLOv5 Module#
The YOLOv5 class integrates three tasks of YOLOv5, including classification (classify), detection (detect), and segmentation (segment); it supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLOv5.
Import Method
from libs.YOLO import YOLOv5
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports three types of tasks, options are ‘classify’/’detect’/’segment’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring images, ‘video’ means inferring real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of kmodel copied to the development board; |
str |
labels |
Category Label List |
Label names for different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when training the YOLOv5 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Category confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask Threshold |
Binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled, options 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the following code, modify the parameter variables defined in __main__ according to the actual situation;
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
confidence_threshold = 0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLOv5 instance
yolo=YOLOv5(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the following code, modify the variables defined in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example. For custom scenarios, please modify it to your own model path, label name, and model input size
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399/nt35516/nt35532/gc9503/aml020t/jd9852/ili9806/virt; among which hdmi defaults to lt9611, and lcd defaults to st7701
display_mode="lcd"
# Display resolution, None means using the default resolution of the current display; when using virt, you can set it manually here, for example [800, 480]
display_size=None
rgb888p_size=[640,360]
confidence_threshold = 0.5
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode, display_size=display_size)
# Create PipeLine, you can pass in sensor_id to select the camera as needed, for example pl.create(sensor_id=2)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLOv5 instance
yolo=YOLOv5(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv5 Fruit Detection#
YOLOv5 Source Code and Training Environment Setup#
For setting up the YOLOv5 training environment, please refer to ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt
If you have already set up the environment, please ignore this step.
Training Data Preparation#
Please download the provided sample dataset. The sample dataset includes three categories of fruits (apple, banana, orange) as scenarios, and provides classification, detection, and segmentation datasets respectively. Extract the dataset to the yolov5 directory, and please use fruits_yolo as the dataset for the fruit detection task. The sample dataset also includes a rotated object detection dataset yolo_pen_obb for desktop pens, and a license plate keypoint dataset car_plate. These two tasks are not supported in the YOLOv5 module of k230.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolov5 for subsequent training.
cd yolov5
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv5 to Train the Fruit Detection Model#
Execute the command in the yolov5 directory to use yolov5 to train the three-category fruit detection model:
python train.py --weight yolov5n.pt --cfg models/yolov5n.yaml --data datasets/fruits_yolo.yaml --epochs 300 --batch-size 8 --imgsz 320 --device '0'
Converting Fruit Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. nncase can be installed online using pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl at https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# Besides nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolov5.zip to the yolov5 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov5.zip
unzip test_yolov5.zip
Follow the commands below to first export the pt model under runs/train/exp/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path yourself
python export.py --weight runs/train/exp/weights/best.pt --imgsz 320 --batch 1 --include onnx
cd test_yolov5/detect
# Convert kmodel, please customize the onnx model path. The generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/train/exp/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.
Copying Model Files#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing the code.
YOLOv5 Module#
The YOLOv5 class integrates three tasks of YOLOv5, including classify, detect, and segment; it supports two inference modes, including image and video; this class encapsulates the kmodel inference process of YOLOv5.
Import Method
from libs.YOLO import YOLOv5
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports three types of tasks, options are ‘classify’/’detect’/’segment’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
The kmodel path copied to the development board; |
str |
labels |
Category Label List |
The label names of different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
The input resolution during YOLOv5 model training, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
The category confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask Threshold |
The binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
max_boxes_num |
Maximum Detection Boxes |
The maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the code below, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify it to your own test image, model path, label name, model input size for custom scenarios
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLOv5 instance
yolo=YOLOv5(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the code below, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify it to your own model path, label name, model input size for custom scenarios
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399/nt35516/nt35532/gc9503/aml020t/jd9852/ili9806/virt; among them, hdmi corresponds to lt9611 by default, and lcd corresponds to st7701 by default
display_mode="lcd"
# Display resolution, None means using the current default resolution of the display; when using virt, it can be manually set here, such as [800, 480]
display_size=None
rgb888p_size=[640,360]
confidence_threshold = 0.8
nms_threshold=0.45
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode, display_size=display_size)
# Create PipeLine, you can pass in sensor_id to select the camera as needed, for example, pl.create(sensor_id=2)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLOv5 instance
yolo=YOLOv5(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv5 Fruit Segmentation#
YOLOv5 Source Code and Training Environment Setup#
For setting up the YOLOv5 training environment, please refer to ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (github.com)
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip install -r requirements.txt
If you have already set up the environment, please ignore this step.
Training Data Preparation#
Please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenes. Extract the dataset into the yolov5 directory and use fruits_seg as the dataset for the fruit segmentation task. The sample dataset also includes a rotated object detection desktop pen scene dataset yolo_pen_obb and a license plate keypoint dataset car_plate. These two tasks are not supported in the YOLOv5 module of the k230.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data to the training data format officially supported by yolov5 for subsequent training.
cd yolov5
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv5 to Train the Fruit Segmentation Model#
Execute the command in the yolov5 directory to train the three-class fruit segmentation model using yolov5:
python segment/train.py --weight yolov5n-seg.pt --cfg models/segment/yolov5n-seg.yaml --data datasets/fruits_seg.yaml --epochs 100 --batch-size 8 --imgsz 320 --device '0'
Converting the Fruit Segmentation kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 yourself and add environment variables. Installing nncase via pip online is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and install using pip in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolov5.zip into the yolov5 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov5.zip
unzip test_yolov5.zip
According to the following commands, first export the model under runs/train-seg/exp/weights to an onnx model, and then convert it to a kmodel model:
python export.py --weight runs/train-seg/exp/weights/best.pt --imgsz 320 --batch 1 --include onnx
cd test_yolov5/segment
# Convert to kmodel. The onnx model path is user-defined, and the generated kmodel is in the same directory as the onnx model.
python to_kmodel.py --target k230 --model ../../runs/train-seg/exp/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect to the IDE and copy the converted model and test images to the path CanMV/data. This path can be customized; you just need to modify the corresponding path when writing the code.
YOLOv5 Module#
The YOLOv5 class integrates three tasks of YOLOv5, including classification (classify), detection (detect), and segmentation (segment); it supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLOv5.
Import Method
from libs.YOLO import YOLOv5
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports three tasks, options are ‘classify’/’detect’/’segment’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names for different classes; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when the YOLOv5 model was trained, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Class confidence threshold for the classification task, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
NMS Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
Mask Threshold |
The binarization threshold for segmenting the object in the detection box during the segmentation task; |
float【0~1】 |
max_boxes_num |
Maximum Number of Detection Boxes |
The maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled, options are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the following code, modify the parameter variables defined in __main__ according to the actual situation;
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label names, and model input size.
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
mask_threshold=0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLOv5 instance
yolo=YOLOv5(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the following code, modify the variables defined in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv5
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label names, and model input size.
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
mask_threshold=0.5
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLOv5 instance
yolo=YOLOv5(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv8 Fruit Classification#
YOLOv8 Source Code and Training Environment Setup#
For YOLOv8 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolov8. Please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolov8 directory. Please use fruits_cls as the dataset for the fruit classification task. The sample dataset also contains a rotated object detection desktop pen scene dataset yolo_pen_obb and a license plate keypoint dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Classification task data does not need to be annotated with tools, just organize the directories according to the format. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.
cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv8 to Train the Fruit Classification Model#
Execute the command in the yolov8 directory to train the three-class fruit classification model using yolov8:
yolo classify train data=datasets/fruits_cls model=yolov8n-cls.pt epochs=100 imgsz=224
Converting the Fruit Classification kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library requires offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, the other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip
Follow the commands below to first export the pt model under runs/classify/train/weights as an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path yourself
yolo export model=runs/classify/train/weights/best.pt format=onnx imgsz=224
cd test_yolov8/classify
# Convert kmodel, please choose the onnx model path yourself, the generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/classify/train/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
The options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, just need to modify the corresponding path when writing code.
YOLOv8 Module#
The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the kmodel inference process of YOLOv8.
Import Method
from libs.YOLO import YOLOv8
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, the options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, the options are ‘image’/’video’, ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
The kmodel path copied to the development board; |
str |
labels |
Class Label List |
The label names of different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
The input resolution when training the YOLOv8 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
The class confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
NMS Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
Mask Threshold |
The binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
The number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
The maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, the options are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model to Implement Image Inference#
For image inference, please refer to the following code, modify the definition parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. Please modify it to your own test image, model path, label name, and model input size for custom scenarios
img_path="/data/test_apple.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
confidence_threshold = 0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLOv8 instance
yolo=YOLOv8(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model to Implement Video Inference#
For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. Please modify it to your own model path, label name, and model input size for custom scenarios
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611 with resolution 1920*1080; lcd defaults to st7701 with resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.8
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLOv8 instance
yolo=YOLOv8(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Frame-by-frame inference
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv8 Fruit Detection#
YOLOv8 Source Code and Training Environment Setup#
For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolov8. Please download the provided example dataset. The example dataset includes classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolov8 directory. Please use fruits_yolo as the dataset for the fruit detection task. The example dataset also includes a rotated object detection dataset yolo_pen_obb for the desktop pen scenario, and a license plate keypoint dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.
cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv8 to Train the Fruit Detection Model#
Execute the command in the yolov8 directory to use yolov8 to train the three-class fruit detection model:
yolo detect train data=datasets/fruits_yolo.yaml model=yolov8n.pt epochs=300 imgsz=320
Converting the Fruit Detection Kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip
Follow the commands below to first export the pt model under runs/detect/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please select the pt model path by yourself
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/detect
# Convert kmodel, please select the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/detect/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Instructions |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
The image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build Firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run the code in the IDE.
Copying Model Files#
Connect the IDE and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing the code.
YOLOv8 Module#
The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLOv8.
Import Method
from libs.YOLO import YOLOv8
Parameter Description
Parameter Name |
Description |
Instructions |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
The kmodel path copied to the development board; |
str |
labels |
Class Label List |
Label names for different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
The resolution of the current frame for inference, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
The input resolution when training the YOLOv8 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
The class confidence threshold for the classification task, and the object confidence threshold for the detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask Threshold |
The binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
The number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
The dimension of keypoints in the keypoint detection task. Only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
The maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model to Implement Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLOv8 instance
yolo=YOLOv8(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model to Implement Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399. Among them, hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.5
nms_threshold=0.45
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLOv8 instance
yolo=YOLOv8(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Infer frame by frame
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv8 Fruit Segmentation#
YOLOv8 Source Code and Training Environment Setup#
For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolov8. Please download the provided example dataset. The example dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolov8 directory. Please use fruits_seg as the dataset for the fruit segmentation task. The example dataset also contains a rotated object detection desktop pen scene dataset yolo_pen_obb and a license plate keypoint dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Manually convert the annotated data into the training data format officially supported by yolov8 for subsequent training.
cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv8 to Train the Fruit Segmentation Model#
Execute the following command in the yolov8 directory to use yolov8 to train a three-class fruit segmentation model:
yolo segment train data=datasets/fruits_seg.yaml model=yolov8n-seg.pt epochs=100 imgsz=320
Converting Fruit Segmentation kmodel#
The model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library requires offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip
According to the following commands, first export the pt model under runs/segment/train/weights as an onnx model, then convert it to a kmodel model:
# Export onnx, please choose the pt model path yourself
yolo export model=runs/segment/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/segment
# Convert kmodel, please choose the onnx model path yourself, the generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/segment/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Instructions |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used in model conversion, used during the quantization phase |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, you only need to modify the corresponding path when writing the code.
YOLOv8 Module#
The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLOv8.
Import Method
from libs.YOLO import YOLOv8
Parameter Description
Parameter Name |
Description |
Instructions |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names of different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Current frame resolution for inference, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when training the YOLOv8 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Category confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask Threshold |
Binarization threshold for segmenting objects in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the training model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model to Implement Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify the test image, model path, label name, and model input size for your custom scenario
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
mask_threshold=0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLOv8 instance
yolo=YOLOv8(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model to Implement Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify the model path, label name, and model input size for your custom scenario
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
mask_threshold=0.5
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLOv8 instance
yolo=YOLOv8(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv8 Rotation Target Detection#
YOLOv8 Source Code and Training Environment Setup#
For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolov8, please download the provided sample dataset, which contains one rotation target detection category (pen) as the scene with dataset provided respectively. Extract the dataset to the yolov8 directory, please use yolo_pen_obb as the dataset for the rotation target detection task.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.
cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv8 to Train the Rotation Target Detection Model#
Execute the following command in the yolov8 directory, using yolov8 to train a single-class rotation target detection model:
yolo obb train data=datasets/pen_obb.yaml model=yolov8n-obb.pt epochs=100 imgsz=320
Converting Rotation Target Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 yourself and add environment variables. nncase can be installed online via pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used in the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip
According to the following commands, first export the pt model under runs/obb/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please select the pt model path yourself
yolo export model=runs/obb/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/obb
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/obb/train/weights/best.onnx --dataset ../calibration_obb --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
The image data used in model conversion, used in the quantization stage |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself, see the tutorial: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, just modify the corresponding path when writing code.
YOLOv8 Module#
The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotation target detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the YOLOv8 kmodel inference process.
Import Method
from libs.YOLO import YOLOv8
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring images, ‘video’ means inferring real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
The kmodel path copied to the development board; |
str |
labels |
Category Label List |
Label names for different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
The input resolution during YOLOv8 model training, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
The category confidence threshold for classification tasks, the target confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask Threshold |
The binarization threshold for segmenting objects in the detection box in segmentation tasks; |
float【0~1】 |
kp_num |
Number of Keypoints |
The number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
The dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
The maximum number of detection boxes allowed to return in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the following code, modify the parameter variables defined in __main__ according to the actual situation;
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example, please modify the path of your own test image, model, label name, and model input size for custom scenarios
img_path="/data/test_obb.jpg"
kmodel_path="/data/best.kmodel"
labels = ['pen']
model_input_size=[320,320]
confidence_threshold = 0.1
nms_threshold=0.6
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLOv8 instance
yolo=YOLOv8(task_type="obb",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the following code, modify the variables defined in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example, please modify the path of your own model, label name, and model input size for custom scenarios
kmodel_path="/data/best_yolov8n.kmodel"
labels = ['pen']
model_input_size=[320,320]
# Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.1
nms_threshold=0.6
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLOv8 instance
yolo=YOLOv8(task_type="obb",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Frame-by-frame inference
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLOv8 License Plate Corner Point Detection#
YOLOv8 Source Code and Training Environment Setup#
For setting up the YOLOv8 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolov8. Please download the provided sample dataset. The sample dataset includes a dataset for the scenario of detecting the four corners of a license plate. Extract the dataset to the yolov8 directory and use car_plate as the dataset for the keypoint detection task.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Convert the annotated data into the training data format officially supported by yolov8 for subsequent training.
cd yolov8
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLOv8 to Train a Keypoint Detection Model#
Execute the command in the yolov8 directory to train a single-class rotated object detection model using yolov8:
yolo pose train data=datasets/car_plate.yaml model=yolov8n-pose.pt epochs=100 imgsz=320
Converting Keypoint Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7 to be installed
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolov8.zip to the yolov8 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolov8.zip
unzip test_yolov8.zip
Follow the commands below to first export the pt model under runs/pose/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path by yourself
yolo export model=runs/pose/train/weights/best.pt format=onnx imgsz=320
cd test_yolov8/pose
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel will be in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/pose/train/weights/best.onnx --dataset ../calibration_pose --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Optional values are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), and write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing code.
YOLOv8 Module#
The YOLOv8 class integrates five tasks of YOLOv8, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the kmodel inference process of YOLOv8.
Import Method
from libs.YOLO import YOLOv8
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, optional values are ‘image’/’video’. ‘image’ means inference on an image, ‘video’ means inference on a real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
The path of the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names of different classes; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
The resolution of the current frame being inferred, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
The input resolution when the YOLOv8 model was trained, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi ([1920,1080]) and lcd ([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
The class confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
NMS Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
Mask Threshold |
The binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
The number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
The maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled, optional values are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the code below, and modify the definition parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, model input size, number of keypoints, and keypoint dimension
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ['plate']
model_input_size=[320,320]
kp_num=4
kp_dim=2
confidence_threshold = 0.5
nms_threshold=0.45
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLOv8 model
yolo=YOLOv8(task_type="pose",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the code below, and modify the definition variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLOv8
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label name, model input size, number of keypoints, and keypoint dimension
kmodel_path="/data/best.kmodel"
labels = ["plate"]
model_input_size=[320,320]
kp_num=4
kp_dim=2
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi is defaulted to lt9611 with a resolution of 1920*1080; lcd is defaulted to st7701 with a resolution of 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLOv8 model
yolo=YOLOv8(task_type="pose",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO11 Fruit Classification#
YOLO11 Source Code and Training Environment Setup#
For setting up the YOLO11 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo11. Please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three fruit classes (apple, banana, orange) as scenarios. Extract the dataset to the yolo11 directory, and please use fruits_cls as the dataset for the fruit classification task. The sample dataset also includes a rotated object detection dataset yolo_pen_obb for a desktop pen scenario, and a license plate keypoint detection dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Classification task data does not need to be annotated using tools; you only need to organize the directory structure according to the format. Convert the annotated data to the training data format officially supported by yolo11 for subsequent training.
cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO11 to Train the Fruit Classification Model#
Execute the command in the yolo11 directory, and use yolo11 to train the three-class fruit classification model:
yolo classify train data=datasets/fruits_cls model=yolo11n-cls.pt epochs=100 imgsz=224
Converting the Fruit Classification kmodel#
Model conversion requires installing the following libraries in the training environment:
# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# Windows platform: Please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding Python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolo11.zip to the yolo11 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip
According to the following commands, first export the pt model under runs/classify/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please select the pt model path yourself
yolo export model=runs/classify/train/weights/best.pt format=onnx imgsz=224
cd test_yolo11/classify
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/classify/train/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Instructions |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to k230 chip; |
str |
model |
Model Path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run the code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, you only need to modify the corresponding path when writing code.
YOLO11 Module#
The YOLO11 class integrates five tasks of YOLO11, including classify, detect, segment, rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the YOLO11 kmodel inference process.
Import Method
from libs.YOLO import YOLO11
Parameter Description
Parameter Name |
Description |
Instructions |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names of different classes; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Current frame resolution for inference, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when training the YOLO11 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Class confidence threshold for classification task, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask Threshold |
Binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example, please modify it to your own test image, model path, label name, and model input size for custom scenarios
img_path="/data/test_apple.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
confidence_threshold = 0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO11 instance
yolo=YOLO11(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example, please modify it to your own model path, label name, and model input size for custom scenarios
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.8
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO11 instance
yolo=YOLO11(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO11 Fruit Detection#
YOLO11 Source Code and Training Environment Setup#
For setting up the YOLO11 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo11, please download the provided sample dataset. The sample dataset contains classification, detection, and segmentation datasets for three types of fruits (apple, banana, orange) as scenarios. Extract the dataset into the yolo11 directory, and please use fruits_yolo as the dataset for the fruit detection task. The sample dataset also includes a rotated object detection desktop pen scenario dataset yolo_pen_obb, and a license plate keypoint detection dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the yolo11 officially supported training data format for subsequent training.
cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO11 to Train the Fruit Detection Model#
Execute the command in the yolo11 directory, and use yolo11 to train the three-class fruit detection model:
yolo detect train data=datasets/fruits_yolo.yaml model=yolo11n.pt epochs=300 imgsz=320
Convert Fruit Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. Online installation of nncase via pip is supported, but the nncase-kpu library requires offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the nncase_kpu-2.*-py2.py3-none-win_amd64.whl download directory
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, the script also uses the following libraries:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolo11.zip into the yolo11 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip
According to the following commands, first export the pt model under runs/detect/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path by yourself
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/detect
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/detect/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Notes |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to k230 chip; |
str |
model |
Model Path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization phase |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flash Image and Install CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware by yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, just modify the corresponding path when writing code.
YOLO11 Module#
The YOLO11 class integrates five YOLO11 tasks, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the YOLO11 kmodel inference process.
Import Method
from libs.YOLO import YOLO11
Parameter Description
Parameter Name |
Description |
Notes |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks. Options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes. Options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names for different classes; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution during YOLO11 model training, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’. Supports hdmi ([1920,1080]) and lcd ([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, e.g., 0.5; |
float【0~1】 |
nms_thresh |
NMS Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
Mask Threshold |
Binarization threshold for segmenting objects within detection boxes in segmentation tasks; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task. Only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled. Options are 0/1. 0 means no timing, 1 means timing; |
int【0/1】 |
Deploy the Model to Implement Image Inference#
For image inference, please refer to the following code, modify the definition parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify to your own test image, model path, label names, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLO11 instance
yolo=YOLO11(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploy the Model to Implement Video Inference#
For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify to your own model path, label names, and model input size
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, options are hdmi/lcd/lt9611/st7701/hx8399. hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.5
nms_threshold=0.45
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLO11 instance
yolo=YOLO11(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Inference frame by frame
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO11 Fruit Segmentation#
YOLO11 Source Code and Training Environment Setup#
For YOLO11 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo11. Please download the provided sample dataset. The sample dataset includes classification, detection, and segmentation datasets for three categories of fruits (apple, banana, orange) as scenes. Unzip the dataset into the yolo11 directory, and please use fruits_seg as the dataset for the fruit segmentation task. The sample dataset also includes a rotated object detection dataset yolo_pen_obb for the desktop pen scene, and a license plate keypoint detection dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Then convert the annotated data into the training data format officially supported by yolo11 for subsequent training.
cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO11 to Train the Fruit Segmentation Model#
Execute the command in the yolo11 directory to train the three-category fruit segmentation model using yolo11:
yolo segment train data=datasets/fruits_seg.yaml model=yolo11n-seg.pt epochs=100 imgsz=320
Converting Fruit Segmentation kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 yourself and add environment variables. pip can be used to install nncase online, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, unzip the model conversion script tool test_yolo11.zip into the yolo11 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip
According to the following commands, first export the pt model under runs/segment/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path yourself
yolo export model=runs/segment/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/segment
# Convert kmodel, please choose the onnx model path yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/segment/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model conversion script (to_kmodel.py) parameter description:
Parameter Name |
Description |
Description |
Type |
|---|---|---|---|
target |
Target Platform |
Optional values are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
Path to the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying Model on k230 Using MicroPython#
Flashing Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.
Copying Model Files#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized; you only need to modify the corresponding path when writing code.
YOLO11 Module#
The YOLO11 class integrates five tasks of YOLO11, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream; this class encapsulates the kmodel inference process of YOLO11.
Import Method
from libs.YOLO import YOLO11
Parameter Description
Parameter Name |
Description |
Description |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, optional values are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on a real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path to the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names for different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when training the YOLO11 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Category confidence threshold for classification tasks, target confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
Mask Threshold |
Binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of the keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the training model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to return in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled, optional values are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying Model for Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify it to your own test image, model path, label name, model input size for custom scenarios
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
mask_threshold=0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO11 instance
yolo=YOLO11(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying Model for Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify it to your own model path, label name, model input size for custom scenarios
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
mask_threshold=0.5
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO11 instance
yolo=YOLO11(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Frame-by-frame inference
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO11 Rotated Object Detection#
YOLO11 Source Code and Training Environment Setup#
For setting up the YOLO11 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo11. Please download the provided sample dataset, which contains a rotated object detection dataset for a single-class rotated pen detection (pen) scenario. Extract the dataset into the yolo11 directory and use yolo_pen_obb as the dataset for the rotated object detection task.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Convert the annotated data into the training data format officially supported by yolo11 for subsequent training.
cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO11 Rotated Object Detection Model#
Execute the command in the yolo11 directory to use yolo11 to train a single-class rotated object detection model:
yolo obb train data=datasets/pen_obb.yaml model=yolo11n-obb.pt epochs=100 imgsz=320
Converting Rotated Object Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# Besides nncase and nncase-kpu, the script also uses the following libraries:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolo11.zip into the yolo11 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip
According to the following commands, first export the pt model under runs/obb/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please select the pt model path by yourself
yolo export model=runs/obb/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/obb
# Convert kmodel, please select the onnx model path by yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/obb/train/weights/best.onnx --dataset ../calibration_obb --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Notes |
Type |
|---|---|---|---|
target |
Target platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration image set |
Image data used during model conversion, used in the quantization phase |
str |
input_width |
Input width |
Width of the model input |
int |
input_height |
Input height |
Height of the model input |
int |
ptq_option |
Quantization method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying Models on k230 Using MicroPython#
Flashing Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself, see the tutorial: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write code and run it in the IDE.
Model File Copying#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, just modify the corresponding path when writing the code.
YOLO11 Module#
The YOLO11 class integrates four tasks of YOLO11, including classification (classify), detection (detect), segmentation (segment), and rotated object detection (obb); it supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLO11.
Import Method
from libs.YOLO import YOLO11
Parameter Description
Parameter Name |
Description |
Notes |
Type |
|---|---|---|---|
task_type |
Task type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference mode |
Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel path |
The kmodel path copied to the development board; |
str |
labels |
Class label list |
Label names for different classes; |
list[str] |
rgb888p_size |
Inference frame resolution |
The resolution of the current frame being inferred, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model input resolution |
The input resolution during YOLO11 model training, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence threshold |
Class confidence threshold for classification tasks, target confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
nms threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
mask threshold |
The binarization threshold for segmenting objects in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of keypoints |
The number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint dimension |
The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the training model; |
int【2/3】 |
max_boxes_num |
Maximum number of detection boxes |
The maximum number of detection boxes allowed to be returned in one frame; |
int |
debug_mode |
Debug mode |
Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying Model for Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify to your own test image, model path, label name, model input size for custom scenarios
img_path="/data/test_obb.jpg"
kmodel_path="/data/best.kmodel"
labels = ['pen']
model_input_size=[320,320]
confidence_threshold = 0.1
nms_threshold=0.6
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO11 instance
yolo=YOLO11(task_type="obb",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying Model for Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.Utils import *
from libs.YOLO import YOLO11
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example, please modify to your own model path, label name, model input size for custom scenarios
kmodel_path="/data/best.kmodel"
labels = ['pen']
model_input_size=[320,320]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi is set to lt9611 by default, resolution 1920*1080; lcd is set to st7701 by default, resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.1
nms_threshold=0.6
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO11 instance
yolo=YOLO11(task_type="obb",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Frame-by-frame inference
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO11 License Plate Corner Detection#
YOLO11 Source Code and Training Environment Setup#
For YOLO11 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo11. Please download the provided sample dataset, which includes a dataset for the license plate detection four-corner keypoint scenario. Extract the dataset to the yolo11 directory, and please use car_plate as the dataset for the keypoint detection task.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo11 for subsequent training.
cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO11 to Train a Keypoint Detection Model#
Execute the command in the yolo11 directory, using yolo11 to train a single-class rotated object detection model:
yolo pose train data=datasets/car_plate.yaml model=yolo11n-pose.pt epochs=100 imgsz=320
Converting Keypoint Detection kmodel#
The model conversion requires the following libraries to be installed in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. nncase can be installed online via pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolo11.zip to the yolo11 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo11.zip
unzip test_yolo11.zip
According to the following commands, first export the pt model under runs/pose/train/weights as an onnx model, then convert it to a kmodel model:
# Export onnx, please choose the pt model path by yourself
yolo export model=runs/pose/train/weights/best.pt format=onnx imgsz=320
cd test_yolo11/pose
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory level as the onnx model
python to_kmodel.py --target k230 --model ../../runs/pose/train/weights/best.onnx --dataset ../calibration_pose --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model conversion script (to_kmodel.py) parameter description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Optional values are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
Path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Alternatively, use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, you only need to modify the corresponding path when writing code.
YOLO11 Module#
The YOLO11 class integrates the five tasks of YOLO11, including classify, detect, segment, obb, and pose; it supports two inference modes, including image and video; this class encapsulates the YOLO11 kmodel inference process.
Import Method
from libs.YOLO import YOLO11
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, optional values are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on a real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of the kmodel copied to the development board; |
str |
labels |
Category Label List |
Label names of different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when training the YOLO11 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Category confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
nms_thresh |
NMS Threshold |
Non-maximum suppression threshold, required for detection and segmentation tasks; |
float【0~1】 |
mask_thresh |
Mask Threshold |
Binarization threshold for segmenting objects in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is effective, optional values are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model to Implement Image Inference#
For image inference, please refer to the following code, modify the definition parameters in __main__ according to the actual situation;
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify to your own test image, model path, label name, model input size, number of keypoints, and keypoint dimension
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ['plate']
model_input_size=[320,320]
kp_num=4
kp_dim=2
confidence_threshold = 0.5
nms_threshold=0.45
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO11 model
yolo=YOLO11(task_type="pose",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=100,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model to Implement Video Inference#
For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO11
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify to your own model path, label name, model input size, number of keypoints, and keypoint dimension
kmodel_path="/data/best.kmodel"
labels = ["plate"]
model_input_size=[320,320]
kp_num=4
kp_dim=2
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. Among them, hdmi defaults to lt9611 with a resolution of 1920*1080; lcd defaults to st7701 with a resolution of 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
nms_threshold=0.45
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO11 model
yolo=YOLO11(task_type="pose",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,nms_thresh=nms_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO26 Fruit Classification#
YOLO26 Source Code and Training Environment Setup#
YOLO26 is the latest generation of real-time object detection models released by Ultralytics, featuring a series of fundamental architectural innovations that enable end-to-end NMS-free inference, aiming to provide more powerful and easily deployable solutions for edge computing and low-power devices.
For the YOLO26 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo26. Please download the provided sample dataset, which contains classification, detection, and segmentation datasets for three types of fruit (apple, banana, orange) scenarios. Extract the dataset to the yolo26 directory, and please use fruits_cls as the dataset for the fruit classification task. The sample dataset also includes a rotated object detection desktop pen scenario dataset yolo_pen_obb and a license plate keypoint detection dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Classification task data does not need to be annotated with tools, just divide the directories according to the format. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.
cd yolo11
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO26 to Train the Fruit Classification Model#
Execute the following command in the yolo26 directory to use yolo26 to train the three-class fruit classification model:
yolo classify train data=datasets/fruits_cls model=yolo26n-cls.pt epochs=100 imgsz=224
Converting Fruit Classification to kmodel#
The model conversion requires the following libraries to be installed in the training environment:
# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# Windows platform: Please install dotnet-7 by yourself and add environment variables. Online installation of nncase using pip is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used in the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip
According to the following commands, first export the pt model under runs/classify/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please select the pt model path yourself
yolo export model=runs/classify/train/weights/best.pt format=onnx imgsz=224
cd test_yolo26/classify
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/classify/train/weights/best.onnx --dataset ../calibration_data --input_width 224 --input_height 224 --ptq_option 0
cd ../../
💡 Description of model conversion script (to_kmodel.py) parameters:
Parameter Name |
Description |
Note |
Type |
|---|---|---|---|
target |
Target Platform |
Optional values are k230/CPU, corresponding to k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
The image data used during model conversion, used in the quantization phase |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
The quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware by yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write and run the code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data. This path can be customized, just modify the corresponding path when writing the code.
YOLO26 Module#
The YOLO26 class integrates the five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the kmodel inference process of YOLO26.
Import Method
from libs.YOLO import YOLO26
Parameter Description
Parameter Name |
Description |
Note |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, optional values are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, optional values are ‘image’/’video’, ‘image’ means inference on an image, ‘video’ means inference on the real-time video stream collected by the camera; |
str |
kmodel_path |
kmodel Path |
The kmodel path copied to the development board; |
str |
labels |
Class Label List |
The label names of different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
The resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
The input resolution when training the YOLO26 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
The class confidence threshold for classification tasks, and the object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
mask_thresh |
mask Threshold |
The binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
The number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
The dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
The maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, optional values are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model to Implement Image Inference#
For image inference, please refer to the following code, modify the definition parameters in __main__ according to the actual situation;
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
confidence_threshold = 0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO26 model
yolo=YOLO26(task_type="classify",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model to Implement Video Inference#
For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[224,224]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. The default hdmi is set to lt9611 with a resolution of 1920*1080; the default lcd is set to st7701 with a resolution of 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.8
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO26 model
yolo=YOLO26(task_type="classify",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Inference frame by frame
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO26 Fruit Detection#
YOLO11 Source Code and Training Environment Setup#
YOLO26 is the latest generation real-time object detection model released by Ultralytics. It incorporates a series of fundamental architectural innovations, achieving end-to-end NMS-free inference, and is designed to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.
For setting up the YOLO26 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo26. Please download the provided sample dataset. The sample dataset includes classification, detection, and segmentation datasets for three categories of fruits (apple, banana, orange) as scenarios. Extract the dataset to the yolo26 directory. Please use fruits_yolo as the dataset for the fruit detection task. The sample dataset also includes a rotated object detection dataset for the desktop pen scenario yolo_pen_obb, and a license plate keypoint detection dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete annotation. Manually convert the annotated data into the training data format officially supported by yolo26 for subsequent training.
cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO26 to Train the Fruit Detection Model#
Execute the command in the yolo26 directory to train the three-category fruit detection model using yolo26:
yolo detect train data=datasets/fruits_yolo.yaml model=yolo26n.pt epochs=300 imgsz=320
Converting Fruit Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# Windows platform: Please install dotnet-7 by yourself and add environment variables. Online pip installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip
According to the following commands, first export the pt model under runs/detect/train/weights to the onnx model, and then convert it to the kmodel model:
# Export onnx, please choose the pt model path by yourself
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/detect
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/detect/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
Path to the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
Width of model input |
int |
input_height |
Input Height |
Height of model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying the Model on k230 Using MicroPython#
Flashing the Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware by yourself. See the tutorial: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write code in the IDE and run it.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized; you only need to modify the corresponding path when writing code.
YOLO26 Module#
The YOLO26 class integrates five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection, and keypoint detection (pose); it supports two inference modes, including image and video stream (video); this class encapsulates the YOLO26 kmodel inference process.
Import Method
from libs.YOLO import YOLO26
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’. ‘image’ means inference on an image, ‘video’ means inference on real-time video stream captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path to the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names for different classes; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Resolution of the current inference frame, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when the YOLO26 model was trained, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
mask_thresh |
Mask Threshold |
Binarization threshold for segmenting objects in the detection box in segmentation tasks; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options are 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying the Model for Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize the YOLO26 model
yolo=YOLO26(task_type="detect",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
print(res)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying the Model for Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399. hdmi defaults to lt9611 with resolution 1920*1080; lcd defaults to st7701 with resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.5
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize the YOLO26 model
yolo=YOLO26(task_type="detect",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
# Frame-by-frame inference
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO26 Fruit Segmentation#
YOLO26 Source Code and Training Environment Setup#
YOLO26 is the latest generation real-time object detection model released by Ultralytics, featuring a series of fundamental architectural innovations that enable end-to-end NMS-free inference, aiming to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.
For setting up the YOLO26 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo26. Please download the provided sample dataset, which includes three categories of fruits (apple, banana, orange) as scenes, and provides classification, detection, and segmentation datasets respectively. Extract the dataset to the yolo26 directory, and please use fruits_seg as the dataset for the fruit segmentation task. The sample dataset also includes a rotated object detection desktop pen scene dataset yolo_pen_obb, and a license plate keypoint detection dataset car_plate.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.
cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Training Fruit Segmentation Model with YOLO26#
Execute the following command in the yolo26 directory to train a three-class fruit segmentation model using yolo26:
yolo segment train data=datasets/fruits_seg.yaml model=yolo26n-seg.pt epochs=100 imgsz=320
Converting Fruit Segmentation kmodel#
Model conversion requires installing the following libraries in the training environment:
# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# Windows platform: Please install dotnet-7 by yourself and add environment variables. nncase supports online installation via pip, but nncase-kpu library needs offline installation. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl at https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, install using pip in the directory where nncase_kpu-2.*-py2.py3-none-win_amd64.whl is downloaded
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip
According to the following command, first export the pt model under runs/segment/train/weights to an onnx model, then convert it to a kmodel model:
# Export onnx, please select the pt model path yourself
yolo export model=runs/segment/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/segment
# Convert kmodel, please select the onnx model path yourself, the generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/segment/train/weights/best.onnx --dataset ../calibration_data --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Description |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization phase |
str |
input_width |
Input Width |
The width of the model input |
int |
input_height |
Input Height |
The height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying Model on k230 Using MicroPython#
Flash Image and Install CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure the latest features are supported! Or use the latest code to compile the firmware yourself, tutorial see: Firmware Compilation.
Download and install CanMV IDE (download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, copy the converted model and test images to the path CanMV/data directory. This path can be customized, just need to modify the corresponding path when writing code.
YOLO26 Module#
The YOLO26 class integrates the five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), and keypoint detection (pose); supports two inference modes, including image and video stream (video); this class encapsulates the YOLO26 kmodel inference process.
Import Method
from libs.YOLO import YOLO26
Parameter Description
Parameter Name |
Description |
Description |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’. ‘image’ means inferring an image, ‘video’ means inferring the real-time video stream collected by the camera; |
str |
kmodel_path |
kmodel Path |
Path of kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names for different classes; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Current frame inference resolution, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when training the YOLO26 model, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
mask_thresh |
Mask Threshold |
Binarization threshold for segmenting objects in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Keypoint Number |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only supports 2 and 3, determined by the training model; |
int【2/3】 |
max_boxes_num |
Maximum Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled, options 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying Model for Image Inference#
For image inference, please refer to the following code, modify the defined parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
confidence_threshold = 0.5
mask_threshold=0.5
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO26 model
yolo=YOLO26(task_type="segment",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying Model for Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label name, and model input size
kmodel_path="/data/best.kmodel"
labels = ["apple","banana","orange"]
model_input_size=[320,320]
# Add display mode, default hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
mask_threshold=0.5
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO26 model
yolo=YOLO26(task_type="segment",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,mask_thresh=mask_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO26 Rotated Object Detection#
YOLO26 Source Code and Training Environment Setup#
YOLO26 is the latest generation real-time object detection model released by Ultralytics, which has undergone a series of fundamental architectural innovations, achieving end-to-end NMS-free inference, aiming to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.
For setting up the YOLO26 training environment, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please skip this step.
Training Data Preparation#
You can first create a new folder yolo26, please download the provided sample dataset. The sample dataset contains a rotated object detection dataset for the scenario of single-class rotated pen detection (pen). Extract the dataset to the yolo26 directory, please use yolo_pen_obb as the dataset for the rotated object detection task.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.
cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please skip this step.
Using the YOLO26 Rotated Object Detection Model#
Execute the command in the yolo26 directory, and use yolo26 to train a single-class rotated object detection model:
yolo obb train data=datasets/pen_obb.yaml model=yolo26n-obb.pt epochs=100 imgsz=320
Converting Rotated Object Detection kmodel#
Model conversion requires installing the following libraries in the training environment:
# Linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires installing dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# Windows platform: Please install dotnet-7 by yourself and add environment variables. pip online installation of nncase is supported, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl from https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the download directory of nncase_kpu-2.*-py2.py3-none-win_amd64.whl
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool, and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip
According to the following commands, first export the pt model under runs/obb/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path by yourself
yolo export model=runs/obb/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/obb
# Convert kmodel, please choose the onnx model path by yourself. The generated kmodel is in the same directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/obb/train/weights/best.onnx --dataset ../calibration_obb --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model Conversion Script (to_kmodel.py) Parameter Description:
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to the k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization phase |
str |
input_width |
Input Width |
Width of the model input |
int |
input_height |
Input Height |
Height of the model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying Models on k230 Using MicroPython#
Flashing Image and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Daily Build Firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself, see the tutorial: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write and run code in the IDE.
Model File Copy#
Connect the IDE, and copy the converted model and test images to the path CanMV/data directory. This path can be customized, you only need to modify the corresponding path when writing the code.
YOLO26 Module#
The YOLO26 class integrates four tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), and rotated object detection (obb); it supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLO26.
Import Method
from libs.YOLO import YOLO26
Parameter Description
Parameter Name |
Description |
Explanation |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inferring images, ‘video’ means inferring real-time video streams captured by the camera; |
str |
kmodel_path |
kmodel Path |
Path of the kmodel copied to the development board; |
str |
labels |
Class Label List |
Label names for different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Current frame resolution for inference, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when the YOLOv8 model is trained, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when the inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Class confidence threshold for classification tasks, object confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
mask_thresh |
mask Threshold |
Binarization threshold for segmenting objects in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function takes effect, options 0/1, 0 means no timing, 1 means timing; |
int【0/1】 |
Deploying Model to Implement Image Inference#
For image inference, please refer to the following code, modify the definition parameter variables in __main__ according to the actual situation;
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example. For custom scenarios, please modify it to your own test image, model path, label name, and model input size
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ['pen']
model_input_size=[320,320]
confidence_threshold = 0.1
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO26 model
yolo=YOLO26(task_type="obb",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,conf_thresh=confidence_threshold,max_boxes_num=100,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying Model to Implement Video Inference#
For video inference, please refer to the following code, modify the definition variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is just an example. For custom scenarios, please modify it to your own model path, label name, and model input size
kmodel_path="/data/best.kmodel"
labels = ['pen']
model_input_size=[320,320]
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi defaults to lt9611, resolution 1920*1080; lcd defaults to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[640,360]
confidence_threshold = 0.1
# Initialize PipeLine
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO26 model
yolo=YOLO26(task_type="obb",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
YOLO26 License Plate Corner Detection#
YOLO26 Source Code and Training Environment Setup#
YOLO26 is the latest generation real-time object detection model released by Ultralytics. It has undergone a series of fundamental architectural innovations, achieving end-to-end NMS-free inference, and is designed to provide more powerful and easier-to-deploy solutions for edge computing and low-power devices.
For YOLO26 training environment setup, please refer to ultralytics/ultralytics: Ultralytics YOLO 🚀 (github.com)
# Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
If you have already set up the environment, please ignore this step.
Training Data Preparation#
You can first create a new folder yolo26, please download the provided sample dataset. The sample dataset contains datasets for the license plate detection four-corner keypoint scenario. Extract the dataset to the yolo26 directory, please use car_plate as the dataset for the keypoint detection task.
If you want to use your own dataset for training, you can download X-AnyLabeling to complete the annotation. Convert the annotated data into the training data format officially supported by yolo26 for subsequent training.
cd yolo26
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/datasets.zip
unzip datasets.zip
If you have already downloaded the data, please ignore this step.
Using YOLO26 to Train the Keypoint Detection Model#
Execute the following command in the yolo26 directory to use yolo26 to train a rotation object detection model:
yolo pose train data=datasets/car_plate.yaml model=yolo26n-pose.pt epochs=100 imgsz=320
Convert Keypoint Detection kmodel#
Model conversion requires the following libraries to be installed in the training environment:
# linux platform: nncase and nncase-kpu can be installed online, nncase-2.x requires dotnet-7
sudo apt-get install -y dotnet-sdk-7.0
pip install --upgrade pip
pip install nncase==2.11.0
pip install nncase-kpu==2.11.0
# windows platform: please install dotnet-7 by yourself and add environment variables. nncase can be installed online using pip, but the nncase-kpu library needs to be installed offline. Download nncase_kpu-2.*-py2.py3-none-win_amd64.whl at https://github.com/kendryte/nncase/releases
# Enter the corresponding python environment, and use pip to install in the nncase_kpu-2.*-py2.py3-none-win_amd64.whl download directory
pip install nncase_kpu-2.*-py2.py3-none-win_amd64.whl
# In addition to nncase and nncase-kpu, other libraries used by the script include:
pip install onnx==1.15.0
pip install onnxruntime==1.19.0
pip install onnxsim==0.4.36
Download the script tool and extract the model conversion script tool test_yolo26.zip to the yolo26 directory;
wget https://kendryte-download.canaan-creative.com/developer/k230/yolo_files/test_yolo26.zip
unzip test_yolo26.zip
According to the following commands, first export the pt model under runs/pose/train/weights to an onnx model, and then convert it to a kmodel model:
# Export onnx, please choose the pt model path by yourself
yolo export model=runs/pose/train/weights/best.pt format=onnx imgsz=320
cd test_yolo26/pose
# Convert kmodel, please choose the onnx model path by yourself, the generated kmodel is in the same level directory as the onnx model
python to_kmodel.py --target k230 --model ../../runs/pose/train/weights/best.onnx --dataset ../calibration_pose --input_width 320 --input_height 320 --ptq_option 0
cd ../../
💡 Model conversion script (to_kmodel.py) parameter description:
Parameter Name |
Description |
Description |
Type |
|---|---|---|---|
target |
Target Platform |
Options are k230/CPU, corresponding to k230 chip; |
str |
model |
Model Path |
The path of the ONNX model to be converted; |
str |
dataset |
Calibration Image Set |
Image data used during model conversion, used in the quantization stage |
str |
input_width |
Input Width |
Width of model input |
int |
input_height |
Input Height |
Height of model input |
int |
ptq_option |
Quantization Method |
Quantization strategies are |
0/1/2/3/4/5 |
Deploying Models on k230 Using MicroPython#
Flashing Firmware and Installing CanMV IDE#
💡 Firmware Introduction: Please download the latest Dalily Build firmware according to your development board type to ensure that the latest features are supported! Or use the latest code to compile the firmware yourself. For the tutorial, see: Firmware Compilation.
Download and install CanMV IDE (Download link: CanMV IDE download), write code in the IDE and run it.
Model File Copy#
Connect the IDE and copy the converted model and test images to the path CanMV/data directory. This path can be customized, just need to modify the corresponding path when writing code.
YOLO26 Module#
The YOLO26 class integrates the five tasks of YOLO26, including classification (classify), detection (detect), segmentation (segment), rotated object detection (obb), keypoint detection (pose); supports two inference modes, including image (image) and video stream (video); this class encapsulates the kmodel inference process of YOLO26.
Import Method
from libs.YOLO import YOLO26
Parameter Description
Parameter Name |
Description |
Description |
Type |
|---|---|---|---|
task_type |
Task Type |
Supports four types of tasks, options are ‘classify’/’detect’/’segment’/’obb’/’pose’; |
str |
mode |
Inference Mode |
Supports two inference modes, options are ‘image’/’video’, ‘image’ means inference image, ‘video’ means inference real-time video stream collected by camera; |
str |
kmodel_path |
kmodel Path |
kmodel path copied to the development board; |
str |
labels |
Class Label List |
Label names of different categories; |
list[str] |
rgb888p_size |
Inference Frame Resolution |
Inference current frame resolution, such as [1920,1080], [1280,720], [640,640]; |
list[int] |
model_input_size |
Model Input Resolution |
Input resolution when YOLO11 model is trained, such as [224,224], [320,320], [640,640]; |
list[int] |
display_size |
Display Resolution |
Set when inference mode is ‘video’, supports hdmi([1920,1080]) and lcd([800,480]); |
list[int] |
conf_thresh |
Confidence Threshold |
Category confidence threshold for classification tasks, target confidence threshold for detection and segmentation tasks, such as 0.5; |
float【0~1】 |
mask_thresh |
Mask Threshold |
Binarization threshold for segmenting the object in the detection box in the segmentation task; |
float【0~1】 |
kp_num |
Number of Keypoints |
Number of keypoints in the keypoint detection task; |
int |
kp_dim |
Keypoint Dimension |
Dimension of keypoints in the keypoint detection task, only 2 and 3 are supported, determined by the trained model; |
int【2/3】 |
max_boxes_num |
Maximum Number of Detection Boxes |
Maximum number of detection boxes allowed to be returned in one frame of image; |
int |
debug_mode |
Debug Mode |
Whether the timing function is enabled, options 0/1, 0 is no timing, 1 is timing; |
int【0/1】 |
Deploying Model to Implement Image Inference#
For image inference, please refer to the following code, modify the definition parameters in __main__ according to the actual situation;
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own test image, model path, label name, model input size, number of keypoints, and keypoint dimension
img_path="/data/test.jpg"
kmodel_path="/data/best.kmodel"
labels = ['plate']
model_input_size=[320,320]
kp_num=4
kp_dim=2
confidence_threshold = 0.5
nms_threshold=0.45
img,img_ori=read_image(img_path)
rgb888p_size=[img.shape[2],img.shape[1]]
# Initialize YOLO26 model
yolo=YOLO26(task_type="pose",mode="image",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,max_boxes_num=100,debug_mode=0)
yolo.config_preprocess()
res=yolo.run(img)
print(res)
yolo.draw_result(res,img_ori)
yolo.deinit()
gc.collect()
Deploying Model to Implement Video Inference#
For video inference, please refer to the following code, modify the defined variables in __main__ according to the actual situation;
from libs.PipeLine import PipeLine
from libs.YOLO import YOLO26
from libs.Utils import *
import os,sys,gc
import ulab.numpy as np
import image
if __name__=="__main__":
# This is only an example. For custom scenarios, please modify it to your own model path, label name, model input size, number of keypoints, and keypoint dimension
kmodel_path="/data/best.kmodel"
labels = ["plate"]
model_input_size=[320,320]
kp_num=4
kp_dim=2
# Add display mode, default is hdmi, optional hdmi/lcd/lt9611/st7701/hx8399, where hdmi is defaulted to lt9611, resolution 1920*1080; lcd is defaulted to st7701, resolution 800*480
display_mode="lcd"
rgb888p_size=[320,320]
confidence_threshold = 0.5
pl=PipeLine(rgb888p_size=rgb888p_size,display_mode=display_mode)
pl.create()
display_size=pl.get_display_size()
# Initialize YOLO26 model
yolo=YOLO26(task_type="pose",mode="video",kmodel_path=kmodel_path,labels=labels,rgb888p_size=rgb888p_size,model_input_size=model_input_size,display_size=display_size,kp_num=kp_num,kp_dim=kp_dim,conf_thresh=confidence_threshold,max_boxes_num=50,debug_mode=0)
yolo.config_preprocess()
while True:
with ScopedTiming("total",1):
img=pl.get_frame()
res=yolo.run(img)
yolo.draw_result(res,pl.osd_img)
pl.show_image()
gc.collect()
yolo.deinit()
pl.destroy()
kmodel Conversion Verification#
The model conversion script toolkits (test_yolov5/test_yolov8/test_yolo11/test_yolo26) downloaded for different models contain kmodel verification scripts.
yolo26 adds the nms process to the model for implementation, and the output data has actual meaning, which is not suitable for measuring with cosine similarity. Moreover, the output shape is not large, and tiny differences will amplify the computational difference of cosine. You can use the test_***_onnx.py and test_***_kmodel.py scripts to test the actual inference results.
Note: Executing the verification script requires adding environment variables
linux:
# The paths in the commands below are the paths of the Python environment where nncase is installed. Please adapt and modify according to your environment export NNCASE_PLUGIN_PATH=$NNCASE_PLUGIN_PATH:/usr/local/lib/python3.9/site-packages/ export PATH=$PATH:/usr/local/lib/python3.9/site-packages/ source /etc/profilewindows:
Add the
Lib/site-packagespath under thePythonenvironment wherenncaseis installed to the system variablePathof the environment variables.
Compare onnx Output and kmodel Output#
Generate Input bin File#
Navigate to the classify/detect/segment/obb/pose directory and execute the following command:
python save_bin.py --image ../test_images/test.jpg --input_width 224 --input_height 224
Executing the script will generate the bin files onnx_input_float32.bin and kmodel_input_uint8.bin in the current directory, which serve as input files for the onnx model and kmodel model.
Compare Output#
Copy the converted models best.onnx and best.kmodel to the calssify/detect/segment directory, then execute the verification script with the following command:
python simulate.py --model best.onnx --model_input onnx_input_float32.bin --kmodel best.kmodel --kmodel_input kmodel_input_uint8.bin --input_width 224 --input_height 224
The following output will be obtained:
output 0 cosine similarity : 0.9985673427581787
The script will sequentially compare the cosine similarity of the outputs. If the similarity is above 0.99, the model is generally considered usable; otherwise, actual inference testing is required, or the quantization parameters need to be changed to re-export the kmodel. If the model has multiple outputs, there will be multiple lines of similarity comparison information. For example, for a segmentation task, there are two outputs, and the similarity comparison information is as follows:
output 0 cosine similarity : 0.9999530911445618
output 1 cosine similarity : 0.9983288645744324
ONNX Model Inference on Images#
Navigate to the classify/detect/segment/obb/pose directory. Taking the classification task as an example, open test_cls_onnx.py, modify the parameters in main() to fit your model, and then execute the command:
python test_cls_onnx.py
After the command executes successfully, the results will be saved to onnx_cls_results.jpg .
The detection task, segmentation task, oriented object detection task, and keypoint detection task are similar. Execute
test_det_onnx.py,test_seg_onnx.py,test_obb_onnx.py,test_pose_onnx.pyrespectively.
Kmodel Model Inference on Images#
Navigate to the classify/detect/segment/obb/pose directory. Taking the classification task as an example, open test_cls_kmodel.py, modify the parameters in main() to fit your model, and then execute the command:
python test_cls_kmodel.py
After the command executes successfully, the results will be saved to kmodel_cls_results.jpg .
The detection task, segmentation task, oriented object detection task, and keypoint detection task are similar. Execute
test_det_kmodel.py,test_seg_kmodel.py,test_obb_kmodel.py,test_pose_kmodel.pyrespectively.
Tuning Guide#
When the model does not perform well on the K230, consider tuning from aspects such as threshold settings, model size, input resolution, quantization method, and training data quality.
Adjust Thresholds#
Adjust the confidence threshold, nms threshold, and mask threshold to tune the deployment performance without changing the model. In detection tasks, raising the confidence threshold and lowering the nms threshold will reduce the number of detection boxes; conversely, lowering the confidence threshold and raising the nms threshold will increase the number of detection boxes. In segmentation tasks, the mask threshold affects the division of segmentation regions. You can adjust according to the actual scenario to find the threshold for better performance.
Change Model#
Choose models of different sizes to balance speed, memory usage, and accuracy. You can select n/s/m/l models for training and conversion according to your actual needs.
Change Input Resolution#
Change the input resolution of the model to fit your scenario. A larger resolution may improve the deployment performance but will consume more inference time.
Modify Quantization Method#
The model conversion script provides 3 quantization parameters, performing uint8 quantization or int16 quantization on data and weights.
In the kmodel conversion script, different quantization methods are specified by choosing different ptq_option values.
ptq_option |
data |
weights |
calibrate_method |
|---|---|---|---|
0 |
uint8 |
uint8 |
NoClip |
1 |
uint8 |
int16 |
NoClip |
2 |
int16 |
uint8 |
NoClip |
3 |
uint8 |
uint8 |
Kld |
4 |
uint8 |
int16 |
Kld |
5 |
int16 |
uint8 |
Kld |
Improve Data Quality#
If the training results are poor, please improve the dataset quality, optimizing from aspects such as data volume, reasonable data distribution, annotation quality, and training parameter settings.
Tuning Tips#
The impact of quantization parameters on performance is greater on YOLOv8 and YOLO11 than on YOLOv5; compare different quantized models to see the effect;
Input resolution has a greater impact on inference speed than model size;
Differences in the data distribution between training data and K230 camera data may affect the deployment performance. You can use K230 to collect some data and annotate it yourself for training;
