aibase_module_api_manual#

Overview#

This manual aims to guide developers in building a complete AI inference workflow when developing AI Demos using MicroPython, implementing the functionality from loading the model, preprocessing, inference, obtaining model output, to post-processing. This module encapsulates the inference process of a single model, wrapping the preprocessing, inference, and output acquisition operations within the framework. When developing AI applications, users only need to focus on preprocessing configuration and the post-processing process.

API Introduction#

init#

Description

AIBase constructor, initializes the AI program to obtain image resolution and display-related parameters.

Syntax

from libs.AIBase import AIBase

aibase=AIBase(kmodel_path="**.kmodel", model_input_size=[224,224], rgb888p_size=[1280,720], debug_mode=0)

Parameters

Parameter Name	Description	Input / Output	Notes
kmodel_path	kmodel path	Input	Required
model_input_size	Model input resolution, list type, including width and height, e.g., [512,512]	Input	Required
rgb888p_size	Input image resolution of the AI program, list type, including width and height, e.g., [1280,720]	Input	Required
debug_mode	Debug timing mode, 0 for timing, 1 for no timing, int type	Input	Default is 0

Return Value

Return Value	Description	Notes
AIBase	AIBase instance	This class is generally inherited by subclasses as a parent class, based on which AI Demo classes for different scenarios are written

get_kmodel_inputs_num#

Description

Gets the number of inputs of the kmodel.

Syntax

aibase.get_kmodel_inputs_num()

Return Value

Return Value	Description
inputs_num	Number of kmodel inputs

get_kmodel_outputs_num#

Description

Gets the number of outputs of the kmodel.

Syntax

aibase.get_kmodel_outputs_num()

Return Value

Return Value	Description
outputs_num	Number of kmodel outputs

preprocess#

Description

Calls the ai2d preprocessing method defined in the subclass. If the preprocessing method cannot be implemented with ai2d or preprocessing is not required, it needs to be overridden in the subclass.

Syntax

aibase.preprocess(input_np)

Parameters

Parameter Name	Description	Input / Output	Notes
input_np	Input data in `ulab.numpy.ndarray` format, shape needs to be consistent with the configuration in `Ai2d.build`	Input	Required

Return Value

Return Value	Description
input_tensors	List of input tensors obtained after ai2d preprocessing

inference#

Description

Method that uses kmodel for inference and obtains model output.

Syntax

results=aibase.inference()

Return Value

Return Value	Description
results	List of kmodel inference outputs, each output is in `ulab.numpy.ndarray` format

postprocess#

Description

Post-processing interface definition. Since different AI applications require different post-processing steps, this method needs to be overridden in the subclass.

Syntax

aibase.postprocess()

Return Value

Return Value	Description
None

run#

Description

The complete process of single-model inference, including preprocessing, inference, obtaining output, and post-processing, returns the post-processed output, used for drawing results on the Display.

Syntax

aibase.run(input_np)

Parameters

Parameter Name	Description	Input / Output	Notes
input_np	Input data in `ulab.numpy.ndarray` format, shape needs to be consistent with the configuration in `Ai2d.build`	Input	Required

Return Value

Return Value	Description
None

deinit#

Description

AIBase deinitialization method, destroys the kpu instance and releases memory.

Syntax

aibase.deinit()

Return Value

Return Value	Description
None

Example Program#

Attention

The AIBase class is basically not used alone. It serves as the parent class for AI Demo application development, providing basic interfaces. Subclasses inherit from AIBase and override certain methods according to task types to implement specific scenario development. During development, a draw_result method needs to be defined in the subclass to draw results according to the task.

The following is a face detection example program:

from libs.PipeLine import PipeLine, ScopedTiming
from libs.AIBase import AIBase
from libs.AI2D import Ai2d
import os,sys,gc,time,random,utime
import ujson
from media.media import *
from time import *
import nncase_runtime as nn
import ulab.numpy as np
import image
import aidemo

# 自定义人脸检测类，继承自AIBase基类
class FaceDetectionApp(AIBase):
    def __init__(self, kmodel_path, model_input_size, anchors, confidence_threshold=0.5, nms_threshold=0.2, rgb888p_size=[224,224], display_size=[1920,1080], debug_mode=0):
        super().__init__(kmodel_path, model_input_size, rgb888p_size, debug_mode)  # 调用基类的构造函数
        self.kmodel_path = kmodel_path  # 模型文件路径
        self.model_input_size = model_input_size  # 模型输入分辨率
        self.confidence_threshold = confidence_threshold  # 置信度阈值
        self.nms_threshold = nms_threshold  # NMS（非极大值抑制）阈值
        self.anchors = anchors  # 锚点数据，用于目标检测
        self.rgb888p_size = [ALIGN_UP(rgb888p_size[0], 16), rgb888p_size[1]]  # sensor给到AI的图像分辨率，并对宽度进行16的对齐
        self.display_size = [ALIGN_UP(display_size[0], 16), display_size[1]]  # 显示分辨率，并对宽度进行16的对齐
        self.debug_mode = debug_mode  # 是否开启调试模式
        self.ai2d = Ai2d(debug_mode)  # 实例化Ai2d，用于实现模型预处理
        self.ai2d.set_ai2d_dtype(nn.ai2d_format.NCHW_FMT, nn.ai2d_format.NCHW_FMT, np.uint8, np.uint8)  # 设置Ai2d的输入输出格式和类型

    # 配置预处理操作，这里使用了pad和resize，Ai2d支持crop/shift/pad/resize/affine，具体代码请打开/sdcard/app/libs/AI2D.py查看
    def config_preprocess(self, input_image_size=None):
        with ScopedTiming("set preprocess config", self.debug_mode > 0):  # 计时器，如果debug_mode大于0则开启
            ai2d_input_size = input_image_size if input_image_size else self.rgb888p_size  # 初始化ai2d预处理配置，默认为sensor给到AI的尺寸，可以通过设置input_image_size自行修改输入尺寸
            top, bottom, left, right = self.get_padding_param()  # 获取padding参数
            self.ai2d.pad([0, 0, 0, 0, top, bottom, left, right], 0, [104, 117, 123])  # 填充边缘
            self.ai2d.resize(nn.interp_method.tf_bilinear, nn.interp_mode.half_pixel)  # 缩放图像
            self.ai2d.build([1,3,ai2d_input_size[1],ai2d_input_size[0]],[1,3,self.model_input_size[1],self.model_input_size[0]])  # 构建预处理流程

    # 自定义当前任务的后处理，results是模型输出array列表，这里使用了aidemo库的face_det_post_process接口
    def postprocess(self, results):
        with ScopedTiming("postprocess", self.debug_mode > 0):
            post_ret = aidemo.face_det_post_process(self.confidence_threshold, self.nms_threshold, self.model_input_size[1], self.anchors, self.rgb888p_size, results)
            if len(post_ret) == 0:
                return post_ret
            else:
                return post_ret[0]

    # 绘制检测结果到画面上
    def draw_result(self, pl, dets):
        with ScopedTiming("display_draw", self.debug_mode > 0):
            if dets:
                pl.osd_img.clear()  # 清除OSD图像
                for det in dets:
                    # 将检测框的坐标转换为显示分辨率下的坐标
                    x, y, w, h = map(lambda x: int(round(x, 0)), det[:4])
                    x = x * self.display_size[0] // self.rgb888p_size[0]
                    y = y * self.display_size[1] // self.rgb888p_size[1]
                    w = w * self.display_size[0] // self.rgb888p_size[0]
                    h = h * self.display_size[1] // self.rgb888p_size[1]
                    pl.osd_img.draw_rectangle(x, y, w, h, color=(255, 255, 0, 255), thickness=2)  # 绘制矩形框
            else:
                pl.osd_img.clear()

    # 获取padding参数
    def get_padding_param(self):
        dst_w = self.model_input_size[0]  # 模型输入宽度
        dst_h = self.model_input_size[1]  # 模型输入高度
        ratio_w = dst_w / self.rgb888p_size[0]  # 宽度缩放比例
        ratio_h = dst_h / self.rgb888p_size[1]  # 高度缩放比例
        ratio = min(ratio_w, ratio_h)  # 取较小的缩放比例
        new_w = int(ratio * self.rgb888p_size[0])  # 新宽度
        new_h = int(ratio * self.rgb888p_size[1])  # 新高度
        dw = (dst_w - new_w) / 2  # 宽度差
        dh = (dst_h - new_h) / 2  # 高度差
        top = int(round(0))
        bottom = int(round(dh * 2 + 0.1))
        left = int(round(0))
        right = int(round(dw * 2 - 0.1))
        return top, bottom, left, right

if __name__ == "__main__":
    # 显示模式，默认"hdmi",可以选择"hdmi"和"lcd"
    display_mode="hdmi"
    # k230保持不变，k230d可调整为[640,360]
    rgb888p_size = [1920, 1080]

    if display_mode=="hdmi":
        display_size=[1920,1080]
    else:
        display_size=[800,480]
    # 设置模型路径和其他参数
    kmodel_path = "/sdcard/examples/kmodel/face_detection_320.kmodel"
    # 其它参数
    confidence_threshold = 0.5
    nms_threshold = 0.2
    anchor_len = 4200
    det_dim = 4
    anchors_path = "/sdcard/examples/utils/prior_data_320.bin"
    anchors = np.fromfile(anchors_path, dtype=np.float)
    anchors = anchors.reshape((anchor_len, det_dim))

    # 初始化PipeLine，用于图像处理流程
    pl = PipeLine(rgb888p_size=rgb888p_size, display_size=display_size, display_mode=display_mode)
    pl.create()  # 创建PipeLine实例
    # 初始化自定义人脸检测实例
    face_det = FaceDetectionApp(kmodel_path, model_input_size=[320, 320], anchors=anchors, confidence_threshold=confidence_threshold, nms_threshold=nms_threshold, rgb888p_size=rgb888p_size, display_size=display_size, debug_mode=0)
    face_det.config_preprocess()  # 配置预处理

    try:
        while True:
            os.exitpoint()                      # 检查是否有退出信号
            with ScopedTiming("total",1):
                img = pl.get_frame()            # 获取当前帧数据
                res = face_det.run(img)         # 推理当前帧
                face_det.draw_result(pl, res)   # 绘制结果
                pl.show_image()                 # 显示结果
                gc.collect()                    # 垃圾回收
    except Exception as e:
        sys.print_exception(e)                  # 打印异常信息
    finally:
        face_det.deinit()                       # 反初始化
        pl.destroy()                            # 销毁PipeLine实例

Development Tips#

For common data type conversions during development, examples are provided here for reference.

Tips:

Image object to ulab.numpy.ndarray:

import image
img.to_rgb888().to_numpy_ref() #返回的array是HWC排布

ulab.numpy.ndarray to Image object:

import ulab.numpy as np
import image
img_np = np.zeros((height,width,4),dtype=np.uint8)
img = image.Image(width, height, image.ARGB8888, alloc=image.ALLOC_REF,data =img_np)

ulab.numpy.ndarray to tensor type:

import ulab.numpy as np
import nncase_runtime as nn
img_np = np.zeros((height,width,4),dtype=np.uint8)
tensor = nn.from_numpy(img_np)

tensor type to ulab.numpy.ndarray:

import ulab.numpy as np
import nncase_runtime as nn
img_np=tensor.to_numpy()

aibase_module_api_manual

Contents

aibase_module_api_manual#

Overview#

API Introduction#

init#

get_kmodel_inputs_num#

get_kmodel_outputs_num#

preprocess#

inference#

postprocess#

run#

deinit#

Example Program#

Development Tips#