Note

This is the documentation for the latest development branch and may refer to features that are not available in released versions. If you are looking for the documentation for a specific release, use the drop-down menu on the left and select the desired version.

Multi-Object Tracking (MOT) Application Development Guide#

Attention

This sample uses the single-camera dual-channel development pattern. For ByteTrack and OCSort, refer to single_model_example.md. For DeepSORT and BoTSORT, refer to double_model_example.md.

Overview#

Multi-object tracking (MOT) aims to detect multiple targets in a video sequence and keep a stable identity (ID) for each target across adjacent frames. A typical MOT pipeline includes:

  1. Object Detection: detect targets such as pedestrians or vehicles in each frame

  2. State Prediction: predict target motion across adjacent frames, usually with a Kalman filter

  3. Data Association: match current detections with existing tracks using motion, appearance, or both

  4. Track Management: create new tracks, update existing tracks, and remove lost tracks

This sample supports DeepSORT, ByteTrack, OCSort, and BoTSORT. They represent different tradeoffs in terms of accuracy, robustness, compute cost, and dependence on appearance features.

Algorithm Introduction#

DeepSORT#

DeepSORT is an extension of the classic SORT tracker. SORT depends only on motion information, while DeepSORT introduces deep appearance features (ReID) and significantly improves identity consistency under occlusion and re-identification scenarios.

Core components:

  • Motion model

    • constant-velocity Kalman filter

    • state typically includes position, scale, aspect ratio, and their velocities

  • Appearance model

    • use a deep CNN to extract feature embeddings for each detected target

    • features are usually L2-normalized vectors

  • Data association

    • first perform Mahalanobis-distance gating from Kalman prediction

    • then use the Hungarian algorithm on a combined cost of motion distance and appearance distance

  • Track lifecycle

    • includes Tentative, Confirmed, and Deleted states

    • multiple successful matches are needed before a track becomes confirmed

ByteTrack#

ByteTrack is a modern MOT algorithm designed to maximize tracking performance without using appearance features. Its key idea is that low-confidence detections still contain useful motion information.

ByteTrack splits detections into:

  • high-score detections for reliable matching

  • low-score detections for recovering possibly lost tracks

Matching process:

  1. match tracks with high-score detections through IoU and Hungarian matching

  2. match unmatched tracks with low-score detections

  3. initialize new tracks only from high-score detections

Main characteristics:

  • pure motion modeling

  • no ReID model required

  • very fast and easy to deploy

OCSort#

OCSort improves the SORT/ByteTrack-style tracker by addressing cases where Kalman prediction becomes inaccurate under sudden motion or camera movement.

It introduces observation-centric motion modeling, which relies more on recent observations than on long-term velocity estimates.

Key points:

  • estimate velocity from recent observations

  • improve robustness under sudden acceleration

  • improve robustness when the camera shakes or pans quickly

  • focus more on geometric consistency than on appearance cues

BoTSORT#

BoTSORT combines ideas from ByteTrack and DeepSORT. It aims to keep high speed while achieving stronger identity consistency.

It integrates:

  • ByteTrack-style high-score and low-score detection association

  • optional ReID appearance features

  • improved motion modeling compared with classic SORT

Its matching strategy usually includes:

  • a main association stage on high-confidence detections

  • a secondary association stage on low-confidence detections

  • IoU distance and optional appearance distance fusion

Comparison and Application Scenarios#

Algorithm

ReID

Motion Emphasis

Complexity

Core Characteristics

DeepSORT

Yes

Kalman filter + appearance

High

strong ID stability under occlusion and re-identification

ByteTrack

No

Kalman filter + IoU

Low

fast, simple, and effective without appearance features

OCSort

No

observation-centric motion

Medium

more robust under detection jitter and unstable motion

BoTSORT

Yes

Kalman + IoU + ReID

High

stronger performance in complex scenes through multi-stage matching

K230 integrates these algorithms under one application style, so you can quickly switch the detection model and tracking parameters without rebuilding the lower media stack.

Build Code#

From the RTOS SDK root:

make list-def
make ***_defconfig
make -j

After the firmware build completes, the image is generated under output.

Build Method 1#

After the code changes are ready, enter one of the algorithm directories under:

src/rtsmart/examples/ai/multi_object_tracking

and run:

./build_app.sh

The build intermediates are generated in build, and the deployment summary files are collected in k230_bin.

Build Method 2#

From the RTOS SDK root, run make menuconfig and enable:

RT-Smart UserSpace Examples Configuration
-> Enable build ai examples
-> Enable Build MOT(Multi-Object Tracking) Programs

Select the target algorithm, save the configuration, then run:

make -j

With this method, the deployment summary files are built directly into the corresponding application directory under /sdcard/app/examples/ai/multi_object_tracking.

You can also enter the target directory directly and run:

make -j

This also supports incremental build and places outputs in k230_bin.

Board Deployment#

Flash the firmware first. See:

how_to_flash

After boot, a virtual disk named CanMV is visible. Copy the generated elf, kmodel, and any required files from k230_bin to CanMV/sdcard.

Then connect to the board over serial and run:

run.sh

After startup, you should see the video output on the screen.

Reference deployment effect:

multi_object_tracking

If you want to use an HDMI display, modify:

~/canmv_k230/src/rtsmart/examples/ai/multi_object_tracking/botsort_track_app/src/setting.h

Change:

#define DISPLAY_TYPE 'st7701'

to:

#define DISPLAY_TYPE 'lt9611'

Then rebuild the application.

Comments list
Comments
Log in