RVV Applications#
Overview#
RVV (RISC-V Vector) is the vector extension of the RISC-V ISA. K230 supports RVV and can use vector instructions for parallel computation, significantly improving data-processing performance.
Functional Description#
RVV Features#
RVV provides strong vector-compute capability:
SIMD-style computation
variable vector length
rich vector instructions for arithmetic, logic, load, and store
flexible data types, including integer and floating-point types
K230 RVV Support#
K230 provides:
vector length:
256-bitor512-bitdepending on the specific implementationvector registers: 32 vector registers (
v0tov31)data width:
8,16,32, and64bitscalar types: integer and floating-point
Main Advantages#
Using RVV provides:
higher performance through parallel computation
more concise code for data-parallel workloads
better energy efficiency compared with pure scalar execution
Application Scenarios#
RVV is suitable for:
image processing
audio processing
matrix operations
data copy
cryptography
DSP applications
Build Notes#
Enable RVV Support#
Add RVV-related compiler flags:
CFLAGS += -march=rv64gcv -mabi=lp64d
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=rv64gcv -mabi=lp64d")
Inline Functions#
Include the RVV header in your source code:
#include <riscv_vector.h>
Usage#
Basic RVV Usage#
Vector configuration and load#
#include <riscv_vector.h>
void vector_add_example(float* a, float* b, float* c, int n) {
size_t vl = vsetvl_e32m4(n);
vfloat32m4_t va = vle32_v_f32m4(a, vl);
vfloat32m4_t vb = vle32_v_f32m4(b, vl);
vfloat32m4_t vc = vfadd_vv_f32m4(va, vb, vl);
vse32_v_f32m4(c, vc, vl);
}
Vector-width setting#
RVV supports different LMUL widths:
vfloat32m1_t v1 = ...;
vfloat32m2_t v2 = ...;
vfloat32m4_t v4 = ...;
vfloat32m8_t v8 = ...;
Conditional handling#
Use vector masks for conditional processing:
void vector_conditional_example(float* a, float* b, float* c, int n) {
size_t vl = vsetvl_e32m4(n);
vfloat32m4_t va = vle32_v_f32m4(a, vl);
vfloat32m4_t vb = vle32_v_f32m4(b, vl);
vbool32_t mask = vmfgt_vf_f32m4(va, vb, vl);
vfloat32m4_t vc = vfmerge_vfm_f32m4(vb, va, mask, vl);
vse32_v_f32m4(c, vc, vl);
}
Reduction operations#
Use reduction operations for accumulation:
float vector_sum_example(float* a, int n) {
size_t vl = vsetvl_e32m4(n);
vfloat32m4_t va = vle32_v_f32m4(a, vl);
float sum = vfredosum_vs_f32m4_f32m4(va, vfmv_s_f_f32m4(0.0f, vl), vl);
return sum;
}
Performance Optimization Suggestions#
use the largest practical vector length (
LMUL) for the workloadkeep memory aligned to improve load/store efficiency
combine RVV with loop unrolling where appropriate
reduce scalar fallback code whenever possible
Tip
RVV programming has a learning curve. It is recommended to start from simple examples and gradually become familiar with the available RVV instructions and usage patterns. For details, refer to the RISC-V vector extension specification.
Tip
On K230, RVV can significantly improve performance in scenarios such as image processing and audio processing. For best results, optimize it together with K230 hardware features such as DMA and cache behavior.
