Audio Module API Manual#
Overview#
This manual aims to provide a detailed introduction to the CanMV audio module, guiding developers on how to implement audio capture and playback functionality by calling Python API interfaces.
API Introduction#
wave#
The wave module provides a simple way to read and process WAV files.
The
wave.openfunction can open a WAV file and return the corresponding class object.The
wave.Wave_readclass provides methods to obtain metadata of a WAV file (such as sample rate, sample points, number of channels, and sample precision) and to read WAV audio data from the file.The
wave.Wave_writeclass provides methods to set metadata of a WAV file (such as sample rate, sample points, number of channels, and sample precision) and to save PCM audio data to a WAV file.
When used in combination with the pyaudio module, this module can easily implement playback and capture of WAV file audio, as well as saving WAV audio files.
open#
Description
Opens a WAVE file for reading or writing audio data.
Syntax
open(f, mode=None)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
f |
File name |
Input |
mode |
Open mode (‘r’, ‘rb’, ‘w’, ‘wb’) |
Input |
Return Value
Return Value |
Description |
|---|---|
Wave_read or Wave_write class object |
Success |
Other |
Failure, raises exception |
wave.Wave_read#
The Wave_read class provides methods to obtain metadata of a WAV file (such as sample rate, sample points, number of channels, and sample precision) and to read WAV audio data from the file.
get_channels#
Description
Gets the number of channels.
Syntax
get_channels()
Parameters
None
Return Value
Return Value |
Description |
|---|---|
>0 |
Success |
0 |
Failure |
get_sampwidth#
Description
Gets the sample width in bytes.
Syntax
get_sampwidth()
Parameters
None
Return Value
Return Value |
Description |
|---|---|
>0 (valid range [1, 2, 3, 4] corresponding to sample precision [8, 16, 24, 32]) |
Success |
0 |
Failure |
get_framerate#
Description
Gets the sample rate.
Syntax
get_framerate()
Parameters
None
Return Value
Return Value |
Description |
|---|---|
>0 (valid range (8000~192000)) |
Success |
0 |
Failure |
read_frames#
Description
Reads frame data.
Syntax
read_frames(nframes)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
nframes |
Number of frames to read (number of channels × sample precision per sample point / 8 ) |
Input |
Return Value
Return Value |
Description |
|---|---|
bytes byte sequence |
wave.Wave_write#
The Wave_write class provides methods to set metadata for WAV files (such as sample rate, sample points, number of channels, and sample precision) and to save PCM audio data to WAV files.
set_channels#
Description
Set the number of channels.
Syntax
set_channels(nchannels)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
nchannels |
Number of channels |
Input |
Return Value
None
set_sampwidth#
Description
Set the sample byte width.
Syntax
set_sampwidth(sampwidth)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
sampwidth |
Sample byte width, valid range [1, 2, 3, 4] corresponding to sample precision [8, 16, 24, 32] |
Input |
Return Value
None
set_framerate#
Description
Set the sample rate.
Syntax
set_framerate(framerate)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
framerate |
Sample rate [8000~192000] |
Input |
Return Value
None
write_frames#
Description
Write audio data.
Syntax
write_frames(data)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
data |
Audio data (bytes byte sequence) |
Input |
Return Value
None
pyaudio#
The pyaudio module is used for audio processing, responsible for capturing and playing binary PCM audio data. To play WAV format files or save captured data as WAV files, it needs to be used in combination with the wave library. See the Examples section for details.
pyaudio.PyAudio#
Responsible for managing multiple audio input and output channels, each channel is represented as a Stream object.
open#
Description
Open a Stream.
Syntax
open(*args, **kwargs)
Parameters
Variable parameters, refer to [Stream.__init__].
Return Value
Return Value |
Description |
|---|---|
py:class: |
Success |
Other |
Failure, throws an exception |
close#
Description
Close a Stream.
Syntax
close(stream)
Parameters
None
Return Value
None
Note
This function will call the close method in the Stream object and remove the Stream object from the PyAudio object. Therefore, this function does not need to be called, you can directly call the Stream.close method.
terminate#
Description
Release audio resources. When PyAudio is no longer used, this function must be called to release audio resources. If a vb block is applied for in the default constructor, the vb block should be released in this function.
Syntax
terminate()
Parameters
None
Return Value
None
Note
This function will call the close method in the Stream object and remove the Stream object from the PyAudio object. Therefore, this function does not need to be called, you can directly call the Stream.close method.
pyaudio.Stream#
The Stream class object is used to manage an audio input or output path.
__init__#
Description
Constructor.
Syntax
__init__(
PA_manager,
rate,
channels,
format,
input=False,
output=False,
input_device_index=None,
output_device_index=None,
enable_codec=True,
frames_per_buffer=1024,
start=True,
stream_callback=None)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
PA_manager |
PyAudio class object |
Input |
rate |
Sampling rate |
Input |
channels |
Number of channels |
Input |
format |
Sample size in bytes |
Input |
input |
Whether it is an audio input, default value is False |
Input |
output |
Whether it is an audio output, default value is False |
Input |
input_device_index |
Input path index [0,1], default value is None (use the default path 0). 0: I2S path (the specific link is determined by enable_codec: when enabled, it is the analog path of the built-in audio codec; when disabled, it is the I2S digital path); 1: PDM digital path |
Input |
output_device_index |
Output path index [0,1], default value is None (use the default path 0). 0: I2S path (the specific link is determined by enable_codec: when enabled, it is the analog path of the built-in audio codec; when disabled, it is the I2S digital path); 1: fixed as I2S digital path |
Input |
enable_codec |
Whether to enable the built-in audio codec, default value is True |
Input |
frames_per_buffer |
Number of frames per buffer |
Input |
start |
Whether to start immediately, default value is True |
Input |
stream_callback |
Input / output callback function |
Input |
Return Value
None
start_stream#
Description
Start the stream.
Syntax
start_stream()
Parameters
None
Return Value
None
stop_stream#
Description
Stop the stream.
Syntax
stop_stream()
Parameters
None
Return Value
None
read#
Description
Read audio data.
Syntax
read(frames)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
frames |
Number of frames |
Input |
Return Value
Return Value |
Description |
|---|---|
bytes |
The read audio data |
write#
Description
Write audio data.
Syntax
write(data)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
data |
Audio data (bytes byte sequence) |
Input |
Return Value
None
volume#
Description
Get or set the volume.
Syntax
volume(vol = None, channel = LEFT_RIGHT)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
vol |
Volume value to set |
Input |
channel |
Channel selection: LEFT (left channel), RIGHT (right channel), LEFT_RIGHT (left and right channels) |
Input |
Return Value
When setting the volume, return value: None When getting the volume, return value: tuple
enable_audio3a#
Description
Enable audio 3a.
Syntax
enable_audio3a(audio3a_value)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
audio3a_value |
Audio 3a enable items: AUDIO_3A_ENABLE_ANS (audio noise suppression), UDIO_3A_ENABLE_AGC (automatic gain control), AUDIO_3A_ENABLE_AEC (echo cancellation) |
Input |
Return Value
None
audio3a_send_far_echo_frame#
Description
Send the far-end reference audio (i.e., the audio played by the near-end speaker), only used in the echo cancellation (AEC) scenario in audio 3a.
Syntax
audio3a_send_far_echo_frame(frame_data,data_len)
Parameters
Parameter Name |
Description |
Input / Output |
|---|---|---|
frame_data |
Far-end reference audio data (bytes byte sequence) |
Input |
data_len |
Data length |
Input |
Return Value
None
Example Programs#
Audio Capture and Save as WAV File Example#
import os
from media.media import * #导入media模块,用于初始化vb buffer
from media.pyaudio import * #导入pyaudio模块,用于采集和播放音频
import media.wave as wave #导入wav模块,用于保存和加载wav音频文件
def exit_check():
try:
os.exitpoint()
except KeyboardInterrupt as e:
print("user stop: ", e)
return True
return False
def record_audio(filename, duration):
CHUNK = 44100//25 #设置音频chunk值
FORMAT = paInt16 #设置采样精度,支持16bit(paInt16)/24bit(paInt24)/32bit(paInt32)
CHANNELS = 2 #设置声道数,支持单声道(1)/立体声(2)
RATE = 44100 #设置采样率
try:
p = PyAudio()
MediaManager.init() #vb buffer初始化
#创建音频输入流
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
stream.volume(vol=70, channel=LEFT)
stream.volume(vol=85, channel=RIGHT)
print("volume :",stream.volume())
#启用音频3A功能:自动噪声抑制(ANS)
stream.enable_audio3a(AUDIO_3A_ENABLE_ANS)
frames = []
#采集音频数据并存入列表
for i in range(0, int(RATE / CHUNK * duration)):
data = stream.read()
frames.append(data)
if exit_check():
break
#将列表中的数据保存到wav文件中
wf = wave.open(filename, 'wb') #创建wav 文件
wf.set_channels(CHANNELS) #设置wav 声道数
wf.set_sampwidth(p.get_sample_size(FORMAT)) #设置wav 采样精度
wf.set_framerate(RATE) #设置wav 采样率
wf.write_frames(b''.join(frames)) #存储wav音频数据
wf.close() #关闭wav文件
except BaseException as e:
print(f"Exception {e}")
finally:
stream.stop_stream() #停止采集音频数据
stream.close()#关闭音频输入流
p.terminate()#释放音频对象
MediaManager.deinit() #释放vb buffer
if __name__ == "__main__":
os.exitpoint(os.EXITPOINT_ENABLE)
print("音频示例开始")
record_audio('/sdcard/examples/test.wav', 5) # 录制WAV文件
Play WAV File Example#
import os
from media.media import * #导入media模块,用于初始化vb buffer
from media.pyaudio import * #导入pyaudio模块,用于采集和播放音频
import media.wave as wave #导入wav模块,用于保存和加载wav音频文件
def exit_check():
try:
os.exitpoint()
except KeyboardInterrupt as e:
print("user stop: ", e)
return True
return False
def play_audio(filename):
try:
wf = wave.open(filename, 'rb')#打开wav文件
CHUNK = int(wf.get_framerate()/25)#设置音频chunk值
p = PyAudio()
MediaManager.init() #vb buffer初始化
#创建音频输出流,设置的音频参数均为wave中获取到的参数
stream = p.open(format=p.get_format_from_width(wf.get_sampwidth()),
channels=wf.get_channels(),
rate=wf.get_framerate(),
output=True,frames_per_buffer=CHUNK)
#设置音频输出流的音量
stream.volume(vol=85)
data = wf.read_frames(CHUNK)#从wav文件中读取数一帧数据
while data:
stream.write(data) #将帧数据写入到音频输出流中
data = wf.read_frames(CHUNK) #从wav文件中读取数一帧数据
if exit_check():
break
except BaseException as e:
print(f"Exception {e}")
finally:
stream.stop_stream() #停止音频输出流
stream.close()#关闭音频输出流
p.terminate()#释放音频对象
wf.close()#关闭wav文件
MediaManager.deinit() #释放vb buffer
if __name__ == "__main__":
os.exitpoint(os.EXITPOINT_ENABLE)
print("音频示例开始")
play_audio('/sdcard/examples/test.wav') # 播放WAV文件
Summary#
Through this manual, developers can easily utilize the CanMV audio module to implement audio playback and capture functionality. This module combines the advantages of the wave and pyaudio libraries, providing convenient interfaces and clear API documentation, facilitating rapid development and application of audio-related projects.
