Skip to content

A PortAudio based audio_common with text to speech for ROS 2

License

Notifications You must be signed in to change notification settings

mgonzs13/audio_common

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

audio_capture

This repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.

License: MIT GitHub release Code Size Last Commit GitHub issues GitHub pull requests Contributors C++ Formatter Check

ROS 2 Distro Branch Build status Docker Image Documentation
Humble main Humble Build Docker Image Doxygen Deployment
Iron main Iron Build Docker Image Doxygen Deployment
Jazzy main Jazzy Build Docker Image Doxygen Deployment
Rolling main Rolling Build Docker Image Doxygen Deployment

Table of Contents

  1. Installation
  2. Docker
  3. Nodes
  4. Demos

Installation

cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
pip3 install -r audio_common/requirements.txt
colcon build

Docker

You can create a docker image to test audio_common. Use the following common inside the directory of audio_common.

docker build -t audio_common .

After the image is created, run a docker container with the following command.

docker run -it --rm --device /dev/snd audio_common

Nodes

audio_capturer_node

Node to obtain audio data from a microphone and publish it into the audio topic.

Click to expand

Parameters

  • format: Specifies the audio format to be used for capturing. Common values are paInt16 (16-bit format) or other formats supported by PortAudio. Default: paInt16

  • channels: The number of audio channels to capture. Typically, 1 for mono and 2 for stereo. Default: 1

  • rate: The sample rate that is is how many samples per second should be captured. Default: 16000

  • chunk: The size of each audio frames. Default: 4096

  • device: The ID of the audio input device. A value of -1 indicates that the default audio input device should be used. Default: -1

  • frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

  • audio: Topic to publish the audio data captured from the microphone. Type: audio_common_msgs/msg/AudioStamped

audio_player_node

Node to play the audio data obtained from the audio topic.

Click to expand

Parameters

  • channels: The number of audio channels to capture. Typically, 1 for mono and 2 for stereo. Default: 2

  • device: The ID of the audio input device. A value of -1 indicates that the default audio input device should be used. Default: -1

ROS 2 Interfaces

  • audio: Topic subscriber to get the audio data captured to be played. Type: audio_common_msgs/msg/AudioStamped

music_node

Node to play the music from a audio file in wav format.

Click to expand

Parameters

  • chunk_time: Time, in milliseconds, that last each audio chunk. Default: 50

  • frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

  • audio: Topic to publish the audio data from the files. Type: audio_common_msgs/msg/AudioStamped

tts_node

Node to generate audio from a text (TTS).

Click to expand

Parameters

  • chunk: The size of each audio frames. Default: 4096

  • frame_id: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: ""

ROS 2 Interfaces

  • audio: Topic publisher to send the audio data generated by the TTS. Type: audio_common_msgs/msg/AudioStamped

  • say: Action to generate audio data from a text. Type: audio_common_msgs/action/TTS

Demos

Audio Capturer/Player

ros2 run audio_common audio_capturer_node
ros2 run audio_common audio_player_node

TTS

ros2 run audio_common tts_node
ros2 run audio_common audio_player_node
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"

Music Player

ros2 run audio_common music_node
ros2 run audio_common audio_player_node
ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"