This project is a RESTful API server that creates an audio from a text based on Piper. The APIs are compatible with OpenAI APIs of create speech.
Note
The project is still under active development. The existing features still need to be improved and more features will be added in the future.
Warning
tts-api-serve ONLY supports Linux platform for now! The support for other platforms will be added in the future.
-
Install WasmEdge v0.14.1
curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1
-
Deply
wasmedge-piper
pluginFor the purpose of demonstration, we will use the piper plugin for Ubuntu-20.04. You can find the plugin for other platforms Releases/0.14.1
# Download piper plugin for Mac Apple Silicon curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-piper-0.14.1-ubuntu20.04_x86_64.tar.gz # Unzip the plugin to $HOME/.wasmedge/plugin tar -xzf WasmEdge-plugin-wasi_nn-piper-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin rm $HOME/.wasmedge/plugin/libwasmedgePluginWasiNN.dylib
-
Download piper model and voice config file
# Download piper model curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx # Download voice config file curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json
For more voice models and config files, visit rhasspy/piper-voices.
-
Download text-to-speech synthesizer
# Download espeak-ng data directory curl -LO https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz tar -xzf piper_linux_x86_64.tar.gz piper/espeak-ng-data --strip-components=1
-
Download tts-api-server.wasm
curl -LO https://github.com/LlamaEdge/tts-api-server/releases/latest/download/tts-api-server.wasm
-
Start server
wasmedge --dir .:. tts-api-server.wasm \ --model-name piper \ --model en_US-lessac-medium.onnx \ --config en_US-lessac-medium.onnx.json \ --espeak-ng-dir ./espeak-ng-data
[!TIP]
tts-api-server
will use8080
port by default. You can change the port by adding--port <port>
.
-
Send a request for creating an audio from a text
curl --location 'http://localhost:8080/v1/audio/speech' \ --header 'Content-Type: application/json' \ --data '{ "model": "piper", "input": "This is an audio speech test", "response_format": "wav", "speed": 1.0 }' --output test.wav
If the request is successful, the generated audio file will be saved as
test.wav
.
-
For Linux users
cargo build --release
-
For macOS users
-
Download the
wasi-sdk
from the official website and unzip it to the directory you want. -
Build the project
export WASI_SDK_PATH=/path/to/wasi-sdk export CC="${WASI_SDK_PATH}/bin/clang --sysroot=${WASI_SDK_PATH}/share/wasi-sysroot" cargo clean cargo update cargo build --release
-
If the build process is successful, tts-api-server.wasm
will be generated in target/wasm32-wasip1/release/
.
$ wasmedge tts-api-server.wasm -h
Whisper API Server
Usage: tts-api-server.wasm [OPTIONS] --model-name <MODEL_NAME> --model <MODEL> --config <CONFIG> --espeak-ng-dir <ESPEAK_NG_DIR>
Options:
-m, --model-name <MODEL_NAME> Model name
--model <MODEL> Path to the whisper model file
--config <CONFIG> Path to the voice config file
--espeak-ng-dir <ESPEAK_NG_DIR> Path to the espeak-ng data directory
--socket-addr <SOCKET_ADDR> Socket address of LlamaEdge API Server instance. For example, `0.0.0.0:8080`
--port <PORT> Port number [default: 8080]
-h, --help Print help
-V, --version Print version