Skip to content

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Notifications You must be signed in to change notification settings

LlamaEdge/tts-api-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TTS-API-Server

This project is a RESTful API server that creates an audio from a text based on Piper. The APIs are compatible with OpenAI APIs of create speech.

Note

The project is still under active development. The existing features still need to be improved and more features will be added in the future.

Warning

tts-api-serve ONLY supports Linux platform for now! The support for other platforms will be added in the future.

Quick Start

Setup

  • Install WasmEdge v0.14.1

    curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.14.1
  • Deply wasmedge-piper plugin

    For the purpose of demonstration, we will use the piper plugin for Ubuntu-20.04. You can find the plugin for other platforms Releases/0.14.1

    # Download piper plugin for Mac Apple Silicon
    curl -LO https://github.com/WasmEdge/WasmEdge/releases/download/0.14.1/WasmEdge-plugin-wasi_nn-piper-0.14.1-ubuntu20.04_x86_64.tar.gz
    
    # Unzip the plugin to $HOME/.wasmedge/plugin
    tar -xzf WasmEdge-plugin-wasi_nn-piper-0.14.1-ubuntu20.04_x86_64.tar.gz -C $HOME/.wasmedge/plugin
    
    rm $HOME/.wasmedge/plugin/libwasmedgePluginWasiNN.dylib

Run tts-api-server

  • Download piper model and voice config file

    # Download piper model
    curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
    
    # Download voice config file
    curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json

    For more voice models and config files, visit rhasspy/piper-voices.

  • Download text-to-speech synthesizer

    # Download espeak-ng data directory
    curl -LO https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz
    tar -xzf piper_linux_x86_64.tar.gz piper/espeak-ng-data --strip-components=1
  • Download tts-api-server.wasm

    curl -LO https://github.com/LlamaEdge/tts-api-server/releases/latest/download/tts-api-server.wasm
  • Start server

    wasmedge --dir .:. tts-api-server.wasm \
      --model-name piper \
      --model en_US-lessac-medium.onnx \
      --config en_US-lessac-medium.onnx.json \
      --espeak-ng-dir ./espeak-ng-data

    [!TIP] tts-api-server will use 8080 port by default. You can change the port by adding --port <port>.

Usage

  • Send a request for creating an audio from a text

    curl --location 'http://localhost:8080/v1/audio/speech' \
      --header 'Content-Type: application/json' \
      --data '{
        "model": "piper",
        "input": "This is an audio speech test",
        "response_format": "wav",
        "speed": 1.0
      }'
      --output test.wav

    If the request is successful, the generated audio file will be saved as test.wav.

Build

  • For Linux users

    cargo build --release
  • For macOS users

    • Download the wasi-sdk from the official website and unzip it to the directory you want.

    • Build the project

      export WASI_SDK_PATH=/path/to/wasi-sdk
      export CC="${WASI_SDK_PATH}/bin/clang --sysroot=${WASI_SDK_PATH}/share/wasi-sysroot"
      cargo clean
      cargo update
      cargo build --release

If the build process is successful, tts-api-server.wasm will be generated in target/wasm32-wasip1/release/.

CLI Options

$ wasmedge tts-api-server.wasm -h
Whisper API Server

Usage: tts-api-server.wasm [OPTIONS] --model-name <MODEL_NAME> --model <MODEL> --config <CONFIG> --espeak-ng-dir <ESPEAK_NG_DIR>

Options:
  -m, --model-name <MODEL_NAME>        Model name
      --model <MODEL>                  Path to the whisper model file
      --config <CONFIG>                Path to the voice config file
      --espeak-ng-dir <ESPEAK_NG_DIR>  Path to the espeak-ng data directory
      --socket-addr <SOCKET_ADDR>      Socket address of LlamaEdge API Server instance. For example, `0.0.0.0:8080`
      --port <PORT>                    Port number [default: 8080]
  -h, --help                           Print help
  -V, --version                        Print version

About

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages