Vosk Voice To Text (VTT) client

Vosk Server is an open source Voice-To-Text server based on Vosk-API, and provides real-time voice transcription over WebSocket (and other protocols).

This Python script is based off their test_microphone.py example, acting as a client interface with a Vosk server. Currently, this version only adds OSC output of the transcription, but the plan is to expand this much further.

Please see the Vosk GitHub repo for details on the server and instructions on how to host your own: https://github.com/alphacep/vosk-server

Project files

vtt_client.py

An initial example script to connect to the Vosk server via websockets and output transcription results as dictionaries via OSC.
vtt_reminder_example.maxpat

A Max patch to demonstrate extracting a reminder time (hour + am/pm) from a transcription received via OSC from the Python script. The approach uses native Max objects to do this, and results in a very convoluted patch.

Note: I in no way endorse the use of Max (especially with only their native patch object library) to perform text analysis. It's simply not designed for this. If Max is required, a far better approach would be to either perform the extraction prior posting the data to Max, or by using Max's Javascript [js] object.

Feature ideas

Dependencies

Vosk Server

If you're using Docker, it's as easy as:

docker run -d -p 2700:2700 alphacep/kaldi-en:latest

See the Vosk Server GitHub page for more info.

Client modules

Python3
pyaudio
websockets
python-osc

Client setup guide

Linux

Click for Linux setup instructions...

This assumes you have Python3 and pip installed.

1. Install the Python modules

I had a fatal install error using the official pip install pyaudio on Ubuntu 20.04. The following command worked perfectly instead:

sudo apt install portaudio19-dev python3-pyaudio
pip install websockets python-osc

2. Clone the project

git clone https://github.com/MaxVRAM/vosk_vtt_client.git

Windows

(Windows) Click for setup instructions...

1. Install `Python 3`

This will work with other versions of Python, but I've only tested it with Python 3.10.0, so that's what I'll be using as an example.

Head over to the Python Releases for Windows page and download Python 3.10.0 (64-bit) installer - or use this direct download link
After it's done downloading and open the installer, make sure you check the Add Python 3.10 to PATH option at the bottom of the window, which makes the Python command accessable from any folder on your system. Then hit Install Now and wait for it to finish.
Open Windows command prompt by pressing [win] + r, enter cmd in the box and hit enter.
Check that Python is installed by entering python -V (with a capital V). It should print out Python 3.10.0 or whatever version you installed.

2. Install the Python modules

PyAudio is not a native package on Windows, so it needs to be manually downloaded and imported from a whl wheel file.

Download the PyAudio file that matches your Python version and OS - link.

For example, Python 3.10.0 on Windows 10 (64-bit) would require:
- PyAudio‑0.2.11‑cp310‑cp310‑win_amd64.whl
- Where cp310 is Python 3.10.0, and win_amd64 is Windows 64-bit).

Move the file to your user's Documents folder.
Back in Windows command prompt, navigate to the Documents folder, using cd Documents if you're already in your user folder, otherwise cd C:\Users\<your_user_name>\Documents.
Now install the module:

pip install PyAudio‑0.2.11‑cp310‑cp310‑win_amd64.whl

And finally install websockets and python-osc:

pip install websockets python-osc

3. Clone the project

git clone https://github.com/MaxVRAM/vosk_vtt_client.git

Or download the script directly vtt_client.py.

Usage

If your Vosk Server is running locally listening on the default port 2700, you can simply run the script:

python3 vtt_client.py

Arguments

Vosk Server connection

-server <server_url>:<port>
Defaults to localhost:2700

A remote Vosk Server connection might look like this:

python3 vtt_client.py -server example.com:8089

OSC destination

-ip <osc_ip> -port <osc_port>
Defaults to localhost and 9600

Sending the OSC elsewhere might look like this:

python3 vtt_client.py -ip 192.168.40.22 -port 5110

Putting it together

A full example might look like this:

python3 vtt_client.py -server example:8098 -ip 192.168.40.22 -port 5110

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitignore		.gitignore
README.md		README.md
vtt_client.py		vtt_client.py
vtt_reminder_example.maxpat		vtt_reminder_example.maxpat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vosk Voice To Text (VTT) client

Project files

`vtt_client.py`

`vtt_reminder_example.maxpat`

Feature ideas

Dependencies

Vosk Server

Client modules

Client setup guide

Linux

1. Install the Python modules

2. Clone the project

Windows

1. Install `Python 3`

2. Install the Python modules

3. Clone the project

Usage

Arguments

Vosk Server connection

OSC destination

Putting it together

About

Releases

Packages

Languages

MaxVRAM/Vosk-VTT-Client

Folders and files

Latest commit

History

Repository files navigation

Vosk Voice To Text (VTT) client

Project files

vtt_client.py

vtt_reminder_example.maxpat

Feature ideas

Dependencies

Vosk Server

Client modules

Client setup guide

Linux

1. Install the Python modules

2. Clone the project

Windows

1. Install Python 3

2. Install the Python modules

3. Clone the project

Usage

Arguments

Vosk Server connection

OSC destination

Putting it together

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`vtt_client.py`

`vtt_reminder_example.maxpat`

1. Install `Python 3`

Packages