-
Endpoint: Websocket streaming speech API endpoint
wss://bodhi.navana.ai
-
Sample Script:
streaming_client.js
(for static audio files)streaming_client_with_conversion.js
(for static audio files)
Store the authentication headers in env to access the streaming speech API endpoints:
$ export API_KEY=YOUR_API_KEY
$ export CUSTOMER_ID=YOUR_CUSTOMER_ID
The received response format will be a JSON object.
{
"call_id": "CALL_ID",
"segment_id": "SEGMENT_ID",
"eos": false,
"type": "partial",
"text": "TRANSCRIPT"
}
-
Call_id:
- Unique identifier associated with every streaming connection
-
Segment_id:
- Unique identifier associated with every speech segment during the entire active socket connection
-
Text:
- If type = "partial"
- Partial transcript corresponding to every streaming audio chunk
- Partial transcripts for every audio chunk (will be for a 100ms audio chunk if streaming audio packet size is 100ms)
- If type = "complete"
- Complete/final transcript generated for each speech segment
- Generated once per segment_id i.e., when the speech segment end is reached
- If type = "partial"
-
eos:
- If 'eos' is true, marks the end of the streaming connection
npm install
node streaming_client.js -f loan.wav
Options: -f: File name of the audio file to be streamed.
node streaming_client_with_conversion.js
To ensure optimal compatibility and performance with our audio processing system, please adhere to the following audio stream requirements:
-
Encoding/Bit Depth: 16Bit PCM with a 2 Byte depth, providing high-quality audio representation.
-
Minimum Sample Rate: The audio must have a sample rate of at least 8000Hz.
-
Fixed Streaming Rate: Audio packets should be streamed at (chunk_duration_ms) a fixed size (50 - 500 ms), ensuring consistent data flow. We recommend using 100 ms as shown in the example script.
-
Channels: Audio must be single-channel (Mono) to ensure compatibility with our processing pipeline.
-
Speakers: Initially, support is provided for a single speaker per channel. However, support for multiple speakers on a single channel is under development and will be announced soon.
- Hindi:
hi-general-v2-8khz
- Kannada:
kn-general-v2-8khz
For testing the code, modify the .js
file with the model name you want to use.