-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Feedback] Voice Calls (alpha) #175
Comments
|
Suggestions: Suno Bark:an opensource alternative to elevenlabs api for speech synthesis.
Info about the free and open source speech synthesis model Bark: |
That's my feedback; hope it's useful. Keep up the good work with big-agi. I use it every day! |
See also #175. This accomplishes a similar function in an elegant way.
Any voice feature is not working at brave |
Yes, sadly Brave does not support the Web Speech API for voice input. |
Having issues in Firefox (on Mac). While I activated speech recognition in the browser settings, it does not seem to work: I talk but I get no reaction from the AI. |
I think the voice is a great feature - really been looking for something like this - but it would be best it it really worked like a phone call. right now the thing keeps chiming when "listening" and it's kinda annoying and discruptive to the conversation - especially if you try and put it in the hands free mode (as opposed to the push to talk) - I've seen this done in other chat via browser where it's more a stream listening to the microphone. In order to get rid of the sound looping where the AI hears it's self speaking and responds to itself - i've seen it implemented where when the computer is speaking it shuts off the microphone until the sound has stopped playing - (in the case i'm talking about voxta.ai - the microphone icon goes red with a slash showing that it's not listening when the ai is speaking) - this stops the sound looping so it even works without headphones. the implementation they have on voxta.ai work smoothly - you can go back and forth like using pi ai or the chatgpt conversation mode - it's really cool. when it's speaking if you are wearing headphones they even have a mode on the settings where you can interrupt it (so it's set to listen all the time even when it's speaking - but the interrupt feature works because if you have this mode on which is mean to be used with headphones, you can even interrupt the ai while it's speak ) If you could get the conversation mode to work more like either of those this would be the killer app - you get to pick the LLM you want, you get to customize things, and you can have 2 way seemless conversation back and forth with just about any LLM that there is especially with all the choices on something like openrouter.ai - it would be very very cool to be able to have smooth conversations with just about any LLM out there - using your software and smooth conversational ai - it'd really get to be like the movie Her. Great job on this software! One other thing - as it's implemented now - when in a "call" it didn't consistently play the speech responses - it was like hit or miss - sometime it would speak what the ai was saying back and other times it wouldn't. it always displayed the response - but every other time it didn't speak the response... |
This is very nifty, and almost anyone can set it up (as long as they use Google Chome on Desktop) But... any niceness gets erased when you have a great or funny conversation that's almost impossible to repeat, that you want to screenshot or record... and then you resize the window only to run into this:
Yes, resizing the window too. Really?!
Come on!!! What the hell? |
Instructions and feedback thread for Voice Calls in
big-AGI
.1. Start a Voice call
There are two ways of initiating a Voice Call from an existing chat:
2. System Check
Make sure all the checks are green, or try to resolve the issues before proceeding. This wizard will only be shown
![image](https://github.com/enricoros/big-agi/assets/32999/b18abf60-5cdd-4ae9-bfe1-93aa2eb4da2a)
the first time, unless the issues persist.
3. Call Options
During a call, you can switch "Push To Talk" on/off. If active (default) then the microphone needs to be
![image](https://github.com/enricoros/big-agi/assets/32999/91124fe7-cf88-4b78-9dcb-35afe6d5243e)
pushed before speaking. This is best to avoid echoes and other ambient noise.
Note - you can also say the following commands during a call. These single words will be interpreted as system commands:
Known limitations:
🙌
Looking forward to your feedback to prioritize the right integration and development!
🙌
The text was updated successfully, but these errors were encountered: