This project builds a multiplatform iOS/macOS application that can:
- Read a linked webpage
- Upload a document - txt, pdf, epub
- Read the uploaded documents using text to speech. The supported TTS synthesizers are: a) Local Apple TTS b) Amazon Polly Cloud TTS c) Google Cloud TTS d) Microsoft Azure TTS
For the Cloud TTS, user needs to use their own AWS/Google Cloud/Azure accounts. The setup below will guide you through what resources to deploy. All of these cloud services are a pay-as-you-go model, so you'll only pay for what you actually read instead of a subscription that you pay to some app developer. See below for what the # looks like for each cloud service. You might also qualify for free tier options, new account credits and free monthly quotas.
- Amazon Polly - https://aws.amazon.com/polly/#/
- Google Cloud - https://cloud.google.com/text-to-speech/#
- Microsoft Azure - https://azure.microsoft.com/en-us/#/details/cognitive-services/speech-services/
Just build the project in Xcode, sign it for local development and install it in your devices.
- Create an AWS account or use an existing one. https://aws.amazon.com/free
- Open the AWS Console -> select any region you like (in the region selector on top-right) (Some voices are not available in certain regions. I recommend us-east-1 for all voices) Console Link
- Search for CloudFormation in the Console search -> Create Stack -> Upload a template -> Use the template in the AetherVoice/Dist/AmazonPollyCFN.yaml
- Wait for stack to complete creation and then note down the value of
identityPoolId
in the Outputs tab of the stack - Provide the identityPoolId in AWS configuration in the Settings of the app. (Note: the identityPoolId acts like a password to access your AWS account's Polly resources. Don't share it with anyone. The value will be securely stored in your keychain upon entering it)
- # for GCP or use an existing account: https://cloud.google.com
- # to your Google Cloud Console.
- Create a new project "AetherVoice" on the top-bar or use an existing project.
- Access the API Library for the Text-to-Speech API.
- Select your project and click the "Enable" button.
- Visit the Credentials page.
- Click on “Create Credentials” and choose "API key". Your new API key will appear; click "Close" to save it.
- Click on the name of the new API key to open its settings page.
- Under "Application restrictions", select "iOS apps" and add the bundle identifier 'com.ract.AetherVoice' (You can change the bundle id in Xcode if you want)
- Under "API restrictions", select "Restrict key" and choose "Google Cloud Text-to-Speech API" from the dropdown list.
- Click "Save" to apply the restrictions.
Source in GCP Docs - Create API Keys Source in GCP Docs - Restrict API key usage in iOS apps
Provide the generated API key in AWS configuration in the Settings -> GCP Configuration of the AetherVoice app.
(Note: the API key acts like a password to access your GCP account's Text-to-Speech API. Don't share it with anyone. The value will be securely stored in your keychain upon entering it)
- # for Azure or use an existing account: https://azure.microsoft.com/en-us/free/open-source
- # to your Azure Portal and go to deploy custom template.
- Click "Build your own template in the editor" and then load the AetherVoice/Dist/AetherVoiceFree_Azure_template.json file. Feel free to change the region if you want. The available voices change based on the region selected. Also, the template is using the free-tier subscription but feel free to change to a standard or pay-as-you-go subscription if you want.
- Save -> Create new or use existing resource group. -> Review + Create -> Create
- Wait for deployment to finish and then "Go to resource"
- Under "Keys and endpoint" section of the resource, note down "Key 1" or "Key 2" and the region.
Source in Azure Docs - Deploy templates Source in Azure Docs - TTS #
Provide the generated Resource key and Azure Region in the Settings -> Azure Configuration of the AetherVoice app.
(Note: the API key acts like a password to access your Azure account's Text-to-Speech API. Don't share it with anyone. The value will be securely stored in your keychain upon entering it)