Text-to-Speech Service Configuration
Rodel Agent supports multiple text-to-speech services. Please refer to the sections indexed by subheadings to configure the desired service.
Open AI
Documentation | https://platform.openai.com/docs/api-reference/chat/create |
API Token | https://platform.openai.com/account/api-keys |
Configuring Open AI is relatively simple. Just fill in the API Key in the Access Key
setting.
Azure Open AI
Documentation | https://learn.microsoft.com/azure/ai-services/openai/overview |
Azure Service | https://azure.microsoft.com/en-us/products/ai-services/openai-service |
The relationship between Microsoft and Open AI is quite special, akin to Open AI’s exclusive distributor. Thus, you can create Open AI resources on the Azure cloud service platform and deploy Open AI models like tts
and tts-hd
as needed.
Although the models are the same, the network request formats differ, as do the required configuration items.
Once you have deployed an Open AI resource, you can find the required keys (either one of the two keys will suffice) and endpoint on the resource page under Resource Management
-> Keys and Endpoints
.
Adding Custom Models
The biggest difference between Azure Open AI and Open AI is that you need to manually deploy the required models.
In the application, Azure Open AI does not provide pre-configured models. Simply filling in the key and endpoint will not make the configuration effective; you also need to create custom models.
In Azure OpenAI Studio, you can deploy specific models from the model library.
Each model has its own Id, for example, the Id for Dall·E 3 is dall-e-3.
When deploying a model, Azure will ask you to provide a Deployment Name, which is crucial. We interact with the service via API using the deployment name, not the model ID, as the model identifier.
I recommend using the same deployment name as the model ID to reduce ambiguity when deploying models.
Azure TTS
Documentation | https://learn.microsoft.com/azure/ai-services/speech-service/text-to-speech |
Azure Service | https://azure.microsoft.com/zh-cn/products/ai-services/text-to-speech |
Azure's speech service inherently includes text-to-speech capabilities, which we can deploy ourselves according to the documentation.
Then, simply fill in the Access Key
and Region
of the corresponding resource into the application settings.