English version of the documentation is under development.
You can read the Russian version.
SmartSpeech is a service for speech recognition and synthesis for Salute virtual assistants. Using this service, you can:
- Transform your speech into text and back, all in real time.
- Apply deep learning across the speech database.
- Apply all functions of the service to phone communications.
- Speech synthesis for chats, guides, and product descriptions. Users of an app or website can both watch and listen to the content. Use speech synthesis for chats, guides, and product descriptions.
- Voice input for texts. Customers use voice input, when they face writing difficulties. Integrate speech recognition into chats, search or navigation.
- Interactive menu and auto attendant. You can use speech synthesis and recognition to create an interactive voice response (IVR) or auto attendant — thus optimizing the call center operations.
- Telemarketing. The SmartSpeech technologies help to exclude human operators — you can make your telemarketing more efficient.
When using SmartSpeech, the maximum load handled by the server should not exceed 10 simultaneous streams at a given time.
To use the service, you need to be either registered as a sole trader or legal entity and pass the moderation.
Speech synthesis includes:
- API for speech synthesis
- How to create an audio out of a text
- Speech synthesis tagging
Speech recognition includes:
- API for speech recognition:
- Synchronous recognition (HTTP)
- Stream recognition (gRPC)
- Asynchronous recognition (HTTP and gRPC)
- How to get a text out of an audio
- How to improve recognition outcomes
SmartSpeech team contacts
Please send your questions to: SmartSpeech@sberbank.ru.