SmartSpeech

English version of the documentation is under development.

SmartSpeech is a service for speech recognition and synthesis for Salute virtual assistants. Using this service, you can:

  • Transform your speech into text and back, all in real time.
  • Apply deep learning across the speech database.
  • Apply all functions of the service to phone communications.

Use cases:

  • Speech synthesis for chats, guides, and product descriptions. Users of an app or website can both watch and listen to the content. Use speech synthesis for chats, guides, and product descriptions.
  • Voice input for texts. Customers use voice input, when they face writing difficulties. Integrate speech recognition into chats, search or navigation.
  • Interactive menu and auto attendant. You can use speech synthesis and recognition to create an interactive voice response (IVR) or auto attendant — thus optimizing the call center operations.
  • Telemarketing. The SmartSpeech technologies help to exclude human operators — you can make your telemarketing more efficient.
When using SmartSpeech, the maximum load handled by the server should not exceed 10 simultaneous streams at a given time.

Our GitHub account shows code samples of customer apps and protobuf files.

To use the service, you need to be either registered as a sole trader or legal entity and pass the moderation.

To start using the service, read the Terms of Use section.

Synthesis

Speech synthesis includes:

  • API for speech synthesis
  • How to create an audio out of a text
  • Speech synthesis tagging

Recognition

Speech recognition includes:

  • API for speech recognition:

    • Synchronous recognition (HTTP)
    • Stream recognition (gRPC)
    • Asynchronous recognition (HTTP and gRPC)
  • How to get a text out of an audio
  • How to improve recognition outcomes

SmartSpeech team contacts

Please send your questions to: SmartSpeech@sberbank.ru.