Dialogue Cloud

Speech Engines for Anywhere365

Note: Only the default Text to Speech Provider ‘Microsoft Server Speech Synthesizer’ can be used for all IVR Interactive Voice Response, or IVR, is a telephone application to take orders via telephone keypad or voice through a computer. By choosing menu options the caller receives information, without the intervention of a human operator, or will be forwarded to the appropriate Agent. Questions as configured in SharePoint. The advanced Speech Engines (both ‘Microsoft Cognitive Services Speech’ and ‘Google Cloud Text to Speech’) can only be used for Queue Messages, Callback The CallBack feature enables the customer to leave his / her number to be called back by an available agent during business hours. Messages, and in combination with plugins, for example Dialogue Studio.

Introduction

Anywhere365 supports different Text to Speech Providers, only one Provider can be configured for a UCC A Unified Contact Center, or UCC, is a queue of interactions (voice, email, IM, etc.) that are handled by Agents. Each UCC has its own settings, IVR menus and Agents. Agents can belong to one or several UCCs and can have multiple skills (competencies). An UCC can be visualized as a contact center “micro service”. Customers can utilize one UCC (e.g. a global helpdesk), a few UCC’s (e.g. for each department or regional office) or hundreds of UCC’s (e.g. for each bed at a hospital). They are interconnected and can all be managed from one central location.. By default, Anywhere365 uses ‘Microsoft Server Speech Synthesizer’ which is installed as part of Anywhere365. If an advanced provider is configured, ‘Microsoft Server Speech Synthesizer’ will be used as a back-up or in specific cases. For more details please continue reading.

 

MicrosoftSpeechSynthesizer

This is the default Text to Speech provider installed in as part of Anywhere365. This provider is used for all text to speech operations.

Setting

Value

Remark

SpeechProvider​

MicrosoftSpeechSynthesizer

Optional as this is the default version

SpeechPreferredVoiceName

Exampled: Microsoft Server Speech Text to Speech Voice (en-GB, Hazel)

Optional, if not specified, the Culture Info will be used to determine an appropriate voice

 

MicrosoftCognitiveServices​

This speech provider uses the Azure cloud service* for text to speech operations. This provider offers better quality compared to the default provider. This advanced text to speech provider can only be used for specific operations:

  • Queue messages configured in the IVR Questions list on SharePoint

  • Callback messages configured in the IVR Questions list on SharePoint

  • Say-nodes in Dialogue Studio

A Cognitive Services key can be obtained through the Azure Portal. After you have added a "Cognitive Services" resource to your Azure Subscription, copy key 1 from the "Keys and Endpoint" section. Make sure to use correct endpoint addresses in the settings below when you choose to host the service in a region other than western Europe.

Setting​

Description

Value

Remark

​SpeechMicrosoftCognitiveApiKey​

The api key for Microsoft Cognitive services. ​

 

Mandatory

​SpeechMicrosoftCognitiveApiEndpoint

The api endpoint for Microsoft Cognitive services.

Example: 

https://westeurope.tts.speech.microsoft.com/cognitiveservices/v1

Mandatory

SpeechMicrosoftCognitiveApiAuthorizationEndpoint

The authorization endpoint for Microsoft Cognitive services.

Example: 

https://westeurope.api.cognitive.microsoft.com/sts/v1.0/issueToken

Mandatory

SpeechProvider

 

MicrosoftCognitiveServices

Mandatory

​​SpeechPreferredVoiceName

 

Example:

Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural)

Optional, if not specified, the Culture Info will be used to determine an appropriate voice

Note: Text to Speech is subject to additional costs and are billed on the used Azure subscription. For rates see https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/.

 

GoogleCloudTextToSpeech​

Google Cloud Text to Speech service * provides the most comprehensive set of voices. For a complete overview of all voices check https://cloud.google.com/text-to-speech/docs/voices. This advanced text to speech provider can only be used for specific operations:

  • Queue messages configured in the IVR Questions list on SharePoint

  • Callback messages configured in the IVR Questions list on SharePoint

  • Say-nodes in Dialogue Studio

To enable this speech to text provider you have to configure two lists, GlobalSettings and the PluginSettings.

The PluginSettings list will contain the Credentials JSON of the Google service.

 

GlobalSettings

Setting

Value

Remark

SpeechProvider​

GoogleCloudTextToSpeechV1

Mandatory

SpeechPreferredVoiceName

Example:

nl-NL-Wavenet-C

Optional, if not specified, the Culture Info will be used to determine an appropriate voice

PluginSettings

Setting​

Scope​​

Value

Remark

GoogleAppCredentialsJson​

TextToSpeech

Json generation in Google Cloud.

Mandatory

Note: For more information about their offering you can have a look here: https://cloud.google.com/speech-to-text

 

UCC Voice Selection

By default, the UCC will select the voice which name matches the value of the setting "SpeechPreferredVoiceName" (a setting in the generic Settings list of the UCC, and should match the "Voicename" of the corresponding text to speech provider). If there is no match, it will select a voice based on the value of the setting "CultureInfo".

If there are multiple matching voices, the first voice that matches the criteria will be selected.

When a speech provider can't initialize or is misconfigured the MicrosoftSpeechSythesizer will be used as fallback.

 

Voice Selection Dialogue Studio

Via Dialogue Studio it is possible to configure the voice in three ways:

  1. Default (UCC Configured); see above "Default UCC Voice Selection".

  2. Custom Voice; configure the Culture and Gender, and the UCC will select a voice of the configured text to speech provider matching the criteria.

  3. SSML; this makes it possible to select multiple voices for each individual node in your flow (useful for multi-lingual messages).