How to configure transcripting in the UCC Plugin Framework
The transcription feature allows transcribing conversations of customer voice (spoken) dialogues in our Plugin Framework. Anywhere365 Dialogue Studio in combination with transcription, can be used to write the transcribed text to CRM Customer Relationship Management, or CRM, is (usually) a software-based, data management method to deal with interactions with customers and potential customers. or a database.
Prerequisites
- a Google Cloud, with "Cloud Speech-to-Text API" enabled and json credentials
or
Note: Introduced in DC2024.01
Transcription in Plugin Framework
There are a few settings required to allow plugins to use transcription functionalities. First two optional settings can be set in the SharePoint UCC A Unified Contact Center, or UCC, is a queue of interactions (voice, email, IM, etc.) that are handled by Agents. Each UCC has its own settings, IVR menus and Agents. Agents can belong to one or several UCCs and can have multiple skills (competencies). A UCC can be visualized as a contact center “micro service”. Customers can utilize one UCC (e.g. a global helpdesk), a few UCC’s (e.g. for each department or regional office) or hundreds of UCC’s (e.g. for each bed at a hospital). They are interconnected and can all be managed from one central location.'s general Settings list and allow transcribed text to be written to the general UCC log file for developer debug reasons. The other settings are plugin specific and go into the PluginSettings list. See table below.
Settings
Setting |
Description |
Example value |
---|---|---|
EnableTranscriptLogging |
Enables the visibility of the transcripted text in the logs. This setting can impact privacy and should only be set to true for debugging purposes. |
True default: False |
EnableTranscriptIntermediateLogging |
Enables logging of intermediate results. Some transcriptors are able to provide incomplete results while listening. When this setting is set to True all intermediate results are written to the log. When set to False it will only log final messages. This setting only has effect when EnableTranscriptLogging is set to True. |
True default: False |
PluginSettings
Setting |
Description |
Example value |
Scope |
---|---|---|---|
PluginPath |
Full path to the dll of the transcriptor plugin. |
example: Z:\UccService\Plugins\Wsp.Anywhere365.SpeechToTextEngine.Impl.GoogleSpeech\1.0.x.x\Wsp.Anywhere365.SpeechToTextEngine.Impl.GoogleSpeech.dll or Z:\UccService\Plugins\Wsp.Anywhere365.SpeechToTextEngine.Impl.MsCognitiveServices\1.0.x.x\Wsp.Anywhere365.SpeechToTextEngine.Impl.MsCognitiveServices.dll |
Transcriptor |
Note: The correct version number (1.0.x.x) will be automatically populated when requested on Dialogue Cloud
Z:\UccService\Plugins\Wsp.Anywhere365.SpeechToTextEngine.Impl.GoogleSpeech\1.0.x.x\Wsp.Anywhere365.SpeechToTextEngine.Impl.GoogleSpeech.dll
Plugin specific settings
Google Speech
To setup Google Cloud follow this guide: https://cloud.google.com/speech-to-text/docs/quickstart-client-libraries.
Make sure the "Cloud Speech-to-Text API" is enabled.
The following settings need to be added in the "PluginSettings" list.
Setting |
Value |
Scope |
---|---|---|
GoogleCredentialJson |
The plain text JSON of the credential. Do not enter the path to the file, but the content of the file.
|
Transcriptor |
Proxy server with Google Speech
To use a proxy server with Google Speech you will need to add an Environment Variable to your system (not to user only). Add the variable http_proxy with the value http://proxyserver.local:8080 .
Microsoft Speech
To setup Microsoft Speech transcription follow this guide: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-speech-to-text?tabs=windows%2Cterminal&pivots=programming-language-csharp#prerequisites.
Prerequisites
-
An Azure subscription
-
A Speech Resource
-
The Speech resource Key and Region
The following settings need to be added in the "PluginSettings" list.
Setting |
Value |
Scope |
---|---|---|
CredentialsJson |
Transcriptor |
Google Transcription Model
To add a specific preferred transcription model from Google's list of models, a Plugin setting can be optionally added. For a current list of models see: https://cloud.google.com/speech-to-text/docs/transcription-model
If no setting is specifically added to PluginSettings, phone_call is the default model used, if the detected language supports it. For other languages for which the model "phone_call" does not exist, the model always falls back to the Google transcription model "default".
Setting |
Value |
Scope |
---|---|---|
GoogleRecognitionModel |
The plain text name of the model. |
Transcription |
Google Phrase List - Model Adaptation
The recognition of single words or phrases can be improved using model adaptation (Opens in new window or tab). This is useful for cases where certain phrases or words are said and can generate ambiguity with others. For example, if there is an ambiguity between "Claim" and "Plane", by using this feature it can give a boost to "Claim" and can be guessed more frequently.
The phrase set can be created using Google Cloud Console, where a name is given with the format projects/{project_id}/locations/{location}/phraseSets/{phrase_set_id} for its reference.
If a reference is set in this setting, the adaptation model on that specific phrase set is going to be used in the transcriptions of the node.
Note: That the name has to be exactly as it is stated in Google Cloud Console. The UCC logs indicate if the phrase set has been picked up successfully. If something went wrong, the error will be logged and it will fallback to no phrase list being picked up for the transcription.
Setting |
Value |
Scope |
---|---|---|
AdaptationModelPhraseSet |
Optional phrase list set to provide commonly used phrases or single-words to give priority during transcription. |
Transcription |
Transcription Recording
Introduction
The transcriptions can also be recorded in a text file.
Settings
Setting |
Value |
Description |
---|---|---|
UseTranscriptRecording |
true |
If true, transcript will be recorded |
Location
The Recording will be stored at the same location as the audio recordings, Learn More
Data Format
In the example transcription below, you see that transcriptions have an "IsFinal" flag that indicates if the transcription has completed. Intermediate results are also logged. To determine the name or uri of the participant for which audio was transcribed, the TranscriptHistoryMessage.ParticipantId can be joined with the HistoryParticipant.Id.
{
"Version":"1.0",
"DialogueId":"f748e530-6955-47a6-b7da-0de1fe0d8ea3",
"HistoryParticipants":[
{
"Id":0,
"ParticipantUri":"sip:ucctestuser1406@anywhere365.net",
"ParticipantDisplayName":"ucctestuser1406@anywhere365.net",
"ParticipantType":"Customer"
}
],
"ChatHistoryMessages":[
],
"TranscriptHistoryMessages":[
{
"Language":"nl",
"Transcript":"Hello",
"IsFinal":false,
"ParticipantId":0,
"Timestamp":"2020-01-06T11:38:34.8081659+01:00",
"Index":0
},
{
"Language":"nl",
"Transcript":"Hello world",
"IsFinal":true,
"ParticipantId":0,
"Timestamp":"2020-01-06T11:38:34.8236803+01:00",
"Index":0
}
],
"TranslationHistoryMessages":[
]
}