Install SpeechToText plugin

Speech Api Limitations

  • Do not use your computer’s internal microphone for speaking. While supported the overall stability of the speech api can be comprised by it.
  • The speech api can only record small chunks of up to 15 seconds. In order to circumvent this we start a new recording once the previous one ends, However this does mean that words can be cut in two, resulting in the api not understanding the word
  • Speech recognition is heavily influenced by the speakers, it expects words to be spoken in a certain way for it to be recognized. As such it can be quite unreliable for non-native speakers speaking the language
  • Language recognition for the lesser spoken supported languages is still quite poor
  • It has trouble understanding words that are not real words in the language, such as names
  • Words from another language that are also used in your current selected language will be attempted to be translated
  • It does not discern voices

 

Obtaining a Speech Api Key

A prerequisite is that you have an Azure subscription, since the keys are generated via Azure. Anybody that has access to the Azure portal can make use of the free version. This version will however expire after 30 days. Depending on the version you wish to use more transactions will be available per month. The free version allows 5000 transactions, with an allowed 20 transaction per minute.

The link for the cognitive services keys is https://www.microsoft.com/cognitive-services/en-us/subscriptions

On this page select the speech tab and then Bing Speechi Api, Get API Key. Login with the Microsoft account that is also linked to the Azure. Then make your choice which type of subscription you wish to use.

 

Installation & folders

The installation folder is default set to “C:\Program Files (x86)\Anywhere365 Attendant\” but can be changed on installation. All the plugins will be installed in the folder called plugins, if this is not present when you install a plugin it will be created.

The default installation location should be: “C:\Program Files (x86)\Anywhere365 Attendant\plugins\Wsp.Anywhere365.OutboundDialerCallQueuePlugin”. After installation this folder will be present.

 

Settings in the attendant

When opening the attendant click on the bottom right circle to open the settings tab. Then click on plugins in the right bar. Then select the Speech To Text Plugin to open the settings page where you can configure the settings for the Speech To Text Plugin.

There are six settings in the attendant that are configurable. The language setting and the speech api on setting can also be altered on the plugin tab, but these will not be remembered when you close the attendant. This is handy if you only wish to record certain calls or need to switch language sometimes and do not want to alter the default set by the settings.

Setting

Description

SpeechApi Key

The generated key required to activate the Microsoft Speech Api.

Key Words

A comma separated list of words that you wish to highlight. For instance if the setting contains anywhere, every time the word anywhere is found in the text it will be highlighted in orange.

Language

The language it expects to hear.

Speech Recording Time

The interval time to start a new recording. The speech api is not meant for long recording and as such we do it in chunks instead. The downside is that this can mean that words said between chunks lose their meaning because one part is recorded on one chunk and the other on the next chunk.

SpeechApi On

Whether or not the plugin should be recording calls when the attendant is started. If you only wish to record certain calls you can switch it on in the tab.

Play Sound File

You can put a soundfile in %appdata%/Workstreampeople/Attendant/SoundFiles. This folder is created if the plugin is installed. The default sound file will be installed in the plugin folder itself and can be moved to the appdata location. This sound file will then be played before the recording of a call starts, but it will not be heard and it’s contents will not shown on the plugin page. The reason for this is that if at the start of the call you do not speak for the first 10 seconds the api will abort the recording, so this is a failsafe measure for such occasions.

 

When you are done with setting the settings either press the save or cancel option to save / cancel your settings. Attempting to leave the settings tab will also prompt the save / cancellation option

 

 

Speech To Text Plugin

We will now go into detail about the look and possibilities of the plugin. On the right side of the attendant you will find the following icon for the plugin.

After setting the recording to active with a valid speech api key it will record any accepted calls until it is disconnected or if you turn off the recording. The receiver side are the words you speak and the caller is the person on the other side of the call.

As stated in the setting speech recording time, for each recorded chunk a new sentence will be made. The text will be visible until the next accepted call. You are free to change language on the fly. When the next chunk starts it will start translating in the then selected language. Via the save button you can save the transcript. If an error occurs during the recording this will be shown as well. If the connection to the api is lost it will actively try to fix this, but this depends on the responsiveness of the Bing Speech Api

There is currently no way to distinguish multiple people in a group conversation, so these will all be seen as the same person, the caller.