














VOICE RECOGNITION ON ASTERISK
Voice recognition and synthesis on AsteriskIVR with voice recognition .
Task:
Create IVR with Voice recognition feature and processing voice commands. In addition, this IVR should read text.
What we have:
1) Installed Ubuntu
2) Installed Asterisk
How task was resolved:
We will use google services for voice recognition and speech synthesis. To use this google services we will start some perl scripts.
Solution description:
1) Installing packets:
a. Perl: The Perl Programming Language
#apt-get install perl
b. perl-libwww: The World-Wide Web library for Perl
#apt-get install libwww-perl
d. perl-libjson: Module for manipulating JSON-formatted data
#apt-get install libjson-perl
e. flac: Free Lossless Audio Codec
#apt-get install flac
f. To connect to the google.com by https we will use IO-Socket-SSL packet:
#apt-get install libio-socket-ssl-perl
g. sox : Sound eXchange, packed which processes sounds :
#apt-get install sox
h. mpg123 : MPEG player
#apt-get install mpg123
i. Asterisk module (usually it installs witch asterisk) format_sln : Raw slinear module for asterisk
2) Creating key, this key will be used by our script for authorization voice recognition requests on google site (usually it gives to us opportunity to process about 50 – 500 speech recognition request. But if it is necessary to have more, use commercial solution from google.com). to create key you have to done the following steps:
a. If you will subscribe to the Chrome dev, you will have opportunity to generate key, which we will be used in speech recognition script. To activate subscription follow:
https://groups.google.com/a/chromium.org/forum/#!forum/chromium-dev
b. Open Google Developers Console (https://console.developers.google.com/) site with you login and password.
c. Create new project (https://console.developers.google.com/project)
d. Enter to the new created project and add “Speech API”
e. Enabling API (Enable API)
f. Creating Key:
g. Copy key. We will use it on step 3.b).
3) Creating 2 pers scripts in the following directory /var/lib/asterisk/agi-bin
a. The first one calls googletts.agi and it will be used for synthesis of the test wich will be typed in extensions.conf file.
b. The second one calls speech-recog.agi and it will be used for spesch recognition. In this script variable "key" should contain key which was generated on step 2.g)
4) Changing dial plan in extensions.conf for the following, to test how both scripts are working :
Voice dialing example
exten => 1236,1,Answer()
exten => 1236,n,agi(googletts.agi,"Please say the number you want to dial.",en)
exten => 1236,n(record),agi(speech-recog.agi,en-US)
exten => 1236,n,GotoIf($["${confidence}" > "0.8"]?success:retry)
exten => 1236,n(success),goto(${utterance},1)
exten => 1236,n(retry),agi(googletts.agi,"Can you please repeat?",en)
exten => 1236,n,goto(record)
exten => _XXXX,1,Progress()
exten => _XXXX,n,Dial(SIP/${EXTEN},30)
exten => _XXXX,n,Set(CHANNEL(language)=ru)
exten => _XXXX,n, VoiceMail(${EXTEN}@default,u)
exten => _XXXX,n,Hangup()
This dialplan will ask for which extension to call. And if it will recognize voice with probability about 80% or more, than it will call to that extension. This script rrecognise English language, but also supports other languages:
[['Afrikaans', ['af-ZA']],
['Bahasa Indonesia',['id-ID']],
['Bahasa Melayu', ['ms-MY']],
['Català', ['ca-ES']],
['Čeština', ['cs-CZ']],
['Deutsch', ['de-DE']],
['English', ['en-AU', 'Australia'],
['en-CA', 'Canada'],
['en-IN', 'India'],
['en-NZ', 'New Zealand'],
['en-ZA', 'South Africa'],
['en-GB', 'United Kingdom'],
['en-US', 'United States']],
['Español', ['es-AR', 'Argentina'],
['es-BO', 'Bolivia'],
['es-CL', 'Chile'],
['es-CO', 'Colombia'],
['es-CR', 'Costa Rica'],
['es-EC', 'Ecuador'],
['es-SV', 'El Salvador'],
['es-ES', 'España'],
['es-US', 'Estados Unidos'],
['es-GT', 'Guatemala'],
['es-HN', 'Honduras'],
['es-MX', 'México'],
['es-NI', 'Nicaragua'],
['es-PA', 'Panamá'],
['es-PY', 'Paraguay'],
['es-PE', 'Perú'],
['es-PR', 'Puerto Rico'],
['es-DO', 'República Dominicana'],
['es-UY', 'Uruguay'],
['es-VE', 'Venezuela']],
['Euskara', ['eu-ES']],
['Français', ['fr-FR']],
['Galego', ['gl-ES']],
['Hrvatski', ['hr_HR']],
['IsiZulu', ['zu-ZA']],
['Íslenska', ['is-IS']],
['Italiano', ['it-IT', 'Italia'],
['it-CH', 'Svizzera']],
['Magyar', ['hu-HU']],
['Nederlands', ['nl-NL']],
['Norsk bokmål', ['nb-NO']],
['Polski', ['pl-PL']],
['Português', ['pt-BR', 'Brasil'],
['pt-PT', 'Portugal']],
['Română', ['ro-RO']],
['Slovenčina', ['sk-SK']],
['Suomi', ['fi-FI']],
['Svenska', ['sv-SE']],
['Türkçe', ['tr-TR']],
['български', ['bg-BG']],
['Pусский', ['ru-RU']],
['Српски', ['sr-RS']],
['한국어', ['ko-KR']],
['中文', ['cmn-Hans-CN', '普通话 (中国大陆)'],
['cmn-Hans-HK', '普通话 (香港)'],
['cmn-Hant-TW', '中文 (台灣)'],
['yue-Hant-HK', '粵語 (香港)']],
['日本語', ['ja-JP']],
['Lingua latīna', ['la']]];