NCSA Speech Translator
 

David Bock
NCSA Visualization and Virtual Environments
January, 1998



 

Overview
NCSA's Speech Translator is an interface tool for Windows 95//98/NT providing voice recognition capabilities using IBM's ViaVoice  speech processing technology.  Translator also allows a user to connect and send recognized speech commands to an awaiting remote process through a standard TCP/IP socket connection across the Internet.
 
 

Description
NCSA's Speech Translator was designed with the primary goal of providing an easy to use interface to two basic operations.  First the tool was designed to provide an interface to a typical voice recognition session including such tasks as enabling/disabling the speech engine, reading a grammar file, activating/deactivating the microphone, and processing the speech.  Secondly, the tool was designed to allow a user to easily open a socket connection and send to an awaiting process the recognized speech commands.  Translator's voice recognition is enabled by accessing speech processing technology provided by the IBM ViaVoice software product.  User's must have this software installed on their system to access Translator's speech processing functionality.

Translator was developed by NCSA's Visualization and Virtual Environments group at the University of Illinois - Urbana-Champaign.  Translator is written in C++/MFC under Microsoft's Visual C++ and IBM's ViaVoice SDK for Windows 95/98/NT platforms.
 

 
Operation
The Translator tool interface is shown below followed by a detailed description of its operation.


 
 
Preparing the grammar file
Speech command grammar is created in a readable text format called Backus-Naur Form or BNF.  This grammar file can be created in any standard text editor.  The ViaVoice speech engine processes grammar in a binary representation called a finite state grammar file or FSG.  To prepare your grammar file for the speech engine, you must convert the readable BNF file into a corresponding FSG format.  This can be accomplished with the grammar compiler supplied with the ViaVoice SDK.  Once the grammar file is in FSG format, it is ready to be read by Translator.
 

  Enabling the speech engine
Before speech can be recognized, the ViaVoice speech engine must be enabled.  To start the ViaVoice speech engine, either select "Connect" under the "Controls" menu item or press the button with the keys icon (shown above).  This will start the speech engine and a message will appear in the "Output Information" window informing the user of operation results.
 

  Reading the grammar file
Once the speech engine has been started, Translator is now ready to accept an FSG grammar file.  Use this icon button or select "Open" under the "File" menu item to read an FSG grammar file.  A message will appear in the "Output Information" window informing the user of operation results.
 

  Connecting to a remote host
To send your processed speech commands to a second listening process over a socket connection on a remote host, use this icon button or select "Remote server..." under the "Controls" menu item to connect and establish communication.  This action will present the user with a dialog box with which to enter both the 1) IP address or host name of the remote machine and 2) the socket port number on which to establish communication.  A message will appear in the "Output Information" window informing the user of operation results.
 

  Activating the microphone
Select this icon button or "Start recognition" under the "Controls" menu item to activate the microphone and begin speaking.  A message will appear in the "Output Information" window informing the user of operation results.
 

  Deactivating the microphone
Select this icon button or "Stop recognition" under the "Controls" menu item to deactivate the microphone.  A message will appear in the "Output Information" window informing the user of operation results.
 

  Disabling the speech engine
Upon completion of a session, disable the speech engine with this icon button or selecting "Disconnect" under the "Controls" menu item.  A message will appear in the "Output Information" window informing the user of operation results.

 

Download
Download the Windows 95/98/NT executable Translator tool only.
Unzip this file to get the binary executable only.
 

Download the source code for the Windows 95/98/NT Translator tool.
Unzipping this archive will create a folder "NCSASpeech" containing the Microsoft Visual C++ project complete with all source code.
 
 

Installation
Installation instructions for both the executable and source code are described below.

IBM ViaVoice software
NCSA's Speech Translator tool requires IBM's ViaVoice software product.  This software must be installed on your system for proper operation of the Translator tool.
 

Setting your system path
The Translator executable must have access to the ViaVoice libraries installed with the software product.  To accomplish this, add the ViaVoice/lib directory to your system path.  For example, if you've installed ViaVoice in your C: partition, simply add "C:\ViaVoice\lib" to your system path.

 
Compiling the source code
To compile the source code for NCSA's Speech Translator tool, you'll need to download and install IBM's ViaVoice SDK development kit.
 
 

Copyright
 
 
 


[Alliance] Alliance NCSA UIUC [NCSA]