There is a formidable collection of knowledge to be mastered for the developer working in the speech technology arena. We'll define two important categories: the client side and the server side.

The client side is the side which is most visible in the magazine ads and consumer software stores. That side consists of what is known as continuous speech dictation software. The server side is what has also been historically called telephony.

The Client Side

While the dictation aspect currently attracts the most attention, the modern developer should be aware that the "command and control" aspect of these packages is the place which gives startling new user productivity benefits. "Command and control" is the ability to bop around the Windows environment without using any mouse clicks or keystrokes, just voice commands. It is implemented to some degree in all of the major dictation products.

The startling benefits come by using the "macro" facility, also incorporated in these products, to extend both the dictation and command/control aspects in significant ways. The macro facility is the primary avenue of new user utility for the developer.

The macro facility allows the developer to encapsulate an arbitrary collection of mouse clicks and keystrokes into one audio phrase. The phrase can have any number of syllables, although the phrase should be a short one for human factors reasons.

Take e-mail preparation as an example. Suppose the developer observed that a user frequently sends e-mail to a certain group of people. The developer could attach to the audio phrase "e-mail the closing report to the East Coast reps" all of the following: the Windows menu bar operations for report retrieving, the mouse clicks starting the Internet connection, the arrow key movements and menu closing keystrokes necessary for that message composition and sending. Once these sequences are attached to the macro by the developer, thereafter, in the specified application context, saying the phrase would get the operation done.

The macro facility has the property of being able to capture the graphical user interface state changes which we have all become used to, but which are nevertheless complicated, onerous and time consuming.

