Developer View contd...

Developer SDK's

The market for speech SDK's has become quite rich. IBM has a developer tool kit, which is also quite elaborate and allows much more precise control of the voice environment than is obvious in their dictation product and has Linux offerings. Chant Inc. makes a Smalltalk toolkit which fits around the IBM C oriented API thus giving the Smalltalk community access to speech technology. Philips is also offering SDKs. ScanSoft is offering recognition and text-to-speech SDKs for handheld devices.

All of these developer toolkits require serious design work concerning the details of the interaction between the user, application and speech engine. Allow time for experimentation with these kits before committing to your first project estimate. The details of interpreting speech and responding correctly to the various utterance sequences that normal people use can be daunting.

Server side applications make use of SDK's as well. As an example, Nuance offers Foundation SpeechObjects(TM), a core set of free, open-source, standards-based components for building speech applications - components that handle off hook, on hook, DTMF recognition, message playing etc. It partially overlaps the UNISYS facilities for dialogue control and extraction of quantities like dollar amount and dates from the conversation. That SDK does not provide quite as elaborate interpretation of prior context the full UNISYS product can.

The UNISYS product, with its elaborate semantic dictionary, can do things like equate "I don't want to pay anything" with "Zero dollars" without the programming assistance that would otherwise be needed using the Nuance SDK alone. Nuance provides pre-written grammars for brokerage, airline, dollar amounts and dates.


Page Last Updated: 01/02/02