The Client Side contd...
Three different muscle groups-- eyes, forearm, fingers-- would
be normally be brought into play to perform the earlier e-mail
example. So what, you say, people move muscles all day and nobody
writes Congress about that. However, each change of muscle group
takes time, approximately one tenth second, and requires a shift
of attention. To the extent that a voice macro can be made to
reduce the number of muscle group changes, that is the extent
to which concentration stays more appropriately focused, productivity
goes up and the work context seems more natural and consistent.
This e-mail task is not a change of state in the larger context
of the user's intent to get work done. But the user has to lead
the desktop through several of the computer's own state changes
before the user can get get the job done. We have all blithely
accepted these roamings around the Windows pane as the price of
being computer literate. Mouse clicks are good for you, say the
GUI hypsters. The new macro facilities of speech recognition change
all of this.
In working with these packages, it is clear that regardless of
options and features, the developer still needs to think carefully
about the user interface, sometimes called the LUI or language
user interface. Each of the major vendor's products has quite
different ease of use for the same task which might be "macro-ized",
notwithstanding the improvement given by voice.
A strategy for discovering this difference of ease is to count
the muscle group events needed by each package for a common set
of operations or macro. An operation with one eye movement, a
forearm movement, a finger movement and two voice syllables would
count as five events. What we have found was that although these
packages all have the ease intrinsic to voice control, some have
more ease than others. The lesson for the developer is to always
watch for opportunities to refine the muscle event counts (or
some similar kind of measurement of your choosing) for the application
in question. And to carefully compare each product to find the
most productive one for that specific application. All of this
is different from testing the "accuracy" of the product's
recognition capability.
Back to top: client side...
More: the server side
More: developer SDK...