The skill of dictation
Here are some tips you can follow that will make your use of voice recognition software easier and more effective:
- Enunciate carefully and speak slowly enough so that each word gets its due (although you don't have to go too slow). Remember, you are controlling a machine, not talking to a person.
- While speaking, envision the text you are seeking to produce. This will help you give equal heed to each word (so the computer can too), keep a steady rhythm and suppress "dysfluencies" like, ah, y'know.
- Watch the results on the screen as you go along. This may slow you down but will enhance your accuracy. To paraphrase Wyatt Earp: It's good to be fast, but it's better to be accurate.
- Even a momentary loss of focus can lead to misrecognition, especially of one-syllable words. But if you can maintain focus, the results can be far more accurate than typing.
- A big issue for novices is that they have learned to "think with their fingers," so suddenly removing the keyboard is a major impediment to composition. I have found it best to just speak the text as it comes to you without stopping for mistakes. You can edit it later.
- Finally, there is the environment. Background silence is best, but droning ventilators hurt recognition more than office chatter. Meanwhile, if you don't mind being overheard on the phone then you won't mind being overheard while dictating -- otherwise, find an office. You can use about the same volume for the phone and for speech recognition.
But with version 12, these factors have faded into the background (although they they haven't entirely disappeared). For example, you can dictate effectively at about half the speed of an auctioneer -- should you prove able to do so. Assuming that you stay focused while dictating, the error rate is now trivial (see sidebar).
An important part of that new reliability is the noise canceling headset microphone supplied with the software, which does not react to background noise. It made things a lot easier for me -- I had to turn off my previous microphones every time I stopped speaking to keep them from picking up other sounds. The Home and Premium versions come with a two-speaker analog headset, while the Professional and Legal versions come with a one-speaker USB headset.
Version 12 is outwardly not very different from previous versions, with the same interface and basic command scheme. The vendor claims that accuracy out-of-the-box is 20% better than that of version 11, and in my testing, that did seem to be the case. New features include an interactive tutorial, Bluetooth support, and enhanced support for Gmail and Hotmail.
Dragon installs from a CD; during the installation, it asks a number of questions about your age, gender and accent. (It also tests the microphone, and in my case was not happy until I had tried several ports.) It then listens to your voice during a short training session, taking about five minutes. (With early versions the training took easily 45 minutes.) You have the option to let it examine your document folders and outgoing email folders to look for commonly used words.
When invoked, Dragon puts a thin control bar across the top of the screen. You click an icon in this control bar to turn on the microphone. When you start to talk, text appears at the cursor. If you talk quickly, the text may fall as much as a sentence behind, but I found it invariably caught up fairly quickly. Punctuation marks must be pronounced.
If word X is misrecognised, you can adjust the software by saying "Correct X." Word X will then be selected and Dragon will present a list of possible corrections. If none of them match, you can spell the desired word. Thereafter, Dragon is more likely to recognise the word correctly. (With version 12, I found that one correction was always enough.)
On the other hand, if you simply decide you want to change word X, you say "Select X." Dragon assumes you want to change it as an editorial decision (rather than because there was a mistake), and will not alter its later recognition based on your change. You can also select arbitrary phrases, whole sentences or paragraphs in order to delete, move, or reformat, etc. by saying things like "select next three words," "select previous paragraph," or "select current line".