Two great uses for SR

I have found two great uses of speech recognition for telecommuters. But, before I tell you what they are, I’d like to give you some background on my experience with speech recognition software.

My SR Learning Experience

My typing speed varies between 80 and 150 wpm, depending on how much sleep I’ve had the previous night and my caffeine intake at that point in the day. The more caffeine, the slower my net speed due to the increased frequency of the backspace key–the most used key on my keyboard. As you can imagine, this speed was taking a toll on my wrists. Since my job required me to write a great deal, I found I was starting to experience some repetitive motion pain. Before the pain turned to injury, I decided to give speech recognition (SR) software a try. I had already tried an ergo keyboard, which helped, but being the geek I am, I wanted to try out something I felt could really fix my problem. I figured, why type at all, when the computer (theoretically) could do it for me!

Well, I was not very successful with the early versions of SR software I tried. The error rates were just too high at about one every five words. I reduced this a bit with extensive training of both me and the software (slow down, pronounce words more clearly), but the time required to go back and correct the errors still made for a net loss in productivity.

As time went on, the quality of the software improved and I was able to eventually use a version of NaturallySpeaking with good results. At this point I was getting about one error per sentence. My biggest problem was disciplining myself to proofread what I ‘wrote’, as I had a tendency to ramble and then immediately hit ‘send’ without reviewing my work. If you have used speech recognition software before, then you know what I mean when I say the results were often hilarious and sometimes embarassing. SR software at that time was very good at understanding phonetics, just not the context. The words it selected were correct phonetically, they just didn’t belong in the sentence. Even with the best software available today, this mistake is still made. However, the tools get better with each generation and the latest from Nuance (NaturallySpeaking), for example, is scary-good. NS will analyze the context of your sentences as they are formed and correct badly guessed words spoken earlier in the sentence as you talk. This is a most amazing thing to watch.

The software is one important component of a successful SR exerience, but you cannot neglect the hardware end of things. After a lot of experimenting, I discovered the biggest improvement to my recognition accuracy came when I purchased a USB (digital) headset with a noise-canceling microphone. Accuracy improved to about one error in an entire paragraph! So, if you want to try out SR technology, get the best software package you can afford, and purchase a good digital headset (I use a Plantronics USB 500).


Best uses of SR for telecommuters

From the very early days of experimenting with SR I had visions of being able to use it for everything I wrote. Well, I have to tell you, I’m disappointed. Although I thought I’d be able to learn to compose email and write reports with it, I have found that my single-tasking brain does not allow this to work very well. Basically, when I talk to myself, I disturb my own thoughts. I just can’t have someone (even me) talking to me while I try to compose my thoughts. Had I learned to utilize a secretary who took dictation, as was common in the 50s, SR would likely have been an easy transition for me. But, alas, most of my career has involved writing my own memos on a keyboard where I can think while typing. 

Nevertheless, I have found that SR works best for me for two classes of tasks: transcribing notes and instant messaging. I often take hand-written notes in meetings as I find this faster and more intuitive when ideas are flowing quickly. Using SR, I am able to convert these hand-written notes to text at about 200 words per minute with very few errors. This alone is worth the admission fee. But, the really amazing use of SR is with Instant Messaging. I can sit and chat for long stretches on IM without touching the keyboard. I just say ‘new line’ at the end of dicatating a response to send it. Chatting over IM can be a very slow process, really, when you compare it with the phone or a face-to-face conversation. I can get impatient waiting for a short reply from a slow typer. Adding SR to instant messaging greatly increases the speed of the conversation, and the messages can be less abbreviated and thus eaiser to read to boot. You don’t even have to worry as much about spelling as this is often forgiven in IM chats anyhow. SR and IM…very cool!

As a final note, there is one technical issue that can make SR use frustrating. It takes too long to load the code and voice templates into memory to get going the first time or to resume after a period of non-use. This time penalty tends to keep me from using SR for ad hoc tasks as I find it hard to wait each time I want to use it. This problem can be somewhat mitigated by loading up your PC with huge amounts of RAM (4GB plus) so that the program components are not swapped to disk as often when you load up Word or other memory hungry applications.