My students and DictationBridge

I’m Erin, a member of the DictationBridge team working on documentation and social media for the project. When I first heard about DictationBridge, I was eager to get involved for many reasons, but some of the most compelling were the stories of my students during the years I worked as a full time assistive technology trainer.

If I had a dollar for every time I heard, “But Erin, can’t I just talk to the computer?” I could make a significant contribution to the DictationBridge funding campaign. Sometimes this question came during the frustration of learning to touch type, but often it came from a place of much more profound need. Elders in their eighties and nineties had stories they needed to pass down to their grand children and pushed themselves to master typing skills despite failing health, diabetic neuropathy, and hand tremors. Young people with dyslexia and other learning disabilities, some of whom had passed through the K12 education system without gaining basic literacy skills, excelled at performing most tasks on the computer, but slumped in frustration when it came time to write documents and emails. Individuals who had experienced strokes and other brain injuries faced the complex combination of cognitive and motor skills required to both remember commands and execute them on the keyboard. Busy and fatigued graduate students looked for an opportunity to give their hands a rest while composing long papers.

Assistive technology funding varies vastly throughout the world, and I can’t possibly speak to what support individuals receive in other regions. Only some of the individuals I described above received state funded technology support, and none of them received funding for dictation solutions. Often they used PCs handed down from friends or family members, and the NVDA screen reader allowed access to their computers without financial burden. when it came to dictation software however, I didn’t have a recommendation that was financially attainable. I know there are many blind computer users besides my former students without the right tools to tell their own stories, and I hope you will join me in supporting DictationBridge. The next time someone asks me if they can talk to their computer, I want to be able to say, “Yes!”

Why I Am Building DictationBridge

My name is Pranav Lal, I live in India and this is why I’m involved in the DictationBridge project, one to which I hope you contribute to by opening this link.”

I have been using speech-recognition for over a decade. I began using it seriously when I had a bout of repetitive stress injury (RSI). I remember the days when I had to speak slowly and very carefully to maximize accuracy. Most of my experience has been with Dragon Individual earlier known as Dragon Naturally Speaking because of my accent. Over the years, I have tried a variety of add-ons to enhance my speech-recognition experience. Once I had more or less recovered from my RSI, I began to appreciate its productivity benefits.

I recently changed to a situation where NVDA became my primary screen reader. Before I switched, I had been using it off and on, waiting for the features I needed to come in. When the ability to read Excel charts and the ability to handle track changes and other annotations came, I knew it was time to switch. The ability to hear the text processed by Dragon was the only missing feature I needed. DictationBridge’s primary function is to echo back the recognized text as a user dictates. This gives immediate feedback and helps me catch errors and confirm what I have dictated is correct. In the NVDA spirit, Dictation Bridge is free and open source and, therefore, will be extensible by individuals with programming skills and will come at no cost to end users.

You can do what you want with it therefore there are no barriers to the creation of novel applications. The other thing is that DictationBridge will bring dictation to people who traditionally would not have access to use this feature. Unfortunately, a large chunk of the world’s population falls into this category. Think of a blind quadriplegic or a blind child in rural India. NVDA is being distributed widely thanks to several non-governmental organizations and suitable government schemes. It is also a great tool to promote literacy. In India, according to some statistics there are 15 million blind people, 2 million are children. One out of three blind people in the world lives in India. Think of them using speech-recognition with echo back. The ability to read and write will no longer be a barrier and we can begin to bring this population into the mainstream.

Please contribute to DictationBridge.

Inside DictationBridge

For many years, I’ve been the lead programmer for Serotek, makers of the System Access screen reader among other products. Now I’m the lead programmer for the DictationBridge project. So far, I’ve kept a low public profile. But there has been some confusion about Serotek’s role in this project, the licensing status of the code, and the limitations of the software, particularly when used in conjunction with JAWS. So I’d like to clear these things up.

The roots of this project go back nearly 9 years. In the summer of 2007, we at Serotek wanted to enhance our System Access screen reader to take full advantage of the features of the then-new Windows Vista operating system. We were particularly excited about the speech recognition capabilities that were built into Windows for the first time with the release of Vista. We realized that speech recognition would be particularly useful for people with repetitive strain injury (RSI). Here was an opportunity to provide an alternative to expensive dictation products, using a feature that was built into the operating system itself. How could we resist that?

The trouble was that if speech recognition, particularly dictation, is going to be useful to a blind user, the dictated text needs to be echoed back, so the user can verify that the computer interpreted the input correctly. But the programmatic accessibility interfaces for Windows, such as Microsoft Active Accessibility (MSAA) and UI Automation (UIA), don’t provide a standard way for assistive technologies to detect when text is entered through an alternative input method such as speech recognition, thus it’s invisible to a screen reader via these standard techniques.

So, once the project got the green light, I rolled up my sleeves and looked for a way to detect the text coming from the Windows speech recognition subsystem. I found a solution using a technique called API hooking. API hooking basically means that the assistive technology product intercepts any attempt to invoke certain functions in the operating system, and does something with the information that it can gain from those intercepted calls. In this case, I figured out which OS functions were being used by the speech recognition subsystem to insert dictated text into an application, and wrote code to intercept calls to those functions, so System Access could get the text and read it back.

API hooking is a tricky and error-prone technique, and very few software developers have experience with it, but it’s indispensible for assistive technology on Windows. To the best of my knowledge, there is no other way that we could have implemented robust support for Windows’s built-in speech recognition, even today. API hooking is also essential in implementing support for the Dragon line of products from Nuance.

Because I had over two years of experience with API hooking when I started working on speech recognition support in 2007, I was able to get the code written in a reasonable amount of time. That code has been working reliably ever since, on every version of Windows from Vista through Windows 10. By paying Serotek for this code, the DictationBridge project will be able to leverage a battle-tested solution to a difficult problem. Without paying Serotek, the DictationBridge project would need to find someone who could re-implement something equivalent from scratch, a cost far greater than the fee we’re paying for the license. As I said, very few developers have experience using API hooking in the context of assistive technology, and I’m one of them. But because of my non-compete agreement with Serotek, I would not be able to work on this project, legally or ethically, unless Serotek is paid for the work I’ve already done. Serotek paid me for the time I took to write and maintain the original code, it’s only fair that they are compensated for allowing their software to be released for free and as open source.

It’s also important to emphasize that most of the funds we’re raising for DictationBridge will go toward new work, because there’s still a lot of work that needs to be done to make DictationBridge a success. Specifically:

  • The code that we’re licensing from Serotek currently depends on a proprietary, third-party component that Serotek purchased. We need to replace that component with an open-source equivalent.
  • While support for Windows Speech Recognition is mature and robust, support for the Dragon family of products from Nuance is currently at the prototype stage. A significant portion of my time on this project will be spent fleshing out and debugging that code, including more low-level API hooking.
  • We need to write scripts for NVDA, JAWS, Window-Eyes, ZoomText Fusion and Dragon Professional to make the Dragon and Windows Speech Recognition user interfaces fully usable with each of these screen readers. We will be paying other developers to help write these scripts.
  • We need to write end-user documentation, including instructions specific to each supported screen reader. We’ll also need to produce an audio demo of the finished product and we need to pay the people working on these tasks.
  • To ensure that users can get a useful level of technical support when they need it, we’ll need to hire and train the support staff, a process already underway.
  • Finally, there are incidental expenses, such as paying for Dragon licenses and administrative work.

So there’s a lot more to this project than the existing code I wrote for Serotek.

Even though this code is currently proprietary to Serotek, it will ultimately be included in the DictationBridge open-source release. Specifically, all DictationBridge code will be released under the Mozilla Public License, version 2. The MPL is compatible with the GNU General Public License (GPL), the license used by NVDA. But in contrast with the GPL, code released under the MPL can be mixed with proprietary code in the same program. This makes the MPL a better fit for an add-on package like DictationBridge, which will support commercial assistive technologies such as JAWS, Window-Eyes, and ZoomText Fusion, as well as the free NVDA.

Finally, some have asked about the features and limitations of DictationBridge when used in conjunction with JAWS. The main difficulty in working with JAWS is that, due to limitations of the JAWS scripting facilities, it is not possible to add new system-wide scripts without modifying the default scripts that ship with JAWS. If we modify those default scripts, then we have to re-do those modifications for every new version of JAWS. Unless the DictationBridge project receives ongoing funding to support JAWS, this kind of ongoing work is not feasible.

Therefore, DictationBridge will work with JAWS to the extent that we can make it work without modifying the default scripts. Specifically, dictated text will be echoed back, for both Windows Speech Recognition and Dragon NaturallySpeaking. The Dragon user interface for making corrections to the dictated text will be accessible. But access to the equivalent UI for Windows Speech Recognition, if we can make it work at all, will be more limited with JAWS than with the other screen readers. Of course, since all of DictationBridge will be open source, other developers are always free to implement more extensive integration with JAWS if they wish. We would be especially happy if Freedom Scientific itself chose to integrate DictationBridge into JAWS; there would be nothing to prevent that from happening if Freedom Scientific were so inclined.

In closing, I believe that a crowdfunded project like DictationBridge is the best way to develop assistive technology in the open and make it freely available to everyone who can benefit from it. But to make it work, we need the support of the whole community. If everyone contributes, it only takes a little from each person. So let’s all pitch in and make this dream a reality.

You can make a contribution on our Indie Go-Go crowdfunding page and I urge you to do so.

In Their Own Words: Getting more done using speech-recognition

The following post is by Sue Martin, one of the contributors to the DictationBridge project.

Sue W. Martin has been around the world of assistive technology over twenty years. She started as an assistive technology instructor in Maine in the mid 90s. From there she moved to a position as subject matter expert with the United States Department of Veterans Affairs, VA. She currently works for the VA Office of Information & Technology as a management analyst.
Martin lives with her husband and assorted cats and dogs in the foothills of the Cumberland Plateau.

“When’s Easter this year?” I picked up my phone and asked the question. “Easter is Sunday, March 27.”
Thank you very much.
I’ve been using integrated speech input/speech output over fifteen years. And it’s come a long way baby. My “day job” is with the United States Department of Veterans Affairs, VA. A month after I started my chief of service came into my office. “How long will it take you to be able to teach Dragon?”
I was the subject matter expert for the computer access training section at a VA blind rehabilitation center. So I got started.
Today we think nothing about talking to our devices and having them talk back to us. But in the early part of this century such a feat was pretty extraordinary. Because VA standardized on the Windows operating system; the software allowing someone to use a computer without looking at it or touching it had to be JAWS for Windows (Freedom Scientific, Inc.) and Dragon NaturallySpeaking (Nuance, Inc.)
Out of the box these two applications don’t communicate very well with each other. Hence a third piece of software was required. This third piece of software “bridged” the two programs allowing full hands free control of the PC.
Two years after I began teaching veterans to use a computer hands free I was asked to become a private beta tester for one of the bridging technologies. What fun I had! Before you get the wrong idea beta testing is fun, yes, but it’s also hard work. And it can be hell on hardware. I’ve lost count of the reimages I’ve had because of crashes when testing!
But testing software pushes the horizons. I tried a lot of crazy things . . . because I could. My PC is in our sunroom. The sunroom is separated from the kitchen by a breakfast bar. On the recommendation of a long-time speech input user I purchased a wireless headset.
One afternoon, after my official tour of duty had ended, I was reading an interesting article on the Internet.  “Be Quiet. Say Time?” I suddenly realized my husband would be home in half an hour and I hadn’t done anything about dinner.
“What the heck,” thought I. “Read Document.” And I went in the kitchen to prepare dinner and kept right on reading.
A year after we bought our house my husband gave me a greenhouse for Christmas. The greenhouse is close enough to the house that the wireless headset works just fine when I’m out there. I’ve spent many an hour working with my plants while operating my computer that sits snugly in the sunroom.
Hands-free computing also has it’s humorous moments. There I was, dictating away, when my husband came in from outdoors. “Scratch That.” So he came over and scratched my back. The man can follow directions!

I’ve been out of the hands-free computing world for a while and was thrilled to learn about DictationBridge. DictationBridge is opening up the world of hands-free computing too more people than ever before.

Free Software Expands Options

Access to productivity tools often comes at a high cost for people with disabilities. The price of obtaining screen readers, braille displays, scanners and other devices can easily set an individual or their employer back several thousand dollars. It is no wonder, then, that off-the-shelf solutions with accessibility features have become increasingly popular.

For example, the blind and visually impaired community has embraced iOS devices both for what they do, and for the fact that an extra software purchase is not required to make them accessible.
Currently, out-of-the-box dictation software is partly accessible to blind and visually impaired PC users. Sighted PC users enjoy two dictation options. Windows Speech Recognition can be turned on with a few clicks on any Windows machine. The average user who needs dictation for convenience, fatigue, injury, or simple curiosity can simply turn it on, complete some short training exercises, and start dictating. The user who decides they require a more robust dictation solution can purchase Dragon (formally known as Dragon NaturallySpeaking) from a major retailer and gain full control of their PC with their voice.

For blind users, neither of these solutions is optimized for an out-of-the-box experience. Without echo back, a user has to use a large number of keystrokes to proofread their work. Dictation Bridge will create a free software solution to address many of these gaps. With the addition of echo back for dictation, a user can choose to employ the free speech recognition features built in to Windows, or to purchase any current version of Dragon.
DictationBridge will increase the speech recognition options available to blind users, and insure that blind users don’t have to pay for features that their sighted counterparts can access for free.

DictationBridge: Coming Soon To A Screen reader Near You

DictationBridge is coming for free to users of NVDA, JAWS and Window-Eyes and it’s coming soon. For a quick look at DictationBridge, listen to this audio demo.

When DictationBridge 1.0 is released it will:

  • Be compatible with NVDA, JAWS and Window-Eyes.
  • provide screen reader users with access to the built-in MS Windows dictation facility as well as with a number of different versions of the Dragon software packages from Nuance.
  • be distributed for free with 100% of its source code included.
  • be fully documented and will come complete with all of the documentation any user may want.
  • have professional technical support available to users who choose to purchase such.

The DictationBridge Team

DictationBridge is being managed, written, documented, tested and will be distributed by a coalition made up of 3 Mouse Technology (3MT), Serotek and a number of independent technologists working to bring a free dictation solution to all blind users of Windows based computers employing the three most popular screen readers. A large portion of the DictationBridge code already exists as part of the SystemAccess screen reader and Matt Campbell, author of SystemAccess and all of the other software from Serotek, is the lead developer on DictationBridge. If you are interested in getting a feel for how DictationBridge will work with the Microsoft Windows dictation features, you might give SystemAccess To Go a try as DB will be using most of the code that Matt wrote for that screen reader.

In addition to Matt Campbell, the DictationBridge team includes a lot of names one may know from around the world of accessibility and hands free computing. We’re proud to have Lucy Greco, Erin Lauridsen, Pranav Lal, Sue Martin, Debra Kaplan, Bryan Smart, Amanda Rush and Jeff Bishop joining the 3MT and Serotek teams to deliver this exciting and very important bit of technology to the users who need it.

What Will DictationBridge Do?

DictationBridge will contain all of the features identified as actually meaningful to blind dictation users, a feature set based on extensive research into the actual usage habits of such individuals. It will support both MS Windows dictation and a number of different versions of Nuance’s Dragon products. It is a major goal of the DictationBridge team to deliver a solution that is as cost effective for end users as possible, if used with the built in Windows dictation system and the NVDA screen reader, end users will enjoy it at no cost to themselves at all. Different versions of Dragon provide users with different features and DictationBridge will support as many as is technically practical. Some advanced features may require users employ Dragon Professional but the DB team is committed to minimizing the number of features that will require users purchase this costly software and we all believe that most users will be happy with either the no cost Windows system or one of the versions of Dragon that are relatively inexpensive.

To achieve its goals, the DictationBridge team will soon be launching a crowdfunding campaign to pay the one time development costs required to bring this software to the general public.

3 Mouse Technology will be selling DictationBridge email technical support for $55 per year. Users will have a free dictation solution fully supported by professionals and will also have a community driven mailing list where users can help support each other at no cost at all.

The DictationBridge team believes that blind and otherwise disabled people should not need to pay a penny more than anyone else to use the same technology products. There are approximately 65 million blind people on the planet and DictationBridge will be available as free software to all of those who use a Windows based PC. As it supports the built in Windows dictation facility, users will not need to purchase an expensive Dragon Professional license. Users who want or need the power of Dragon Pro, however, will have a terrific solution available with all three popular Windows screen readers.

Stay Tuned

In the coming weeks, the DictationBridge team will be making a number of announcements about the project and will soon be launching the crowdfunding campaign. If you want to make sure you don’t miss any announcements, please subscribe to the DictationBridge mailing list by opening this link to the mailing list site or by sending a blank email [by using this link to open your mail program and hitting “send.”

Announcing DictationBridge: A Free Dictation Solution For Screen Reader Users

Welcome to the dictationBridge website.

DictationBridge will be a screen reader plug-in that will allow blind users to better enjoy speech recognition.

DictationBridge will enhance the interaction model for blind users of Dragon and Microsoft dictation utilities.
It may seem that speech synthesis and recognition would conflict with each other. . . Indeed,, if you place a microphone near speakers, it does pick up sound and can cause squealing feedback. However, with DictationBridge, screen reader users will not experience this issue for reasons too complex to describe in this brief announcement.

In most cases, users interact with screen readers via a collection of keyboard commands. However, there are many screen reader users who have additional impairments, such as learning disabilities or repetitive strain injuries, who need to use speech recognition to get things done. With DictationBridge, they can enjoy their screen reader in a more efficient manner and one that is compatible with other disabilities or repetitive stress injuries, a malady, based on observational and anecdotal evidence, common among blind people due largely to using a keyboard far more heavily than those who use vision for their computational experiences.

Command and control of one’s system can also be made more efficient for screen readers using dictation and DictationBridge provides access to this as well.. Speech-recognition is often represented in popular culture with scenarios like Star Trek where the computer would do whatever the crew of the Enterprise would ask of it. Speech recognition is not quite at that level, however it is possible to command your computer using speech recognition today and have it carry out pre-programmed tasks.

DictationBridge enhances the users’ experience with speech recognition and screen readers. It fills many gaps that occur due to the interaction of these systems and accessibility problems in the software from Nuance and Microsoft.

DictationBridge is designed to be as screen reader agnostic as possible. The program, however, will be first coded for and tested against the NVDA screen reader for Microsoft Windows .

DictationBridge is free, libre, open source software (FLOSS) . and it will be available, including 100% of it’s source code, upon release. The DictationBridge team is dedicated to protecting your freedoms.

If you would like to get more information about this project, join the mailing list by activating this link to join the dictationBridge mailing list or by this link which will launch your email program and, with it, you can just send the mail without adding anything else. Whether you use either the form or the email, you will receive a confirmation message with instructions on how to complete your registration.

Those of you who are interested in DictationBridge but don’t want to join a discussion list and want to follow the project’s progress should come back to this page often as we hope to be updating it frequently.

Thanks For reading and please do join our effort to bring DictationBridge to the world at no cost to all but those who make a voluntary contribution to our crowdfunding effort.