This session will review speech recognition innovation applied to assistive technologies and inclusive digital interfaces.  Industry leaders and advocates will share their perspective on their impact as well as their potential future applications including for media accessibility, simultaneous translation and captioning, and for digital interactions, home applications, education or in business environments.

Session Chair: AnnMarie Killian, CEO, TDIforAccess, Inc,


  • Josh Miller, Co-Founder and Co-CEO, 3Play Media
  • Sara A. Smolley, Co-Founder and VP, Strategic Partnerships, Voiceitt
  • Shadi Abou-Zahra, Principal Accessibility Standards and Policy Manager, Amazon Devices and Services
  • Stephen Framil, Corporate Global Head of Accessibility, Merck & Co. Inc.

This video is lacking captions. We expect captions by February 14, 2024.


Good afternoon, again, welcome back. The panel discussion on New Frontiers for Inclusive Speech Recognition. The moderator is AnnMarie Killian, CEO of TDIforAccess. The floor is yours. Thank you.

ANNMARIE KILLIAN: Hello, everyone. Good afternoon! Thank you for coming. Thank you for coming to this speech recognition plenary. This panel here, we have been preparing for our discussion today and we will be having different avenues that we'll talk about, but before we start, it is important to introduce our names, and our image descriptions.
So I am AnnMarie Killian. I'm CEO for TDIforAccess. I'm 5'9''. I have a blue jacket, brown hair, long. I have glasses and I have hoop earrings and I would like to ask each person to introduce their name and do their image description and then we'll get started.

JOSH MILLER: I'm Josh Miller, cofounder and coCEO of 3Play Media, I'm a white male in my 40s now. I have dark brown hair with gray coming in. I'm wearing a blue blazer and jeans and some brown shoes.

SARA SMOLLEY: I'm Sara Smolley, cofounder at Voiceitt. I have a red dress, I have long straight hair. Wearing heals on this occasion. In general I would be probably barefoot in front of a screen. Really good to be here.

STEPHEN FRAMIL: Hello, Stephen Framil, from Merck, corporate global head of accessibility. I'm wearing a turquoise coat, black shirt, standing up 6 feet, probably shrinking by now because I'm in my mid50s and mostly black hair and a white Caucasian and Pacific Islander dissent.

SHADI ABOU-ZAHRA: I'm Shadi AbouZahra, Principal Accessibility Standards and Policy Manager at Amazon. I have a really cool red wheelchair that makes me feel as young and sporty as I would like to be.

ANNMARIE KILLIAN: Okay. So now we'll explain the process of the agenda and the opening of the five questions that we're each going to talk about for 5 minutes each. It is about 45 minutes, then we're going to allow 15 to 30 minutes of questioning. If you don't like, you know, the Slack, raise your hand, we'll have a mic brought to you, you can raise a hand. We do ask that you respect and wait until the closing of the session. Okay. So we can open it for questions at that point.
If you're here, so are we ready to get started? Okay. The first area to approach for speech recognition is the innovation and inclusion. I want to ask Sara, can you  I want to ask you a question about, you know, how there is three parts, how recent advancements in speech recognition and technology contributed to improving accessibility and inclusivity in digital interfaces and assistive technology, can you provide some examples and then how are straight leaders and researchers collaborating to ensure that speech recognition systems are trained on a wide range of speech variations to cater to the different users effectively?

SARA SMOLLEY: Thank you. Thank you for the question. Again, it is really good to be in this discussion, this panel with you guys, and with all of you today. So again I'm Sara, one of the cofounders of Voiceitt, an assistive technology startup based in Israel. We have done one thing, hard enough, speech recognition for nonstandard speech. Now, a question like that, we have sort of a tradition in our company to respond usually literally in the voices of our users. So I want to start with just a few profiles of individuals that we collaborate with in building our speech recognition technology: Dr. Clare Monroe, Ph.D., particle physicist in the U.K., she's a TED speaker, science fiction writer, scientist, she also happens to have cerebral palsy and finds it difficult for people to understand what she says; Michael Cash, my colleague, product specialist based in Tel Aviv. He often says people find my needs difficult to understand because of my cerebral palsy, my British accent, and my sense of humor. He used the speech recognition now accessible to be more involved in work meetings and achieving his dream now to be in sales, he's now leading customer and partner meetings. One more example, a deaf individual, heart researcher, based in Boston, and again wants to share his thoughts and research over video meetings. So, yes, voice technology  back to your question, yes. Voice technology, it can open the world for people who find it difficult to be understood and for people with more general disabilities. That's not actually why we started our company. So my grandmother was diagnosed early onset Parkinson's Disease, and so  when she was 40 years old, by the time I was born, she had lost most of her motor capabilities, but more than anything, it was her speech that was impacted. Even as a young child, it was difficult for  it was difficult for her to express herself and to be understood. Even as a young child I felt her frustration because it was difficult for her to communicate basic human needs. That was kind of our inspiration to develop a technology that can learn her way of speaking, and then essentially it translates for her, help her to communicate by voice whether it is saying I love you, communicating with a medical professional.
And so that's essentially what we did. We collected voice samples from people around the world with nonstandard speech, and then developed a mobile application to help them communicate in person with others. So what we learned then from our community, hey, inperson communication can be really helpful, but now in a world increasingly activated by voice we could do a lot more. We could help people interact with their machines. So this comes back to your question, how can voice technology now be used to help increase accessibility and inclusivity of digital interfaces and really of mainstream technologies. That's a point that I often really like to emphasize, which is on the one hand, we  you know, technology providers, yes, we can help to increase access, and increase independence, and quality of life in many instances, but equally, we can simply help people to bring the joy of mainstream platforms, gee utility and convenience of mainstream technologies, that's a lot of what our goal is as well.
So we went from verbal dialogue, inperson dialogue that I just described, to then voice control and so enabling a person with a speech and motor disability to turn on and off a light, adjust a thermostat, play music independently for the first time when very often they may rely on a human dare giver to perform these things on their behalf  care  by voice.
So this is one of the first experiments from just voice communication, communication to interface in a partnership of integration that we did with Amazon Alexa during COVID, it was really the beginning, when we started to learn more and more from the community, okay, how can we build not just a technology that's cool and interesting and fun, but really addresses the needs and preferences of those individuals. From there, a continuing learning, which we're still doing, and I think part of the goal here today, it is really to discuss about the collaboration, not just amongst startups, but among and in between other stakeholder, corporations, other technology providers, innovators, researchers, and, of course, the end user.

ANNMARIE KILLIAN: Thank you, Sara. I'm touched by your  you know, obviously, your story, about your grandma, what you did in the work, really inspired you about your mom, thank you for that. So next, Shadi AbouZahra, are you ready for your question? Okay. So speech recognition has the potential to greatly enhance media accessibility through features like realtime, captioning, simultaneously translation. So can you tell us how do you envision the applications evolving in the near future, and what challenges do you foresee in terms of accuracy for language diversity within that?

SHADI ABOU-ZAHRA: Yes. Thank you very much. Where do I start? Well, let me back up. Before answering the media question, just speaking a bit personally. For me, speech technology has been truly a life changer. I have one of my many talents, it is getting into bed and forgetting the lights on. Having to crawl back in my wheelchair so go switch off the lights, I have a remote control but that's always somewhere else. What I'm trying to say, just being able to control the environment by speech, just simply using my voice for commands, switching channels on the television. These kinds of things.
Such an incredible helper. And it is these things that really make such a big difference in the daytoday life. So, you know, so speech technology has just so many uses and I think Sara talked about individuals, specific people and how they're benefiting from that technology in different ways. But there are many other uses. I think, you know, this conference, it already has been often mentioned the automatic captioning. I think people have seen automatic captioning a few years ago, and they have seen it more recently, just the incredible improvements happening, nearly expediential. So it is just you know, in a matter of time. We're looking at so much more. I think this is just the beginning of how many more uses. One of the features that Amazon has, for example, it is called dialogue boost. What it does actually, using speech recognition, it is identifying the speech in a movie, say, and then suppressing the background noise to try to make the audio clearer, and you can set how far do you want to do that. It is actually a nice preview for us in the studio, a small plug, we have a stand, you can actually come, look at that live, how that's being used. Again, this is a completely different use or a very different way of using speech recognition or speech technology more generally. Providing an accessibility improvement or a feature. What I'm trying to say, it is I think there are so many uses for so many different people and we're only seeing the beginning and seeing the improvements here that can potentially happen.
Today's keynote mentioned the speech accessibility project, and Amazon participates in that as well. I think it is a very exciting project, but we're still very early on in that project, still collecting data, trying to improve, I think Christopher from Google was talking about that, trying to collect data here to collectively improve speech recognition for more people around the world. What was also mentioned in the keynote, and as somebody who originally comes from Egypt, grew up in Austria, now in the U.S., different languages, different cultures, and there is a lot ahead of us in terms of collecting languages from people with diverse accents, let alone languages and from different region, from different areas, so I think the positive side is where we have come to, in the last year, just incredible, and mind blow, but I think that there is still a long, long way ahead of us to achieve far more improvements that are needed.

ANNMARIE KILLIAN: Thank you. Next area of topic discussion, it is Steve, talking about education and business environments. Okay. So the exact question speech recognition can play a pivotal role in education. And business environments. By facilitating the communication and interaction for individuals with diverse needs, could you share some insights into the practical benefits and challenges of implementing speech recognition solutions in these context?

STEPHEN FRAMIL: Certainly, thank you for having the one nonspeech recognition provider here on the stage. It is a delight to be here. I think about some of those challenges, you know, if you  at Merck, we're a global company, my meetings start as early as 6:00 a.m., I'm having conversations with multinational accessibility stew warts all around the globe, even to the late evening. A lot of different mother tongue backgrounds other than English, of course, the company runs and operates in English. It is one of those things where you get accustomed to being able to understand different accents. I think, one of the things I found, it was a few years ago, a story about  with the PowerPoint, PowerPoint is, like, the way we communicate at Merck, that's the question, do you have a PowerPoint for it. There is a feature on there that you can have translations, verbal translations while you're presenting. You can do it in English, you can start flipping languages around. So I remember one meeting I had with the team in Spain, I thought I was so clever, I was like, okay, I'm going to  I'll be speaking in English, translating it back to them in Spanish. I'm really clever.
After I got about a minute or two into going, they stopped me, they said could you please turn that off? It is making our heads explode because what they're having to do is translate three times into English and then what they're hearing me say and then what they're seeing on the prompt. I think it is fantastic and we definitely  I think as a consumer, as a global company we have to be kind of sensitive to how we're using these different tools given the audience. Another story is that I have got accessibility stewards, you know, 70 plus globally in every market that we have an office, and I recall just last week I believe meeting with the team in Vietnam and, you know, I found, you know, for some of the individuals understanding them was very straightforward, some of them wasn't. This is where having some sort of accent as you were saying, accent speech recognition to be able to decipher for that, it would be extremely helpful. You know, I think with the large global company, there be is a lot of moving around. I think when it comes to understanding people, that can be learned. Every one ever us can learn to understand individuals. I recall in graduate school part of my housing arrangement was that it was a group of us that would take care of an individual with cerebral palsy, a fellow student, pretty much a use case where he had to have everything done for him. Understanding him at first took a little while, took some time. But through the course of time I could easily take dictation for the papers that he was writing, and understand him, be able to communicate and translate and that was something that's learned.
So what I'm saying, the accents, they can be learned, they can be acquired, but that's when you have a longterm sort of relationship with an individual. In a large global company, there is a lot of movement and sometimes you're across the world from one another, you don't have the luxury of building that longterm relationship to better understand how people are saying things.
That's something that I, you know, came to mind. One other thing that I would just because as you were speaking, I be thought of this, where  my father, is  we're accepting the fact that he's 86 years old, and he's  he has dementia, acquiring dementia, my mother has to write out things that  to communicate. He refuses to wear his hearing aid. So sometimes he doesn't understand exactly if it is  if the writing is too complex. So maybe another  with the AI conversations that we have been having today, maybe being able to distill what are the salient points that are being said for someone whose having dementia. Just another thought, throw that in there for those developers out there. There is a lot of use cases when it comes to on the user side of a large company, a lot of factors and it is very exciting. I think, you know, ultimately, it is going to help, you know, everyone understand each other better as we go forward.
Thank you.

ANNMARIE KILLIAN: Thank you, Steve. Thank you. So the next area, speech recognition, it is technical challenges and limitations, Josh, so we'll talk  we'll ask Josh. Josh, speech recognition, wow, there is a lot of triggers here, that can impact the success of the background noise, accents, all of that, you know, speed, all of this. So what strategies, you know what, challenges trigger all of these that are facing the technology challenges such as background noises, how do the challenges impact the quality of speech, recognition, overall user experience.

JOSH MILLER: Right. Thank you. I'll do my best not to be a party pooper about speech recognition. So to give you background on 3Play media, we're a media accessibility platform, we're focused on delivering high quality captioning, audio description, other accessibility services at scale. So we have been combining technology with humans in the loop to actually really deliver very, very high quality with the use of speech technology. We were using speech recognition 15 years ago to actually make this process scale better and go faster and be able to deliver more accessible content. From the very beginning we thought about the idea of accessibility, it is not just about compliance and for deaf and hard of hearing, blind, low vision, but about enabling people to consume content along the lines of their preferences. Really enabling people to consume it however they want.
So we think a lot about that when, you know, today we see conference platforms being able to deliver speech recognition. Some people really prefer speech recognition in the platform over a higher quality live human captioner because the speech recognition is faster. That can make a big difference in the way that they consume that content. Ultimately it depends on their own human  you know, their own capabilities in terms of how they're listening or how they're lip reading, whatever it may be in terms of following that content. I think that's win of the biggest things we have to think about when it comes to the challenges, the user preferences. And then, the content itself.
So it is kind of alluded to in a few different ways here, it was talked about in the previous panel about the CVTA as well, that we really need to understand what is the content itself and what is the context of the users consuming that content in.
So we  in our world, we use speech recognition and then correct it with a human then do QC on it before we put it out.
That's because speech recognition will assess what is being said, but it is going to not do as well when it comes to speaker changes or overlapping speakers or background noise, all of the things that AnnMarie Killian had alluded to, other things, you know, certain types of content, it is really important, it would be nonspoken sounds that are actually really relevant to the plot, which are really hard to capture properly. That doesn't apply to all types of content.
We think about how the speech recognition can perform in different scenario, and we put out a report every year, the state of ASR, where not only are we benchmarking all of the different engines out there, from, you know, all of the big names, the tech provider, but also looking at it by industry. So we're seeing that there are legitimately leaps and bounds improvements in speech recognition today and we're seeing the word error rates getting down as low on average across these different types of content to about 7%, meaning it is 93% accurate. That's pretty amazing when you consider the range. We also look at formatted error rate. That's the idea of getting speaker changes, nonspoken element, punctuation, capitalization, pow of Howard on the panel noted that commas save lives, you know, when you put that comma in the wrong place, you're saying let's eat grandma instead of let's eat, grandma. Right. So those are real things. And that gets into this idea of not all errors are equal and so optimizing for the content is really important.
So when we of you start to go into the industry by industry or content type analysis, those ranges start to open up quite a bit. You see certain types of content can get as low as a 4% word error rate, meaning it is 96% accurate on average. That's incredible. But you have to account for that. So that means it is probably a single speaker using common language, and really optimizing for the setting.
So on the technology side, what can we do? We have to look at how can you isolate speaker channels? How can we build unique vocabulary lists or dictionaries based on the type of content? We can start to do that now. One of the biggest advancements in the technology is the flexibility of actually being able to tune an engine in almost realtime, actually start to improve the speech technology ten minutes before a live event and actually get better speech recognition if you have the right vocabulary.
So there is a lot more we can do and we think about the conference platforms, one of the beauties of a conference platform, is that the technology can identify who is speaking. So you can put in speaker IDs, they can also hopefully soon isolate those channels and get better speech recognition even though people are speaking on top of each other. There are a lot of things we can do, but a lot depends on context.

ANNMARIE KILLIAN: Thank you, Josh. Thank you so much. So another question, for all of you, and this part, you know, we could have discussion as well, in the future looking ahead, what are some emerging technologies and exciting applications of speech recognition that could further promote inclusivity and if you wouldn't mind telling us, you know, can you predict some of the emerging exciting applications that can reshape the ways we interact with digital interfaces and information with each other in our lives, and how will it be applicable to us in the digital aspect as well.
Did I get  it is an easy question, go for it!

SARA SMOLLEY: Yeah. Maybe I'll start. Just to come back to what I described in the beginning. I mean, we  when we, you know, started building our technology we had no idea of, you know, where we were actually going and how far, you know, we would come. What was  what is coming, kind of  what was science fiction even for us in a few years ago in what's called continuous speech recognition, which means enabling the person to speak fluently. So it is in the case of someone with a speech disability to have the technology, understand what they're saying, very similar to the way that the technology would understand someone with standard speech. This is sort of science fiction for us just a few years ago. And now that the technology, you know, really is working, even for people with pretty, you know, moderate speech disabilities, it is really open to new use cases that we just couldn't have imagined.
So from our perspective, it is where really depending on people in the community to continue to tell us, okay, where are  where does the value of this technology really lie. Where can we as technology providers, that's what we are, a group of technologists and speech pathologists building a technology and relying on the community of all stakeholders to tell us which direction to go. Just some examples I mentioned, you know, one of the first  one of the first applications of the accessible speech recognition that we have built, when we moved from inperson communication to then okay, interaction with machines, one of the first integrations we did, it was with Amazon Alexa, again, say turn on the lights using our  using the voice it application. From there, our users said, okays while we can do voice commands what, about when we want to do more than just environmental control but to interact with the newest AI productivity tool? When it comes to, you know, what's the weather, you know, many things that you can do now with ChatGPT and other tools. So the  one of the newer integrations that we have done, it is with WebEx, so to make video meetings accessible, to kind of opening workplace meetings for people who are challenged or to be understood in a hybrid workplace. Of course, many of those other applications.
So again, like that application, a video meeting, you know, interactions by voice with the newer AI tools, it is really not something that we could have imagined. I think it is really  I mean, the sky is the limit. Really always kind of coming back to let's not just build a cool technology. Right. You know, use our vision to build something really great, but to always come back to earth where it with be really making a difference what do our users and customers say will make a difference in their lives right now.

ANNMARIE KILLIAN: Go ahead. You want to answer, Josh should be in.

JOSH MILLER: I think right on what Sara is saying, the idea of every day experiences and you were alluding to as well that's where this gets really interesting. I agree, 20 years ago, speech recognition was a cool technology looking for an application, there have been just so many advancements both in terms of the quality of speech recognition but also in the flexibility of applying it, the changes of the conversation dramatically. I mean, the idea that  the dialogue that we saw a demo of the dialogue this last week with a clip from Jack Ryan, it is really cool, and it comes back to this idea that, there are always articles now about how everyone is using subtitles and captions at home and part is user preference, part of it is also because theatrical releases were not made for a living room, they're made for a massive theater where you have the crazy huge speakers and surround sound systems that you just don't have at home. In comes the dialogue boost to follow along better, I had no idea speech recognition was behind that, incredible. There are interesting ways to really enable basic experiences.

ANNMARIE KILLIAN: Shadi AbouZahra, go ahead.

SHADI ABOU-ZAHRA: Yeah. So again, I agree with both, it is I think these small  small improvements, not necessarily, you know, like new technology as such, but making the existing technology more robust, better understanding the context, better understanding what's meant. Like if I just say name of somebody while I'm speaking, it is clear that it is part of the speech, not a comment, not another word. You know, things like that.
Just more generally expanding our language coverage, expanding how we recognize people with different speech pattern, be it due to disability or due to the different accents. I think all of these things, some will make the technology even more robust, even more useful in daytoday that will become so common and so ubiquitous like with every technology when it starts, it is still having a lot of, you know  thinking about cellphones, when they came out, they were not as reliable, as easy to use, now it is a normal thing to do.


STEPHEN FRAMIL: I'll add, you know, as a customer, I appreciate wanting to look to the customer, what's driving the data. I'm thinking about in pharma, of course, you know, this is an area where I personally would like to, you know, expand our accessibility work at Merck, what's happening in the clinical trials? So you have a situation where, you know, for any reason people who are getting involved in clinical trials have comorbidity, other types of illnesses, disease, that are disabling in terms of their speech. This could be something that's really very useful. Also really kind of speech directly to the whole notion of social determinants of health, getting old pharma on, I can tell  in terms of providing equal access to clinical trials across, you know, the entire spectrum. So there is a lot of opportunity there, that's a  we kind of solve this together, meet us where we are right now and figure this out together.

ANNMARIE KILLIAN: Thank you. Thank you for that. Now the last question, and probably in all transparency, my favorite question: You know, as a deaf women to ask this panel, you know, regarding empowering individuals. So how can we  with individuals with disabilities, advocates, communities, contribute to the ongoing development and improvement of speech recognition of technology and are there any opportunities for collaboration to ensure that the technology truly empowers and benefits everyone?

SHADI ABOU-ZAHRA: I'll take a shot. I think that the multiple levels and multiple ways, again, today in the keynote panel, the Speech Accessibility Project was mentioned, this is one example, but it's one example. It's one opportunity here, but there is also a lot of opportunity to for example build Alexa routines and scales, we love to cooperate with organizations, I think, you know, Sara mentioned, some of the work. You know, working with different organizations to innovate, for example, I recall in Germany I think there is an Alexa skill where a voice was developed with a German Association for Blind and Low Vision, and this skill allows you to browse the libraries of some German broadcaster networks, media libraries there. You know, browse through by engineer, date, but by simply searching for something and getting these played through an Echo device. So here is an opportunity, for example, where together something was developed using the speech interface and speech recognition to create a custom specific solution here. The organizations, assistive technology makers like voices and other that are really focusing on specific cohorts, Amazon is focusing on mainstream, right mainstream features and making sure that providing these accessibility features for the broad majority, and I think there is close relationship to specialized assistive technologies that actually focus on specific features and functionalities that go further. This is maybe another opportunity here or another way of basically working together and furthering the reach of the technology.

ANNMARIE KILLIAN: Thank you. In our world we think about feedback loops, how to improve the technology, and I think it goes beyond that too, I think that whether it be a basic communication device or a tool to improve communication versus, you know, mainstream captioning, I don't think we're probably doing enough to close that loop to get enough feedback as to when is it working well, when is it not working well, how could it be better, what are the actual problems? Where we are essentially selling to content producers, publishers, whether that's in the media space, the enterprise, we're selling to someone making the decision on behalf of the user. How do we enable that better, to be able to identify where things are working really well, and does there still need to be improvement?


SARA SMOLLEY: I'll just add, you said this is a question close to your heart, and I think it is something really that we all share. I would like to sort of say that  it is so important for people to really be engaged I think in a way that mainstream consumers simply don't have to be. I think it is  you know, for  I don't know the specific, I'm assuming that most consumers don't  they don't like something, it is not working for them, they throw it out, walk away. I really believe and in my experience, in this field, in building assistive technology, it sort of is a  it just it is incredibly helpful for people to really be involved, whether they're, you know, engaged to share their experiences, because it is not always obvious, and what has been made challenging to what we have experienced. People are not used to be listened to and heard, it takes an additional effort to do additional interviews to get the information in order to improve the technologies and improve the accessibility of the products. I think there are a few startups, and the larger corporations for people to be involved in beta testing for people that are stakeholders, maybe caregiver, others, to be involved in that realism. And I would urge that's incredibly  incredibly helpful, and one other point that I would like to kind of make here, sort of turning that question a little bit on its head a bit to express a challenge here and a lot of people experienced this: I was speaking the other day with someone from Johnson, handling the speech accessibility project together with the five major tech companies. And we were discussing the challenges of data collection in the space, in that  when we talk about people with nonstandard speech, people with speech disabilities, this sounds  you know in, mainstream dialogue, this sounds like a very niche thing, actually it is extremely diverse and there is not always a central location to go and say, okay, the samples, engaging people in the area, so I'll turn the question around, as other people's perspectives on engaging people, the community as we build products and technologies, to ultimately benefit everybody.
Briefly, I would just say at Merck, not being only the accessible lead, but also coChair of our Disability Inclusion Strategy Council and one of the projects that we're embarking on is inclusive packaging design when it comes to our pharmaceutical packaging, it is not equal across the globe. In Europe, there are laws where they have to have braille on the actual packaging of medicines. So this is  this is an area or an opportunity where assistive technologies and providers of such, you know, we can collaborate together on what that could be. We haven't defined it. We're at the beginning of the journey. It will take 18 months at least to get the ideas. We have the software, the medical devices used in research labs and clinical trials, there is a lot of opportunity I think when large corporations and providers collaborate together. Yeah, it is very exciting.

ANNMARIE KILLIAN: I want to add something, we have a little bit of time before we open questions, 28 seconds. I want to emphasize, inclusive disability research, oftentimes we're finding solutions themselves to describe people who are impacted. For example, you know, if there is a  you know, an impact, in sign language interpreting, you know, is there only  can you only get an interpreter themselves to discuss the AI impact, no! There is parts of us, we need to be bringing in people and talking about it and consumers, bringing it to the table for discussion of AI and come up with solutions, and that already we have in house experts that it can solve this. So maybe I'm curious: Do consumers, have you been brought to the table for this discussion? You know, it is important for you to believe in the person who is innovating the solutions. Are you involved in the different communities with those with disabilities? One size does not fit all, it just doesn't. They are all different, we're all different. There are consumers, observers and resources that are available and sometimes people go, oh, I can't find  well, are you in the field of getting the solution of innovations, it means that you know the community, you have to go out there, you have to be inviting them, it is not just People with Disabilities but different backgrounds, people of color, involve them. The different things, you know, a challenge, you know, we have to include and discuss and influence all of the feedback. Okay.
So now I'll open up the floor for questions.
You want to come up here.
You're fine, I will watch this.

My name is Christian, I'm with Gallaudet University. My question is in regards to a personal experience with deaf people that have accents. I know there is quite a variation of some people's voice is easy to understand but some people, it can be quite the challenge. So I am thinking about where we can go, and to make sure that the technology does work and think outside of the box and accommodates everybody. For example, from my experience, there is a lot of meetings and such that have both hearing and deaf people involved. I do speak for myself oftentimes and that's okay especially when there's all hearing in the room, but if there are deaf people involved as well, that does bring up some problems and it can be hard for people to recognize my voice and understand me. It is an impossible situation for me. So how do you accommodate both hearing and deaf people at the same exact time? I feel like speech recognition should be improved and make sure that it is working for everybody in the room. It does present these challenges and obviously, you know, there is new companies, new technologies, and with that comes problems as well. For example, like you mentioned, the partnership with WebEx, this is wonderful, many deaf people use Zoom, not WebEx. So if you think about that, the point is, how can we get from here to develop the technology and find the solutions that are needed and get to a point where it will accommodate everybody.
How can we get there?

ANNMARIE KILLIAN: Sara, you want to answer that one?

SARA SMOLLEY: Yeah. Thank you so much for that question. I really appreciate it, because I'll tell you, really honestly, we as  you know, not long ago, we weren't sure that  we assumed, or we weren't sure whether the auto speech recognition that we're building, accessible speech recognition, that it would be interesting at all for people in the deaf community. We thought, well, if people are signing, well, do you really want our technology? This is an example of really the interest and the need came from individuals to us and when we launched what I mentioned before, our continuous speech recognition, enabling more fluent conversation, it seems to be the right time to maybe explore this. So we teamed up with one of the leading providers of solutions for deaf individuals strictly for the purpose of, okay, well, we haven't really validated the technology before with deaf voices, or what you call deaf accents with this. Now, will it work well, if so, what would be the use cases in which, you know, it would be most valuable. Would it be, you know, interacting with GPT, the voice assistant, or an in person communication, or through the meetings? So we  so we started to kind of explore that in both directions: First does the technology work? Second, what is the user experience that would be most valuable.
So with that, the technology, it works great, even to the extent of a lot of the training that may be necessary people with speech correlated with other situations like cerebral palsy, other situations, deaf voices may be consistent and need less training, almost no training on the technology.
So then it became, okay, the use cases: So we talked about the WebEx integration. Our product does  it actually has a Zoom plugin as well so, we're working through that and I think it also still kind of comes down to the partnerships that we're building. So it is not something that a startup can do alone. It is not something that a corporation can do alone, not something that the university can do alone, but working together to find that solution, again, the technology, the user experience, and then the actual real world application in a business model  by the way, which is a totally different discussion  but a business model that actually can get the products into the hands of people that can benefit from them.


SHADI ABOU-ZAHRA: This is not an answer to the question, it just just the thoughts about that the people have very different types of speech patterns, not only in the deaf community and many other disabilities with ALS, Down Syndrome, cerebral palsy and I think, what I mean about the work ahead of us, a lot of it, to make the improvement. Some are progressive, that means that you might have thinking about things like MS, other types of disabilities where you can be on one day having a certain speech pattern, the other day, you have another. We need more robust systems with robust recognition is basically what I'm trying to say.

ANNMARIE KILLIAN: Thank you so much. Any other questions? In the back.

I'm Lindsay. I'm an occupational therapist and research scientist from the University of Pittsburgh. I was wondering, I'm glad that AnnMarie Killian and Sara gave the example of the other provider population and Merck, you did say provider, other provider populations that work with individuals with diverse range of abilities. Can you speak a little bit about how that has supported  how you have involved them either in your development, your innovation, and the impact that made on your product on the outcomes? And I guess the second part might be how that could, like, tie back to the other point we just mentioned, Sara, I get it has to fit in a business model, but that collaboration among academia, the population, the providers, the industry that makes the products to not only facilitate or develop the inclusive products but also to get them in the end user's hand and ensure that they have the knowledge and skills.


SARA SMOLLEY: So clinicians, providers, caregivers, family members, they're extremely important part of the development process and has been from the very beginning. In experiences, in our approach, we have a team of speech language pathologists in Israel and the U.S.A. who work with other speech language pathologists and OTs and Fs and others to initially to identify individuals who can benefit from the products or to be involved  involved in research and also to bring feedback back to the development team in the sense of the feedback that Josh had described. It is challenging. I think we can all kind of recognize that because of the costs involved and the time restraints of many therapists. I'm sure that you can relate if you're practicing speech pathologist. So from a startups point of yew review, we have taken different creative approaches to be able to involve therapists and other clinicians in different stages of the process while also being mindful of time restraints and we do this through grants, through different, you know, training methodologies that can be actually incorporated into therapy. Then through kind of things like surveys, other ways of collecting feedback that would just be very simple, efficient there. But I appreciate the question, I would love to talk to you and other researcher, others to collaborate in ways that would work, like from your perspective as well. I think it is really, really important.

STEPHEN FRAMIL: I would also add to that, you know, in pharma, we're in the business of improving medicines to improve lives. To do that we need to do clinical trials, to effectively do a trial so somebody can go to market, it is to have a good cross section and to break down the barriers that existing processes might have for People with Disabilities. This is where I think there is an opportunity to partner with assistive technology providers to break down the barriers so that a pharma company can successfully get a medicine to market and improve lives. That's what we're doing, and so that's where a partnership and how can we better, you know, take a look at that, that concept of social determinants of health. That's what pharma is interested in, then how can assistive technology help with breaking down those barriers to get there.
You know, one thing, AnnMarie Killian had mentioned about disability, you know, that's, of course, at Merck, the digital accessibility policy owner is our Chief Diversity Inclusion Officer. It is not an IT. The thought leadership comes from DE&I, it is enabled by IT, and of course the business is responsible for implementing. So I think there is a real opportunity here with collaboration as we  you know, we have the different goals and different business models and how do we break down the barriers so that People with Disabilities and People with Disabilities cut across all marginalized groups.
You think about the whole DE&I concept, various marginalized groups, the thing is, People with Disabilities cut across all of them, it is universal. Of course, as it is often said, if you live long enough, you will be one of them or it may reach you sooner than you expected.

ANNMARIE KILLIAN: I want to add the last comment to if you live long enough  recently I was reading World Health Organization, they were saying by 2050, 2050, people that are still living will become either deaf or die. Okay. So the impact is spreading. So unfortunately we have one minute left. We want to wrap up. So I'm going to let each of the panelists give their last  I want to thank each panelist for being involved with your time and if you have any additional questions, you feel like oh, feel free to come up, talk to each of the panelists as we're close, okay. Thank you again for your support.
Thank you.

FRANCESCA CESA BIANCHI: Thank you again to the panelists. I would like to just announce the networking break. Half an hour, from 3:45 to 4:15. So please, if you are interested in the next topic, in this room, we will talk about new enablers for inclusive workplaces.
Thank you.

This text, document, or file is based on live transcription. Communication Access Realtime Translation (CART), captioning, and/or live transcription are provided in order to facilitate communication accessibility and may not be a totally verbatim record of the proceedings. This text, document, or file is not to be distributed or used in any way that may violate copyright law.

Leave a Reply

Your email address will not be published. Required fields are marked *