 We got to wait for the queue here. You let us know. I'm John Stein. This is my friend, Oita Coleman. And we're both with the Open Voice Network. I'm former Intel and Cisco. Oita, former SAS. Former SAS, yeah, in the analytics AI, responsible for software quality. And this is what we are. And we're going to talk about why this would be important in our conversations. We are Open Source Community Linux Foundation. We are focused on conversational AI from a standards, openness, interoperability, privacy, all those perspectives. We are not going to be, let me make it clear, we're not going to be a platform. But we are looking toward the future of conversational AI, especially from a user's perspective, especially from an American Express perspective, especially from those enterprises, NGOs, and organizations that are part of, get away from the speaker, of what you're working on from an IoTG perspective. If you've got the vision part, this is the voice part. AI, ML, et cetera. Our sponsors are these, Microsoft, Deutsche Telecom, Target Corporation, Schwartz Group, Veritone. Founding principles of what we do are about data protection, openness, and interoperability. Those are the three major reasons why we were formed really two years ago. In fact, we just celebrated our second birthday. So we're young, we're growing, we're making progress. Very exciting. Our theme, bring user trust. And user trust, really from several perspectives. When you say user trust, most people say, oh, it's about privacy, right? It's about data protection. But this is also about the trust of enterprises, such as yours. How is voice used to create value? Where is it used to create value? How will it be used to create value in the future? What can we trust about this technology from a user's perspective? And again, for it to create value and sustain value. Some things, click it if you would, please. This is our understanding, and here's a key point about what we're up to. This is our understanding of the market. Right, Intel, we're gonna think about TAM, right? Most individuals would say the voice market is that of the general purpose platforms, Siri, Alexa, Google. And yes, they've done a great job. And generally, with Google getting out of the third party app market, Google Home, roughly around 300,000 apps. Give or take, worldwide. US is roughly 35 to 40% of that, as we know. However, and again, we all know this and yet, all too often, we don't think, how big could this get? How big will it get? Especially as we think, we're looking at over the next several years the Michelangelo moment. Think of the Sistine Chapel. You've got the one finger coming this way. General Purpose apps need third party enterprise users, enterprise value. They need to connect to an American Express. They need to connect to Target. They need to connect to whomever, Bank of America. On the other side, Target, Bank of America, American Express, need and want to connect to those constituents of the general purpose platforms. That's gonna come together. People in the voice industry are saying, oh my goodness, Google got out of the third party app business. What's gonna happen to third party apps? What's gonna happen to third party? The third parties exist. They're running IVRs. They're running connections for voice that are going to their constituents, but they're unwilling to give away their data. Key point, they're unwilling to give away their data. They're unwilling to put themselves in the mercy of those who are unwilling to respect the data. The total available market here is in billions. I've reached out to Tom and to others at Intel. This is a billion destinations market. There's no reason why Nvidia should be leading in this and Intel not. In the car, in the smart home, enterprise apps anticipate a growing 20 to 30% this year based upon a number of third party analyst firms. The fastest growing element within voice are enterprise inside their firewall. Some call them IVR, some call them customer assistance, some call them whatever. I'd love to learn what you're doing, but this is the fastest growing area in voice right now. It's not part of this ecosystem. We lose sight of it because it's not in this whole generate area of publicity and comment. Fastest growing, every website will be voice enabled. We will reach and will operate within a worldwide voice web. 1.9 billion websites online, IOT endpoints, IOTG, everywhere, billions of destinations as you know. We will reach them via voice. We will interact with them via voice. Every AI will be conversational, say the futurists, and then of course in the metaverse, God knows how we define that. Come on up front, we're just talking here, we're friends. However we define the metaverse, the immersive world will be largely voice driven. Certainly immersive gaming in other places, of course we'll touch tap swipe, of course we'll do this, but voice will increasingly be interaction for the metaverse. This is the world that's in front of us with voice. Now some people would say, well, this is not growing fast, voices at the low end of the hype curve is no reason to invest. You buy on the dip. This is growing and ready, poised. Starting with this, is poised to grow rapidly. And again, we're growing toward that Michelangelo moment. Well, lead if you would please. This is why our work is important to you. We have five major, six major work groups, more than 220 volunteers across 13 nations. It's not just theory. We are actively building and doing right now and the core of that is a vision for interoperability. How does an application and agent on a Google platform talk to a Microsoft platform, talk to a Deutsche Telekom platform, talk to a Baidu platform, talk to a Microsoft Azure AI platform, talk to Alexa, talk to Siri. We have the outline of how that will work. There will be multiple layers. These are the key four, but there's gonna be in different layers of which we will have protocols and specifications to allow the sharing of data, messages, acoustic, context, controls, and then on and on and on, depending upon the development environment, depending upon what you need to achieve, depending upon the relationship between your platform and someone else's. And it will be in a host delegate model. We'll initiate, let's say we're in our automobile. We have an agent who are on platform A. Let's go ahead and click. We'll pass control to our favorite grocery because we have a loyalty program with them because they are focused and want to provide the best customer experience. They want me to operate within their brand experience, not to be disintermediated by someone else, not to have someone else control that experience because they will want to control and engage with me in a private data protective way. That'll be passed interoperable. And then we may go here and here and here and here and here, on and on and on. This is what we are building right now. We anticipate for perhaps these first three, keep an eye to GitHub, Open Voice Network. We will have first protocols and specifications draft to be tested. And we have a number of companies lined up to test this and demo this in four to six to eight weeks. By September 1, we anticipate there will be individuals, companies, firms, platforms testing this approach. Aweid and I are reviewing the core white paper on that this weekend. Our team was going through some final development on that this past Tuesday. We're moving ahead and very rapidly. Future of conversational AI. Interoperable, data protected, secure. Here are the things we're also working on. Interoperability, of course, at the core. And we're also working on how do you find and discover? Findability and discoverability. How can you, you're here. I want to go, anyone chat about Wegmans? Anyone know Wegmans food markets in the Northeast? Anyone shop at Target? Ever shop at Target? Send me to Target. How do I find Target in the United States, in my neighborhood and not Target, Australia? How do I find Delta airlines and not Delta faucet? How do I find and how do I discover? Is there a need for some kind of a destination registry or plan? Is there a need for some means of findability and discoverability? We believe yes in a worldwide voice web. How do you find Target? How do you find PayPal? And know and trust in an authenticated way that you have reached them. Have to know that is the place you want to go. Working on that. Oida leads our privacy, security and ethical use portfolio. And she's going to talk about that in a moment. But at every step along the way, one will pass privacy controls, privacy understandings. One will pass data security protocols. And then certainly for those who are participating, companies will adopt the ethical use guidelines that we have published here recently, 1.0. This is where we're building to. And again, if you would, the prior slide, Oida, to this future, trillions of destinations. Trillions of dollars, euros, whatever currency you wish, in terms of total value in a worldwide voice web. I'm gonna ask Oida to jump ahead to yet a next slide and turn it over to her about some of our work also. If you would, Oida, click that. Are you gonna, the points at the top there, I think are really important. If Bill wants, use many, that whole Michael Angelo moment that you were talking about, we are at the inflection point, similar where we were with the browser wars, with the independent platforms. Now is an enterprise, how do you write in this interoperability world once and not have to write specifically for the Alexa platform or for Siri? So this is the inflection point that we are when we say that Michael Angelo moment, Bill wants, use many places. And that's really the message behind interoperability that we're working towards. One of the origin story of this happened in a December day in a Minneapolis meeting room and it was about two degrees Fahrenheit outside. And I had been asked to make a presentation for 15 minutes to senior management of a major Fortune 500 firm in Minneapolis and began talking 15 minutes. They said, prepare one slide. You know how difficult it is to prepare one slide, right? It's impossible. Prepare one slide. And the CIO who I had the privilege of knowing was there and began speaking about this voice thing. And he reached into his pocket, pulled out his smartphone, bink, bink, bink, bink, bink. And in five minutes, the CEO came in, surprise. I had a flight in 90 minutes. I thought, well, that's okay. I can go home tomorrow. CEO came in and we began to discuss where this thing was going. And the CIO, again, this is a Fortune 500 firm. This is roughly 100 billion in revenue. Minneapolis, Minnesota, they said the following things. The CIO said, and to Oida's point, I need to build once and use everywhere. I'm not interested in investing in this platform, this platform, this platform, this platform. I want to build once. Maybe you feel the same way. The CEO said, I need to reach all my constituents. I don't care what platform they're on. I don't care what browser they use. I need to reach all of my constituents, all of my customers, all of my guests. And then finally, the CEO turned to the CIO and they both turned to me. And the CEO said, well, wait a minute, let's pause for a moment. Who owns the data? Because the data is where the money is. And the CEO, the CIO said, well, currently, if you're working with one of the major platforms, those major platforms will claim your data. And the CEO said, and forgive my language, said I'll be damned if anyone is going to take the data which is the lifeblood of our business. Reach all the constituents, build once, use many. And your data is yours, not theirs. And thank you, Oida, for reminding me to emphasize that three founding principles of the Open Voice Network, of which this vision of interoperability and the work we're doing to make that happen. Now, go ahead, if you would, Oida, thank you. I'm gonna turn it to Oida. This is another reason why we're doing what we're doing. And critical for all of us to be thinking about, certainly those who are developing in voice and from the United Nations perspective, Oida, please. So before I talk about this slide, I'm going to go back to this slide and echo the points about your data is yours, not theirs. In this scenario that you see here, it's about groceries and payment. But if you wanna imagine, suppose it's about your health, your identity, things that are captured by your voice. And if you think about it, voice is the richest source of data outside of the human genome. So you're capturing text data, you're capturing acoustic, you're capturing context, and you're capturing all of those things around your voice data. And as the data is being analyzed, there's so much that can be learned about you. So going back to the question of who owns the data, who controls the data, and then the story of interoperability where data is shared from one agent to the next, who gets to see and know all of this information about you. Who gets to know your identity? Things like your name and related identity data, your intent, questions that you ask of your, of the voice agent. Behind that is analyzing what's the intent of your request, physical characteristics. We all know that based on our voice, you can probably tell I'm a short person. I, you know, my age, my weight, upper body strength, all those things can be learned. Ethnicity, our language, dialect. So just think as you go through each of these things, the more and more layers are known about you. Personality type, whether you're extrovert, introvert, demographic information, where do you live? What's your educational level and what's your social status or in socioeconomic class even, based on your language, based on how you speak? Sentiment, what's your emotional state? Are you angry? Are you depressed? Are you sad? You know, maybe they recommend if you're depressed, maybe we should go give you some ice cream. All of these things, you know, all these suggestions that are made, you know, would you like, you know, to seek help? All of these things are the intent and the information that's gathered about you. It is so personal. Trustworthiness and your physical health, you intoxicate it. Thinking about this scenario where in that chart where there's so much with your automobiles now, where is the voice activated? And you imagine a scenario where you have gone to a bar and knows your GPS information. It can analyze your voice and based on that, determine whether you're intoxicated. All that information, you know, in the wrong hands can make, you know, can really be detrimental to you, to your personal safety, to your, just who knows and has access to that information and then mental health. All of those things are key to our work in helping to put in place guidelines related to privacy, related to security, related to ethical use. From a privacy standpoint, who has access? How do I give consent from agent to agent? How does that consent get passed from one agent to the next that I am authorizing this information to be passed? How do I control how much of that information is accessible to the initial agent versus the next? How do I get to delete the information? How do I understand the decisions that are made about the data in the background? All of those things, that's important to the work that we do from a privacy standpoint. From a, in helping the, you as a consumer understand from the ethical side, the issues related to transparency. How do I know? How do I trust, you know, what's being done? What's the, you know, transparency and privacy? All of those things, and is it secure? The control's there. How do I trust that the authentication between one agent to the next that I have the ability to, if they're using my voice from to authenticate as an identifier or if we say or start doing things like wallets where my voice is my passport. How do I trust that and how is that authenticated? So all of those are things that we're working on and developing our, the privacy guidelines, security guidelines and ethical guidelines that we've published and are working on. This is, we're in a wild, wild West era of voice. We truly are. And the work of the Open Voice Network is to maybe not be the sheriff who walks in down the dusty center road, but certainly give some guidance to those who may be sheriffs and to be proposing ideas, concepts, standards, protocols and the like to the great standards organizations worldwide. But as you look at this, just understand for a moment and many of you who work in Voice would know this but right now there are firms that are evaluating in real time the sentiment of shoppers and then within they promise and they say milliseconds they are assigning different messages and different speakers, different call center agents to those. There are firms that are evaluating in real time Wall Street CFO quarterly statements. Is it believable or is it not? And sending that information in real time to Wall Street analyst firms, stocks going up, stocks going down based upon trustworthiness. And then the use of biomarkers in physical health. A great story, a firm outside Provo, Utah, Canary Speech has a project going and you're aware of the, some may use the term gypsy populations in the UK. I think they call themselves travelers. Well, they're not terribly interested in visiting doctors. They're not terribly interested in being part of a standard health community, but research has indicated that nearly all traveler families are equipped with smartphones. Canary Speech is working on a program with Northern Ireland Health that the diagnosis of some 15 to 20 even 30 seconds of that speech can give a sense of both physical and mental health of those communities. And especially at a time of COVID, are we hearing the raspiness in the voice? Are we hearing the coughing? Can you speak to me? And as a marvelous way being tested right now of how we use voice acoustic analysis for some tremendous value creation within underserved populations. All kinds of things possible here. We can focus on the scary. We can focus on the good. It's all in front of us. The Open Voice Network working to bring this to the benefit of all. I think this is pretty much what we wanted to share with you. And I think Orita and I are gonna get silent and just listen to your questions and your comments and give you back some time. But please go ahead, Orita. Do we wanna touch on, I think one thing that's, you might be interesting to touch on is the synthetic voice and the deep fakes, that additional work, because that is something that is also part of our work and is out. You've all heard, synthetic voice. You're all familiar with what that is, a cloned voice, does that make sense? I could take, for those who don't, I could take, we could take, what, 30 minutes, 45 minutes of your voice chop it up, reassemble it, and then have you speak a different language in your voice. One of the original and very prominent value propositions for this. Hollywood actors, whose films go world wide, take, for example, Samuel L. Jackson. That's a voice that many of us would recognize immediately. Well, I was in Europe a couple weeks ago, and here was a Samuel L. Jackson film, and the voiceover in German was kind of like this, and it had a very high voice, and I thought, that's not Samuel L. Jackson. That made no sense. Well, now we can take Samuel L. Jackson's, cut it up, and have him speak Deutsch or speak Mandarin as only Samuel L. Jackson could do. And it begins to change all kinds of things in entertainment. But if you can do that, you could also take Joe Biden's voice or Donald Trump's voice, and have that individual speak all kinds of interesting things. Propaganda. Propaganda. There's so much. It happens, and we've seen that with Barack Obama and Deep Fakes and all kinds of things. One of the approaches to resolving that potentially is establishing some standards for audio watermarking. Is a means without hearing a beep that would identify when a synthetic voice is present, and then potentially a standard for consumer or enterprise detection of synthetic voice. That's, we have a study group that has been meeting now for a year, which is published to White Paper, has a webinar on YouTube. What synthetic voice, where is it going, how might we address it? Again, the kind of work of the Open Voice Network. Here's the takeaway. Gosh, join our work. We are unique in that we start our meetings on time, but by golly we always end on time. And if you say I have a hard stop at 10, you'll get a hard stop at 10. Our work groups, interoperability, privacy, data security, findability and discovery, ethical use, authentication, and I should have added synthetic voice. Those are the teams that we are have working right now, meeting either weekly, bi-weekly, or monthly, and all of our documentation is in a Google workspace and increasingly on GitHub. You can just as an individual, without committing your company, sign up as a friend of the Open Voice Network. You know, what you guys are doing is interesting. I'm interested in this, I support it. It's a good idea. Go on our website, click, I wanna be a friend of the Open Voice Network. And if you're a friend, you're gonna be involved and be invited to specific discovery meetings, specific overarching, I would say, here's what's going on, here's the inside look, here's the access to get, this is where we're looking on this and this and this. So friends are those who are gonna know and see, be at the leading edge of what the Open Voice Network is doing. And then finally, as I mentioned, we have enterprise sponsors. We have companies that are on our board who are contributing monies, we're hiring third party talent, people much smarter than me, equally as smart as Oida, to drive us forward, write the code, develop the protocols, establish the standards, develop the demos so we can test things to drive it forward. And there's multiple levels of sponsorship and gives and gets and the like. It's on our website. That's the Open Voice Network, that's what we're up to. And I think it's wonderful that we be quiet and let you tell your stories and ask your questions or ask for clarification, please. Oida, anything else you want to add? We'd love to see you. And as, as Miss John said, we are a judgment free organization, come as often or even if you want to sign up to be someone who for our next, for example, review of our ethical guidelines, if you just would like to be served as a reviewer, give us your feedback. Does this resonate within your enterprises? Is this something that from the security guidelines perspective, is this something that your enterprise you think will adopt? And if not, what else needs to, needs to happen, what needs to be added? So however you want to participate, attend meetings or just be a friend, friend who can help advance the work that we're doing. We're open, we're inclusive, and we're guilt-free. If you want to help in whatever way, you're absolutely welcome. So, yeah, questions, thoughts, yeah. Readers should be able to hone their data. Everything on the work with the people here, like how is that particular work here? The whole point is to be able to analyze all that data kind of like into a particular AI model. There was that data stored, the testing and all the testing data. Question was, I don't know if you heard it, was, okay, so user owns your own data, well then where is that stored? How is that then used for training? What happens to the platform? Because the platform wants it for training and development, we get that, and there's a value to that data. It's easy and a cliche to all to be determined, but I think one of the first things is, and this is what Oed and Team have led, is an open consent. Do you understand who you're sharing your data with? Do you understand what's being shared? Do we even grasp that? That is a first step. So informed consent and a UX that makes that reasonably easy as opposed to something, Clujie, that takes you 14 years. So that's a first step. A second step would be, is it possible that I would have my own voice path port? Because I'm gonna choose, generally, I choose to share a lot of data because it's in my benefit to share. So I will choose to share with MX. I've got a card, I want them to know. I choose to share with my retailers. I want them to know because I get things in return. So we're not suggesting that everyone locks their data away. It's much more users be aware, enterprises be aware. It's frightening the number of C level people I meet with and Oeda meets with who don't know in voice what's happening to their data. What do you mean? Well, yes ma'am, yes sir. This is what's happening. Kind of scary. We're also looking at, so the first paper that we wrote was from the consumer, the user's perspective of privacy. And so we are now looking at, the commercial enterprises care about privacy as well. So we are now creating a shorter paper looking at the commercial privacy because they're right now, there are very few standard guidelines from a, anything related to commercial privacy. So we are, so your question is very spot on. That is the reason, whatever you mentioned, we have phase everything. So that is the reason we don't send data out. On a cloud, right. Because we don't want to send data out because there's a proprietary issue. We don't trust our private integrators. We had our integration with Siri, Alexa, Google, you name it, we had it. But that did not scale, because at the end of the day you have to send that data. That's not why we do it. So I understand all the concerns, but from the paper or the protocol perspective, how interconnected these things are. So the interoperability with privacy and data security. So is there like a mesh of connections across multiple things that they're trying to solve? Is it varied or is it just like at protocol level where there will be a product that will follow? What is the current state right now? Great question. What's the current state right now? We're at the early protocol level. So we're early in the process. And we realize back to the multiple end levels that you're describing, we've got to answer that and address that. And especially when privacy is going to be an informed consent authentication, is there an agreement as to data when we transfer and that's happening in milliseconds to be determined? And I hate to say, well, we don't know yet, but we're working toward that right now. We're at that stage. But your point is great one. And there was a part of your question. I believe you asked if we would be creating a product for that. What's that? Yeah. So in the course you would be. Yes. I am forced to say it, right? Right. I cannot say anything that does not, except HTTPS or HTTPS is kind of a protocol. But then they kind of reward as a standard and then they kind of integrate it with the product. So that's what you're thinking. Like what is your vision towards it? Like would you start with just one horizontal, like maybe interoperability, and then evaluate it across my product because there are like tons of ways for that right now. That's the way we're thinking right now. That's the way we're thinking right now. Yeah. Yeah. We have a list of the first what seven or eight different platforms who have said, yeah, we want to work with you. And so we'll start working down that list. NVIDIA. From Riva, that kind of takes care of all your, all your needs, be it conversational level or the common ways like KCD, do that, something like that, right? Did you experiment with that? We are in conversation with some great friends at NVIDIA right now. Do you know Shyamala Prayaga at NVIDIA? But we know of that, Surab. In terms of having developed with it, not yet, but we're aware of it. And thank you for mentioning that because I'm registering that in my head for further follow-up. No, thank you. You have a, yes. Yeah, in shift data ownership versus data storage, because I think we were trying to define this. That. This one? Yeah, like who's the word, who's the data storage? And like, if you have any thoughts on that. We are working to define that right now. This, the zero dot one. And I can't, you know, it may end up at a zero dot two something else. This is, if you will, you use very precise terms and I love those. This is the custodian and owner that will make a informed decision to share some level of this. If we have N number of layers, it could be just text or it could be text acoustic and then bing, bing, bing, bing, bing to share here. And then this accepts and says, I am now the custodian. Or I'm a delegate of that. The custodian ownership is here and I'm simply sharing a bit of this to obtain information, add Pinot Noir to my shopping list. Bing, okay, back. So there will be some contextual difference but that's the one, that's the zero dot, it's probably zero dot zero zero one right now in thinking, if that makes sense, that helps. Yeah, and also with the question that you asked, the terms you were using, they have similar terms within the GDPR, EDPB, so those terms and so as a global organization we need to be very aware of those roles and how they are defined. And especially there's a data act that is underway that even further defines those roles and what's required and as that data is passed and delegates, it does spell that out. So we are very aware of those things and making sure that whatever guidelines we come up with are falling in line from a global perspective. Yeah. Very good question. Yeah. There are two questions that came in virtually so I'm gonna read them to you to answer. The first is, one of the scary possibilities that could come from this sort of data getting into the wrong hands is, and I'm thinking Facebook, non-user profiles, JDPR situation, is how through data banks someone could easily come up with using image clouds, fake 3D versions of everyone with a voice and all. We all speak into our phones or around our phones and that could easily turn against us. What are your thoughts on this? The answer is we agree with a concern. Yes, that's, we have not yet addressed that specific issue and question, but it falls broadly into the reasons why we have been pursuing, and Ouida is finishing the final draft of our paper on data security specific to voice. What are the new issues that voice brings? How do we even think about them? What are the next steps that a worldwide voice web will bring to us? And a world of interoperability security becomes, now we have new threat surfaces, et cetera, but to again the statement, thank you. It's one that we'll wanna capture and add to the list. Thank you. Ouida, anything else? No, there was a second question. Yes, and then someone else asked, they said great topic. Are there groups that work on voice to text apps, any pre-reqs to join in for development? Are they happening in real time and speech therapy related applications? Question mark. I should have mentioned we have a several industry focused groups. And please, if you, whoever asked the question, please know that we have a health wellness and life sciences group that is right now working on the role of voice in remote patient monitoring and management of which speech therapy and the use of voice would be a part. We've had some conversations with a wonderful center that works with multiple sclerosis and cystic fibrosis patients in Birmingham, Alabama for the use of voice technologies in therapy, creation of synthetic voices to give those patients a voice and ongoing therapies. I think I said that. And would love to learn more of your interests, the individual who asked the question and potentially bring you into some of our conversations on the RPM and M topic. Thank you. Thanks for your time. Thanks especially here late in the day, thank you. From the ethical side or from the... Help us understand more. At this point, but... Not specifically. I'd like to learn more. No, I'd love to learn more. Yes, definitely. Thank you. Thank you.