 Hello everybody, my name is Mike Hadley, I'm the head of digital communications here at the World Economic Forum and I am extremely proud and delighted to be introducing this conversation between three of the world's most renowned professors in artificial intelligence and social and emotional, the creation of social and emotional intelligence in robots and machines. Coming at the end here we have Maya Pantik who is a professor of effective and behavioural computing at the Imperial College in London. We have Vanessa Evers who is professor of human-media interaction at the University of Twente in the Netherlands and Justine Kussel who is the associate dean of technology strategy and impact at the School of Computer Science at Carnegie Mellon University in America. In the very very brief time I had to get familiar with what our guests do, this is the understanding to which I came. In order to make a machine understand what a human is thinking or how to make it very, interact very smoothly with a human and in a social situation you have to do three things. You have to sense what it's doing and interpret those signals. You then have to build a machine that is capable of understanding the analysis that you have had from that data and then you have to interpret, you have to imbue the machine with some kind of interaction capability with the outside world. I think that's kind of where we're at. That's left to right. My job in this whole process is to interpret data that comes through from sensors. Let me ask you that question. We have a camera, we pointed at a camera or some kind of a sensor, we pointed at a person and we get data from that sensing. What happens then? You got it right so we need to have a kind of sensor. The sensor doesn't matter. Cameras are definitely important because based on the cameras we can have the videos of the faces for example or the bodies. Then we can analyze for example facial expressions which is also the main topic of my research. What you do with that is actually you track certain characteristic points in the face such as for example corners of the eyes or corners of the eyebrows, corners of the mouth and so on. Based on the movements of those points you can reason about the motion of the face and reason about the muscle actions that underlie this facial expression. You can actually see that if the corners of the mouth go upwards, well you smile. If the corners of the eyebrows go up that will be an eyebrow raise and then that's a first level of interpretation that you can give and the second level of interpretation will be actually the meaning of those actions that the face produced. The meaning can be different. These facial muscle actions they are agnostic, they are totally independent of the meaning and the higher level interpretation that you give. Which you can then interpret those further into emotions for example such as joy or surprise, fear, whatever kind of emotion you are interested in. You can also say whether this is just a positive or a negative emotion which we call valance which is how positive something is or you can go into some other kind of interpretation of mental states such as pain for example, depression, attention and so on. So based on the signals that I said in those facial analysis you come to these higher level interpretations that then can be used in further analysis and in further building of machines that can actually take into account the emotional state of a person. In emotions, I'm happy because of the context in which that happiness arose. How do you teach a machine to take into account the context of an emotion? Well this is, context as a word and a term is a very complex thing. So context actually is who the person is and that's very important because how I will express for example joy may be different than how a person will express it. That joy is something which we consider something which is universally expressed. Everybody in the room probably smiles when they are joyful but in principle there are different things like pain. How I will express pain may be very very different than how other people will express it. So who the person is is the first contextual question. Where that person is. So for example if I'm here and I raise my eyebrows it's definitely not that I'm surprised. I probably want to emphasize what I want to say. What I'm doing, this is again linked. If I speak or if I listen, if I listen and I throne it might be that I actually doubt what you said or I don't understand. But if I talk and I throne it might be I'm not sure about this. So again the interpretation is different. The next one is where I said that. Why that's actually whether it's emotion, whether it's this kind of underlying, this is actually what we want to find out and how I express these things. So that's context. Context is very difficult. And there are currently no machine learning methods that truly take context into account. So because context is also taking into account time which we currently use only on short terms so we can analyze something like a second of a video and reason about that but we cannot analyze days. And for example if you talk about moods then days are needed or about strain due to stress. So first you have to analyze the stress and see the stress and then know that the strain is due to the stress. So context is very important but how we will do it in machine learning is still not known. So that's like one of the challenges that we have in the field. So we have data relating to how the human is feeling. How do you start the process of building a functional machine that uses that data? So the work that Maya is doing and we use it especially for machines that have to interact with people. So non-machines that are in a factory with a cage around them and arrows on the floor so you know where to walk and stay away from those machines. But robots for instance and autonomous cars and intelligent homes that interact with people. And when you interact with people you have to take into account the social environment. You have to understand what people's situation is, what the social situation is what they're in and their emotions. So the algorithms that Maya develops are really important to understand the social scene. So it's one of the robots that we built is running around in Schiphol Airport and it's supposed to pick up passengers from one gate to the other gate and it goes to that gate autonomously. So it drives around there but when it sees people in front of it it shouldn't treat them as objects that you need to go around. The robot must understand what a group of humans is and it would be nice if he could see well it's a little family or this is maybe a group of people that seem to belong together and then go around those people. For instance someone is taking a photo of two people and you go around that you don't drive straight through that. And it gives these machines social skills and social intelligence which is essentially if you want this technology to integrate into our environment. Once when they're exploring Mars or when they're in the factory you don't need it so much. But when they come into our homes and they give us services or enter the hospital and they give us services it's very important that they fit very naturally in our environment and that they behave kind of in a way we expect. It can't be exactly like people because robots often don't look or sound like people at all but something that seems intuitive to us so that needs to be designed very carefully in that behavior. So you said you have some of these carts already in Shippo Airport already using this kind of technology. Some other examples please. So actually a robot that Maya and I worked on together is called a frog. It was really interesting about the robot is that it can offer services in outdoor environments and we proved that by having the robot run on its own autonomously in an outdoor cultural heritage environment and it approaches people. It knows when they are interested so thanks to Maya's algorithm it can detect people's interest and it adapts the content that it provides to that interest. And it navigates the environment autonomously. It builds a map on its own and what's really interesting is that it works outdoor because outdoor and Maya could tell more about that. It's very difficult to use the data that comes from the sensor technology because of the weather and the light conditions and anything that can happen when you're not in a nice room that's perfectly lit. So that's another example we work on. For instance robots for in a hospital setting feeding robots that listen to the conversation before they offer food so that you more naturally partake in the dining rituals such as rescue robots that are able to understand whether someone is in pain and kind of try to respond appropriately to that. Cars that know whether you are distracted. Homes that know whether you belong there and I can keep going. Very good. So just in my understanding of your work is that if Maya works with the individual and the recognition of emotions in the individual, you are a pioneer in making machines work together with others. Right, that's correct. So we've talked so far about the detection of cues and then how we infer underlying psychological states from those cues. And once we understand those underlying psychological states, how do we use our understanding of them to build robots or systems of any kind whatsoever, it could be a car, as Vanessa said, that can respond appropriately. And the kind of response that Vanessa has been working on is detecting the social situation and the emotions and using that to respond appropriately, not to interrupt for example a family unit. In my own work, I understand what we do to be towards the end of allowing robots of all kinds to better collaborate with people and to collaborate with people over a lifetime, not just over a minute. So facial expression for example is very difficult to detect over long periods of time. But I've built a mobile assistant that is a kind of a cartoon virtual human that exists on your cell phone and when you put your cell phone down it can slide onto your computer screen when you go into the kitchen, it could slide onto the front of your refrigerator, when you go into the living room, it could slide onto the wall of the living room. It's there with you all the time. It would be crazy if that system every morning said to you, hello, I'm your cell phone and you would say I know you're my cell phone, we've been together for around five years now. And so to remove that sense of aphasia, we need to understand how people work in groups, in dyads, how interpersonal relationships work and how they facilitate collaboration and all kinds of other work. So in my own work I've looked at several use cases. One is your mobile assistant that can help you figure out what's most important on your calendar. Another is to learn what kind of news you like and to present that news, but to present it in an increasingly amusing or edgy or entertaining way. So I know you like sports, so I've been looking for something for you, unfortunately you like the Steelers and not the Broncos and the Steelers aren't doing so well, but don't worry, next year everything will be fine. For the moment our cell phones don't do that for us, they don't change over time. Another extremely important use case is education. We know about young people and particularly about young people in underserved communities and under resourced schools that one of the reasons that they have trouble learning is that they don't feel that there's somebody in the classroom who respects them, who cares about them and who is like them. And so one of the systems we've built that actually we've been demoing over in the loft and you're welcome to come see it today and tomorrow until six in the evening is a virtual friend, a virtual peer for children in a classroom. And this virtual peer gets to know you and it builds interpersonal closeness between itself as a virtual peer and the real child and it uses what we might call the social infrastructure or the social scaffolding to improve learning and in a recent experiment we discovered that in fact it does just that. Children who come to school in a situation where they might speak a dialect different than what the teacher speaks and so there is not a teacher in that classroom who's like them and they can feel marginalized. Those children learn more science and learn science better when they work with a virtual peer who speaks the way they do but who also models for them how to switch into the standard English of the teacher as soon as the teacher comes into the room. So they're being taught how to maintain a connection to their home culture, extremely important for all of us and also how to have access to school success by speaking in the way that successful people speak in that particular culture. Those are just two use cases. We have to remember why we're doing this and why we're here talking about AI so much this year at the World Economic Forum. There's so much fear about killer robots and factory robots that are going to overtake the CEO and take her or his position and that's not just something that's going to happen. We build robots and virtual people in our own image, in the image of what we care about. It's our decision whether to build them as aggressive war robots or to build them as robots who care about emotions, who care about people and who work on building interpersonal bonds and the three of us care about preserving those aspects of humanity that make us most human and that is emotion and social interaction. So very much you provide the context that we talked about earlier with all your work is moving in that direction towards providing the context. And interacting as a function of a changing context because context changes over time. So humans as we know all have different levels of emotional and social intelligent. You have one end of the scale, you've got Bill Clinton and the charismatic leaders and then on the other end of the scale perhaps some people with more autistic ways of interacting. What are the conceptual challenges to getting a machine up that spectrum as it were? And where do we stand? So actually Vanessa and I currently have the project with working with autistic kids. But the whole idea there is to help them understand the expressivity, especially facial expressivity in them and the others. The issue is that autistic kids cannot interpret facial expressions of people easily because they miss this gestalt, which means taking the whole face as a whole into account. They actually see it as a totally divided set of parts. The mouth is the mouth. The mouth is mouth, move separately, eyebrow, eyebrow. And you know the issue is when they try to learn how to express because they cannot also react first they get totally shocked when people start moving faces because for them that's like very, very complex set of things going all the way around. That's the first trouble. The second one is they have to react to that and they have no clue how to react to that. So we can teach them actually that what certain expressions mean and how they should express. And it seems that robots are much better teachers in case of autistic kids. The main reason being is that the robots are always consistent so they will show the expression exactly in the same way each time. And if you ask a human to do that there is no chance they will do it because they want to emphasize, well you have to smile. And when you do that for them that's a completely different expression than if you smile just regular smile, right? So it is really important for them that it is very consistent in the same way. So the project is about that. That's an autistic part but if you want to ask how far are we understanding automatically whether somebody is on say more extrovert or more introvert or more dominant side or more whatever side of personality scale, well this is something which we do not have currently the technology for, not due to the technology but actually due to data. We just simply do not have enough of data of dominant people being labeled as dominant people and doing certain things like because what does it mean being dominant? What does it mean being introvert and extrovert, right? We know it like we can explain to you what does it mean but we don't have the data enough that I teach my models well that's how the extroversion is expressed, right? So yeah, maybe I can add to that, maybe I can add to that. So we've got about 10 minutes left, I'm looking forward to hearing from you and then I'd very much like it if we could get some questions. I'm going to add something too. Oh okay great. Short ads from both of us and then we'll open it up. So you asked about a conceptual challenge and I think everybody here in the room they are sitting a certain space apart from each other because we have these rules about how we behave and the technology up to now we keep it in our pocket, we put it somewhere but now this technology is acting on its own, it's going to drive around, it's going to do stuff, it's going to grab things, it's going to give it to us and that means that you need to comply to these rules that we have in our society. So we have to figure out I guess conceptually what these rules are for robots because there are different things in animals and there are different things in humans and only then can we have people and robots work together effectively because I think the challenge is not super human intelligence so that they can surpass us and I don't know be better wives and children raisers and computer scientists I don't know but that we make these computers work together with humans in the most effective way possible. And purposeful. So you must have a purpose, the purpose is usually to help the person, to help the job, to help some state, medical state, whatever but purposeful and these robots going to the war and so that's not purposeful, there is no goal there, right, the goal is to destroy it. So I would say as well the challenge of working with individuals with autism particularly high functioning autism or Asperger's is a very interesting one for researchers like the three of us who are interested in social and emotional intelligence and my own work also with children with high functioning autism or Asperger's I was able to demonstrate that the virtual human was able to teach social skills in such a way that children were better able to learn those social skills from the virtual child than from a teacher using the state of the art methods and that once the child had learned those social skills he or she was able to transfer that learning to interactions with other real children which is kind of the holy grail of what we do, we don't want children spending the rest of their lives interacting with robots or virtual beings, we want them to apply what they've learned to their interactions with real children. I would also say that when you talk about a spectrum of ways of being that's, that brings up a phenomenon that's called entrainment and entrainment means to come to act like somebody that you're interacting with and it's something that all of us do in a conversation I can demonstrate to you that after a mere two minutes you will start to use similar words as the other person. Your intonation the kind of curve of your voice and how loudly you speak will come to resemble the other person and this issue of entrainment is something that my students and I are working on how can we avoid the necessity of categorizing people as one way or another and rather simply, simply it's not that simple actually, teach our systems to entrain to how they are so that because when people entrain to us our affect for them goes up we like people who speak the way we do, who act the way we do, if we can build systems that know how to entrain then we can deal with anybody on that spectrum. Yes, please. How could you learn because I think it's a very interesting issue that you're raising emotional intelligence and artificial intelligence, how this could help us as CEOs to to understand more of the team because everybody knows that we are not talking to machines but sometimes the CEO thinks that they are talking to machines. They are like do this, this, and this and if you are able not only to gain that with let's say arguments but also with emotional intelligence at the end of the day you're going to gain their brain, their intelligence, their intellectual but also their heart. So this is something that concerns me every day, how to interpret it and how we are going to talk about the purpose because the millennials and also the employees more and more they want to have a purpose is not only a quest about money and total compensation but I'd like to listen from you what you have to say about that, how to connect the two things and then to learn more because at the end of the day we need to have just one team, one team with different views but after discussing we just have one view and okay let's go in this direction. So one of the systems that we built that was very effective in teaching social and emotional skills asked people to control the virtual human. So we said this virtual human is in the next room interacting with somebody and you're going to be able to look through this one-way mirror and see how the interaction proceeds and you're going to control what the virtual human does and we did this both with individuals with Asperger's or high-functioning autism and neurotypical individuals. We said just control the virtual human and then at the end of the interaction if you don't like how it went you can you know change some something about how you control it and we'll try again with another person and what we found was that the act of controlling a virtual human in a room with other real people allowed the people doing the controlling to reflect on their own behavior in a way that is rarely possible. So in some sense to talk about it scientifically their hypothesis testing they're thinking about okay what's the right way to work. A nice example from a teenager with autism was that he said to me you know that was that didn't work at all because I think that guy expected me to ask him how he was doing. So next time I'm going to leave time for him to talk. Now this is an issue with individuals with autism. They don't know the concept of contingency. That is someone else says something. You say something that has to do with what they said. But this child learned this in the process of controlling the virtual human. Same thing with adults. You can reflect on whether your management style is a management style that's successful by watching its effects. Now there are many books for CEOs that say get to the balcony, reflect on what you're doing. That's nice but it's a little abstract. If you're in another room watching an avatar of yourself in some sense, a representation of yourself, that can be an easier way of doing this kind of reflection. Just to add we have actually a technology that watches people while communicated with other people through internet. And we are raising flags like there is a conflict rising. You may want to use more positive or agreeable words in order to make this work. So it's actually helping with managing styles. I think there are many applications possible. So if you would have a detection of dominance, it would give you a little buzz every time you're being dominant. You would notice how often in the day you are dominant and you could adapt your behavior to that for instance. And there are many possibilities. And all of these things are possible because of this three-step process that we talked about. First, recognizing observable cues that are observable but not necessarily consciously interpretable by all of us. Secondly, inferring what underlying psychological state they mean, whether they're indicating that someone is angry or depressed or whatever. Figuring out the social context that you're acting in and having the intelligence to interact with a human in order to improve that situation. And that can be interacting as a virtual human or it can be in some sense whispering in the ear of the real human. You might not want to do that again. You know, did you notice a lot of people have been quitting? It could be because you're acting in the following way. Should we go down the back here? Sure, thank you. Well, that last comment made me think of what you think of the Hello Barbie project. But my real question was, I was wondering if you could talk about some of the design decisions that were relevant to building trust and familiarity in the physical aspect of these interactions. I think Vanessa and I probably do some things that are similar and some things that are different. In my case, I build virtual humans and when people work with me and build robots, they build robots that have the form of a human. But I never build anything that looks too realistic. And that's an important design decision for me because I'm not trying to fool anybody into thinking that this is a real person. On the contrary, it's very important for my work, I think for much of our work, that people know the limits of ability of these systems. And I convey that by the form. So the virtual child, for example, looks like a cartoon of a child, not too cartoonish, realistic enough to evoke in the other person unconsciously and automatically the skills of social interaction. But still not so realistic that it makes you think that it's real. So I guess two, oh, sorry. Yeah, do you want to add? There's time, it's all too stabile. I think there are basically two design approaches to developing robots for human interaction. And one is psychology in form, so you look at how people do it and then you try to replicate that. But you've got a robot with no arms, so you need to figure out a way to point and you're dealing with it that way or you take more product design approach where you say, okay, what is the function? What is the outcome that I want? And then design from scratch. And I think very often it's a combination of those two to leverage the familiarity that we have with humans. And we might take one last question because we're kind of out of time. Oh, I guess I'll just talk. I love the work that you're doing. Thank you. I love the work that you're doing. I actually have dyslexia and I would have loved a virtual peer growing up to help me learn how to read. It was hard learning on my own, but somehow I managed. My question is more on the ex machina craziness that you probably face every day of having to defend against. And I'd love to hear it directly from the source. Emotional intelligence is what leads to empathy. And I think that as we move forward in building these machines, I think it's very important that they have empathy. But humans have a spectrum of emotions and intelligence when it comes to being able to express themselves. And we have, people don't have empathy. How do we make sure that these robots are going to have it? And what if there's some kind of malfunction? There's quantum computers now and they're going to be so much more intelligent than we are. How do we make sure that they stay empathetic? Well, what you actually raise here is a very, very important issue. And that's consciousness. You can truly be empathic or not empathic truly. I can build now a robot that will appear to be empathic. It has no idea what you actually show or whatever, but it's learned there is a role. You're sad or you're afraid and it will say to you, well, you know, life is nice, you know, don't be sad, seize the day or whatever, right? But it will mean nothing. But what you actually say is that once it starts meaning something and starts truly having empathy or not, but that's actually evoking the consciousness that that's whole ex machina about, which is a very bad movie, by the way. I'm sorry to say my personal opinion. So the whole idea is that we do not know what is the real model because that's the whole point. Do you need the model for consciousness and thinking because what is consciousness? I know that you know that I know that's a true consciousness, right? This is epistemic logic, right? Having a model of that, do you need a model or that, or it's enough to embody in the robot and say, or in any entity, artificial intelligence entity, that you can find the model on internet and now load it and then fill it in with whatever you want and get the consciousness, right? So we are actually really far from that. And again, it comes the whole idea of what I just said before, purposeful. What is the purpose of embedding this code? Yes, there is a model of consciousness that you can download from internet, right? And you can follow and find whatever you want further on internet because you're just googling. That's knowledge, right? It's just googling, nothing else. But the issue is that there is this first line of code that there is a model somewhere on internet. The question is what, which purpose will that serve? And given that I don't have an answer to that and I think this will be really ridiculous to have that integrated into a robot, I believe we will not come to that. But that's my belief and I'm really talking only for myself because I do know a lot of scientists that actually believe that this may be of importance. So that will be probably one of the debates in the field. And perhaps it's not a coincidence and I think Vanessa, you probably think similarly that you're not imbuing your robots with consciousness. One of the things that characterizes those of us who work on social and emotional intelligence, or I'll speak for myself here is what are our purposes? What are our goals? My goal is to build a system, a robot or a virtual human that can evoke empathy in humans. Not that can demonstrate empathy. What I care about is real people. That's why I do the work I do. I care about teaching them empathy, maintaining the sense of empathy that I believe we're all born with, and preserving empathy because it is one of the things that makes us human and that we care most about. And I use systems that understand, model, scaffold, teach, interactional skills such as empathy, or rapport, or interpersonal closeness by building something that gives the automatic cues of being like a real human in such a way that the real human acquires or demonstrates or uses those skills. But it doesn't, it's not a goal of mine to give the underlying psychological state to that system. We just might have a quick closing mission statement, I guess, from my own Vanessa. Okay, well, I think we really want to build these robots to be effective and efficient, to work with people optimally. And that's actually a very simple goal, in a way. And in order to do that, you need to understand the social situation in which we work and in which we live. Otherwise, these products will not be accepted, they will not be effective. So it's essential that you take the emotional and social state of people into account in this technology. I believe that the technology is very powerful and I believe that we can use it for many good applications. And I believe this whole hype that you heard here about losing jobs because of AI and robots. Technology is a really hype, it's just ridiculous because this is the same thing that we constantly hear. We need people that have to be educated in technology and a lot of jobs are lost because people are not educated in technology. At the same time, we are talking about losing jobs because technology will be developed. So it's just a contradictory thing, right? That we are constantly hearing and everybody gets into that trap. So I know this audience is not big, but just don't believe these things, this is just crap. And another thing which is interesting is that as technology develops, new jobs will open up. Did we know about all the internet services that will be here? No, we didn't know. And they're here and most of us work actually in these kind of jobs, right? So I just think that people should not be scared about technology and technological advancements. And I believe that we all really want just to help each other. So that's the whole purpose of this. And with those optimistic words and outlook, thank you very much. Thank you, thank you. It was a great pleasure. Thank you. Thank you. Thank you.