 And thanks for coming. I know many of you are from Marquardt. And actually, I just initiated some collaboration with Dr. Amy Van Hakke at Marquardt. So you may also see me at your campus. This is the topic of today's presentation. When people think about biomedical engineering, people think about applying technology on physical health. And my research is another type. A type of newer research is how to apply technologies to help people keep their mental health. So when we are talking about mental health, we are talking about psychological disorders. There are many types of psychological disorders. I think you already heard of major depression, anxiety, schizophrenia, dementia. And I'm dealing with autism. So when a patient has a psychological disorder, there are two ways to treat the problem. So one way is psychiatric medication. So basically, they can take pills to surprise the symptoms. Or they can do cognitive behavioral therapy. So in this therapy, therapists try to understand the patient's cognitive pattern and try to change their behaviors accordingly. Usually, the most effective way is to do the combination of the two. The result is better than do either this one or this one. So I design intelligent systems to help children with autism to solve their cognitive and behavioral problems. This is a general type of research in human machine interaction. So regarding machine, we are talking about computers. We talk about robots. And some other people also use mobile devices as the machine. So the systems are designed for children with autism spectrum disorder, or ASD. Why we are studying this disorder? Because many children, they are suffering from this disorder. One in 68 children in the United States. And this disorder has two important features. One is social communication impairment. And the other is repetitive and atypical patterns of behavior. And the cost of carrying in life is over $3 million per person. So it's very, very expensive. And my research mainly address the social communication impairments. This type of research is interdisciplinary research between technology, I mean engineering, and psychology. So an important question is, where am I? Am I more inclined to technology, or am I doing more psychological work? I will say, I'm here. Mainly I do technical work, but at the same time, I have to understand developmental psychology in order to do my research. And here shows the basic stages in my research, or in this type of research. First, we think about what are we trying to solve? What type of system we have to develop? That is the design part. And after that, we develop intervention systems, like including programming, including hardware design. And then we use the design system to do user study, or human subject experimental studies. So in these studies, we recruit real people with autism, and we instruct them to interact with the system that we developed and see how it goes. So after the user study, of course, we will do data analysis. We want to look at the procedure and the result in fine resolution. So you see that this type of research takes a long time. Usually, one research cycle takes two years, and it's a little bit longer than other type of biomedical engineering studies, such as image processing, signal processing. But the good thing is you get a chance to interact with real people and see the change of their behaviors. So it's a bonus. So why we can use machine to help children with autism? That is a fundamental question. If we want to design something for them, then we have to answer this question, why can we do it? So there are two important reasons. One is children with ASD, they have difficulties in discriminating and screening out and necessary information from overall meaning or from the main communication stream. So for example, if here is a kid with ASD sitting here and trying to understand my speech, instead of paying attention to my speech, maybe my facial expression is very destructive to this kid. Maybe what I'm wearing, the color of my sweater attract their attention more than what I'm talking. So using technology, using machine-assisted technology, we can present information. Here I mean communication information in a way that only the primary information is conveyed. And all the unnecessary information are eliminated. So instead of I doing the speech, if I program a robot to do my speech, then I can program the robot in a way that there's no facial expression. And I can paint the robot using some non-destructive color. So they will have less difficulties in understanding the conversation. Another reason is children with ASD, they prefer non-biological motions instead of biological motions. This is a very interesting phenomenon. So of course, if you are using a machine, then you provide non-biological motions, right? So that's the main reasons why we can use hardware. We can develop intelligent systems to help these children. This research has been done for decades. And here are some examples of previous works. So people developed some robots. This is Kipong, and the robot can tilt its head and make some different sounds. And this is the Casper. This is a very famous robot developed in the UK. And the robot can talk to children, and it also has some sensors embedded so children can squeeze its head and can touch its face and et cetera. And there are also technologies such as virtual reality. And this is a simulated social conversation environment. And also, people use virtual reality to investigate how people with ASD, adolescents of ASD, drive. We cannot ask them to drive real cars because of the potential risks. But we can develop the virtual driving environment for them to be able to engage in driving. Now the question is, what's the difference between this and this? And why many people use robots? Can you guess why? OK, the answer is, using robots, you cannot provide physical embodiment. And that physical embodiment naturally creates more engagement. So a famous professor, Brian Schazzlatti, he's at Yale. He did a very interesting study. He recruited students, not people with ASD. And they did the experiment in his own office. They used a robot, gave different instructions to the students. And one of the instructions was, throw a $500 textbook into the trash can. And when they did the experiment using the robot, the student did not hesitate. They went ahead and threw the expensive textbook into the trash can. However, when they used the recorded video of the robot and displayed that on a screen, many students hesitated. They were like, am I going to do this? This is so expensive. Am I supposed to throw this textbook into the trash can? So you can see the importance of physical embodiment. That is one of the reasons why we use robots. And in the traditional works, usually the interaction is just a free play. The researchers, they bring the robot in the room and just encourage the children to interact with the robot. They may wave to a robot. They may touch the hand of the robot in an extant or extant center. And the response from the participants needs to be observed by human observers. And if the robot wants to give some feedback based on the behavior of the children, then the robot needs to be controlled by human operators. So this means people are just using the physical capability of the robot. But the recognition, the brain of the robot is actually the recognition of human. The nazis of human being, the human operator hiding behind the same. So this type of research is called the result of all this research. I did something different. So first, instead of doing free play, since psychologists already find if an intervention is oriented to a particular core deficits or impairments of that disorder, then the intervention itself is more effective. For example, if a child is lack of joint attention skill, then it's better to create an interaction scenario that focus on joint attention instead of other things. And I developed outerminous detection algorithms, so which means my system can track and analyze the behavior, the interaction cues of the subjects automatically. So you do not have to hire an operator to observe the participant. The system can do it by itself. And the system can provide adaptive feedback to the subjects. So once the system understand what is going on, and the system can provide feedback to the participant according to the participant's behavior. So that's the difference between my research and the traditional research. And there are three important requirements in designing intelligent intervention systems for these children. One is the system must be created in a way that it can be accepted. It can be tolerated by these children. It sounds very easy, but actually not. Since children with ASD, they have a normal sensory profile. So a lot of things which are not offensive to us may seem very overwhelming for these children. And the second requirement is we have to create a system in a way that it can attract attention from the participant. In order to teach a participant anything, to change their behavior, at least you have to grab their attention first. And then we want the system be able to elicit desired skills. If a child with ASD cannot do joint attention tasks very well, and then we want to develop a system that can help them do joint attention. So how should a robot interact with the participant? As I mentioned, we should conduct a meaningful intervention. Or we have to solve the core deficits of ASD. And we have to have effective interaction logic. We need to maintain their engagement. It is related to attention. And we have to improve their performance. This is related to whether we can elicit desired behaviors. And how to track participants' behaviors? We will need to design interaction cue sensing. This is the major part that can facilitate the interaction logic. And actually, the interaction cue sensing is the most difficult task in such type of research. And after we develop the interaction logic, we build a robot, and we design the interaction cue sensing components, the next question is how to integrate and coordinate different components. We need a effective system architecture. And those feel very abstract. So I will give you a case study to explain those more. So the case study is on joint attention intervention. What is joint attention? Joint attention means different individuals they share their attention on the same object. Currently, you and I share our attention on my slides. This is joint attention. But many children with ASD, they cannot share their attention. They cannot coordinate their attention with other people. So we try to develop technologies to have them concur this deficit. This study was done, I mean, I did it for about five or six years. So in the first stage, I think it was around 2011 or 2012, our research group want to investigate whether the children with ASD can respond to the instruction given by a robot. So what they did was they recruit a few children with ASD, and they program the robot so the robot could point at different target monitors mounted in the room. And after the child robot interaction, they also hire a human therapist to do the same thing. So we can compare the effect between a human and a robot. So what we find was, first, the robot, so the participants, they paid more attention to the robot compared with the human therapist. And secondly, the robot could elicit comparable joint attention performance comparing with the human therapist. So this pilot study was successful. This pilot study encouraged us to do more work. But there was a big issue in this study. So this is the hat used to indicate the direction of the children's attention. So how this worked was you see a line of infrared LED. And we also mounted the infrared camera on the ceiling. So if the kid turned the hat towards a robot or target and then we can track the vector of the LED, using the infrared camera, right? So that's how we indicate the direction of their attention. But as I mentioned, children with ASD, they have abnormal sensory profile. And over 30% of the participants, they couldn't tolerate wearing the hat. And they just dropped the study. So basically this means the conclusion we got here were based on the data from the children who could tolerate the hat. So that was the problem. And in addition, the experiment was a one-wizard experiment. The children came to the lab and did the study and then left, only one wizard. So we do not know the longitudinal impact of the robot. So in the next stage, we wanted to study whether the result hold using non-invasive gaze tracking. The gaze indicated attention. And what was the impact of the robot over time? So in stage B, we designed another system. Here is the robot, and the participant was seated in the chair in front of the robot, two target, one on the left side and another on the right side. So instead of wearing the hat, we hired a therapist sitting behind a thing. And she clicked a button if the participant looked at the target monitor following the instruction of the robot. So basically the robot said, look over there. And if the child responded, then the therapist clicked the button. So again, this is a result of our system. But at least our participant finished the study, no dropout. So we can get a more concrete conclusion about the impact of the robot. Again, we recruited a few children with ASD. And we did four sessions on different days to assess the longitudinal impact to some extent. The acceptance rate was 100% as I mentioned. And we find that the preferential attention towards the robot, so in the first session, was high. And this interest lasted across the whole procedure. So the attention paid on the robot did not decrease significantly from session one to session four. And the performance of their joint attention improved significantly. So this encouraged us to do stage C, the next stage. In the next stage, we want to create a autumnal system instead of using the result of our mechanism. So what we did was developing fully autumnal system with non-invasive gaze tracking. So you can see the main innovative novel part was the non-invasive gaze tracking. And this is a diagram of the architecture of the two systems. So the semi-autumnal system means the result of our system. Why I call it semi-autumnal is because the therapist needed to monitor the behavior of the participant. However, the robot could do something autumnously. But still, the whole system was not fully autumnal. And we replace the human therapist monitoring with autumnal gaze tracking to make the whole system completely autumnal. This is how the system looks like. The robot, two target monitors, and the participant was sitting here. And you can also see a few cameras. Those cameras were used for gaze tracking. So as I mentioned, the gaze tracking was a major part and was the most difficult part. So why it is difficult? Traditionally, if people need to do gaze tracking, they use eye trackers. Did you see eye trackers somewhere? Then you probably know that in order to use a commercial eye tracker, you will need to do calibration. And during the calibration, the participant's head is held there. And then the screen needs to display a full calibration point. And the participant looked at a calibration point one by one. After this calibration, the participant again needs to hold their head pose, cannot move, and just move their eye gaze in order to do precise eye tracking. And once they turn their head around, that's messed up the calibration. And the eye tracker will not be effective. However, if you look at the interaction environment, you will see that we need the participant to look around. So we do not want them to hold their heads. So we cannot use an eye tracker. Instead, we programmed, we did some computer vision programming, and we developed our own algorithms to use a person's frontal head orientation to approximate the gaze direction of this person. I see approximation instead of directly using the head pose. And I will tell you why. The first step is to do a head orientation estimation. I will not go through each detail because this is probably all of the scope of this presentation. But basically, it is a minimization problem. And we can detect landmarkers automatically using computer vision, landmarkers on the face using computer vision techniques. And when the orientation of the participant with respect to camera changes, the distribution of the landmarkers also changes according to the change of head pose, right? And by studying the distribution, analyzing the distribution of those landmarkers, we can calculate, we can compute what is the corresponding head orientation. That is the first step. And the second step is we want to solve the problem because there's a problem in order to track a person's head orientation. We have to see the near frontal image of this participant. So for example, if this is the camera, I'm looking at a camera, oh yes, the camera can catch my frontal face and do the estimation. But if I turn away from the camera, the camera cannot see my frontal face and thus cannot track my head orientation. How to solve the problem? The way is to add another camera here, right? So if this camera cannot catch my frontal face and this one can. So I used four cameras to make sure that no matter where the participant was looking within the interaction environment, at least one camera could catch the frontal face and did the head orientation estimation accordingly. So we designed a bunch of algorithms as listed here. And we first validated how accurate the head orientation estimation was done. The red line is the ground choose. And the blue line is the head orientation estimation data recorded. So you can see in all y'all pitch and row, the estimation result pretty precise. So you can see a little bit more off here. But the skill, the range of row is also smaller. So in general, the accuracy looks like this. It's not too bad. Basically, the camera or how long? So the time frame, I want to see, like, is it a second? Thought the camera could capture that. Oh, the camera refresh, I think, at least 15 frames per second. The camera system. So that's fast enough to capture any rotation. Yes, movement. And then we find that when people are looking in the near frontal direction, yes, you can use the head orientation to indicate their gaze direction. However, if they shift the gaze away from the frontal direction to the side largely, and then the head orientation does not really align with the gaze direction. So I am looking at you. This is my gaze direction, actually. And this is my head orientation. So you can see there is a big difference between the two. How to solve the problem? Again, we recruited a bunch of participants. And we display a moving marker. And the marker moves from this side to that side. And then we use the camera system to capture their head orientation. The position of the moving marker was treated as the grand truth of their gaze direction. Because we asked them to stare at the moving marker as much as possible. And then we did regression and fitted a curve between the frontal head orientation and their gaze direction. And you can see the red line was the curve applied in the system. And the shape itself is quite nonlinear. So first detect the head orientation, and then extend the detection range, and then convert the head orientation into the gaze direction. So now the system is ready to go. The interaction logic over the prompting hierarchy looks like this. So initially, the robot asked the child, look. If there's no response, the robot adds more hints, like look over there, more words, right? And also, the target display different things, initially static picture, not quite interesting. And if there was no response, the target may display audio or video. So from level one, from level six, the stimuli were higher, stronger and stronger. And this is called at least to most prompting hierarchy, which has been applied in psychology and education a lot. This table is very intuitive, but this is in human language. So the question is, how can we interpret this logic and implement that in the robot? So we need to convert, translate the table into something a machine can understand. And this is a pseudo code, a piece of pseudo code to implement that hierarchy. And different commands, different lines shows the input and output of different system components. Basically, we need to have a behavioral function that is a function to describe the behavior of the participant. And then we have an interaction queue detection function. And then we have a prompting function that works based on the output of the interaction queue detection. So using this interaction protocol, the least to most LTM interaction protocol, we have two goals. The first goal is we want to make sure that adding prompt levels increases the probability of expected response. There are six levels of prompts. And if the students, the participants' performance does not improve after prompt level three, then it's meaningless to add prompt level four, five, and six, right? So we want the sequence to be designed in a way that each one can help improve their performance. And at the highest prompt level, the system can elicit expected response with high probability. This is the second goal. So we want to design the interaction protocol in a way that eventually the system can have these children hit a target. Hit a target means turning, looking at the monitor, OK? And as I mentioned, beyond interaction queue sensing and interaction protocol, how to integrate everything together is a big task. And here I call it system integration. The information flow of the system looks like this. You have a target module to display static pictures, audios, and videos. And then you have a robot module to control the motion, a speech of a robot. And the two output of the two modules was observed by the participant. And the participant's behavior was signed by the gaze tracking module. And the output of the gaze tracking module was input into a supervisory controller. So the supervisor controller actually controls the interaction protocol and send comments to the target module and the robot module. So you can see here we form a closed loop interaction between the machine and the human. And this information flow seems easy. But if you want to implement the system, the real system, more works are involved. This is a parallel state chart model that describes the scheduling and parallel processes of the integration. I will not go into the details of this paragraph, but I just want to show you the potential complexity of integrating everything together. And this is just a figure for 112. And there were eight trials in one session, and there were four sessions. So you can imagine the whole picture can be very, very complex. So after the development of the system, we did another experimental study with 14 children with SD. And each of them, again, had four repeated sessions on different dates. OK. Basically, we got the same result using the system as that of the result of system. So what does this mean? This means using the non-contact, non-invasive gaze tracking technology, we can achieve the same goal. We can get the same good result in the case where we hired a human observer or controller. So we do not need to have them in the filter. We just use the system. We just use the gaze tracking technology instead. And the preferential attention did not change significantly, but their performance did change significantly. And the lower the PROM level is the better the performance is. And this graph shows we achieved two goals of the interaction protocol. You can see when the PROM level increased, the target head probability also increased. And eventually, at the highest PROM level, the target head probability almost reaches one. So one is the maximum rate. It's 100% target head rate. So two goals were achieved. So you already observed a case study for joint attention. That was not the only system that I developed. Following the same methodologies, designing interaction cue sensing, designing proper interaction protocols, and do good system integration, you can design and develop many different types of intervention systems. So this is an imitation learning system for children with SD. And I also developed virtual reality systems to recognize how people with autism recognize other people's facial expressions. And this is another figure shows a social orienting intervention system that I developed. So basically, the system can present social communication, stimuli in different temporal and spatial distribution, and we can access children's behavior within this system. So you can see this type of research has two contributions. First, we have technical contributions. We designed a system, and we did this, and that, we did a lot of programming and data analysis. And it also contributes to the science of ASD intervention. First, we understand how human or how children with autism they intact with machine, with something that is not human. And secondly, we observed interesting cognitive and behavioral patterns in these children when they were interacting with the machines. So there are a few future works. This graph shows, can you guess what is this? What's this? Yes, facial expression recognition. So it will be cool if we can also recognize children's facial expression while we are tracking their gaze direction, right? Because our cognitive status, our behavioral status, most of them can be reflected by facial expression. So we may want to introduce more advanced interaction cue detection methods in the systems. And then we want to model the interaction dynamics between the machine and the subjects. This is very important. So why it is important, first, as I demonstrated, the IoTM interaction protocol. There are many things existing in psychology, but how to implement them into hardware, we need some translation. And that is mathematical modeling. And again, people's behavior in short-term human machine interaction is different from that in longitudinal human machine interaction. So imagine that you buy a new iPhone. On the first day, you play it for a long time, right? And then gradually, after two years, you get used to it and you can put it anywhere, right? So it's different. And we want to do individualized interaction. This is important because autism spectrum disorder, this disorder is a spectrum. The spectrum means one participant can be quite different from the other. We cannot design the same thing that is suitable for each individual. So it will be good if we can create a model, create a system that can treat different patients differently. And again, we want to investigate the skill generalization. So what does this mean? Eventually, we want to have these children to be able to interact with human being well instead of intact with machines well, right? So machines, the human machine interaction is just a procedure or a tool to help them gain more skills and to implement those skills in real life when they interact with their parents, with their classmates, with other people, other real people. So that is called the skill generalization. So this slide, I want to give you some ideas about what the inspirations of my research on the mental health care in general. So the first thing I want to mention is we need to understand people's expectations. So a lot of people ask me, do you think in the filter robot can replace human therapist? And my answer is no, no, no, no, no. That's not possible. And this is because, yes, technology is powerful when we use it to investigate a very single specific point. But humans' behaviors are so complex and comprehensive, I haven't seen any technology can handle such comprehensive knowledge very well. I can give you an example. Yes, we can use gaze direction to indicate people's attention. Now I'm turning back. I'm watching at this whiteboard. But my attention is still on you, right? Because I'm doing a presentation, right? So in order to fully understand people's behavior, people's intention, you have to consider many, many environmental factors. And this research is interdisciplinary research. It's a collaboration between engineers and psychologists. Engineering and psychology, two completely different subjects. The same word can mean different things in people in two separate areas. So I can give you an example. When I did my PhD study, a psychologist asked me to design a wave motion on a robot, wave. Then I was thinking about how can I define a wave on a robot? What should be the angle, right? What should be the head of the hand? So all these things automatically turn into numbers in my mind. And so this means the conversation between the two sides is very important. And also, specific engineering design development. I already introduced that we need to make sure that the system is engaging. So if we present such a robot to children with autism, then I guess the kid will run away immediately, right? So actually, we use a robot looks like this. And how many minutes do I have? OK. OK. And so let me quickly talk about something else. It's also important to consider our population and what type of disease are we dealing with. So if we are dealing with ASD, we want to solve social communication problems. If we are designing a system to have children with ADHD, then probably we are dealing with attention problems, right? If we are designing a system to have people with social phobia, then we want to create similar different social communication scenarios. And we also need to pay attention on the age and gender of our participants. Like, for example, here the black line shows the different emotions of adults. This is one of my recent work. And the red line shows the face of children. It looks quite similar, right? So the expressions between children and adults looks quite similar. There are differences, but not too much. However, if you use technology, if you train machine learning classifiers to classify, to recognize those different emotions, then there is a problem. I use the same algorithm, a support vector machine. And I train classifiers using adult data and test that using adult data. The result is not bad. And I train the classifier using children's data and test the classifier using children's data. And yes, it's good again. However, many existing systems, they use programs trained with adult data to recognize children's faces. And I simulated the same scenario. You see the accuracy drops. In this case, the discussed expression of children is even recognized as angry using the adults classifiers. So this is wrong. And then I did a reverse study. How about we use the classifiers trained with children's data and then use that to recognize adult facial expressions? It's completely messed up. It does not work at all. So it's very important for us to understand the population. And as I mentioned, the systematic integration and fusion is very important. In my small-scale studies, the figures already looks pretty complex. And now if I involve more factors like more interaction queue sensing, more complex interaction protocols, then probably the network looks very complex. And it does need expertise to handle such complexity. So in the next five to 10 years, we would like to do three things. First, is we want to extend the user study. In almost all the research articles, in the conclusion discussion section, you will see a limitation of this study is this small sample size. That exists in almost all the research articles. There are a few reasons, but if we can encourage community participation more, then we can recruit more children and we can get more data, and thus we can develop better technology. And again, we want to encourage industrial collaboration in this research. Like for example, the robot that I used cost $9,000. It includes 25 degrees freedom. It has different sensors, and the bumpers, sonar, and cameras. However, in my study, I only used the motors on the arms, and I didn't use any of the sensors in the robot. So which means the dollars are wasted, right? Probably I just need 1K from the 9Ks, right? So if we can develop some other robots that are just suitable for our intervention, not with extra non-used hardware, then we can achieve lower cost, right? And of course, such research benefits the community. In the future, when the technology is more mature, we may apply them at home and in clinic, right? And we also want to get more feedback from real life application instead of in the lab setting. And last, I want to acknowledge my sponsors, NIH, NSF, Vanderbilt, Michigan Tech University. Thank you. Thank you, Dr. Zheng. I can pass the microphone if anybody has questions. It's a long pass. Yes, thank you for this presentation. Very interesting research. Was there a selection criterion in the autistic kids for severity of the autism? So for my experimental study? No, there's no restrictions. So I did about 300 sessions across the whole spectrum, low functioning and high functioning. And yes, we did observe differences in different experimental setups. So sometimes we find that technology can have low functioning children better. And sometimes we find another piece of technology can have high functioning children better. So it all depends. But what about things that might be disruptive to the experiment, like violent behavior or severely autistic children wear diapers? It might be difficult. Yes, it could be difficult. And actually, well, you know, my participant were two to five years old. So it's a little bit easier for me to handle their behaviors. But still, I sometimes got scratches and bruises and during the experiment. But those are kind of expected. And we want to develop robust technology. We want to build a system in a way that can tolerate such behaviors eventually. Thank you. Thanks. Hi, yes. Two questions. One was kind of going off that point. Did you do all the testing same time of day? Had they just eaten before or something? Did these kind of things matter at all for the results? Did you see any change from doing a test in the morning versus the afternoon? Like if that kid had interaction before during the day or not? Like if they were exhausted? It's a very good question. Actually, in my experimental studies, I did not separate participants. Or I didn't set up a particular time for the participant to get into a lab. Because it's difficult, the participants sometimes, the family, they drive two hours to the lab. So I cannot set up very strict constraints. But we did observe that if like a two-year-old wanted to have a nap. And then we asked them to do an interaction session. And the result was horrible. That happened. And did you do any, like, exit or entry questionnaire about their state that day or that week? Like if it was a good day, bad day? Because sometimes it can vary? We didn't do the survey for that particular day. But we did standard psychological assessment on their mental status, behavior status, in general. That includes the history in the past and the recent behaviors, but not necessarily on the same day. And then I was wondering, did it appear viable to be able to replace your camera system, in a sense, with sensors on the robot, especially if you're using something that expensive? So then if it's just the robot together, like if someone wants to use this at home with their kid, then they can just, wherever they play with the robot, it's set up and set up a big camera rig or something. Yeah. It's a good idea. And I think it is possible. Because I purchased a commercial version. I cannot break the robot and insert my own hardware inside. But I think eventually, with the help from industry, we can implement a new robot with better cameras, or even with stereo cameras that can be used to sense the behaviors of children better. Well, I have another question. Did you try on normal children? I mean, did the same results are achieved? Yes. So we call them typically developing children, TD group. And that is the control group when we did our experimental study. And we observed almost all the experiment, we observed different results in the TD group and children with ASD. I have a short question. I'm thinking about the four camera that you mentioned, four gaze, and then the facial direction. So how did you choose four? And then I'm thinking is it because it's the discrete number of cameras, would it be helpful if you just use an array of detectors as opposed to this discrete number of cameras? How did you choose and would it matter in the final? So it's a very good question. Allow me to draw a graph here. So if here is a camera, the camera has a view angle. So how many cameras can we put? It depends on the overlap of the view angle. We want the participant's head can be covered in this range. So for each camera, the head orientation estimation, usually if the head orientation is near the front, the accuracy is higher. And as the head turns, the accuracy decreases. So we also want to make sure that we do not use the edge of the detection range. So this does not mean the more cameras the better, because in order to drive each camera, there is computation cost. And the more camera we have, the lower the system will be. So we want to balance the number and the performance. Time for speaker. Thank you.