 Thank you everyone for joining into this session on some bites and today we'll be talking about the impact of modern directional microphones in a special little turning speech. And this is a big topic of a great interest of mine. I've worked for many years in the field of technology and specifically looking at directional technology. And I have a personal interest in understanding how directional microphones impact the everyday living standards and conditions of people with hearing impairment. So I don't have a lot of time in this talk to go into understanding technology as such, but I will try to define the problem statement as to why we're looking at directional technology. Then I will describe my research methodology and I will provide you with a summary of the results we have achieved so far. So let's start by looking at what the problem is with directional systems. So what we have here is a typical or conventional laboratory experiment or setup where we have a listener sitting right in the middle of an array and then we have positioned a number of sound sources around the listener. As we can see, for example, we have a target location of a sound source represented potentially by a loudspeaker, and then we have noise sources around. And this is typically what we do in conventional laboratory experiments. So we typically feed hearing aids to the patients on the left and the right, and then we actually go about evaluating what they do. So as an example, I will be talking about the difference between omnidirectional and more directional systems like cardioid and beam formats as well. So let's try to understand what this omnidirectional signal do in this environment. So what you see on the screen, if I can bring that up, is what we call a polar response. The red is for the right hand side of the head and the blue is for the left hand side of the head, essentially where the macrophos are placed. As you can see what happens when this figure describes the sensitivity of the microphone in relation to the direction of the sound source. So for sound sources that are located on the target direction, the microphone has equal sensitivity to just about anywhere on the right hand side. And then the sensitivity decreases to the opposite side. And there is something behind that is that the sounds actually generated from this side of the head, they need to go over the head before they arrive at the microphone location. As a result, they cast an acoustic shadow. So what we see here is the acoustic shadow effect of this source on this microphone. And of course you see the opposite to be true when you look at the microphone on the left hand side of the head. So this particular, at least now we'll actually hear almost anything in the same way because we then combine the two ears. There is pretty most the same intensity of sounds being presented from the microphone point of view. So what happens with something like a cardio that figure in the middle? The cardio has this property where it's designed to suppress sounds from other directions, other than more or less the frontal direction. The problem with that is the head is also in between, especially when they are positioned to the left and to the right side of the head. The head is in between is casting a shadow on the characteristic response to the microphone. And as a consequence, the directionality has to be tilted towards the right of the head. And this is quite common what you observe in realized listening conditions, where the directionality sensitivity of a directional microphone position on the left appears not to be directly to the front, but it's likely to the right. And it's always true for the other side of the head. Now, when we talk about more directional systems, this isn't just an example of what a directional system looks like in terms of polar characteristics and the activities sensitivity. So what we can see is some far more narrow directionality in terms of response is sort of focused very sharply in a given special direction. And typically that direction is where the target is. So by looking at the difference between the three different configurations, you can see why this super directional system would be superior in attenuating the noises around the listener compared to the other two. So in terms of increased directionality, we go from omnidom cardio with a more conventional system designs, and then we go from cardio to b-formance. So in terms of speech understanding noise, the sort of research actually shows us that these type of arrangement that they help you to understand the target speech in the sort of noise configurations. But when you come to cardio, they do slightly better than you serve with omnidirectionals. And then again, when you look at b-formance, you get a significant improvement of speech understanding in noise because the b-format tends to focus more specifically on target in attenuate pretty much anything else. So the listener has less trouble hearing recall in those sounds presented by the target location. So let's consider something different. Let's consider what happened when the target moved to the side. And here we have moved the target to the side in the same sort of representation layout. And what we see is essentially that our directional system still captures that particular sound, although it's moved to the right. So it doesn't have any problems in capturing that sound. The more directional system, the cardio also is able to capture that sound location so that the listener can hear the sound way effectively and has no trouble in doing that. There are some differences between the two, but for the most part, the effect is not that different between the omnidirectional and the directional. For a target location, this is likely moved to the right-hand side. The problem comes when these higher directional systems, because of the way they're designed, when the target sounds move to the right and off from the main axis of direction, then it's no longer in the pathway of this narrow directionality. And as a consequence, the person is not going to be able to hear that particular sound effectively. And in fact, it will start to emphasize something else like the noise. And so in this particular condition, this particular type of technology, it will not be a positive thing for the listener and will produce a narrative effect. But of course, we have moved a long ways into trying to understand what direction, in terms of technology. And now the knowledge that is designed to deal with more complex listening situations, as you can see, are represented by this configuration here, which is just an arbitrary configuration, when you actually see distribution of noise sources all over the place, in addition to something that will be more complex when you happen to have, perhaps, more than one target location. And the target location happens to be just about anywhere around you. So what happened with being from a system is that being from us are now able to broaden these output characteristics in such a way that you may be able to capture the information present from those locations. Unfortunately, at the same time, you also capture some of the noises that is present in those frontal locations as well. You can also develop technology that will have very narrow directionality, but instead of broadening, what it might try to do is try to move around in terms of searching for that particular target sound source. And when it locks into the specific target sound source, it will provide the emphasis of that particular target location. So we could scout, sort of, searching for the main target speech and then just lock into that. Of course, there are some difficulties, practical difficulties, to be able to do something like that in practice. And then we have something a little bit more advanced and complicated when what we are actually doing here is using some statistical basis to try to characterize these acoustic space, which is very complex. And we combined some information about where we believe a particular target of interest is located with this concept of broadening. So it's a combination of these two in some ways, using statistics to produce something slightly different. And what we end up producing is some sort of pattern characteristics that is a little bit unusual, but it's more effective in being able to capture the targets of interest when we get multiple targets specifically in relation to the noise sources. We have a better good suppression of the noise sources around it. So this looks like a very amazing and very good solution, but the question is this fairly new technology. So the question is how effective these type of solutions are in real world listening conditions. And that's the reason why we go about trying to explore these ideas in the laboratory. So let's now move to our research methods. And before I actually go into our specific research, I just want to recall something I talked in past sessions of the seminars where I talk about realistic listening conditions. And I always argue that a realistic listening condition is one that happens to have some sound sources that are different distances from the listener. And also we have a little bit of reverberation simulating the sort of things you experience in real life listening conditions. For example, a cafeteria with multiple people sitting on tables and having conversations and you'd be in position anywhere on that cafeteria and listening to them. And these happen to be in multiple distances. And of course the sounds they generate that tend to reflect on walls and surfaces and they all arrive to you at the same time. So this is a very calm place type of background noises. In addition to that, when you go into restaurants, you don't just go in trying to hear sounds and recall them. That's not what we do, although that's what we tend to do in the laboratory. In real life listening situations, we tend to use a little bit more of message processing. So we are engaging in conversations. We are very interested in to understand the context of that conversation. And we also use a lot of memory and a lot of other factors that involve our ability to understand it. So now we'll develop a material to be able to access this sort of functionality, which we call the dynamic conversational test or DCT. And you want to know more about this, please go into YouTube and you can play back some of the recordings we've done in previous sessions about realistic listening environments. So this is the sort of type of environments are going to be used to assess the technology because these are more realistic. But we need to go a little bit higher than that in this particular location, because we're talking about something else as well. We're talking about something that's happening in this particular representation, this pictorial representation of a restaurant, where three people are having a conversation. So what's happening here is that we have a more dynamic type of environment. When a listener might be engaging, not just with one person, but with multiple people. And this is the sort of representations we need to capture in the laboratory to be able to understand the actual effect we will find in this new type of technology. Okay, with that out of the way, let's look more specifically as to what we actually did. Just one second, please. So on the left hand side here, what we have is that our recruitment process. We actually identified them on the hearing participants and tell hearing impaired listeners what we did. We actually feed them with actually hearing aids. And then we provide them for them to be able to hear the sounds, we just provide them with the NALA 2 amplification targets. So this is more or less the distribution of hearing losses. And this stands for frequency averages. Okay, so I will start with just describing the technology we actually use for this particular study. We use essential technology that was provided to us by UltiCom. And that particular hearing aid technology happened to have two features where we wanted to evaluate. One feature is called the Pina Omni, which is functions as an omnidirectional microphone, essentially. It was our reference conditioned. We also use this type of beam format technology that is refers us to the open sound navigator, which combines very super-directionality functions with noise reduction features and uses an additional type of this multi-talker type of searching algorithms that enabled you to engage in communication with multiple talkers. And just to put into context what we described before, we are comparing essentially this type of thing against this type of thing. So that gives us a little bit of interesting aspects we need to explore with technology. So let's now look at the tasks we ask people to do. One of the first things we actually ask people to do is, of course, do what we always ask people to do is to recall sound sentencing noise. We call that the speech-noise test assessments. In this particular case, we set up to measure this at approximately 95% intelligibility. We were not looking to understand what happened in the intelligibility space when you have dynamic targets described here, where we actually position targets randomly from other plus or minus 45 degrees. The reason why we selected this particular criteria was more uses as reference for the following test, which is the dynamic comprehension test. As you can see, what we did, we selected the SNR, we identified here, and we added another 3DV SNR improvement. And the whole rationale behind that is that while people were doing this particular test, they have full intelligibility of target sounds. So that was the main rationale behind that. That becomes, it will come very important as we're trying to understand what this research is telling us. In addition to that, we actually selected conversation material from two targets, and they were positioned at plus 22 to minus 67 degrees, or minus 22 and plus 67 degrees. So they were actually moving towards the right, to the left of the SNR, and the locations were selected, or the better locations were selected randomly as well. In addition to this, what we also asked our participants is to tell us the degree of listening effort, the experience while performing each run of the test, specifically now a DCT test. Not in measure, and that will become a little bit important, although I will not focus on this particular in this talk, it's the measure of cognitive dimensions. We used reaction time as a proxy for the cognitive processing ability of the person to perform a task, and we did that in quiet. So let's just look at some of the results. So what we have here is that the speech intelligibility is not a result. The Y axis actually shows the correct SNN speech intelligibility performance of this course, and in the X axis, we have the two groups. We have the normal hearing group on the left, and we have the hearing group on the right. So what you can see is that for the normal hearing group, we were sort of reaching a little bit of saturation on the responses. This is the 98% in performance, but in here 100% ceiling effects. But for the hearing impaired group, we see a larger difference specifically between the Pina omni and this PIN format type of technology. They always send base technology. And actually doing a statistical assessment, we observe significant difference in these two groups, but not so much into this particular group here. So we now move into looking at what happened in the DCTS course. Again, the Y axis just gives the DCTS course in percentage, and the X axis is just a separation between the two groups, the normal hearing group, and the hearing impaired group as we did before. So again, for the condition for the normal hearing group, we did not see much difference to essentially they are very close to ceiling in terms of performing this comprehension assessment, but we see some differences for the hearing impaired group. And in fact, those difference here were actually significant for the two. What that means is that we see a significant improvement of the BIN format technology over the Pina for the hearing impaired group while performing the DCT assessment task. And it came to the self-ready effort. It was not so present for the hearing impaired group, which is a significant difference. This is just, by the way, the Y axis is just a self-ready effort in a 10 point scale. So 10 means, of course, a lower effort, a lower number means less effort. So the lower the value, the better for the participants. And so what we see here is what we expect to see that people tend to have less effort, at least self-ready effort when they're listening to the BIN format. And that was for the both groups about the same. And in fact, the significance for the group was quite robust in this particular assessment. So this is very interesting. So let's try to try to understand in a different way. To try to understand relationship between our intelligibility assessment and what the DCT test is telling us, what I did is I put it together into a mixed effect model. And the model essentially is trying to relate the performance of intelligibility against a number of variables. In this case, the different type of microphones, and that would be the Pina omni in the BIN format, as well as the hearing loss profile the person had in addition to that reaction time. I used the subject as a random variable to make the system to the model that would be more robust. So this is what this particular model outputs and eventually is just giving us that the microphone has a significant effect on the actual scores. So I did the same, of course, for the DCT test now. I used the same model and the same parameters as the input. And this time I go to something in a slightly different model. So the main point to see here is that this model and this model are significant different. Here you only have an effect of the microphones, where here you have an effect of the number of things, including the reaction time, which is very interesting. And I will not be talking about this today, but I just want to point that out on the difference between the two things. What my interest is, is more on the effect of the microphone and what we see here, that we have an effect of about 2.3% improvement as we go from PINA to the more directional on average. And here we go about 12% improvement as we go from PINA to the BIN format or the more advanced processing strategy. So let's try to understand what that means and to do that I have created this illustration here. So the x-axis that gives us a test SNR for a particular listener, the y-axis tells us about the intelligibility that person has in noise. So this was more or less what we decided to perform the test, about 95%, and it was about the average performance we observed across the participants. And what we saw was that on average 95% we go about 2.3% improvement, perhaps because we are getting into this saturation region of the psychometric function in relation to intelligibility in SNR. But what's also interesting is that when we were very close to potentially at ceiling effects, we also see an additional improvement of about 12% when we actually test the DCT and that's the improvement between PINA and the BIN format, the OSN BIN format from Otico. So that's very interesting because this story tells us that the different things happening here as opposed to what's happening in there. We are observing improvement when people have a hand per saying intelligibility as well. So this story by itself is very interesting, but we can dig a little bit further as to what this means to a person out there in the real world. And to do that, I would use this particular diagram. The particular diagram what is shown us on the x-axis are scores for a number of participants that have performed their reading span test. And as you know, the reading span test has said things in multiple dimensions in relation to cognition and processing speed in language processing. And what we've done because there are multiple dimensions, we have collapsed those dimensions using a principal component analysis and we're just taking the main principal component out of that and we just plot that here. So higher value means good, lower value means lower processing capabilities of the person in terms of the cognitive functionality. Here we actually have the scores for the now DCT and percentage is correct. Again, the highest score is better than the lowest scores. And what's interesting about this graph is in fact a very nice relationship between the two. So here what I've done, I just plot an arbitrary point on that regression line and I just projected that to the x-axis. So what have I done that? The reason I have done that is because we've proven that this technology provides a significant improvement of performance in the DCT domain to the participant when they actually are comparing the PINA against the BIN format. So what we're actually saying is that we go from a reference line from there, that point shown by the arrow and that actually we end up with something a little bit higher as shown by that straight line. But what's interesting is when we start thinking about what that means in this space, what it means is essentially that we actually need to put that line a little bit farther to the right. And we end up with a different picture altogether and what we can see Okinario is that the performance of this person that was given this new BIN format technology was comparable to someone with higher cognitive functions but that was utilizing the more less directional system, the PINA only technology. And that must have an impact on people out there in the real world. I guess there's more research that needs to be done in this space but this is one of the most significant changes I have seen in relation to relate cognitive functionality and how technology interacts with people in the respect. So let's try to summarize this now. So in summary what we are talking about is this specific environment where we have a lot of complexity and we are engaged in a very complex task trying to communicate with multiple people at the same time and we're trying to evaluate with a technology such as the BIN format, especially this new type of adaptive BIN formats is able to have the listener function in this environment. So what we observe so far was that the where recalled measures we did actually improve even though we actually weren't measuring on the very high intelligibility levels near saturation of 95% we actually saw significant improvement on the performance of the people when we actually switched between the two technologies. In addition to that we also saw a significant improving and comprehension scores for the hearing impaired group and that was about 12% in performance. And we also saw a self-realising effort to decrease when we used the more advanced microphone feature and that was for both groups, the hearing impaired listeners and for the normal hearing group. And in general our observation was that the effects of the advanced microphone setting was equivalent to having less hearing loss as we saw with the regression model, the mis-effect model, but also as we can see by the analogy in the last slide is that so what can be related to an improved cognitive functionality at least in the conversational function of tasks they went about in doing and that must have a significant impact in terms of what people experience out there in the real world. So before I close down this talk I'd like to say thank you to the sponsors for providing the financial support for this research as well as technology and also advice me on how to base utilized technology and special thanks goes to Elaine who also have a number of conversations and trying to understand the significance of the research we have done so far.