 Thank you for staying around for what I'm sure is going to be a really interesting panel. It's my pleasure to introduce Professor Ed Delp who is the Charles William Harrison distinguished professor of electrical and computer engineering Professor of biomedical engineering and professor of psychological sciences and Ed Is you know similar to Al Ed has had a long and distinguished career With a number of pioneering innovations in the image and video processing area In case you haven't noticed it's also quite outspoken Right, just in case you haven't noticed right which at times can be you know It can it can be quite intimidating but after 18 years of having the pleasure of being a colleague with him I know what a fantastic Only if you're an administrator In which I'm not so you know I've had the pleasure of I guess being on his good side and Enjoying the pleasure of many conversations with him. So I'm gonna Turn it over to Ed to introduce the other panelists and take it away from here Okay, thanks. Thank you very much for the kind introduction So the panelists today, of course we have dr. Bovik We also have Chee Guile Guile from he's an assistant professor of VCE wonder there. He is right next to me We have Maggie zoo who is an associate professor of VCE We have Greg buzzard who's professor of mathematics and we have Melba Crawford who is a distinguished professor in the civil engineering department I should point out Two of the people on this panel myself and Melba have known Al for a long time and if you were here when you began his talk Maybe I can tell you later about his drug dealer comment that he made About when we were in the jungle in police and it really was a drug dealer we ran into Okay, so the the main goal and the panelists are not gonna have any statements and the main goal Is of this panel is really to engage professor Bovik and ask some questions So maybe to start off Maybe Melba. I'll let you ask the first question. Okay. I should I should have told you that but I did I know but why don't you start off? Okay, so I'm not gonna ask his civil engineering microphone. Oh It's on it's on. Okay So we've known each other a very long time. Yes the first time that dr. Bovik came into my office at the University of Texas He bent over to come in And that was my introduction So one of the first so we've had Engagement professionally as well with the IEEE and many of you are probably in the IEEE and So in the imaging area where I was working more in applications and certainly not video But we were both going to the same big conference the big eye-cast conference with this is such a drag There's so many people there. You've got 10 12 parallel sessions What are we going to do in order to actually have a community that can engage in a meaningful way? So he said we'll start a conference and we'll have about 200 people there Yep, and so We got our friends together and we had this conference in Austin And we had a couple of parallel sessions and this conference has continued for how many years now 20 Well, that was 92 I guess so you know going on 30 years and it is 94 94 I guess And we said we're gonna make it totally 94 94 29 29 years. Yep, and we're gonna make it international Conference on image processing. It's still 200 people, right? Oh, that was where I was getting to So it was so popular that now it's got I don't know how many thousand people that typically attend and It's like going to CDPR where you have people standing up around the outside of the room. So how do we? Professionally engage, you know, because it's really important almost everybody here. I think as a student How do we engage students, you know being at a conference? That's the whole value of a conference is really the engagement And so what are your thoughts on that as we we move forward? Sure Well, you know COVID was very revealing in this regard, right? Because we all stopped going to conferences in Person and then we did this, you know, sort of online thing Where you know to be honest, you know, you probably attended very little of the conference Maybe you saw a talk or something which was pre-recorded Which is typically pre-recorded. Yeah, right And, you know, you certainly didn't interact with anybody which means you didn't bounce ideas and you didn't, you know Have a personal connection. You didn't form new research Relationships or or anything like that. So to me that just first of all that in-person aspect is Absolutely unassailably perfect, you know, I think, you know during COVID also multiple of my graduate students who are Normal kids, you know would have been happy and so on we had to go through therapy and so on because they're feelings of Isolation during COVID, you know, I couldn't see them. They couldn't see each other They couldn't be in the lab or anything else and that's just sort of going the opposite direction And it's important that we are social creatures. We need to come together to learn to co-educate to to advance knowledge as One now the size of it, you know, that just reflects the importance of pictures and videos because they're everywhere It's in your pot. I mean just everything is pictures and videos today and we can't cancel that out. Okay I think some sort of, you know, combination of things is starting to happen because You know, there was also proliferation of conferences. So there's the gigantic, you know Conferences and but then there's suddenly 500 other image processing conferences, too I think that they're going to start to, you know, dwindle because of zoom and so on Maybe they'll go online and that sort of thing. I think we'll still have the eye cast We'll have ICIP, CVPR and a few others as well And they'll be big and we'll just have to learn how to accommodate because we need to send our students to those more than anything I mean, I like to go too, especially if it's a nice place I want to find Ed there and shoot the bull and talk about society matters and research and everything else But students I remember going to conferences as a student and I was instead of being like now old and, you know Walking around jaded, you know and just like howdy-tah. I was bullet interested in every one of those posters and talks I sat through the talks. I was interested. I asked questions, you know, and I really was really engaged And that I think is was so important for my own personal development that let's not give that up no matter what That's what I think and don't be intimidated when you go to a conference Don't ask, you know, because if I'm what they're giving to talk I mean, I just had, you know, about five or six students come talk to me for me That was the best part of the talk My ability to you know afterward somebody was interested about the first part was talking to me Second best second second best because we are educators first and foremost Okay, that was the best because somebody listened somebody understood something It's not the whole thing and somebody was interested And asked an intelligent question. So that for me was thrilling to ask after those questions So Do the students have any questions they'd like to ask Professor Bovic? I mean, we have some other ones we can continue to ask But does anybody have and don't hesitate if you just stand up and ask the question Anybody want to ask a question? Including you might want to ask a question about his career path And why did he end up at the University of Texas? If you're interested in that does anybody you have any okay, go ahead Back way back here go ahead Okay, there's a we'll get but there's another question over here So I don't know if you covered this in the Lecture, but why did you why did you receive two Emmys? Oh Well, you know to be honest You know, I was working on Of foveated vision that's where we account for the fact that your eye has You know more centers in the middle of you know the retina So when I'm looking at you you're in focus, but everybody else is blurry Okay, but in creating trying to do foveated compared we didn't have a quality model So I was searching for a quality model So myself and joe wong one of my greatest students We got together and we started talking about it and I proposed and he was saying this mean squared error isn't working And I said to him why don't you just make it local, you know measure it locally and normalize it Okay, and that became the structure term. He went home and came back the next day with the other two terms Okay, and so we had our model and it worked and we were puttering along and suddenly the television people started picking up on it because it actually Correlated much more highly with human judgments anything ever before And it kind of ran with itself But of course we became opportunistic as well and tried to engage companies A company called videoclarity started marketing it to the world We gave them all our code helped them send a student there to help them and so on But then it kind of took care of itself, you know, and at one point it was Just sim ss. I am I think a lot of people call it was you know controlling More than half of all the moving bits at some point, you know So if you go back to Amy's question, why did you choose to go to the University of Texas? Well, you know, I went to school in Champaign Urbana, okay, and uh Not to not to belittle your weather here No, just kidding. I did kind of get sick as a kid of the snow and slush and so and I wanted I liked hot weather But more importantly than that Austin at the time was perceived as you know, the hot spot of the future I didn't want to go to California to crowded and all that east coast to crowded I wanted to go someplace. It was hot sunny fun and a dynamic environment They had these initiatives called MCC and and what was the other one? I forget, you know, but anyway You know huge amounts of money pour in It's another one of these semiconductor things. Yeah. Yeah SMC yeah, SMC and so on and Semitec Semitec. Thank you. That's the one. Yeah Thank you, David. Yeah, so, you know and I went down there and you know, I love the town and so on So it was my number one target all along and I went there It's been great, you know, Steve Ray Vaughan And then outdoorsy place, you know, it's a lot of outdoor activities and warm. I liked hot, you know Now I'm not so hot on the hot because it's 105. I lost it all summer. I gotta tell you I'm getting sick of that, you know So do any of the other students have any other questions? Otherwise, we'll we'll ask one of the panelists to ask a question Now I should mention that Thomas Wong said Al Osta has a great future I listen to him a lot Okay, you have a question David. Go ahead. I think it's on Okay, so when I was in graduate school, you hated the mean square error and I still think that you don't like it Do you dislike it as much as thin or more or less? I love the mean square error It's what you've meloded with age No, no, I it's just that I didn't like it being used for picture quality prediction because it's such a horrible predictor Okay, so I didn't you know make comparisons with because it's not worth it anymore But I mean literally the correlations are below the worst picture quality prediction algorithm by big margin It just doesn't have any perceptual aspect. It's just a pixel to pixel comparison Which doesn't account for perceptual principles like masking effects and so on So did you have david in the class? Uh, I don't think I classed. Yeah, you took image processing. Yes. Yeah, I used to see how was he was good Yeah, yeah, he said you like the man. I remember he said I like the math. Oh, okay. Turned out. He's right All right. Here's my phd advisor for like a semester. Yeah, just you know, I was sort of standing He was the bridge there with Okay, uh, any other questions from the audience Okay, otherwise You ask sure So my research is about jointly designing hardware and software for future cameras or visual sensors So I want to ask you from a perspective of hardware maker that's from The perspective of video quality preservation. Do you have any feedback or wish list for hardware makers to make their next generation cameras, for example If there are a aspect that you wish cameras to improve on for example, like higher resolution or higher dynamic range Or flexible frame rate as you have mentioned in the talk Or things like that Well, two things about camera technology one It's amazing at all the different camera technologies and their capabilities along every dimension Including the ones you all mentioned the spatial resolution temporal, you know bit depth Everything, okay, so that advanced tremendously. The other thing is that camera manufacturers are incredibly secretive Okay, so it's very hard to know what's going on inside a camera computation Okay, they because it's so competitive field, you know that In they're in addition, they're having to compete with the iphone So it's even more competitive. You certainly don't find about much like that unless you like ed, you know He does legal cases and you know sees the secrets Um, I do see he sees the secrets, you know, I've seen him too. So I know I can't tell him either But but but don't you I'm sorry, but don't you think some of this computation they're doing in the camera in some ways Great the quality Well, it can I mean what I can tell is the iphone does use quality models like this Okay, that much I know through I'm not going to reveal how I know But I wasn't through a patent case. Okay, so they're injecting, you know perceptual optimization Um, I think it's great that they're doing that because it's something to be done fast and efficient and be put in, you know Just hard wired, you know, just write hard wire code and that sort of thing As opposed to relying upon, you know, deep learning and that sort of thing I think they can coexist But it's a difficult problem to talk about how you can, you know, put in that neural processor, which they do Along with the, you know, sort of net, you know, standard type processors But how do they not only coexist, but how do they also Co-process The visual the visual information that, you know, I think is kind of a challenge Going forward just as it is conceptually, how do we combine, you know, machine learning? Which is just this big black box that can optimize anything with the right data with Science and truth. Today I talked about science and truth Okay, which ought to be able to make deep models better But how we I'd like to see how those things can, you know Coexist as far as all the improvement along all these various dimensions. Boy, they're on top of that, you know I mean, which makes makes my job interesting because then I can develop a model for well high frame rate or You know better colors or, you know, whatever new dimension comes along So, by the way, is there any questions because I'm just going to follow up on So, you know, there's this, I don't know what I'm Philosophical argument or what that You know, we take the picture and we look at the picture um, but That's really not telling us What effectively is out there in a sense you do you understand what I'm saying is number one the sensors we have Only sense in certain wavelengths And we know other animals maybe have other sensors. So maybe we're not perceiving The real scene we're perceiving some version of the scene and and And then on top of that, we got all this processing going on the side to camera is, you know Is that a good thing is well, what you just said is not only true of cameras is true of our own eyes Right, okay, because you know, we have a limited Range of, you know, bandwidths we see the brain is immediately compressing like I talked about throwing away Huge amounts of information because of that, you know, it's still we still see well But people think when you look at the world or you take a picture with a camera, okay that well, I got everything Okay, but there are Enormous number of visual illusions that just show you how little you're actually seeing when you're looking at the world I don't mean other bands like ir or you know x-rays or that sort of thing They are important absolutely, especially in remote sensing, you know, and by the way these models work for those Well, I that's my next question. Okay. Well, hold off on that. Okay. So, you know In fact, you know, the brain is doing just like deep learning enormous amount of In the broadest sense extrapolation and interpolation to understand That's why visual illusions fool people because the brain is trying to figure out What is it? It's trying to figure out. How does this fit into the real world? And that makes, you know, these special designs that makes wrong answers All right, any other questions from the audience? For the students any type of question? Okay All right, if not professor zoo Hi Hi. So I want to ask a question Kind of related to quality and compression because I do, you know, some research in that area Particularly with compression So I think in your talk, you know, we also talk about your collaboration with these companies They're always trying to push the boundaries of how much we can compress the data, right? But still having the best quality, right? So there's always this tradeoff between, you know, what we call the rate and distortion or rate and quality Right, um, but I think you have covered a lot about From the human perception point of view, right and how we view this as to Whether it's giving us the good quality we like to consume I think there's also a lot of emerging trends where this massive amount of visual data is that we have acquired Our processed machines Right, which do not Mimic our human vision So my question is do you think quality In that sense For data that are processed by machines still plays a role. And then what might that role be? Oh, sure. I mean, uh There are definitely studies and we've conducted one not long ago that show that, you know, machine vision tasks Number one are affected by picture quality Okay, so it doesn't matter, you know, how big a deep network you have and so on if the picture's coming in or of lower quality Then it's going to have a harder time doing recognition and that sort of thing Um, another study we did was Just in face detection, okay Suppose you take a powerful face detector doesn't have to be a big huge deep thing something like what's in your camera Viola Jones type face detector if you also supply that algorithm with some quality features Okay, that are, you know, like why I just talked about today Then it can do a better job at detecting faces because it can learn to account simultaneously for detecting faces as well as Accounting somehow for the quality of the picture and how that might affect the detectability of faces Okay, so very much so you can include that now with that said, you know, if you have gigantic databases like image Net is full of distorted pictures So to some degree if you're, you know, if you when you're training on image net to do image recognition It is learning not only to do image recognition or classification. I should say but doing it on distorted pictures So it has to learn Structural things like the models I talked about Okay, the underlying deep models The early layers of deep models look very much like the same band pass filters I've been talking about Okay, they learn the same things that your visual brain does and does that But also further in and I don't have proof of this because it's further into the network where nobody understands anything You know explainability, but certainly if it's learning Distortion, I mean to at least to the extent of being able to classify in the presence of distortion It's learning that the natural that these statistics are modified In some way, you know how it's learning it may be not explicit and very implicit somehow You know somewhere in the embedding, which is my favorite word in machine learning is embedding Okay, you know, well, what's at the seventh layer? Well, it's the embedding. Okay. What does that mean? I thought it was the feature Well, that's the embedding now. It's a much cooler word. They've come up with. Okay. I got a future map is an embedding Yeah, so I think the deep models can learn it along with their task But if you can inject this information this basic truth in there then it can learn it faster converge better You know smaller models that sort of thing very definitely, you know You know jitendra malik, you know, he's kidding with me one time He said well, all this stuff you do is cool. All this stuff I do is cool as computer vision You know, we're all doing cool stuff right now, but you know one points is going to be a big box replaces this, you know Uh, you know, is that going to be true? I don't know. I hope not You know, I'll be retired by then and won't care in the same way Following following up on that So how about if you take the attitude The hell with it. I don't need the camera. I'll just generate And I could even I could even Send you the parameters over the internet so you could generate the movie Then then I'm not going to have any quality issues. Yeah, well, I'm going to say two things just about forget the quality. Okay number one And I would my kids went through the teen years during the marvel era at the peak And I am just so sick of that. Okay I didn't take them to every single marvel. I've seen 23 more whatever it is now at this point, you know marvel movies Okay, and there's so much generative stuff there and it's such a relief to go see real people Acting and talking and you know and emphasizing quality of acting second thing is You can't watch the news that's been generated videos news videos. That's coming news videos can't be generated That's coming. I mean they might be generative with generated with evil my dear friend. That's coming Well with malintent perhaps. Yes generated with malintent. It's definitely coming, but the real news the honest news Can't be generated. Okay, so I mean these are just two examples. I mean, but certainly cinema you can generate Maybe the actors will all disappear, you know Do we believe That you know a machine learning system anytime in the near future can convey the sensitivity Of a meryl streep and representing some sort of you know personal expression attributes nervousness You know that sort of thing. I think we're far away from it No one Well, we are we are but but I agree with you more than I don't but I think The the problem is there's a lot of people that are sort of taking this attitude That you know, we're going to have all this generated content and I maintain all this generated content It's not good for our society either Beyond the image processing implications. I think it's not good for our society. Yeah I mean you could probably have a generative model that could learn jack Nicholson Yes, okay with his peccadillos and strange behavior. I mean just the way he is wonderful actor But it can't create the next jack Nicholson with that Actors incredible, you know peccadillos and strange attributes and mannerisms and things and so on that would make That actor great and unique and so on, you know, they can't create marlon brand They'll can't create meryl streep can't create, you know betty, you know, whatever. Let's see what happens You know, I'm not impressed so far neither am I but I'm just gonna say let's see what happens Is there any questions? Invasion with the audience Go ahead, please. Hi I'm sorry. It might have been mentioned in the first part of the talk but Uh As a visual scientist, what would you say the top Features of the human perception that is currently missing in the machine perception And what are the possible directions that you know that research should go to incorporate that? Well, my viewpoint is You know, I'm constantly reading the visual science literature and also doing research in just visual neuroscience at times I mean I spent a good part of my career just publishing in vision science journals and that sort of thing Trying to see What new discoveries are forthcoming that I can bring Into video engineering whenever I find one that I might think might be useful like you know, major things have been like foveation Natural scene statistic models, you know things like that adaptive gain control all these kind of things when I find them I try to bring them into some kind of algorithm. Okay, I mean Why do why is you know the word have this why let me give you this example And this is going the other direction your question. I think it's just too interesting. So I never answered the question to the audience of Why does the visual system of the human brain extract do the same processing that extracts that Gaussian noise? Why? Okay, the answer in my mind is so that it can perceive distortion So for example, you know the crystalline lens Of your eye adapts to bring things into focus Okay, and it does it using process visual cortex here So When you change the degree of focus it's changing the statistics of what is being perceived back here And adapting till you just get that right focus at that point in your depth of field Okay, I think it's using this Gaussian noise to control it basically. Okay. I know that was in kind of the opposite direction What about serving energy too? Yeah Go ahead. I'm sorry. I mean conserving energy in the sense that of you know, the you know The band pass processing and everything becomes efficient and that sort of thing because you know, it's not it's it's you know Let me put it this way If we didn't have just foveation Then we'd have to carry our brains in a wheelbarrow Okay, if our whole field of view is the same resolution. Okay If we mount upon that if we didn't have this upfront reduction of excess information By encoding then, you know, we'd have to you know, use a very very large wheelbarrow the card our brains are out So okay, so just remind all that and processing all the information of course is you know expensive Sure consumes heat and all that kind of stuff remind people what you mean by foveation Well, foveation again is you know, when you're reading Here, you know, I can see the word compression here, but I can't read any of the words around it when I One of my eyes pointed there That's because if you take your retina and if you laid it flat Then right in the middle there's a very high resolution of The percent, you know the right the cone cells primarily the photoreceptors Okay, and then when you move away from the center the fovea it gets falls out very fast So it's much sparser sampling. Okay, so it's what you're seeing out here in what's called the visual periphery Is subsample Okay, and it's really amazing because what do the eyes do? I mean everything's blurry except you How do I see Amy? I move my eyes So we have this Fabulous fabulous feedback control system of visual attention moving around quickly Sometimes pointedly to what I need to see and allocate all my visual resources Towards what I want to look at whether it's charlie or whether it's amy or someplace else when i'm driving constantly You're cicating your your point of gaze around the visual field Super efficient to do that But your perception is you got a very high quality high resolution seen in front of you Even though it's really only what you're effectively looking at what I would say is you have a high resolution In an area about the width of yourself. Yeah, okay one degree of visual angle same size as the moon But you have the rest with you know declining resolution all that context is important So you want to you're driving and then if you turn right soon you see a fuzzy building in your periphery But you know you're you know, there's a street sign There's a road there you and then you point your gaze towards it, you know exactly how to turn Okay, so it's all important you need the peripheral information as well as that little area of high resolution It's absolutely amazing the way we develop that. Yeah As microphone It's coming I know it's not Okay, oh, yeah, uh, so I have a thought I mean we had a distinguished lecturer I think last year was bill freeman. You know bill freeman Yeah, and um, he gives a talk where he talks about wings versus feathers. Maybe you've seen this No, okay. It's it's an interesting concept. So the idea is that A wing is intrinsic the flight But feathers are an adaptation that like, you know birds have that work. Okay, that's That's true. And then that would be an example. So in other words, some things are useful but not necessary Whereas other things must be there and for vision You know, uh, the question the question's he's raising or which things do we think are are wings and which are feathers So so my question is are it's foveation wings or feathers? Do you think it's something that's really impressive that humans develop because of their limitations And or because we're not really using foveation that much in real vision systems. I maybe i'm wrong Okay, uh, but you know or is it you just collect all the data and don't worry about it Just have a computer the size of a building Well, keep in mind, you know, there's a computational advantage of electronics over our brain Okay, so I mean what we have is massive parallelism. Okay, but we're a very slow processor So, you know, in a neuron's fire in the microsecond scale It's nothing like, you know, what's available in in computers So there has to be other efficiencies and foveation is one of them and I would call it absolutely a wing In fact, every, you know creature, you know, who has a complex eye Has some sort of fovea to allocate visual resources in that way like horses have horizontal foveas and the reason why as well, you know, they have these, you know Eyes that are looking outward across the plane They want to see whatever prey are coming their way and they don't care what's in the sky Eagles don't attack horses, you know, they don't care, you know snakes. They just stomp on, you know, whatever But whatever, you know, whatever is in that one field of view that's horizontal All the way across the field of view is their fovea and where they're the region of interest, you know And there's other shaped foveas too. Amazing. Yeah There's a question Oh, I'm sorry right here All right, why don't particularly you go first and then we'll have the gentleman in the back Hi, Dr. Bovic. Hello So because Gaussian distribution came up and also in your lecture you mentioned that If the image is poor quality, it's far from the Gaussian point and if it's good, it's near the Gaussian point So does that mean that the underlying distribution of images or videos the spatiotemporal distribution is Gaussian? And if not, then why do we Relate it to the Gaussian point the Gaussian distribution point. Yeah, so again, you know, if you look at an image and Calculate its distributions very non-Gaussian generally right because you're going to have any shape histogram basically It's only after you do this reduction By a band-pass process, okay Remove the extraneous information Separate it may not be useless used for something else. Okay that you end up with this Gaussian residual Okay, why is that true of pictures and videos of the world? That is a mystery still It is not because of something like the central limit theorem It's not an additive effect that creates that underlying Gaussianity of the world So it's a bit of a mystery. It's another one of the magical things about, you know, the Gaussian Gaussianity of the world, but what I can say is so we don't know why it's true. It's wonderful that it's true But you know, again The vision system whatever the distribution of the world would be A neural network. What does a neural network do? Okay, statisticians will tell you or they'll say well, basically It learns distributions, especially a generative model. Okay, it's just Learning to create images having a distribution of whatever signal. Okay, so it's learning distribution. So this neural network The same thing it'll learn the distribution of the natural world Okay, which has this underlying Gaussianity the neurons and the various brain centers that I pointed out They learn to extract that Gaussianity and separate it from the signal structure Which can be used for recognition classification navigation and that so the brain separates those while also Achieving high efficiencies and through encoding and it does it by the same processing Which is kind of miraculous in itself. Okay that by doing efficient encoding you separate Into this noise signal. Well, nobody One nobody knew what to do with the noise signal My thought is well for algorithms Let's do quality prediction and I think the brain does that too So so my speculation and then we come over here about why we do this I think it goes back to efficiency and survivability And survivability survivability in other words, you know, you need that level of vision that can give you that That resolution when you need it and you can focus it So you won't get eaten by something. Absolutely true, you know, just like our you know, our color responses We see more strongly in the green yellow range because we have the yellow sun and we a lot of things you can eat that are green Okay, and red is important. Well, oh bleeding, you know, and our red berries are irritable and that sort of thing and You know, our blue sensitivity is much reduced because it's not as important for survivability occasional blueberry And laying back in your grass and looking at the blue sky. Okay and relaxing So, you know, same, you know, we the neural system adapts for survivability. Of course. Yeah, definitely Okay, we have a student back there. Go ahead. Thank you for the great talk So we have been talking about video quality and especially with video streaming like quality is very important But there comes the latency aspect as well, right? We might want to optimize for quality, but the latency might affect the human Perception. So what are your thoughts about that? So when you say latency, are you saying like if you're watching a show on television and somehow it's delayed? Uh In terms of television but more in terms of internet streaming Let's say the chunks don't arrive in time and there's rebuff ring. Oh, yeah, so You know video quality prediction is you can view it as part of a broader area of research called quality of experience A quality of experience if you you can include a lot of things such as, you know, audio Um, you know, even haptics But also, you know, temporal effects that are not Distortions in the same sense Such as you're watching netflix Suddenly because the bandwidth gets so low the buffer in your television empties out And it freezes and you get the darn spinny in the middle. Okay, and you're like, oh, you know, and then you're going to change the channel Okay, so we've done a lot of human studies on that Modeling that is totally different We don't have a model for You know, if if a video freezes, you know, what is the brain? Because in nature You didn't encounter this, you know, you're out in the jungle doesn't freeze. Oh, I see they're spinning wheels all the time So there's not a brain model instead. It becomes a behavioral model How annoyed are you by, you know, the length of a freeze or the duration of a freeze or how often they occur And so that's what our human studies have focused on and we've created predictive models. They're very accurate Um, it's a bit harder because you know, you know, they happen over long movies as opposed to short, you know Things you have to take time in the long time spans to do these human studies So no brain models natural scene models like I just talked about, you know The statistical models don't really apply the same way if at all It's more of a behavioral thing and so you do need machine learning for that in the end Because you know, we don't have a good model. We can model a freeze. Okay, it's frozen. So that didn't tell you much Okay, thank you brain responses like well, it's frozen. Okay Any other any other questions? Okay David you have okay, go ahead. All right. So there's a lot of interest in the tactile internet You know conveying tactile information which could be converted to an image in in some sense Do you have any feeling if some of these tools could be used for for assessing quality of a tactile signal? So tell me a little bit more about okay. I'm a user of the tactile internet. What am I experiencing? So you put on gloves and it's the sensation of pressure for example, or maybe something. Yeah, so it's like it's for immersive environment It's it's haptics. Yeah, so it's haptics. Yeah, okay So, I mean golly, I mean after all this is a neurophysiological responses that we're experiencing Okay, you'd like those to be faithfully conveyed so you can first of all in this context of what we're talking about today Certainly, there is a quality of the accuracy of that. So, I mean is my tactile experience accurate Does it feel like I'm actually picking up a vase? Okay, or something like that when I'm engaging in this immersive environment So realism, I think it's more of a realism thing But it could also be distortion because you're trying to transmit it over long distances Maybe that as well. So I think there are kind of be similar Not identical, but similar kinds of issues that said the entire I talked about how this There's this band pass processing and divisive normalization where you basically, you know, divide by the variance signal Okay, neighboring neurons. This isn't just a vision thing. It happens at audition hearing Happens in half in touch. It happens throughout your neurosensory system Uh neurons generally are normalized For that same reason of compactifying and also efficiency Okay, so we'll use some sort of analogous models in that context as well. I think for similar issues. Sure Okay, maybe we would maybe we'll go back to the panel and let's get the view of a mathematician Great, uh, so earlier Ed trash neural networks, uh, and just a couple sentences But I wondered if you might offer a more nuanced view of like What things have gone wrong? What kind of engineering or mathematical or software principles might Benefit the whole neural network field and its application to video and images and so forth What was the first part of the question? Um, so how might mathematics benefit machine learning more mathematics and engineering principles? I mean, it seems like there's kind of this gold rush and like Like people just keep throwing stuff against the fan and seeing what happens So like what kind of principles sort of more engineering principles or mathematical or software design principles So, you know a friend of mine is david donahoe the great statistician Okay, and you know last time I saw me, you know, we were talking about machine learning and he goes, you know, Al You know signal processing is dead And then and then I said really I mean, I mean I think that people do creative things and he goes Nah, and you know what Al and he's a statistician. He said, you know what statistics is dead And of course he's joking right because I mean what he's I mean What he's saying is that you know So many of the tasks that we've been doing with clever You know hand crafting and thinking and design and engineering work and so on Is just being replaced by these big optimization boxes Statistic signal processing so many other places But once again, how much creativity has been there? Well, there's been a lot I would say, you know And they certainly get a lot of citations, but today it's still three basic architectures and it was invented back in the 1980s Most of it, okay So I think that injecting, you know principles of science into these things So if you're operating medicine, yeah, these things can find, you know Lesions and mammograms probably better than your average radiologist at this point, okay But if you could under, you know, uncover the underlying, you know Physiological mechanisms of tumor formation and how they spread and they form state-late lesions and so on Then you could probably augment this learning system with that. Okay, you can't expect A machine learning system even with a billion parameters to learn all truth About any problem Okay, so if you know truth give it the truth. Okay, that said there's going to be challenges I think if you're a pytorch programmer, you know, I mean, you're got a job issue coming up. Okay It's good to have, you know, learn to be a data scientist one who can collect data Who understands the science Of whatever their problem is and so you can learn how to collect data so that you can then Shovel it into the deep model. Okay, that's sort of thing. I think that's an exquisite science You know, we find that to be so in doing psychophysical studies. Okay, they're very carefully designed experimental protocols So that at the outcome after we don't want to waste all those Dozens hundreds or even thousands of people's times we want it to be a successful outcome that will actually affect future algorithms So, you know, it's mathematics, you know, mathematics, there, you know, obviously There's a lot of branches of mathematics have nothing to do with deep learning and they're all going to be active But things that are affected by it and you know, like signal processing a very mathematical kind of thing One wonders, you know, to what degree a mathematician in that area is going to survive Well, you see people trying you say like donno Recently had a result on a structure that always seems to appear in the final layer of a deep net Okay, it's has a certain lattice structure or something like that. Okay I think slowly but surely these kinds of thinkers will begin to dissect Deep models of every variety and we'll begin to understand slowly but surely over time perhaps with the assistance of our artificial friends You know, what's actually happening in here. Okay, to explain ability explain ability. Okay, just like we only understand We understand brain centers and we understand coarsely what goes on where but only in certain places do we understand You know enough about the brain to use it in mathematical models at this point Maybe we'll learn more about that any other questions from the yes, somebody's got their hand up There we go In terms of the future of video quality, do you think about um Like creating more ipad kids and like getting children addicted and the implications of like improving this technology and just like Well, yeah, okay, if you're worried about kids what I worry most about is, you know Augmented extended reality VR that kind of thing, you know ipads pretty innocuous, you know, I mean still You know, I think kids who sit in front of a flat screen All for many many hours a day. I think they will tend to have Less proprioceptive capability to interact with the real world. Maybe there'll be less athletic less agile less coordinated less whatever Than kids who are outdoors playing and where the brains are adapting to you know Flying footballs or you know hanging on branches and that and that sort of thing So I worry about you know extent, you know And I talk I tell this to people at meta and at youtube the places that are creating these kinds of devices and so on And they've started to listen they put warnings and that so on But I still worry about that six-year-old kid who puts on a vr helmet for six hours a day Where the content has not been perceptually corrected geometrically Statistically to agree with our real world, you know anime. That's not real Okay, so their brains are adapting to that content. And what does it do to them? I don't know But it can't be positive. They might enjoy it, but it can't be positive And I wonder about you know, we're gonna have stranger behaviors and You know deficits and less capabilities less critical thinking Less well less that's sort of a meta thing, you know, but less critical thinking, you know I mean, I depending on the nature of the content, you know, I won't mention politics From a guy from Texas Awesome. Yeah, okay. Okay. I got it. Um Okay, any other any yes, uh, thanks, professor bovic for your talk. So In visual quality, we want to correlate with human right human visual system So if we see some of the metrics the correlation is close to let's say 0.9, which is really good Do we think the 1% or 0.1 change that is due to subjectivity of human system or human how Because some of the human observer will be subjective or there is still something missing in that truth That can be done Well, one thing that's nice about my field that helps us create our models successfully Is that when it comes to picture distortions and quality humans generally are in high agreement So if I if we create a large data set, you know, we show You know hundreds of people, you know thousands of distorted pictures And we divide them into two groups and then we correlate their opinion scores Yeah, we get very high correlations on normal pictures and videos and maybe 0.95 0.96 or something like that You can view that as kind of an upper bound on performance of any predictive model Any model that goes beyond that is overfitting Okay, because you cannot do better than the actual humans Okay, predicting themselves So it doesn't make sense now depending on the modality of those correlations can fall Okay, so when you start getting into the synthesized content and so on Or if it's immersive where there's many more variables like there's so many different places You can be looking and that sort of thing then, you know, these correlations can fall, but you're right You know the you know the the level of correlation depends on The situation the content and that sort of thing and it's sort of an upper bound Did I get your question? I'm wondering if I got the tail end of the other part of the question is like Is there something still missing in that road probably? Oh sure, you know, I mean Part of the part of what we don't include in the model I described is we don't really talk about content other than a very low level But you know and there's also, you know, there's a separate question of aesthetics So if I show you two pictures, I suppose they have the same amount of distortion One is a beautiful car a little bit distorted the other is a garbage pile And you know most people will say the beautiful car picture is higher quality. Okay, it's just natural and that's what we call aesthetics So that's a content effect and we need to understand more about that and that will help our models We're doing that a little bit. We have algorithms. There's one called rap peak our api qe It's for blind situation where it's got two channels One is using a lot of these statistical things and it's just purely distortion sensitive The other is a semantic channel It's just basically a pre-trained network for whatever tasks were involved in could be just image net trained And then we feed them both to a shallow learner performance leaps Goes up because we learn about content too So that thanks thanks still wide open though. Okay. I think we're going to begin to wrap this up So maybe one more question. Does any anybody Have any one more question they'd like to ask professor bovik Charlie you have a question This is honest we'd make up something well microphone You here great here. Greg. You're straining it. Yeah, what do you think the future is? Particularly for young people in the room like where is the field all of these fields are going to go Where do you think the most important directions are and it could even be well? Maybe you should at least be science engineering math But but but definitely, you know, it doesn't have to be restricted to your particular area of research Oh, well, you know It's always an Trying to attack areas where we don't understand things. I'm going to go back to the math question Okay, how can we ever understand deep networks? Okay. I believe that we have to invent A new kind of mathematics Really, I mean it's easy to say. Oh, it's just interpolating we can use interpolation mathematics Or we can talk about it statistical inferencing And all that kind of thing but really as these models become more and more abstract Okay, and start to represent, you know Concepts ideas metaphysics. Who knows what okay, then we need some sort of symbolic mathematics that we don't currently have And so somebody who is and I know this is about as vague an answer is like even That's what I was looking for vague, you know You know, then I think I think that's a fascinating direction to go for mathematics to try to Dispense with its current tools and try to invent new tools for this new Thing which is called these this massive network, you know, maybe we'll help us understand the brain better as well So that's an off-the-wall kind of answer. Otherwise, I would say, you know, don't be a pie torch programmer Okay, all right, because there's going to be a way over abundance. So half of them are going to be fired soon You know try to be have a science Application in mind and try to become a person who can collect data. I know I said that before But you know my students right now are all going to be in hot demand because they're all conducting human studies They're they become world leaders in that experience In being a data collection they become a unique Scientist at that point, you know, so you can do human studies and computer vision too We're doing a study now of you know, how do you convert from a portrait mode to a landscape mode? Okay, well, let's have humans pick the best portrait from a landscape picture If you have a lot of data like that and we will then the iphone will do it just right like a human would like right instead of some You know algorithm Okay, well, thank you. I want to thank the audience I want to thank the panel members and most importantly want to thank professor bovik for coming here to visit us And I hope everybody enjoyed it. Thank you very much. Al. It's always good to see you So all right fun. All right, thank you for all the great questions fantastic I Not I can see why Purdue is such a one of the greatest schools ever, you know, thank you great people