 Thank you, Judith, for inviting me and for that introduction. And I guess what I'm going to do is take, I'm told, 15 or 20 minutes, give you a summary of this project. And I don't really have specific questions that I want to put to you, but what's happened to this project that these slides don't represent are some, I think, interesting follow-on projects that we're now exploring which will, in various ways, scale up and translate a lot of the technologies into different applications. And I think there will be a lot of interesting legal and ethical questions that are going to come up. So hopefully we will be able to explore some of those. So these are the members of my group and the ones that are shown in high contrast have all at some point contributed to the project, in particular, Phil DeKamp and Brandon Roy, are two primary people. You look pretty good. I know it works. I'm not worried. People, do you want me to disconnect and reconnect? Yeah, why? It was happy. So the point of this project is to record and analyze and ultimately to model aspects of how children learn from speech in the home. So if you want my little animation here, that's where the term speech-home comes from. It's a made-up word, speech in the context of the home. And the human speech-home project, of course, purposely plays off of the genome project. If you think about the equivalent of the, you know, we talked about genotypes and phenotypes, so what is the phenome as opposed to the genome? Well one aspect of the phenome is that the behavioral manifestations of a phenotype and the speech-home you can think of as a data set which captures, again, one facet of behavioral phenotypes and then all of the implications of capturing that, analyzing it, understanding developmental trajectory, developmental disorders, all of that can come out of data analysis that I'll talk about. So our initial goal and really what was the primary thrust of this project going back really a decade in the making was to advance our understanding of how children acquire language in natural contexts. So more than in particular characteristics of speech, understanding language and construction of meaning. And really the approach that's key here in this project is to collect data that has three characteristics. One is its longitudinal, which means we're collecting over the course of many months. It's ultra-dense sampling rate, so up to 10, 12 hours a day of data, and it's in vivo. So rather than observing kids in observation labs, bringing them into some kind of infant observatory, instead we want to go into the home, which is where we've got the natural social interaction. And then we want to couple that because when you put longitudinal, ultra-dense together, what you get is a huge amount of data to develop new tools to actually deal with that data. Some of the differentiators, comprehensive observation. So if you are in the field of child development or child language acquisition, what's typical is you send your graduate student into someone's home or you bring a kid and a mom into the lab and you get a couple of hours of audio. If you're going to do a longitudinal study, you might collect one or two hours of audio recordings once or twice a month for a few months. That's sort of the typical data set, which leads to a very weak foundation for any kind of theory because highly, very sparse, incomplete data. And a lot happens. Anyone who has children or has observed children grow, things happen in the course of days. So if you're sampling every month, it's really not good enough for a lot of purposes. And to minimize observer effects of having someone with a tape recorder or a camera in a room and then developing tools are all differences. And so this is kind of a one-slide summary of the entire project, although the initial thrust remains to understand child development. One of the things we're now looking at is applying this technology to detect and characterize and treat certain kinds of developmental disorders. That's a direction we're heading. And I'll come back to that at the end of the talk. And then there's various things like video scrapbooking, parenting aids. As parents, we are recording our own home life, a lot of interesting possible applications in those directions. Retail behavior analysis. So if you own a retail storefront of any sort and you've got cameras in there, you can do what you want with that data. Or that's what I and some of my sponsors think, and I'm curious what all of you think about that. And that's another thing that I hope we'll talk about at the end. So there's a lot of possible directions of impact of the core technology, independent of or sort of beyond the language acquisition. So just for a bit of a one-slide context of where the support came from. Now almost 10 years ago, I did some work as part of my own PhD thesis where we had mums and infants come into a lab setting, play with toys. We recorded what mums said to infants, so there's audio data, took the toys, let this little robot, little camera on a stick, looked at the same objects, and then built a little system that would listen to whatever the child was hearing, see what the child saw, and from that acquire a lexicon that linked speech to visual experiences. And that was really the start of many projects, including the speech on project. And the idea was to build a machine that in some limited capacity hears and sees what the child hears and sees, step into the shoes of the infant, and learn something about the language. And there's a lot more details about what made that interesting, but essentially this learning machine acts as a lens into the learning environment of the child and lets us test hypotheses about ways that a child may be learning to segment speech, to create semantic categories to link what you see with what you hear. So ball goes with things that are round, orange goes with certain colors and so forth. So this system did in fact learn from child data, and that got us thinking, well, this is interesting, but you bring a child and a mum into a lab setting, they don't act. Basically it's unclear what percentage of waking hours of a child are spent playing with toys versus all the other things one does. So that gave rise to looking for a new data set, so we fast forward ten years. This is a picture of my house in Arlington, and if you were to come into my home, that's the living room, and if you were to look up at the ceiling here, there's a little camera poking out in the ceiling. So this is a very high fidelity high resolution camera, and there's a little privacy shutter that can open and close, and a microphone. So microphone, camera, privacy shutter, there they are up there in the living room. If you were to look through the camera, look down, this is going back about two years now, so there's my son and my wife in the living room, that's opening up into the dining room. And above every light switch, there's this little device, which is a, that's the interface to the house. So there's four buttons on this touch panel, microphone, camera, this we call the oops button, and that's the diary note button. So if you press, for example, the camera button, that's how you turn video recordings on and off. Similarly, if you press the microphone icon, the house stops listening. If you press the oops button, that's like the anti-tevo button, a little dialogue manager comes up and you can say how many minutes back in time you would like to erase immediately and permanently from the recordings. So originally, we only had two buttons and had several oops moments and realized that we would have to add that. And then this was the fourth and last button that got added a few months into the project, which is a diary note. So if you press that in a backend database, there's a little flag that's saying something interesting happened that someone in the house decided to annotate. And then we actually have a simple practice, which is we just say diary note and describe what happened in diary note. Since the house is always recording, that gets connected to the flag. So this is a view of nine of 11 cameras throughout the house. So the kitchen, dining room, living room. There's the baby room with the crib, the downstairs, lower level, entranceway. And this is a little day in the life time lapse video of Life at Home. You can see sometimes these shutters coming in and out, blocking the video, and we've transitioned from day to night and finally lights out. I'll play you now a little example of actual videos so you can get a sense of the quality of one channel. This is my son at about 15 months and I, in a little interaction. What's that over there? Over there. That ball. Oh. And I'm going to come back to that little clip in a while. So what we have done over the past, it's actually now closer to 30 months, so these numbers need updating, is captured about 80,000 hours of video. 120,000 hours of audio on a 200,000 gigabyte disk array, a piece of which is pictured right there. And so in some sense, you are looking at the world's first speech home, roughly 70 or 80% of my son's waking hours at home from birth till the age of about two and a half. Okay, so now this is a raw, unanalyzed set of data, so it raises all sorts of questions of what we can actually mine from it and I'll try to give you a sense of some of those tools. So the first question is, how do you, as a human analyst or sort of anyone in the project, even get a sense of what's in the data? So we've got 200,000 hours of multi-track raw recordings. Actually, I was going to make a comment right at the opening talking about the primaries and so forth. I worked on a project 10 years ago to put the House and Senate online, so we had all of the audio recordings from the House and Senate, which we online with transcripts and put online. And it just reminds me, if you think about the total amount of video and audio of anyone who's of political interest, it's probably not 200,000 hours, but if you shave this down to a number of single track audio and video, it might be comparable and a lot of the issues of how you mine that and find the bit you want when you want it and in various other ways do higher level analysis of essentially a person's identity or their history or their personality or their development, regardless of context, there's I think a lot of points of connection. So I should have said that earlier, but okay, so for those of you who are not familiar, this is a spectrogram. This is a standard way of visualizing the content of audio. All these little bursts with these lines, anyone who has spent a little time looking at spectrograms, you immediately recognize is the fingerprint of speech. What's interesting about this is that was a couple of minutes of speech you just saw go by and without listening to it, you know something about it. It doesn't take long to learn. For example, where is there speech versus no speech? Is someone screaming or singing or just talking? Is there water running because there's broadband high frequency? All those can be just read off of a spectrogram and that's why they're widely used by people who do sound analysis. So we started thinking about how about an equivalent for video? Can you see something about the content of video without watching it? So this is my wife and I in the kitchen preparing a meal and one of a handful of techniques we've been playing with in our lab to start visualizing content of video recordings. So what we're doing here is analyzing where there's movement and leaving a trace and as time scrolls by, just as it did on the audio. So you can see here I am holding a couple of plates and moving around and leaving this trace and there's my wife. And of course the two of us are in a sort of dance interacting as we go through this, as we go through time. So each of these space time worms are a capture of one person's movement. And again, what's interesting now is this is over a minute of video and you can just at a glance read off certain things like there are two people and they seem to have certain kinds of coordinated activity. If you want to know exactly what they're doing, you'd have to go in and watch. But it gives you a certain level of insight. So if you put those two things together, here's some audio. There's two people in one room versus another in a third room. In fact, you can read off of this that there was a person who was in the living room and they moved into the kitchen where they joined another person and there's speech and there's some water running in the faucet. So that's a couple of minutes across several channels where a fair amount of information you can read off. And here is now a day throughout the house. So about 24 channels, a period where all the sensors go off. So this is actually a piece of software called Total Recall. We have fun with our naming. You load up a day of data until the recall and you can very quickly zoom in. So here are these little space-time worms so you can see how many people, which room they're in when they were talking and so forth. And here's three months where at this resolution, you can just see where there is and isn't data captured. Bathroom, master bedroom are generally off. A couple of days we weren't home, so recordings were off. So you can zoom in and out and then dive right in and view a specific video. Here's that clip again. What's that over there? Over there, that ball. So given the tools for visualizing, what we're now doing is developing tools to start sort of a first cut at analyzing fine-grained details of the data. So that little interaction between my son and I, if you were to sort of replay kind of key moments, and imagine where I'm looking as a sort of spotlight coming out of my head. So I'm only attending to certain things and not others. And likewise for my son, as those spotlights for direction of attention are shifting in tight synchrony, there's speech. So what's over there? I say, as I look out towards the green ball, but my son is looking at me. And then we have this moment of joint attention, where he follows my gaze. So he's now looking there and he says green ball. That's a magic moment for language acquisition. Joint attention, we're both attending to physical common ground, linking up the symbolic bits of speech to what we're both seeing. And then he actually closes the loop and I give some reinforcement. So there's all of this very intricate timing and interaction. Which, by the way, if a child has autism, they may not exhibit at all. So that timing and those patterns of joint attention fall apart at a very early age. And if you can detect them, that's also interesting. If you can characterize how they're shifting over time, that's of great interest, clinical interest. Seed where this could be useful. So how do we analyze that kind of gaze and speech pattern out of this raw data? I'm not gonna bore you with details, but just to give you sort of the bottom line. We've developed something that we playfully call Blitzcribing, because it's just so much faster than transcribing. If any of you have transcribed speech, you would be surprised or hopefully impressed by this number. We can take an hour of what we call house time, which is 14 channels of audio across an hour of data. And in about an hour or 40 minutes, transcribe it totally, right? Which means for about 120K, we're gonna transcribe everything my son heard and everything he said from the age of nine months till the age of 24 months, okay? And the result, the estimate will be about 16 million words of transcription, transcribed data. And just to put that into context, there's something called a Childless Corpus, which is an international collection of child language development data sets that have been accrued over the last 50 years. And so there's several hundred different research teams data sets. And when you add all those together, they constitute a 15 million word corpus. So it just happens to be that this one data set is larger. The point here again is, if you wanna look at fine grain development of speech, you need that data. You need to see what's happening moment by moment. This just shows you many of the different kinds of social context, joint attention. One person looking at the other, but not vice versa, face to face, reading together, approaching each other, grandpa hiding in the tent. And what we've developed is a method using computer vision that takes a computer generated head and actually locks it on to the video head and constantly reestimates positioning and orientation. So that the outcome is a relatively precise estimate in three dimensions of where a person is looking. Again, fast forward a bit. Imagine if you own seers or you own a bank and you've got a security camera. And you're curious whether your customers are paying attention to certain products or certain ads or whatever. Or if there are certain unusual behavior that could be predictive of theft or some kind of behavior that you don't wanna see happening in your retail space. You can imagine why this kind of technology could be of interest. So now let me take maybe five more minutes and play you a couple of bits of data just so you get a sense. We'll kind of take a little tour through the speech home, if you will. So since we're now transcribing, we can do things like say, let's pull up every time my son produced said the word ball. And let's look at the social and physical context within which he said it. To get a sense of what he thinks the word means. And is he using it in an overly general or overly narrow way? Or has he really nailed what the term refers to? And so to do that, you need to find all the data where he produces the term. So here is a little walk through about nine months of him saying ball in different contexts. He's now speaking in full sentences by the way, he's turned out to be an early talker. If you are a speech language pathologist and you're interested in development of speech, one kind of time lapse that would be really interesting is to hear the development of a speech pattern of a target word. So what I'm about to play you is an audio only time lapse, which I believe just provides a new lens into speech development. What you're gonna hear is my son when he originally started saying water, his approximation to it was gaga, not unusual, very early words, tend to be very simple sort of syllabic approximations. And over again the course of nearly a year, he develops the proper pronunciation, sort of the adult form of water. And what you'll hear is it's not a totally linear path. So he'll be stuck on gaga for a while and then there'll be this really interesting transition period where he's two steps forward, one step back, and eventually he gets it. So let me play that. And I usually can't resist and I won't today. I'll play you one more bit of evolution of speech, which is my personal favorite. And this one, again, we're fast forwarding through about six months and we have the video to go along with it. And here you start to see the very personal nature of this project too. I have no idea how many times he said daddy now, but I'm sure it's a good chunk of the 16 million words. And find a little video clip I'll show you and this is just the power of recording all the time and being able to go back and find things, all of the things you might expect, sort of typical dramatic things for first time parents. Things like first steps are often captured. So here are the first steps that he took embedded in the archive. He listened carefully, he'll hear him whisper the word wow. That was him, realizing he's done something, yeah. So to kind of summarize, what we're doing now is indeed transcribing all speech heard and produced by him. Annotating selected, so what we want to do is trace the birth of specific words or specific phrases and go back and pull up all the full context in which the words were not just produced, but long before that he's of course hearing them in different contexts. And in various ways starting with joint attention, but then also looking at the objects in the environment, annotate the video. And then analyze the role, sort of the predictive role of social and physical factors in how we can explain which words he learned first. And that's sort of again that very specific thrust that is driving things. So just to conclude, one question on the direct extension of the speech on me is, well, all of this from a scientific perspective is limited because we have n equals one, we've got one subject. So it raises a natural question. How much do you do this? Or larger numbers of people, how about 10, how about 100, how about 1,000 or more? And of course, you're probably wondering, why would you want to do this? So over the last six months, we've been working with the director of the Center for Research, for the director of research at the Grothen Center for Autism, and looking at because of the specific kinds of cues that are immediately or most relevant for understanding early language acquisition are also some of the telltale cues for early detection of autism. So now, as young as 24 months, there are reliable cues for detecting autism. The question is, can you detect it earlier? And once you detect it, can you characterize all of the many different developmental trajectories that children are on? And as you introduce treatments, can you characterize, quantify, with large sets of data on the appropriate tools, the effects of treatments? It turns out all of these are open questions, and so there's great interest in that community to explore these sort of tools. Problem is, I did one little detail about the project I didn't mention. There's about 3,000 feet of concealed wiring embedded in my house. And pulling this contraption out, as cameras and mics all embedded, there's nothing exposed. Very expensive, very difficult. No one else is ever going to do it, I would guess. So we've started thinking about a portable device. And there's our microphone and our camera on through this arch lamp with all the computation, storage, and networking built in the base. There it is looking from above. So gives you the same sort of fisheye view of what's happening in a single room. So in some sense, scaling back, not going for nearly 24-7 throughout the home recordings, but looking at key areas where a lot of interaction happens and capturing data in those contexts. And streaming it out of the home into analysis lab, where working with clinicians, we can sort of dig into the data in various ways. So a lot more to be said about that. In fact, beyond a concept, there is a prototype of the lamp that you could come and see now in the media lab. The head and neck are constructed, and we are planning later this year to deploy a pilot batch of these in homes in the Providence area. And all sorts of questions, and that's my last slide. All sorts of questions get raised on not just my own home data, and I'm happy to get into how we're dealing with various obvious privacy issues that have come up. But this turns up the sensitivities, right? Now we have other people's data coming into our lab. And often these are families, unfortunately, where the parents are headed for divorce, statistically speaking, if your child is diagnosed with autism, you're far more likely to see that divorce in that family, which means the likelihood of everyone being happy about that data being captured and all of the other kinds of information that might be captured. So these are things we're starting to realize. I won't say where it's far along is knowing what to do about it. And yet, at the time, there's huge motivation to get the data, if it could help clinicians get a better fix on what's going on. So there's a whole set of issues there. And then in the retail space, we are indeed talking to, so let me just give you one possibility. The whole idea or premise of the analysis of this data is idea of cross-modal or semantic grounding, cross-modal analysis. You've got the context within which languages is being used, that's captured in the video. And then you've got the actual linguistic signal, in this case, speech. And we want to build machines that can understand the connections between the two, right? So someone is saying certain things, certain contexts, if the machine can see what's going on and hear what's going on, link them, and all sorts of insights can come out of that. So let's say you own a bank. And now you've got these cameras that can similarly see the behavior of people. You don't have speech, that turns out to be illegal, because I'm sure many people here know how to record in a bank. But you do have electronic transaction records. So as you're interacting with this bank teller, everything that the bank teller is doing is being captured. So that's a carrier of semantic data, a task-oriented activity in your video. It's not quite plug-and-play, but similar ideas can be applied. And it's not just our project. I think there's an emerging set of technologies that will make it easier and easier to connect human intent and communicative signals to context. At various scales, this is a relatively fine-grained two-person sort of interactions. And a lot of our corporate sponsors see that, understand that. And so that's sort of another direction that this work may well head. And I think with that, I'll stop talking. Thank you. Bottom line, not a big deal. And in fact- Pretty much I'm just talking, how heavyweight is the software that you've designed to be implemented to the analysis of the data that you've captured? How heavyweight is it? Some of it is very heavyweight, as in very inefficient and very expensive for computation. But there's two things happening. One is, of course, the cost of computations dropping, right? And the density is increasing. So the same volume in our machine room can do more work. And our algorithms are getting better. And so they're both converging towards that being not an issue. And just to give you, again, a sense of where technological trajectory. So Seagate is one of our sponsors. And we're working closely with them. In fact, the whole disk array is filled with Seagate drives. And they are interested in understanding the algorithms we're developing and taking a subset and making them part of the firmware that's in the disk drive. So there is, when you think about computation, is that computer that you have to buy that then processes the data, the one layer of the processing, Seagate, whether it's our algorithms or someone else's, that's going to be in the drive. So if the drive will know it's video and it will extract certain kinds of features, so that the work's actually done literally as you're streaming the data onto your drive. So a lot of that becomes transparent and lightweight. I'm curious about what you thought about comparative smaller projects. So the comparative meaning of your family, a culturally different family. I mean, a family that perhaps is, I mean, like in a different part of the world, right? An orphan, I mean, a child who is an orphan in a different kind of, from a child development point of view, a different kind of language acquisition setting. And even in a single parent household. I mean, those would be four models where the language acquisition could be radically different. And just watching the snapshot, you're an engaged, loving, caring parent. And let's assume your wife is too, but that's not a model that's universal. And then you could layer on, well, the mom didn't have much food before she delivered. And I mean, all of those kinds of things that affect the organism that you start with, the baby when they're, you know, affects this kind of. So I'm thinking about how, you know, from a, because I think you could take interesting lessons from one to the other about how to encourage language development and modeling. I mean, that whole bundle of questions. Right. I think the... As opposed to deep, I mean, 16 million words and a zillion hours as contrasted to saying, well, let's take sort of smaller windows of time in different settings because the conclusions might be ultimately more useful from a, quote, therapeutic point of view or theoretical point of view. Yep. I have a request. Could we turn the projectors off? I think the room's heating up and it's a little loud on this side. So the, I think, first of all, definitely that's necessary in the long run. I think the way I characterize this project is in many ways in terms of methodology and technology trying to push the envelope and show sort of a proof point with one data set and then to engage the much larger community of people who are interested in an incredible number of different factors which are all believed to play a role and put this technology and this methodology into a lot of people's hands. And I don't think we're going to single-handedly do that. Sure, we could do another family with one of those variables. But in fact, if you just watch... So I have a daughter now who's eight months old and if you watch her trajectory, it's just totally different. And anyone who's got two kids knows that happens and over time maybe they're driving each other's differences but early on unlikely, so now is it all genetic or is there some biases behaviorally? Well, of course, there was only one kid to attend to and now there's two. So there's all sorts of differences. Teasing all that apart is a gargantuan task and it's going to require engagement of the community. So I think there's a lot of skepticism about the technology from the child development community when we started this project and a lot of skepticism about being able to handle privacy and those sort of issues. And so I think one sign of success, one metric of success for the project, from my point of view, would be to overcome that skepticism for at least as upset and get more people to be doing some of the things you're suggesting. For the clinical, for the autism... Isn't just that... I mean, isn't... If you did more bite-sized things, that might because you can anonymize the data or you can anonymize the person. So that's from a privacy point of view. And if you took off, we're trying to sell the data to McDonald's to say when they first start saying Big Mac. If you take out those and you're just interested in the language acquisition component and you do small bits... Right. So what you have done is characterize more or less the state of the art in child language acquisition research. And it's a totally viable strategy for understanding certain aspects and totally unviable for understanding other aspects. And so the point of this project is to push ahead in that other direction and say, look, there's with new tools and methodology, we can also look at... I think there's great value in doing longitudinal, even though it'll come at a cost, you can't diversify and do snapshots of lots of families. But there's a lot, we just don't understand that only can come out of doing these detailed... There's a history of diary studies in the field, right? And they're, in general, seen as of great value, but all sorts of theoretical biases, right? The diarist only notes what seems at the time somehow interesting and that is, of course, totally driven by their theoretical dispositions and we're not susceptible to any of that. I think over there and then Judith and then... So maybe one, two and then Judith is three. So let me first say that this totally rocks. Let me then sort of steer away from some of the child development stuff and maybe into some of the privacy stuff, but I suspect it has some of the boyfriend folks a bit worried. I look at this in my first conclusion is that this would be fascinating to DHS. And the reason there is the path tracking, the head tracking within large sets of video and audio. And I sort of wonder whether you've thought through any of the implications of people looking through video in a very different context in public places that are already widely monitored and essentially trying to do predictive monitoring there to figure out if you have a person in a space who intends ill to a large group of people, whether you're able to predict that behavior pattern based on someone moving into that space. So the stereotypical example is someone walks into a public space and starts immediately looking for security cameras, which is actually pretty atypical behavior. If you're doing this sort of real time video tracking and particularly sort of focus tracking, that seems like a real possibility to come out of it. Let me just sort of add one final observation, which is that increasingly so in the US, certainly in the UK for a long period of time, it's become quite common to have extremely pervasive surveillance, but that's mostly how to panopticon effect because there's a general sense that no one is sort of doing real time monitoring of this video. There's simply too much of it. It's simply there so that you know that you are being watched. You could theoretically come back and be recorded. You seem to be suggesting in some ways if the stuff that you've got works as quickly as sort of Sebastian was pushing on that you might actually be able to have a categorically different form of surveillance, a form of surveillance that is basically figuring out who might be a person of interest within a scene, either to say a retailer, but certainly to a security professional. I'm sort of wondering what you've thought through as far as those implications of how the nature of surveillance changes when you're able to do this level of analysis on top of it. So I guess first of all, just a point of clarification, which I think in the interest of time I moved through so quickly and now realize when you'd mentioned real time, what we're doing is not real time analysis. And that's not just a matter of the computer not being fast enough. In fact, the whole method of data annotation, both speech transcription, if you had a chance to read the title, it said Blitzcribe and colon, semi-automatic speech transcription. So it's not automatic. So it's not real time because it's a human loop. Same goes for the video. In fact, state-of-the-art video analytics technology would not allow you to track head orientation the way I showed in that clip, unaided by a human. So the piece that I didn't show there, just for interest of time, is a second layer of software, which is a human operator that does the analysis. So that's kind of one for clarification on that. So that means the analysis happens offline. Is DHS interest in this sort of thing? Of course, they have a huge program. It's the Acquaint program, I think. VACE, sorry, Video Analysis and Content Extraction. And there's many research teams around the country focused on exactly the kind of questions you're asking. So yeah, there's a technological, obvious technological overlap. But that happens to not be where any of our funding comes from and what's driving the questions. Did I answer your question? You started off by saying, am I concerned about? And then, but you didn't end with a clear question of what the concern would be. Sure. You're taking money from retailers. This is likely at some point to become an extremely useful tool for marketing behavior. There's one set of questions and concerns there. If I'm walking into serious, have I implicitly given permission, not just for my purchase behavior to become demographic data, but also for my entire store behavior to become data to build a profile? If and when this tool becomes useful for commercial applications, it's going to get deployed in another context. One of the contexts is going to be a security context. How does that change the answer to how you feel about that? How do I personally feel about the security context? The scenario you just laid out, I feel really good. Because you laid out that someone's about to do something bad, and can we predict that? So our technology doesn't speak to that. We're not even doing real-time analysis. But I suppose that could be a first step. And then the second question you would say is, well, what if they weren't about to do something bad, and your technology screwed up and had a false alarm? So I think that's where things get more interesting. So all these technologies have type one and type two errors. And then you look at the larger context of what are people doing with the tools, when you're never going to have. One basic limitation here is, you can watch behavior all you want. You're not looking at cognition. You can watch my son eat, but you can't tell if he's hungry, or if he's just doing it so that he can go play after. You can't see hunger. You can just see the movements of eating. So similarly, any time you're trying to do intention inference, it is inference. And so there's going to be errors. So I don't think it's possible to give you a simple answer to how I feel about it, because you'd have to lay out what the kind of errors are and what the policies are tied to, how you use that technology given those error table. Well, let me just follow in that, how efficient is your technology and the video where your son is saying water? I mean, has this already been clarified so that all of these actually are water? Or are some of those what he's saying actually not him saying water, and we just sort of think he is? And has somebody already filtered that out, or have we skipped over that? So that's not so much a question of efficiency or the accuracy, I think is what you're asking. So for accuracy, this is a well-known problem with early, early language annotation, which is a very idiosyncratic speech patterns. So we had this, I mean, another odd detail of the project I left out is one of our first speech transcriber was part-time nanny and part-time speech transcriber. And her workstation was in our house. So she spent 20 hours a week doing one, 20 hours a week doing the other. Generating data and transferring data. That's right. So because that's how, for example, I was able to show you the progression of Gaga. If you went back and listened to those early recordings, you wouldn't know what Gaga meant, unless you spent a lot of time watching the video, right? And then you could sort of, because we do have the context, but that would be extremely expensive in that part. But a kind of bottom line on efficiency is for the mature speech forms. So by around the early, what we really focused on in those early months when my son's speech was very difficult to understand was to just transcribe his speech, since we have a lot of that transcribed. And so we're focusing more on the adult speech, sort of the input. A lot of the focus is more on what did he hear in what context and how is that predictive of the order in which he produced things later. And that adult speech is, of course, easier in terms of that question. But the cost is a real issue. So typically, it would cost you a few, a couple of million dollars to transcribe this data. And we realize, well, we can't afford that. And there's a technological solution to bridge that and push it down. So 100K is a whole different story. So I think you're waiting. I'd just like to hear you speculate some more on your view of a road map and the bottlenecks. Like, it seems like right now, the human element is already the bottleneck. In terms of your blitz transcribing, it's 120,000 for 16 million words. I assume that's because of man hours of transcribing. That's right. So are there any technologies you see as being able to circumvent the human bottleneck? Just go nuts. Like, I want to hear your, what are some estimates of when different bottlenecks can potentially be overcome? Or are these like, because they're here to stay, because there are some capital H hard problems that would prevent massive adoption? So massive adoption of what? Well, for instance, the cost of implementing this, if you can just, it's all algorithmic, and you can get out the human middleman, those costs are all going to go down very quickly. But if there's a human element that's irreducibly in there for a long call, that's going to determine the price and the degree of use of systems like this. Like, could it be real-time? So you have the human element there that keeps it for me. Well, again, I guess to clarify this, the question of real-time, it really depends on what purposes you have in mind. If you're trying to do any kind of, say, developmental analysis, or you're interested in certain people in politics, and you want to go back and pull out things they said, et cetera, those are all non-real-time. Those are offline after the fact kinds of analysis. You're mining, and you're pulling out patterns or analysis. So that's not a real-time issue. Real-time is typically when you want an interactive system that will actually intervene on the fly, which is interesting. And all of our robotics work are real-time systems where that's the case. But everything I talked about today is this offline, non-real-time kind of thing. In terms of human bottlenecks, in that particular case, that's in technology terms, that's called automatic speech recognition. A lot of people think automatic speech recognition is solved. They're wrong. If you take the recordings of, say, adult-adult communication from my house, and I were to play them back to you and to 10 other people in this room, there would be very high inter-transcriber agreement in what was said. So it's a very doable task by humans, high 90% agreement rate of what was said. If you plug that same data into the world's best speech recognition system, which is actually paid for and tuned by the intelligence community and we ran such a test, you will get single-digit accuracy under 10%. So there's a huge gap in technology, which is why we developed these semi-automatic techniques. That's not to say progress is not being made, but I think we're actually, in that particular technology, we have hit against a wall and there's new research required to get us unstuck. But it's maybe drilling deeper than you intended. But the schemes are using context. They're using the meaning of what's being the, they're using detection information, what we already deep knowledge about language to do the task. While the expert's systems are ever you're doing it, can't do that. So that does put an upper bound. Your lips transcribing can only go so fast and so cheekily, it seems. Let's scribe, yeah. Yes, not only are people using deep context, but I suspect the very kind of what you might call shallow or front end acoustic processing. I think there's also different things going on in how we're listening to the speech. So when you have a stenographer or a real-time closed captioner for real-time television events, they transcribe a lot faster than if you make a recording that event and ship that tape recording to someone to transcribe. And the problem is, in that case, a technological bottleneck. And that's a bottleneck we've really focused on. So people are pretty fast. Machines could be faster if they had all these things you're talking about. But that's a whole other research program, not one that we'd set out to try to bite into. Yes, I am not sure who was, but. So what have you learned about human speech acquisition that you didn't know? From this project? Yeah. Or from the one that led to it? From this project. Nothing. Do you have a hypothesis about something you might learn from this project? Sure, multiple. And the reason I say nothing is you might think of the project as having at least three phases. Phase one, data capture, phase two, tool construction, and phase three analysis. So we've completed, more or less, completed phase one, which took a huge amount of engineering and plumbing of sort of bit plumbing. And we are very much in the midst of phase two, which is creating tools, which is what I've shown you. And we are turning the corner on having enough speech data transcribed to start doing some forms of analysis just within the speech. That's why the answer is nothing for actual results. It's a long-term project to kind of deal with that. So when you look through the literature on theories of early language acquisition, there's a wealth of theories. It's just incredible how many theories there are and how little data. And for any given theory, there tends to be small piece of data that support that theory, unsurprisingly. And many times, they make contradicting predictions. So a very specific set of hypotheses that we want to do a bake-off on, if you will, is to look at some specific cues that a child may be using to bias what they tend to and what word to, meaning, mappings they are hypothesizing. So for example, if when I hear a certain speech label, I am looking at something and you're looking at it as well versus not, how important is that? When I hear some novel word usage, I'm engaged in a particular activity with you versus not. So there's some activity context. There's some joint attention context. There are multiple competing objects or events in the scene versus there are not. How important are each of those in giving me a leg up on learning? Well, in isolation, each of them seem to be important. But we have no idea how they interact and how important each of them are in any sort of a natural context where they're all at play. And then there's various other simple things. Like I actually think spatial co-location is really important. How close am I to my son? Or are we on opposite parts of the room? Or are we moving around? Is there a third person in the room? Is it morning? Has he just had a meal? Do any of these matter? So we can systematically look at any combination of factors. And what we're looking at again is to see which subsets, if any, are predictive of which particular words he learns to produce in a semantically appropriate way first. So predictive value of later productions. Go ahead. One thing I just wanted to do is shift this a little to some questions for the group as a whole. Yeah, sure. And because I think there's, especially as you're talking about starting to set this up in other homes, I think there's just a tremendous number of very intriguing questions about privacy, about data ownership, et cetera, that a project like this raises. So like right now, we all expect a new house to come with indoor plumbing and lighting and things like that. Imagine you go buy a new condominium and it comes beautifully wired like this. They say it's the ultimate therapeutic tool that you could have for your growing family is to be able to record everything about this. So what are the steps that we would need before this would be from a data safe practice, a useful thing? What are the things that people would need to know? What are the laws that might need to be changed about your liability in cases of data? What would you need in terms of giving releases to people who walk into your front door? What does it mean when someone has come in your front door and it was fine with them and then they had 10 drinks and they don't want this data recorded anymore? Who has the rights over that data? Or the couple who came here for dinner was perfectly fine, they're divorcing a year later and they want your data. Or even kid consent. Yeah, at what point does your child become a teenage rebel and student for all your data? I think that there's ways in which this can become, while this is an unprecedented project in this notion of it being in a home, there are in various places other projects with people recording everything about their lives and I think it's not far-fetched to think that this might be something that, in not that far in the distance, might be a feature in a high-end home and so to understand what are these steps, I mean. And actually just to reinforce that a bit, we are in fact working with one of the largest toy manufacturers in the world who are exploring this concept as a piece of consumer electronics and early focus groups show really positive responses, maybe because some of these implications are just not being presented and not even being thought about. I have a bit of experience with this particular issue and I've written a bit about it in the hospital clinical context for medical records and the insight there is that, and it's not commonly looked at this way, is that it has to do with networking of information that the doctor-patient relationship includes, which is supposedly private. And the thing we discovered is that the merchant, Sears or the hospital does not understand the difference naturally between the fact that when the patient comes into that hospital, they have an expectation that their information will be correlated across visits, which could be done biometrically, but yet the publication of that information so it can be aggregated across different hospitals or across different, is something that needs to be voluntary from a privacy perspective. So just trying to get that thought across about the difference between interacting with one merchant who's providing surveillance and then how that is mapped across merchants and teach the patient to understand that what's being done to them by credit bureaus is not necessarily what they wanted. There might not be a societal interest. There may be a societal interest in the case of credit bureaus doing this aggregation voluntarily, but there isn't one in the case of medical information. At least there's no well-understood societal interest around that. And that's as far as we've gotten. But then we need to go back to the original question about it that Ian asked is if the companies have a right to collect your purchasing data, where does it stop from collecting any other type of data? Where do you draw the line? Where do you even consider start drawing lines? They certainly don't have any limits whom you don't come in currently. Maybe even just keeping it in a domestic environment because that seems to be a place where it's very fuzzy. What are your rights in terms of collecting data on people who walk in your house? You can imagine the Google system where you end up signing the NDA before you make it past the lobby. I'm sort of imagining the system where in the entry hallway there's the biometric, I agree, that I'm gonna be recorded in the context of the house and that I may be used in aggregate research data or perhaps I'll be anonymized or something along those lines. Have there been cases of people doing surreptuous recording or just ongoing audio recording in their houses? Well, there was the entire X10 camera based around the notion that you should place little wireless cameras around your house and surveillance for security purposes, which of course had also nefarious purposes. Most people know it because it was the best of the pop-up ads, but I'm sure that the folks in the room who actually know something about law probably have some history of covert surveillance within your own home and whether that's actionable in one fashion. What did you do with your own guests? Is there a difference when people, what's your expectation of just in a non-technically sophisticated environment where you're not recording everything without someone's knowledge? You're not recording, but is there an expectation of privacy that if you come into my home and we have a conversation and it may be something you just as soon I not tell you that somehow you have an expectation that I'm not doing it? I mean, do you have to sign something when you come in or do I have to sign something that whatever you tell me just in the social setting is private? I mean, it's a similar kind of. Right, but I'm not, I mean, that's the question here. Is it the same when it's actual recording? Yeah, I think it's... I mean, yes, there's certainly all kinds of both social mores around privacy, but I think there may be different ones that come into play when you are actually recording and archiving data. Yeah, I think it's similar, but it's so far advanced when you have comprehensive verbatim recordings and the ability to go back and extract selective components that it's almost, I would say, quantitatively different from the social, you know, he says, she says kind of... And then there also is a design component in that because when you are speaking with someone and you know something about them, you negotiate something about that relationship. So I think one of the things that I'm sure you're working on is the questions of what is covert and non-covert? It's like someone walks up to you with a camera, you know what they're doing. With something like this, is it different if there's an ongoing real-time display showing what's being recorded and what's not so that people have a very visceral sense of what's happening versus... You're saying that informed consent is impossible for recording, but let me just take the most extreme position for the lawyers in the room. I'm not saying it's impossible. What I'm saying is that I think particularly as technologists, a lot of our approach is to be on the design side. So what is it to be informed? Because one of the problems with a lot of informed consent is you get a little note saying there's recording happening here, which is different than something where if you're recording, there's a constant, say, display of what's being recorded where you have a continual instinctive reminder of what's happening. So what does it mean to be informed? If you can delete it, that's one. Well, we are recording right now. Do we need to remind the constamplers that we have the unrecorded? We do, we give it at the start of every talk. No, but I've been wondering for a while, are we still recording? Yes. Kind of in that, but we don't know. So it kind of speaks to... Just to answer, in case you're curious, how do we deal with this in our house? In fact, when we went through IRB approval, which was an interesting process at MIT, because they came back and said we don't know whether to say yes or no, because nothing looks wrong, but this just looks so odd. So we ended up getting actually the director of that project at CMU that I mentioned, the big database of child data to come in as an outside advisor and look over the protocol. But one of the things that IRB asked us to do, which is kind of comical, was to have a little placard as you come into our house. So when you first enter, you'd see a sign that said, everything is being audio and videotaped, and maybe post it on the internet, just to make sure that there's no question of where this data can go. And if you don't want it on, just let us know. What we ended up doing, just FYI, we had that up for a few months, and then realized that that didn't really work. So we took it down and made a product convention within our house, which is whenever anyone comes to the door, since there is a controller right at the door, we turn recordings off. And only friends and family who know us well and who understand the project, and typically would say, you can turn it on, but whenever you come into our house, by default, we have the recordings off. So that's where it evolved to. On that point, one thing I'm curious about, does your son have any inkling of the cameras or the microphones, and if not, how do you intend on him learning about it? Yeah, he loves watching some of the clips that you showed you. He went through a phase where he loved grabbing the IPACs, the little controllers off the wall, and just turning the cameras on and off, and then he just got bored of them. So that's the... But is he aware? No, I wouldn't say overall he's aware, or no, I think he's still a little too young. And we're not recording very much anymore, so we're kind of... I always was asked and always thought about, when would be the day we stop recording? Well, it turns out, there's not a day, just sort of the density of recordings have dropped off. And now, honestly, it's more of a, and as much interest for my wife as myself, my wife is a faculty member at Northeastern and interested in the speech and hearing sciences. So there's kind of overlapping interests. But we both feel sort of like, well, now our daughter is babbling beautifully, and we have to get a little bit of that, if nothing else. If not for the high density, scientific reasons, just to get that home video collection to be complete. Problem is, of course, my son's always got that two-year lead, so. Yeah, many stuff. Is there fear that your second child is gonna discover that there's 200 terabytes of data on the first child and a weird 12 meg of herd of element and sort of be scarred for life in terms of the... The thought has occurred, yeah. Make it up on the beach. Second child always. Yeah, they always get chipped, yeah. There's a sense in which the name of your project, of course, draws on the Human Genome Project. And one of the questions that often comes up in the Human Genome Project is, whose genome are we sequencing? There's an odd sense in which if you've just created, and it sounds like you have, the largest and most comprehensive set of data on early language acquisition, is there any worry that any idiosyncrasies in your son's learning patterns, any idiosyncrasies in your son's speech, suddenly, essentially, you know, affect the development of this field for some period going forwards? I mean, I assume this is now why you're trying to sort of replicate this. So that we're not all Craig Vettner. Craig Vettner, yeah. Did himself? No, no, no, I know. As well as the woman and man from Chicago who were in the original... So I think that ambiguity about whose genome partly comes from just the way the term was used in the popular media early on, comes partly from estimates of how much difference there is between your genome and mine, if that's defined in an individualistic way. When it comes to phenotypes, and in particular, behavioral phenotypes, there's little question of the huge amount of differentiation of the behavioral phenotype as a function of environmental input, which is not an issue with the genotype. So I don't think the same kind of ambiguities are gonna come up. And so I mean, this question earlier about going to different families and so forth is, of course, motivated by that. It's kind of obvious understanding that each speech home is gonna be very different from the next. But I think still, if you look at the influence of certain diary studies, and I'm not claiming that our work will ever attain that level of influence, but there aren't many detailed diary studies that have ever been done. Piaget did one and Tomasello did one. There's Roger Brown did a couple. So there's a handful of diary studies which were very influential, mainly in raising new questions, right? And then to answer them in a sort of scientifically acceptable way, you have to have larger sets of subjects and control for all sorts of things. So it's kind of a hypothesis generating type of project. So yeah, there will be certain biases. Again, sorry, I'm not paying attention to ordering, but go away. I was just wondering that because of the way the search funding operates, you seem to be looking at this mainly as a business application? As monitoring application and drawing inferences from that? So what ways do you foresee this being used as a consumer application? For example, suppose I have a kid next year, and maybe I want to follow this development pattern. So as a consumer application, do you think it is feasible or, because the way funding has been given, you're solely focused on the business side? I think you might have filled in some details about how the funding drove this that are not quite accurate. Media Lab is an extraordinary place where although we have a huge number of industrial sources of funding, it's a primary, that is what paid not only for this project, but the stuff that I talked about you know almost a decade ago now. None of it has had any, there's been no commercial application to date for many of it. And if I tried to sell it directly as that, they would have said, okay, and here's some commercial data, please develop your algorithms on that data. That's the typical kind of directed research. So in fact, I would say, actually with total confidence that the Human Speech Owned Project would not have happened if there was, if I had to make a case for commercial viability in any clear sense, tied it to any commercial interest. Or for that matter, you know, DHS or any of the other. So part of this project is. No, I'm not making a value judgment. I'm just saying that this clearly has positive implications on both sides. By the way, I'm all for commercially-driven research. So it wasn't a value judgment, but I'm just. I'm in B school, so I'm not against that at all. Yeah, no, I'm just clarifying. My point is that do you foresee this being applied in a consumer setting? Absolutely. First? First, because the privacy concerns would be bypassed. I don't know if they will. So if you walk into a retail store and your expectation is that that's a surveillance camera to keep, I don't know, you secure and the products safe from theft, which is why those cameras are there today. So for people who do reflect it all and why those cameras are there, they're security cameras. And now that data is being repurposed for consumer buying habits or certain stores, grocery chains want to figure out what to do with 24 feet of yogurt. I think it would be able to take Vegas, which uses that data incredibly differently than any retailer would. Yeah. But what are the expectations? Everyone knows that about Vegas, right? Right, but everyone knows about Vegas, but that's just the expectation that we have with retailers that it's only a security cam. But Vegas, they know that people are doing profiling. They know that people are watching pits. They know that people are keeping profiles on certain people that facial recognition. You're saying that what happens in Vegas doesn't stay in Vegas? No, it doesn't. Actually, it does. It's not like what's in Vegas. But that was adjusted over time and over time maybe we'll get used to the security cameras. I mean, what do you think? This is a question as many experts here. So what do you think will happen to the retailers if they start repurposing this information, which I'm guessing because, I mean, there's probably no law that says anything clearly one way or the other. It's theirs. I think they'll do it, but there'll be almost no reaction. I think no one will notice. I think it's... It clicks. Why is it different than it clicks? I think I'll stop. You go online and you're clicking on a site and you click on a particular item. So let me give you a... Here's a difference, okay? So remember those space-time worms that I showed you, okay? So an implication of that is we're tracking a person. As long as there's one worm, you could say show me all the video of that worm and now imagine every time you go to Walmart, okay? Although you are anonymous, you're just a video blob floating around in the store, you give away your anonymity when you make that purchase with your credit card, right? So now I'm Walmart and I've got a billion hours of data and I'm going to pull every time your credit card was used and all of the space-time worms that are connected to those credit card touch points. And now think about the implications of that, right? Now I can make a video collage of what you looked like, who you were with, et cetera, et cetera, in Walmart's over the last number of years. That might bug you. That might be upsetting to you, right? They can do it today. They can't do it today because they're not actually, A, they don't have the resolution for it to be, especially upsetting, B, they don't store the data. The same effect. Now I'm just saying the technological infrastructure, if you look at the camera, as you look at the storage array, et cetera, none of it's there. But in principle, you're right. They could do it today. But Vegas is doing it today. Who is? Las Vegas. Yes, and I think it's a special contact with certain expectations. I guess another question here is the crossover thing that Ethan mentioned because so nobody will notice, nobody will care, but we found that the government happily goes to AT&T, AT&T hands over their records, then the Congress retroactively excuses them from all liability, so. But even a slightly less sinister sense than us. And this was sort of where I was trying to go before. We've all gotten used to a certain amount of surveillance. Public doesn't mean what it used to mean, right? Public used to mean I might be seen by another person. Now there's a very good chance that you're seen by a video camera. If you were a person of interest being seen by the video camera suddenly becomes very important. We've all sort of gotten used to that, but it's got this sort of background town off to count effect where we know that the vast majority of the time were being watched. It suggests to me that this sort of system where you're going further, and at the moment in post-processing, I think you would argue five, 10 years from now, you might be able to do in terms of active processing, finding people's field of view, calculating path, isolating it through a crowd. Suddenly this is a really different level of surveillance that comes into play. It probably comes in very gradually. It probably introduces itself in retail spaces. Very, very few people refuse to use their grocery store coupon cards, which all it is is essentially a way to track you and essentially feed you a little bit of coupon to let you do it. It will probably come into play. The question that I have is, what's your reaction to public space gonna be? Is that gonna change over time? Are you gonna handle being in public differently knowing 10 years from now that by moving in a public space, you're generating a stream of data that's very, very useful to commercial entities and very, very useful to a DHS type entity? I don't think, I think that there's no demarcation in the case being Facebook. Is Facebook a public space or a private space? And all the stuff we're seeing now with, as Facebook tries to document its practices as a merchant with respect to is being played out. So I don't, again, my point is informed consent is impossible. It's a very active rebellion to Facebook, which actually began in this room. A moment ago you said that people wouldn't notice and wouldn't care. You're the same person who five minutes ago said that they wouldn't notice and wouldn't care. And now you're saying there's an active rebellion. Which one of those is it? There was certainly an active rebellion against Facebook. I think the question is, to what extent does it end up being intrusive? Facebook didn't think there was gonna be an active rebellion. I think it's a very sensitive edge as far as how intrusive it ends up being and what the implications are coming out of it. Oh, I'd just like to go a step further. There's only a couple of people doing this now because the ergonomics are all wrong, but some people are recording every moment of their life. They walk around with a little camera. There's only a couple of people doing it and that kind of nuts because it's kind of bulky. But the equipment will shrink and it's not just gonna be centralized merchants or things like that. Anyone could be recording you and any time people are gonna be T-going on their lives. And so what are the limitations on like a person? Like before you go into someone else's house, are you do have to turn off your camera? It's not just gonna be centralized. Everyone's gonna be like this. Or so I've heard. I wouldn't want to have like, you know. It's useful as an aid to memory. Like if you have your day recorded and you wanna recall the conversation you had with your boss, then you could rewind and be like, oh, he said exactly X, Y, and Z. And also like that special moment too, you can go back and replay that with a full fidelity of a digital recording. So it's just a matter of the ergonomics getting just right and then there's a little issue with that and a lot of people there might have no idea how that would work. Because that's... Is that the ancient question of where did my rights begin? What's that? That's reminiscent. I mean, I think a lot of these seems like, because we've had, Judith was just saying also brought to you by the Media Lab. So many of those projects originated at the Media Lab. And it seems that it all depends. The real question is not whether you capture the data but what you do with that. And my expectation when this data is being captured is an expectation tied to intent of the person capturing it. And so I would suspect if you did something that was somehow cause someone great embarrassment or some great loss, now you've crossed the line from securing goods in your retail space to doing something damaging to that person, they're gonna come after you. So it's all about intent and purpose and expectations are tied to that. I mean, that's my naive, non-legal take on it. But there are also a lot of examples of it. You know, in France when they were given speeding tickets, nobody cared when the videotape until the videotape somebody was out of their job. Right, sure, yeah. They were in the same place. There's street views things. I mean, because they suddenly had pictures with identifiable people walking into an adult bookstore or license plates that could be read, parked, you know, where the prostitutes are or whatever and stuff that had nothing to do with Google's intent. But certainly, I think that's upset some people. Had the right to feel that they were not private but anonymous in that situation. And what happened with that? I mean, nothing, right? That's, I don't know. The images are still. That's the problem, there's nothing. The relationship between those people and Google, like as there is between Facebook and its users. Sorry? Because those people don't have any relationship with Google the way Facebook does with its users. I mean, Facebook users mean something to Facebook, whereas random guys on the street mean nothing to Google. Right. So. Maybe we'll just all walk around and follow Brookha. Right, Brookha is the fashion of the future. Oh, good. There's a pond or like a tort law, I guess. Like if someone, you get damaged by someone and then you could sue them in that case. But if it was just, it seemed like you'd need a labyrinthine system of, I don't know what, if you couldn't be like, let's say you're a video blogger and then someone just showed up in your thing without consent. You have to get consent of every single person you publish or is it what you do in that case? Like wouldn't that cause a huge, wouldn't that just choke the legal system? If anyone could, would it? Well, again, I guess it depends on the space because when a newspaper publishes a photo of a street scene with identifiable detail, they don't have to get permission, right? Cause it's in the public. But if they hurt someone's home, they would. So I don't think video blogging seems similar, right? It must be in the same category. Audio recording seemed to be treated totally differently, right, at least in the state of Massachusetts. I think very few places where you can legally be recording audio, even though you can have cameras all over the place, you can't have microphones, which I now understand makes total sense from where the real privacy issues are in our home. It's not in the video. It's all about audio. I just brought an interesting quote that yesterday was on my calendar by chance and yesterday I thought it was interesting and today I think it's even more interesting. But it is a quote in the New York Times, Proctor of Sporting and Banking, 80 written by Charles Ferris, who was the chairman of the Federal Communications, the chairman of the SEC, which says, and I quote, when my children's homes are wired, a computer will have a record of what they buy and how much they spend. It will know whether they pay bills quickly, slowly, or not at all, and it will know where all their money comes from. It will know whether they watched the debates or a football game or a controversial movie. In other words, it will know more about them than anyone should. We can and should move at the outset of this information era to address the potential privacy problem so that it, in fact, does not become an actual one. What year was it? 1980. He's actually, Ferris is still around. He was a Jewish staffer, Senator Mansfield. Mansfield was majority leader of the Senate in the 70s and he's a lawyer named partner at a major Boston law firm. And it would be interesting to get him to come in here and use that quote, because he's a very plugged in democratic political person. It would be interesting to hear his perspective given that quote 25, almost 30 years later. One of the interesting things about that quote is how it places, because there's this ambiguity or not in that case of agency. So it, like the home is somehow the agent that is in control. The data. What's that? The data notes. Well, that's a very odd way to talk about data, right? As the hit, right? So it's not like data is self aware. It's meaningless. It's, again, a question of who's doing what with it. Well, that's another talk, right? You can talk all about machine meaning, which is, I mean, almost, but this came up with our project as well. Early on it was, you know, it was a Truman show because it wasn't clear who's gonna see the data or Big Brother. A lot of the questions of who owns the data, who's gonna see the data, but I don't know what it means to say it's the house. But the context of that was more likely, I mean, his context was more likely J. Edgar Hoover. Yeah, actually, when Hoover was the head of the FBI surveilling people privately and then doing something with it, the data. Yeah. I think it's also still in the machine. More so. Sure. Okay, well, thank you very much. It's a pleasure. It's a pleasure. I have a question. I don't handle that question.