 Welcome, everybody. I'm delighted you're here with us today. I'm Cliff Lynch. I'm the director of the Coalition for Networked Information. And you've reached the concluding plenary session for the CNI fall 2020 virtual member meeting, which has been running for about the past month or so. The session concludes the two closing days of plenaries, and we conclude on a very happy and wonderful occasion. The Coalition for Networked Information awards a prize in memory of Paul Evan Peters. He's the first executive director, the founding executive director at the Coalition. He was enormously well-known, well-loved, a good friend, and died suddenly in 1996. It's one of the things that CNI and its then parent organizations, the Association of Research Libraries, EDUCOM and CAUSE, the latter two more recently turning into EDUCAUSE, established an award in his memory. And the key criteria for the award was really lasting major impact on scholarship, on the world of networked information, on the broader world. I give this award every year or two, and the awardee is selected or nominated perhaps by the, by an award committee. The award committee consisted of, oh, yes, just just to flesh out, you can see how awful I am with slides, just to give you a sense of the kinds of folks who have won the Paul Evan Peters award. You can see here that our awardee today is in good company indeed. This year's award committee consisted of Christine Boardman from UCLA, herself a previous Paul Evan Peters award winner. Herbert von Dessampel from DANCE, who you heard from earlier in this meeting, another previous Paul Evan Peters award winner, John Wilkin, the university librarian at the University of Illinois at Champaign Urbana, and Joan Lippincott, who served on the committee before she became a Merida Associate Director for CNI. I'm very thankful to all of the folks on the selection committee for both their hard work, but also for making such a wonderful, wonderful choice. We have selected Francine Berman, who, well, Chris will tell you a little bit more about her. We are very lucky to have with us today Christine Boardman, who as I said served on that selection committee and is a prior Paul Evan Peters award winner. And I will just say that I couldn't have been more pleased when they came forward with Fran as the selection. And I speak for both John O'Brien, the CEO of Edge Cause and Mary Lee Kennedy, the executive director of ARL in saying all three of us were just delighted and terribly enthusiastic. Fran is an old friend and colleague. We've worked together on more things than I can enumerate easily. She's done a tremendous amount for our community and for the broader cause of open science and open data and many other things. And before we hear from Fran, who will give the customary Paul Evan Peters award lecture. I've asked Chris, if she would be kind enough to just say a little bit about about Fran from the perspective of not just a friend and a colleague but also a member of the selection committee. Chris. Thank you. Thank you, Cliff. It is indeed a great honor and pleasure to introduce Fran, who is indeed a longtime friend and colleague. I've also worked closely with her as has cliff probably first when we were together on the board on research data and information and co data of the of the communities. Fran really exemplifies all of the criteria for the Paul Evan Peters award. She's made major contribution to scholarly communication through a leadership as the founder of the research data alliance through her work, chairing with with Cliff as he said, the blue ribbon panel on data stewardship. Her service on the board of Sloan Foundation the national endowment for humanities humanities council, and so on so she's, you know she's addressing the fundamental problems that we face in scholarly communication and technology, and she's greatly increased the awareness of these problems in a broader field. I think what's most important for this award is that she's taken this deep knowledge of computer science, having directed the San Diego Supercomputer Center and then been vice president for research at at Rensselaer. And she's taken to all these awards for computer science where she became a fellow of the Association for Community machinery the IEEE the American Associates advancement science and so on and so forth you can read all the rest of her wonderful time. But I think what's really exemplary is to say how she's taken that deep technological knowledge and used it more broadly she finally got her first sabbatical last year because she just wouldn't let up on all this other work and went to Harvard as a Radcliffe fellow to develop work on the Internet of things so we're looking for the book and the work that comes out of that. But meanwhile, let us hear from Fran the exemplary new recipient of this award and she's given us so much and she'll give us more. Thank you Fran. Welcome Fran. Congratulations. I wish you could hear all of the virtual applause that's taking place. Over to you. Thank you so so much. I'm still blown away by all of this I have to tell you all I'm so grateful to CNI and ARL and EDUCAUS for this extraordinarily meaningful award. Chris and Cliff who have been colleagues, fellow travelers and inspirations for so many years. It's so meaningful to have have you both speak and introduce me. I don't know any kind of impact takes a village. And there are so many of you and so many people in the community who have been fellow travelers. And so many of you who are doing so much now I'll mention some of you in the talk but I am so grateful for you know what everyone is doing and the importance of it all. Cliff called me and told me about this award. I was so surprised and I started thinking about what I might want to say. As we all know we live in pretty extraordinary times in the pandemic has really exacerbated many of the things that we think about. Our community the community who understand the care about data and and digital technologies are more important than ever in this time and we're important. If we want to make society thrive and to get the best of the digital technologies and minimize their risk. So I thought I might talk about the first 30 years of this century. The pandemic has gone before what's ahead of us and I thought I would start with now. So here we are. It's about 10 minutes after three Eastern Daylight time somewhere in the Corona verse. I'm wearing a dress special occasion but you wouldn't know it. I don't know if you're wearing shoes. We're meeting in cyberspace via zoom. It's a really good time to start thinking about how much data is there. What are we doing with that. How did we get from Y2K at the beginning of this century to Cambridge Analytica and beyond. How did information technology become critical infrastructure and what happens as we're seeing increasingly now and we'll certainly be seeing over the next decade when data is collected everywhere and algorithms are in charge. All really important questions that I think many of us think about all the time. And so let's start with the first decade. We'll do this decade by decade. And you can kind of think about the first decade as of the this century as the almost famous data decade. And what I mean by that is data was driving everything, but it didn't have the kind of recognition and respect. It wasn't a first class object. It drove the presidential campaign especially Obama's campaign in 2008. It wasn't the first time that data was used back in 1960 Kennedy use behavioral science and and data to craft a message on civil rights, but it now is used for every campaign. We all worried about Y2K. We all saw Facebook for the first time in that decade we all saw iPhones for the first time in that decade. Silicon Valley enjoyed incredible growth during that time and it was driven by data data which gave all of these companies a competitive advantage and became deregur for anyone who wants to do anything. In medicine and health we saw the human genome project again all run by data and powered by data in the super computing world which I was very much a part of there was the race to a petaflop. And we finally achieved within that decade petaflop computers. And where was I I was at the San Diego Super Computer Center. So I came on from my job as a professor at UCSD to lead one of the two NSF National Super Computer Centers in 2001. A little bit later on when the program changed we had become a founding terror grid node. And our mission was really service to the National Science Foundation community and beyond so we had thousands of users and it was really important for us to provide services and things they needed. When the Super Computer Center was first started in the late 80s it was modeled after the Department of Energy National Laboratories and their super computing facilities. But as Dan Atkins and others had said so compellingly at the end of the 90s in the beginnings of 2000 cyber infrastructure was really the important thing for the research community as it moved on. And it wasn't just the computers it was the software it was the data it was the portals. And so we set about kind of re envisioning what SDSC was all about and we had decided that in addition to providing you know high level supercomputers. What we wanted to do is provide a holistic data oriented environment for everyone. So the idea is that our vision evolved from data focused super computing. And we had some wonderful people at the center or Titan Baru and Regan Moore and Phil Bourne and all kinds of people dealing with important data activities. But we really wanted to go beyond super computing and talk about cyber infrastructure and we decided that we would remake SDSC to provide a much bigger and more capable environment for doing data focused work. Then then one would have at your local environment your institution your university or research lab etc. And we decided what does that mean we kind of want to stretch it out and all kinds of different ways so that maybe you could store a terabyte size collection but we wanted to provide help in storing a terabyte size collection. Maybe you could keep several collections but we wanted to create a large terabyte sized archive so that the number of collections could be stored in a stable and reasonable way. Maybe you could store your collection for the life of your grant, but then you ran out of funding so we wanted to provide some way to provide a greater time frame. We wanted to provide a computing capability and oftentimes when we chose machines. This was not all that popular by the way in a world that cared about whether you were at the on the top of the top 500 list, but we often traded flops for bytes. We had data oriented simulation analysis and modeling so that meant we tried to architect and create machines that had more cash more memory. You know, a better environment for data. And the most important thing it turned out, as always turns out to be the people and the tools that help. And so, we really gathered people with expertise in data services data software curation, etc. And, and that this was sort of the driving vision for SDSE and it's amazing staff. What that meant is that we started creating and attracting all kinds of projects and new collaborators, which really focused on data cyber infrastructure. And you know we looked at data storage and data services and data visualization and data management and data preservation. You know, you SDSE had at any given time, about 100 projects or more, a budget of 10s of millions of dollars typically between 50 and 80 hundreds of people and all of those people were very involved in looking at data oriented activities. One of the most exciting things for me and us at that point was a partnership that actually started in the office of the university librarian at UCSD Brian shot under. And Brian is one of the people to whom I really feel like a woke such a passion in data, and such interesting problems in in the whole world of data stewardship and preservation. This is Brian's beautiful guys that library on the bottom, and our old wonderful building although it did have a view of the ocean I just like to point that one out. Brian and I worked together and one of our first conversations were about SRV and Regan Moore and the ways in which the library and SDSE were working together. But soon those conversations really expanded they certainly expanded my own knowledge, but they also expanded our partnership. And one of the many things that Brian and and I did together was something that I'm really proud of which was a project called chronopolis and chronopolis was a joint project between the UCSD libraries SDSE. And we had wonderful partners and NCAR and University of Maryland, originally as well. And the idea was to decouple access and preservation. So we built, we built a preservation data grid, which means that for all of the collections that we had different nodes would play different roles. And so perhaps a collection we would make available to users at SDSE, but maybe NCAR University of Maryland would serve as a dark archive. And, and similarly University of Maryland might be a bright archive for something, but maybe NCAR provided the dark archive. One of the things we learned in creating chronopolis and and we are we and I am so grateful to the Library of Congress and Laura Campbell and Martha Anderson for taking a chance on this and just providing such great support. One of the things that was really interesting to me is that it really brought to the fore the importance of the social infrastructure one provides when you have these kinds of relationships. And so, University of Maryland and NCAR and SDSE and UCSD libraries had relationships with each other, but we wanted to formalize them in some way we wanted to formalize the trust, and the backup and the replication we would have through service level agreements. And it really started me thinking about the importance of the social infrastructure that has to pair with the technical infrastructure when you think about stewardship and preservation of data, and it was very important for that project in particular. It's a conversation that I carried on in the conversations I had with Chris Greer. Chris was at the NSF at the time and he and I were talking a lot about, you know, how do you think about preservation and access and stewardship in a way that creates kind of economic stability, because in a sense you're really thinking about data as a public good. And we all know that it's tremendously difficult oftentimes to sustain public goods. So, we put together Chris from his side at at NSF and myself as part of the community with the amazing Brian Levois, who is co-chair with me of the Blue Ribbon Task Force for sustainable digital preservation and access. And the charge to that group was to build a comprehensive analysis of, you know, arguably the hardest part, the Achilles heel of sustainable digital preservation, which is economics. What's the economics of the data? How can we sustain it? What are best practices in that? How should we think about that? What should we recommend for action? And we had a tremendously successful task force, which I'll talk to you about in a little bit. And we had an amazing event of which this picture is from of it. And you'll see many old favorites, including Cliff in a in a mode that we often see him thinking very, very deeply. And then coming out with amazing insight. So, so I was, and you haven't changed today, Cliff, so this is pretty amazing. The Blue Ribbon Task Force had a number of people on it and we asked many people to testify and even more people for advice. And if, you know, in my opinion, it was a bunch of superstars. And many names on there, which are really recognizable and many names who have thought really seriously about the economics of data and data as a public good, including Lee Dirks, who we lost many, many years ago, who was such an important champion and support to our community. Chris Greer started us off and Lucy now and Sylvia Spangler and Phil Ogden, Ogden kept us going. Don Waters helped the Library of Congress and narrow provided in kind support for us, and it was an amazing few years. We were very excited to kind of look at all things economics and of course one of the things that we figured that we realized as we went through it is there's really different environments in which data economics has to happen. And, you know, one way to explain kind of what's going on is there are really many stakeholders in the data environment, there are those who benefit from the asset, and those who select to what to preserve those who own it or have rights to it. Those who actually do the preservation and those who pay. Now if you're Google, you benefit from collecting everybody's clicks you decide which clicks are you're going to collect. You own it and so you're happy to pay the data bill for your own collection which is your competitive advantage and there is there is great alignment between all of your stakeholder groups. And there's great alignment, the economics somehow isn't a bad problem. But if you're in the research data world which many of us are the community often benefits from the data that we generate or provide. The PI decides to select what to preserve. Perhaps our universities or others actually own the asset. And, and we or others may preserve the asset but that may not go much longer than the grant itself and then the federal government pays. And that a lack of alignment makes things often very difficult to preserve collections of importance to the community. So our our crack team on the task force, including a number of economists, looked at this from four different scenarios research data as I've said and commercially owned cultural content but they also looked at it from the point of view of publication and scholarly discourse and collectively produce web content. So we had a couple of amazing years with discussions that were extraordinary. The group came out with a couple of reports. The amazing Amy Friedlander was editor for the interim report, and the amazing Abby Smith Ramsey was editor for the final report both of them are incredible documents and I think now they're they're both on my website and I believe and the boys website as well but at one point they were on the SDSC website and I have to say that SDSC told me that they had been these reports had been downloaded more than 120,000 times. And so we really felt like we had made an incredible impact with the blue and task force and I'm so grateful to Chris and and all of the other people who were involved in it in any way. And of course that's what we were doing. What was happening in the rest of the world. In the late 2000s data, the trickle of data that had started at the beginning became waves and really a tsunami. Everybody was talking about big data NSF had started asking people so what are you going to do with your data. The community was scourging scourging to create data management plans. Data was on the cover of a tremendous number of magazines. The Large Hadron Collider was spitting out petabytes of data, many petabytes of data every year. And so really data had become a tsunami. And it really brought us into the next decade, which is, you know, everybody learning to surf the data tsunami. That picture in the middle is a common picture we see every day we're all together but we're all in our own digital worlds on our cell phones. This was the decade we saw Cambridge Analytica people were using data for both good and for ill. And it was a tremendous time for data. Of course there was also the recognition at that time that with more data comes the need for more and better infrastructure. And then many things brought us to that first of all the rise of the small scale devices so it wasn't just the big computers it was computers of every shape and size, including ones that you put in your pocket. It was the sophistication of cloud infrastructure which had really started being a real thing. The government started deciding and recognizing the fact that stewardship and preservation were tremendously important. We saw the Holdren memo in the early 2010s that talked about, you know, research data we saw the government, putting out data with data. We saw an interesting set of studies in nature which talked about all of the data that was missing because of insufficient infrastructure and stewardship and preservation. And so, so infrastructure was really because data had become a first class object infrastructure was on its way to becoming a first class object. And in that environment and now we're going. There we go. And in that environment we started thinking about okay what infrastructure and this just gives you a sense about why it matters from the point of view of the research community. So no matter what kind of problem you want to solve, maybe you want to solve a public health problem. Who's at risk for asthma all over the world where you safest la or Mexico City or, you know, Arkansas or Malaysia. You want to be putting together data from various places you want to worry about interoperability you want to worry about workflows. You want to worry about how we increase agricultural productivity. You want to look at data of various crops and terrestrial data and weather data. When you worry about how accurate the standard model of physics are you want to look at data from the large Hadron collider, or what will happen in an earthquake you want to look at seismic data, and other kinds of things or data on building structures and and how they withstand it. So you need to solve those problems, which are really the focus of what you're, you're interested in looking at is a whole bunch of data building blocks. You need common metadata so if I talk about length and Diane talks about length and Diane's talking about centimeters and I'm talking about inches we know we have a problem. We need domain and institutional repositories for that data. We need to understand what's legal and what's not and that gets into some of the privacy things we start seeing as we go through this decade. What about data workflows again sustainable economics some of the most important social infrastructure data has. So there's a number of different data building blocks one needs, and at least in the research environment. They're often a little ad hoc, often maybe a little one off because our market isn't quite big enough. So, that's a discussion that many of us had been having for a long time. And it's a discussion that Alan Bloteky and Chris Greer were having in their roles as working for federal R&D agencies with their colleagues around the world. So by that time Chris was at NIST and Alan Bloteky who had worked with me at SDSC was at the National Science Foundation, and they had talked been talking with colleagues in Europe and Australia and Canada and all over the world about data infrastructure and how they could empower the community, not just the research community but the community developing the building blocks that the researchers needed, the community of maintainers the community of data infrastructure developers etc. But one of the things they came up with in those discussions is something called the data web forum and they wrote a concept paper. Now by then I had repotted myself from the San Diego Supercomputer Center to RPI, where I was Vice President for Research and because we were all friends they kept sending me various versions of the data web forum concept paper and I kept saying what about this and what about this and by the time we were through with our conversations, by the way, CNI has this CNI I think is the only place who has a copy of this concept paper so thank you Cliff. And if you're interested in it you might want to take a look. But by the time they were interested talking about it. I was very engaged with the concept myself and I was very interested in the whole activity and, and I missed my community because when you're Vice President for Research, you have to love all of your domains equally. And I missed being with the data community at the time when we were finally first class object. And so, so I stepped down from being vpr and I into my, you know, regular professorship. And I decided I would help out with this by helping co found the research data alliance which is what the data web forum we renamed it we decided this was a better name. And so in August 2012. I was on the phone with several seven other colleagues Beth play Lee was a colleague from the United States and we had colleagues in from Finland and Germany, and the UK and Australia, and, and by spring of the next spring of 2013, we had the first RDA plenary 250 people showed up from about 40 countries I think, and, and grew to a community of over 11,000 today. It was, I have to tell you it was thrilling getting to do, you know, essentially an international nonprofit startup and and to do it when you really want to focus on impact and outcomes. And so RDA started from those days to a community driven organization that was dedicated to the development and use of infrastructure for data sharing and data driven exploration. And we got to, you know, create our own culture. And so we tried to create a culture that would really enrich and elevate the data community. Our organization was very pragmatic. The idea was to solve targeted problems and make tangible progress. We worked on problems that somebody had but everybody didn't have to have the same problem. And over time, our members took kind of one of three roles, either role as a member of an interest group which were interested in framing the kind of infrastructure that was needed. The member of a working group which was interested in building in roughly a year, year and a half the kind of infrastructure that would be used by someone or an adopter as someone who actually used it so all infrastructure. That's developed in the RDA needs to be used by someone and needs to be adopted and to make their life better. There was always on impact and outcomes. There was no build it and they will come infrastructure was allowed, and it had to solve problems and it was very important for us to amplify the usefulness of that infrastructure through further adoption and there's lots of programs in the RDA today that do that. Maybe most important, I think the role that RDA played is to help build a healthy and thriving data community. One thing that I'm tremendously excited and proud of is that diversity has always been a priority in the RDA it's perhaps the only organization I've ever been in as a woman in tech, which has half women and leadership at all levels. It's not just gender diversity. It's diversity of professional age. We have a lot of early career people who are who are in leadership we have a lot of people from different countries we have a lot of people from different professional places. So RDA has really been a place where people mix and they mix in, in a really useful way. RDA has really elevated the recognition of infrastructure and the maintainers of infrastructure. We're really important people in our community and often don't get nearly enough credit. And another thing I really loved about RDA is that there was not no focus on world domination. The idea was to partner with other kinds of organizations focus on enabling outcomes no matter where they came from, and to really improve the community and it was really thrilling to be with a group of very dedicated people who really made RDA an amazing organization. I borrowed this from the RDA website they always have all kinds of handy dandy statistics. This is pretty wonderful it shows you the growth over time from, you know, several hundred to over 11,000 people in over 140 countries. There's about a little bit more than two thirds of them are academics and researchers but there's also a fair number in public administration industry journalism etc. And we typically have, you know, roughly 100 minus groups, interest groups working on stuff. You'll find the outputs there they're outputs that are useful to librarians and outputs that are useful to researchers and outputs that are useful to publishers and and all of that kind of stuff so it's just a it's a great organization because I'm a builder and I love building things it was a wonderful wonderful and exciting time and I handed it over to to we had a succession plan and today Rebecca Cascala is leading RDA US and a really wonderful set of people are are leading RDA from the international thing and the organization keeps going and so I'm just so proud and supportive of what that whole community is done. While RDA was worrying about infrastructure, everybody else was to and elsewhere in the 2010s you saw a global recognition of the value of data for just about everything and the importance of infrastructure. Of course, it's, it's hard to find a company these days that doesn't try to use data as a competitive advantage. In academia, it was the rise of all kinds of really interesting programs in data science and to ask some interesting questions about where do they live statistics computer science multidisciplinary, what do we teach in them. Okay, we might teach machine learning what about ethics and other kinds of things and so we are still in a really interesting area or time of experimentation and of course the Sloan Foundation and the more foundation has done a lot in terms of promoting and furthering the pedagogy in data science in the government we saw data dot gov and especially during the Obama administration and one expects during the Biden administration. We'll see a lot more focus on digital technologies and I think that's going to be really important. There's none of those hold a candle to what we're seeing in our real lives. And they, if you think about community and communications social media, virtual reality meetings, etc. We are washing it and especially during the pandemic. So that brings us to the next decade, which is kind of the Internet of everything. And, you know, we have smart cities we have if you had an avocado for lunch last week it may have been grown on a smart farm where each of the plants have sensors that tell the farmer precisely how much water or nutrients or fertilizer it needs. And that part doesn't look like that, but it but it's probably connected to the Internet in one way or another. And so I thought I would talk about this decade, starting with where we are today in the coronavirus. So this is my cyber world, circa 2020, like the rest of you, I am staying put wearing my mask when I go out and try not to get coronavirus. I just finished teaching last week. This is my class say I tried to hide their name so you know, well privacy is not a bad thing. But this is how we taught all semester we talked via zoom. They turned in digital assignments and I gave them digital corrections on them. Every time I got out in my car and I went hiking I pass through an easy pass things so my car was tracked by the Internet. I bought things I buy things via Amazon Prime. I got my groceries yesterday via Instacart. I entertain myself via Netflix I'm on my smartphone a million times a day. And I know me well know that I'm a serious ballet student which is not to say good. And this is how I take my several times a week ballet lessons. This is my teacher my leg does not go this high I just want to point that out. But that's the screen that I see, and the screen that my teacher sees is the screen behind her. And I am one of those boxes I don't know if you can see me in this box but she corrects me over the Internet from 3000 miles away. And my period's about gotten a little better over time. And I'm doing this in my living room. Now, that's my cyber world but your cyber world might look different. Maybe you're driving a much fancier car than I am which can drive itself sometimes, and is certainly connected to the Internet. Maybe you have a much smarter home that I do. Maybe you use an Alexa or you have a connected Roomba, which vacuums your house. Maybe you have a smart refrigerator or a smart toaster or other kinds of a smart coffee maker or something like that. Maybe you monitor your health via these connected devices maybe have a Fitbit for when you run or if you have a small one at home you have a baby monitor and that's, and that's connected to the Internet so you can do all kinds of analysis with how they're sleeping. Maybe you have an implantable face maker or an implantable connected insulin pump. And of course everywhere you go you see these surveillance cameras so you know this is the cyber world we live in and all of us, whether you take ballet lessons online or go through Easy Pass or drive a cool car, all of us are part of the Internet of everything. And the interesting part of the Internet of everything is that with all of these digital devices which not that long ago were optional with the pandemic tech has become certainly if it wasn't before critical infrastructure. I work via the Internet. My students go to school via the Internet. You know the Internet has become a critical part of just about everything we're what we're doing that we're doing these days. And it's not going away. So if you look over the next decade connected technologies are going to become more and more ubiquitous and unavoidable. Cisco estimates that there are about 50 billion devices connected to the Internet these days. That's more than six devices for every human on the planet. It's going to be pretty extraordinary. The video surveillance market is in the billions. The economic impact of connected devices is in the trillions. We expect all cars to be self driving by 2550. We're just at the tip of the iceberg. And it's an extraordinary time we live in full of incredible opportunities and incredible risks. It's going to mean for the data community. And it turns out that it's also going to be the generator of an incredible amounts of data to navigate this brave new world will need information on to assess the objects and systems and devices that we have. How do I know if I'm safe or secure. How do I know about the sustainability of these devices and systems will need a data on their operation. We'll need to know, you know, where that data should reside and who can access it. We'll need data that helps us determine accountability and liability and responsibility and to determine ethical outcomes and that data will be will come from the devices but it will come from a lot of other places as well. We'll need data to allow us to understand what's happening with these systems. So for them to be transparency. So what kind of information do we need, you know, take your average self driving car they're not completely self driving yet. But many of them can drive in many instances all by themselves. Typically today's prototype self driving cars are generating four to six terabytes daily. That's an, that's an immense amount of data. And, you know, as we try to understand as we have them in fleets as we have them connected to one another, they're going to be generating even more data. Now for us in the research world. This creates all kinds of new and interesting research problems. So how do we do open science, or make data fair or do reproducible research in these dynamic environments. They're decentralized their heterogeneous. Data is of different types they're owned by different people, you know how do we do any kind of work there for the many many systems that are now more and more autonomous and decision making. How do we create representative training sets that don't get things wrong and and put us in the wrong space how do we we make sure that our decisions are more towards unbiased and ethical outcomes rather than biased, unethical outcomes. How do we architect the innovation tech innovative technologies to promote the public interest facial recognition can be wonderful to keep us safe, and it can also be highly intrusive. And, and, and those are sort of social decisions. Right facial recognition is just math, biometrics is just math. So how do we decide you know when it's appropriate to use them and when it's not. And of course when all of this becomes critical infrastructure, a must have in order for us to be citizens of our society, or students in our schools, or workers in our company. You know what extra rules are important, and none of this is easy button stuff, none of it. What we're finding. What we're finding is that social constructs social infrastructure is absolutely needed to promote the public interest without social infrastructure, you can go a little crazy. And, and we've seen you know people hacking baby monitors and screaming at babies we've seen, you know, crashes of and catastrophic failures of self driving cars and Alexis with bugs that shared information they shouldn't and cyber vulnerabilities and pacemakers and, and, and a whole national discussion on facial recognition. What that means is that we really need to couple the technical with the social. We have this wonderful wonderful capacity for innovation. We have to make sure that the innovation is good for us. And we do that by creating the correct kinds of social infrastructure that promotes the public interest not private interests, primarily. Oops, I'm going to go back now. Thank you. And so, and so one of the things for us to recognize I think especially in our community is that social and policy controls have technical implications. And when we look at the public interest challenges you know which, which, which protection should we have privacy safety security, what does that mean. When should public interest prevail and when should private interest prevail, who should owner have rights to data. If you're the subject of the data, can you control what happens to it. Is that okay. Who creates the standards and the policy, and all of that at the end of the day, really becomes needs to be translated into technical infrastructure. Right access control policies, the way we collect metadata, the way we create the architectures of the services and devices we have. You know when you go to Disneyland if you want to write on the roller coaster, and you're not tall enough. We already know it's not going to be safe for you we know that maybe you'll get thrown out by the seat belt or you know the force of going around the corner. There's a sign and the sign says you must be this tall to ride the roller coaster. And really what we need now is the equivalent of, you must be this tall to ride the Internet of everything. We need to know you know when is it possibly going to hurt us when is it possibly going to help us. What can we do to make it safe and secure for us. So in the end, you know, it's all of our responsibility and this is a decade that has just begun. It's a decade that we this community, and all of the people that we work with can really make a difference. The government needs to take the lead in creating policy and legislation and enforcement mechanisms for personal protections for the public. That's what it means to be to support the public interest. But business can then take those protections and design them in a really innovative way in terms of products and services. They can make those products and services more transparent. They can support safe practice. Those of us in academia. It's really important for us to be training the next generation of leaders and the current generation of citizens. We need to be talking about the social implications of technology. Everybody in the world needs to know about if we're going to live in our time. What does it mean, you know, is 5G a good thing or a bad thing is net neutrality a good thing or a bad thing is data privacy a good thing or a bad thing. What cybersecurity, you know, these are things that affect the world that we live in and it's really important for all of us, not to know the guts of them necessarily but to know what they are and why they're important. And of course as citizens and members of the public. We need to also take control of what we're consuming. You know, we need to ask before we buy and sometimes make decisions not to buy things that aren't safe for us or that aren't private for us. We need to protect our data. And, and most importantly, we need to speak up and talk to our people who can make policy and legislation and provide feedback and votes. You know, if we looked at the primaries for the this unusual election we were just in, we had many, many, many candidates, and their attitudes about technology are important, because the people who run our government can make rules about technology that make it easier or harder on the rest of it. So it's everybody's everybody's responsibility. And with that I wanted to say a little bit more about all of the amazing people that I am so grateful to. And if I have missed anyone, because I do have pandemic brain these days. But I have to say that so many of you have been so important to me conversations support partnership. The great month that I spent with Chris at Harvard, a couple of years ago. The great partnership I've had with Cliff and so many others. It's really been so important to me and so I thank you all and I cannot tell you how blown away I am at receiving this award. Thank you so much. Got to unmute if you're going to clap. Thank you Fran, that was just tremendous. It really was. And reminds us of so much that's happened over these first 20 years of the decade. The look ahead is fascinating and scary. For sure. I, I'm really just kind of floored by the whole thing. One, one issue that I'm, I'm kind of curious about and I worry a lot about lately is resilience as as this as we become more and more dependent on all of this. All of this technical infrastructure as critical infrastructure. There's a, there's a tendency to want to optimize for cost, rather than necessarily optimizing for resilience. And I wonder if you have any thoughts on that, particularly in the context of the internet of everything. That's a really great question. And I think it's a hard nut to crack, because, you know, the solution is shared by all of us, but the problem is also shared by all of us, you know, it isn't just the private sector being mean. We don't want to spend more for goods and services, and we want them to come to market as soon as possible. And so, you know, in some sense, there's no incentive for the private sector to spend the extra time making sure that their security, or, or not to be taking data as a competitive advantage because other people are, or to provide something that we are okay with. And, and that's why, you know, in my own mind, I think it's important that government take the lead, because if government says, you know, you have to maintain certain standards and if you think about it. Although nothing works perfectly for sure. If you think about food you think about drugs you think about our work environments, the government has set standards about what's acceptable and what's not. We have FDA, we have OSHA, you know, we have a number of different government agencies, whose job ostensibly is to keep us safe if I, if I, you know, have a building company. You know, it costs me more money to get asbestos remediation gear for my people so why would I want to do that. It requires me to do that because asbestos, working with asbestos is really unsafe. You know, FDA has rules about food, Department of Agriculture has rules about food. And so, you know, why we wouldn't apply the same kinds of rules to cyberspace, where things could be just as dangerous to us in a different way. And I do think we need to come to a time where they won't work perfectly. Things will still be hackable but we will be a lot safer if we kind of architect things. So it gives us basic protections in cyberspace and I think that's tremendously important. I mean, the sort of discussion about, you know, well, should we really have network connected devices with no provision for updating the software on it. Right out there. We've seen that story. And another, another piece of that is, and I didn't talk about it in this talk because it was long enough, but, but, you know, there's a notion of sustainability. If you think about it, if everything has a computer in it, and those computers are using rare earth or lithium batteries or all kinds of things, and then we throw them out. We have to worry about the kinds of materials that are being used and their depletion. We have to worry about e-waste, which is mounting and not even counted by every country in the world. And so, you know, our success in cyberspace may, you know, hasten the lack of sustainability of the physical world. And not just our social world. And so you really have to think about it in a very holistic way. And, and I think that, you know, that's where, you know, you need adult supervision somewhere in the system. And, you know, oftentimes not always governments have played that role. Sure. I have a couple questions coming in. But before I get to those two, I want to give Chris an opportunity to reflect for a moment on Fran's wonderful talk. Sure. Thanks indeed that was, that was wonderful Fran. See it, am I on yet? Let's see. Yeah, you're good. Excellent. I think, you know, seeing that arc and seeing how much has evolved over this period of time. One of the things I was going to ask you to reflect a bit more on something we're both concerned about is how the whole notion of open data has changed over the course of this time. And at the beginning of this period, open data was the, you know, the next best thing that was going to revolutionize the world. And now we've seen all kinds of risks and challenges and problems and unexpected commercializations and other sectors and so on. I want to reflect a bit on how the notion of open data has evolved in terms of what should be open to whom and when and why, with what restrictions over these several decades and that would be a good conversation to move along to. Yeah, I'm happy to respond to that but I want to point out that, that as one of the world experts in things like that, I'd be interested in hearing your thoughts as well Chris. I think the interesting thing to me about open data is, especially for us in the research world, who have really leveraged open data in a really important way to make, you know, important breakthroughs is that today with data kind of a wash everywhere and the private sector is being a huge driver. You know, it's hard to know how the work we do in the research environment and the proprietary data that's generated by the private sector, how those can mesh in a reasonable way. So, you know, think about the jobs that both you and I have had which is to train students who will go out in the world and do something important either in the private sector government or, or in academia and when we think about the kinds of things we teach and the kinds of research they do. We want them to have environments that are similar to a representative of the environment they'll deal with in the world. And so, you know, and we want and open data has been one way that we we've been able to do that. But I think as we start looking at the problems that we're going to increasingly see when the private sector has so much data. And right now so much of it is proprietary. I think we're going to have problems even understanding the kinds of problems, certainly at scale that a Google or an Amazon or Apple or a Microsoft or any of these companies a Pfizer is going to have. And, and so I kind of worry about open data in terms of, you know, what's going to be open under what circumstances is it going to be open. How does it impinge on all the discussions we're having around privacy, and, and all of those kinds of things. Yeah, I mean, I think that's a, I think that evolution. That's a good was such an interesting question Chris and you know you talk, you talked about the some of the well known issues around algorithmic bias, which actually is a term I hate because what it really is is you train something. Particularly what it is is you train something on data that was that reflected bias. And you know the algorithm did exactly what it was supposed to do. I wonder, in some ways, whether we're not going to see open data become a great source of bias in the sense that, well, if that's the training set you can get that's the training set everybody will use you know if you look at, for example, the things that have gone wrong and facial recognition. Part of the issue there has just been there's been these big training sets land lying around that people use over and over again. I think I think this is a pretty profound set of questions you're raising there. I want to get before we run out of time completely. I want to get to two comments and questions that we got in here. The first is from Roger Schoenfeld. Fran, thanks so much for this wonderful talk. Congratulations. The world today doesn't just represent the shift of data and technology to being ubiquitous and critical, but also commercialized. Even if we think not about our consumer world but about scholarship in key fields for instance in the social sciences. It's clear there's extraordinary value and data held by commercial organizations. Let's talk about the importance of social infrastructure and balancing public and private interests as data becomes more valuable. Could you say a little bit more about the impact of commercialization of the data environment. And, particularly in academia, perhaps, how is data commercialization affecting scholarship. And if we need to do something to course correct here, do you have thoughts on what we should be doing. That's a fascinating question and, you know, in a way it brings up some of the questions, which Roger knows well from the beginning of this right and the beginning when we started worrying about well. And if we publish the data with our publication remember those arguments from so many years ago, and because how else would you try to reproduce things or understand what's going on and today it's sort of taken on a more mature but you know kind of really difficult set of questions which is, I think it always gets to the heart of you know what's the economic model that's going to work so if you commercialize things. You know, will that be a better economic model or not, and you know what is the role of publications these days. And of course, you and Chris and Roger probably have a much more sophisticated answer than I do but I do think that it's really curious that we've given time and effort to really thinking about these issues for many years but in some ways we still don't have really a good answer to it all. And, you know, if you if you look hard you can often find sort of alternative ways and do an end run around publications just like we used to do an end run around music companies using Napster. Right. But we still I think don't have, and perhaps Roger knows a good solution to this but I don't know I don't know where commercialization should go with this. I don't know what do you think what do you think Chris. Um, you have a thought Chris. Yes, I think it's a matter of governance, which is a lot of what data comes back to and certainly an issue that birdie and and code data and others have dealt with is whenever you've got something that looks like a common good. And that is subject to free rider problems, which this is certainly is you've got to have some way of building a governance model and that's where we've not gotten far enough in the economics and finding a good governance model. And I think that these data run across the line of what's public what's private, and that's one of the big barriers to researchers sharing their data is there concerned about who's going to use those data to exploit them for purposes that they did not see. And who's going to take the benefit from, from that down the line so that's, that's where the governance piece as part of what I was leading to with the open data question earlier. And one of the areas I worry about as commercialization of data creeps into the scholarly publishing sphere, for example, is behavioral data and what uses that can be put to, in other words, it's, it's not so much the scholarly record itself, but the information about the interaction of perhaps specific individuals or groups of individuals with that scholarly record, and how that can be monetized. The economics of this are really odd in a way because people things that are ostensibly free really aren't, and things that ostensibly costs don't always cost. And so, and so I think, you know, we have a very non transparent and very confusing environment that's sometimes dangerous. What's what's that old warning. You're the product's free. You're the product, you're the product, right. Let me move on to another comment here in question. This is from Don waters. Fran congratulations on receiving the award thanks for such a thoughtful talk in light of the comments about public policy and economic goods. That's about the size and concentration of the big commercial data gathers. Google Facebook, Amazon and friends, and the need for regulation of their activities, including the current antitrust activity. Do you have a view on what kind of regulation is most urgently needed. That's a great question because we're just we're living that now as it plays out aren't we. And I mean when I look at that, and I do think that the way that the large scale companies kind of snuff out smaller folks is not a good thing. I understand the whole focus on monopolization it will be interesting to see in the next administration, you know what the FTT and what the FCC and what all of the various players think about all of this but I also think it's really important for us to understand that that doesn't necessarily mean that companies will be more protective. If you break up Facebook, it doesn't mean that you know if you have 100 companies instead of one company, it doesn't mean you're going to have more privacy. And so I do believe that we need to think about privacy and security and safety and other kinds of digital protections. Not just as part of an antitrust activity. I think we need to create some bars where we can expect that products and services will have a certain level of cybersecurity. I think we can make rules about who can control data, and when, and we should know about when it shared or exchanged with other people and we should know about what's being collected. I don't think GDPR straight out is going to be something that will work in this country because we have a very different culture, but I do think that we should be thinking seriously about our own version of what a GDPR would look like. And because I think that's really important for us to think about basic protections. That's not just. It's not just a monopoly issue. It really is a protections issue it is a public good issue. It's an issue about when we can all prevail. You know, if you go to a foreign country, which many of us are yearning to do at this point, because we'd like to travel anywhere you never thought you'd hear yourself say that. And you come back through customs they ask you whether you've been on a farm, right, and to see if you're bringing various diseases into the country. Now, it may be that privately I don't really want to share where I've been, you know, I don't want to say whether I've been on a farm or not, but I tell people whether I've been on a farm it's the only way I can get in the country. I don't want to be the public good, because I don't want to be, you know, someone who brings in things that are, you know, bad for crops or whatever. And, you know, those are the kinds of, I think, you know, standards that we need to set that we where we, you know, promote the public good as sort of a first class object. Yes, I think the monopolies are really good things to be looking at, because I do think sort of a thriving environment where people can be innovative with different ways of creating competitive advantage and different business models, you know, perhaps I'm willing to pay for more privacy than for my ostensibly free products. And so they shouldn't be crushed by the people giving me the ostensibly free products but I don't think that's going to get us all the way there. Right. Well, I've done thanks you for that very helpful response. I think it's about time for me to once again congratulate Fran, and thank her for a superbly thought provoking. Paul Evan Peters Memorial Lecture. I think Chris also it's not often that you come to do one Paul Evan Peters award and get to Paul Evan Peters award winners. At the same time, it's so good to see both of you. I'd like to just take a moment to thank all of the members who and guests who've been with us through this conference. I hope that you found the sessions useful. I hope that you find opportunities to enjoy some of the pre recorded sessions or recordings of some of the sessions you weren't able to get to smoothly. I thank the team at CNI for all their help making this happen so smoothly. I feel a little bit weird to wish everybody a happy holidays in the COVID universe as Fran puts it. But I wish everybody safe times and better times ahead. And I look forward to seeing you all in the near future. Thank you so much Fran. Thanks for joining us today. And with that, I declare our full 2020 virtual meeting to be closed.