 All right. It looks like we've got pretty much everybody here. I'm Cliff Lynch, the director of CNI. For those of you who I haven't had an opportunity to meet yet. And let me start by welcoming you to the fall 2016 membership meeting. As you may have noticed, we have record or near record attendance here. And I'm just thrilled to see all the folks here. I'd like to extend a special welcome to our international visitors and members and to those who maybe had a little bit of weather excitement on the way here. I hope there weren't too many, and it wasn't too exciting. But I am very glad you've made it through. I need to say a few things before I get started in my main remarks. First off, I'd like to note that we have a number of new members who are joining us here. And I'm going to read them off. And I'd invite you to join me in welcoming them. We have York University. Lyrisis, who some of you may know, is among other things the home of the collection space and archive space, open source software programs. Colgate University, the University of California at Merced. Vassar College, the University of Sheffield. San Diego State University, the University of New Hampshire. CrossRough, Rensselaer Polytechnic Institute Libraries, the University of Mississippi Libraries, and Texas State University. Welcome all. A couple of program notes. If there are changes to sessions, we will be posting them on the message board in front of the registration desk. You will note some changes in the overall layout of this meeting. We have in response to both a huge number of session proposals that we wanted to accommodate. And the feedback from you over the last couple of meetings suggesting that there were a number of presentations that would fit very nicely in half hour slots, introduced sessions of varying lengths, half an hour, 45 minutes, and 60 minutes. We have had to tighten up a couple of the breaks, which is why it's important that we be done at 2.15 because the first round of breakout session starts at 2.30. And you will find those session slots are of different lengths. In a few cases, we have taken two half hour presentations generally on the same broad topic and put them together into a one hour slot. And for speakers in those, I would especially ask you to stick to your half hour if you're doing a half hour slot so that the second session can get its allocated time. Ben Schneiderman, who will be giving our closing plenary, is indicated that he's willing to be at the registration desk tomorrow morning at the 10.30 to 11 coffee break if you would like to get him to sign copies that you have of his wonderful book, The New ABCs of Research, which will form the basis for many of his remarks at the closing plenary tomorrow. Now, I also want to clarify two things about the closing plenary. First off, if you were reading the overall schedule, you may have had a twitch of deja vu. We managed to put in Julie Brill's title from last December where Ben's title, which is The New ABCs of Research, should go. The abstract on the abstract page for Ben's presentation is, in fact, correct. Also, just to be clear, because the schedule is kind of ambiguous, we will start the closing plenary with a half hour special briefing by Robert Kahn. And then we will go to Ben's closing plenary talk, which will run about an hour. So that's the agenda there. I think that that is all that I want to say about logistics other than to ask you to join me in congratulating two wonderful new Paul Evan Peters fellows that we recently announced. These are Meg Young, a PhD student at the University of Washington, who is working on data privacy and municipal government. And Kirsten Matayuchi, who is at Rutgers, a master's student, whose interests are centered on outreach and engagement in communities for library and information services. And you can find more information about both of these folks on our web. I just had to grin when I read the descriptions of their interests in research. And they were great choices from among a very strong field. These are things that would have so resonated with Paul Evan Peters. So my congratulations to them. OK, so let me get started. I basically want to do two things. I want to talk about what has been a very busy year for CNI and some of the things that we've got done and will be carrying forward that perhaps some of you may have missed or want to talk about a little more. And then I will conclude with comments on six areas. And I'm going to have to go fast here, where I am spending some time. We're doing some work that I think represent important new areas. I have very little to say about overall developments. I think all of you know this has been quite a year of two malt surprises and so forth. Other than to note that I made some remarks in the December 2015 plenary about data dumps, provenance, integrity, disinformation, and similar sorts of things that are feeling almost creepily prescient at this point. And I continue to watch the news with great interest. These are obviously going to be issues of ongoing importance. But let me turn away from those things. I will simply note that both in the United States and among our colleagues in the UK and the European community, there is great uncertainty at this point. Basically, nobody knows how a lot of things are going to play out. There just isn't any information right now. And we are clearly going to be operating in a very adaptive mode. You may have noticed a couple of days ago they decided they'd keep the government open until the end of April. So that's the kind of world we're in right now. OK, but let's turn from that to what I think has been a really good year for CNI. We have been doing a tremendous amount of work in digital scholarship centers, which Joan Lippincott has been leading, and connecting that up to areas like humanities at scale and how to strategies for supporting them. We did a wonderful planning workshop along with our colleagues at ARL late this spring. And we are looking at doing another one in 2017. There is a workshop report. There are some executive roundtable materials out on humanities at scale. And we are continuing to work in this area, which seems to be gathering significant momentum among the members we talked to. I want to note specifically in the area of executive roundtables that basically, due to my own personal failings, we had not been doing a very good job at staying up to date on reporting on those. Over the summer, we caught up on all of these and that we are current now. And we have made a pretty strong commitment to try and get the roundtable reports out within a month or two of our meetings to keep them timely. So you can look forward. I really think we will do better on that in the future. Speaking of roundtables, our roundtable for the spring is going to pick up on a theme that I've been doing a lot of thinking about and did a long interview on earlier this fall about rethinking institutional repositories. And I think it's really time to do this in light of a number of things. One is the way in which open access or public access funder mandates are unfolding, especially in the United States, which don't really seem to be placing much emphasis on institutional repositories. The growth of disciplinary pre-print servers and repositories is really starting to get a second lease on life. We have, of course, the tremendous model of the archive that started in Los Alamos and then moved to Cornell, which continues to move from strength to strength. But just, I believe it was Friday or Thursday, we had the formal launch of the Social Science pre-print server, which is running on a new platform that the Center for Open Science is made available. We are starting, finally, to see some modest but genuine uptake in pre-print servers in the life sciences. So I think it is fairly well understood at this point that pre-print servers that are disciplinary in nature seem to be more effective as centers of community than highly distributed pre-print servers when they exist. We're also seeing some of the major humanities organizations step up in this area or at least begin to. Finally, of course, we have the whole growth of research data management and the very mixed job that we are doing in establishing sustainable disciplinary or funder data repositories as distinct from article or pre-print or post-print kinds of repositories. So I think there's a lot to be thinking about there, especially as new generations of software start coming online and we start struggling with, how do these integrate with other parts of the ecosystem? Rethinking institutional repositories will, by the way, be the topic for our spring executive roundtable. So look for the call for participation for that if you'd like to talk about it. One of the big themes that we have been spending time on for the last year and that I think is going to inform our work for at least the next two years to come is getting a better handle on the new portfolio of roles for libraries when viewed inside the university context. And I really mean looking at this contextually. There are a lot of new services that the university leadership, the faculty, the staff, the students, all are looking to the libraries to provide or contribute to ranging from new kinds of teaching and learning to research data management to various kinds of publishing and data dissemination. We are seeing a fascinating new interest starting to emerge at some institutions about looking more holistically at institutional assets, like museums, herbaria, libraries, archives. And trying to bring all of this into the main line of teaching and learning and research at those institutions to make discovering material across these easier to bring greater rationality in the way they're supported and built up and sustained. We had an amazing roundtable this morning that looked specifically at collaborations among IT libraries, museums, and archives. And I think you're going to find the reading of the report from that to be very provocative. I certainly came away with it with a lot of insights. But one of the ones that really struck me strongly was that this is another challenge for libraries to move beyond classic bibliographic description and into a much broader set of responsibilities and roles about structuring and maintaining knowledge broadly across a wide range of contexts and fields. And that's a theme I think that we're going to see in a number of places that I discuss this afternoon. But it is definitely an area that is moving some people and some organizations well outside their traditional comfort zones. Very, very interesting development. Many of you know that CNI was a co-sponsor, along with a number of other organizations at a invitational meeting of university libraries and art museums that was hosted at the University of Miami in January. The report of that is available. You can find it on our website and also is very, very provocative reading. And we will be continuing to build on these discussions going forward. Another role which we have explored a bit, for example, on an executive roundtable a couple of years ago, was is the new role of library as publisher. And this spring, we took another look at this in partnership with the AAUP and ARL at a meeting that was hosted by the University of Pennsylvania. And here, I'm sorry, by Temple University. What am I saying? Here, we brought together, I believe it was 22 university libraries. And the heads of those university presses and the distinguishing characteristic here is that the press now reported to the library. So it had been mainstreamed right into the center of the campus activities. And we talked a bit about how that was going and how it meant. And it was really clear that there's still a wide variety of integration. Everything from the press continues to be an utterly autonomous thing that just happens to report to the library, all the way through efforts to much more genuinely begin to integrate and find synergies and collaborations between the two institutions. And I think there's going to be a lot more to look at there. Certainly, there is also, particularly in the monograph area, been a tremendous amount of good work, much of which has been reported here, on trying to understand cost models for the production of university press monographs, which is a very important basis for making strategic progress here. I will just highlight one thing that wasn't much discussed at the meeting in Philadelphia, and that I came away with really struck by. If you look at scientific journals and scientific journal articles, and you actually talk to the scientists, the scholars, who genuinely ultimately own the system and are served by it, there is a broad consensus, I would say at this point, that the ideal open end state is some form of open access there, as long as it's not too inconvenient for the scientists. But there is at least that agreement that there are many good benefits and that that's a desirable outcome. If you ask the same question about monographs, and you can try and do this within disciplines or across disciplines, but broadly speaking, this is mostly humanistic in some social science disciplines as a first horribly crude approximation. There is no consensus I can detect on the desired end state. Should they be open access immediately? After a few years when they go out of print, well, but digital books never go out of print anymore kind of. They may go out from under contract, but they tend to sort of stay available forever unless someone does something deliberate. Old print books that have gone out of print, there seems to be a reasonable consensus that it is a good thing to bring those back as digital things and make them available. And a number of university presses have programs where they work with willing authors to do that now. But I do think that it's going to be important for us to collaborate and help the scholarly communities that rely most heavily on the monograph to begin to achieve some rough consensus on this or at least get a better understanding of the alternatives in the pros and cons. Because without that consensus, I think that our efforts to rethink practices in this area are considerably handicapped. In the coming year, we will be starting to have some focused conversations among library leadership and CIOs about what are the most important current and emerging areas for collaboration? What are the most promising areas? What are the most neglected areas? What are the key areas? I think this is a timely thing to do because frankly the world looks very different today both from the point of view of the CIO and the library than it did 20, 25 years ago when CNI first forcefully argued for the necessity of IT library collaboration as a way forward into the emerging world of the internet. CIOs have a very different collection of responsibilities today. Often I find not always and it's terribly dependent on the individuals and the institution, but often I find that there is a researcher academic computing unit that is much more engaged in many of the issues we're interested in than the overall IT organization which is spending much of its money on compliance, fundamental infrastructure that's used by everybody in business systems. Libraries have built up in many cases formidable internal sources of technological expertise of IT expertise to the point where while they may use some common infrastructure they are largely independent of central IT on some campuses or and in fact are moving into areas that were historically the province of academic computing either on a centralized or decentralized basis. So I think it's time to do some additional probing at that and we have another opportunity coming up. In July, unfortunately, very much on the heels of Brexit we had a wonderful opportunity to revitalize our relationship with the Joint Information Systems Committee the JISC in the UK with our biannual joint summer meeting in Oxford and to meet the new leadership there Paul Feldman is their new CIO and look carefully at some of what's going on in the UK higher education world which is very interesting. It's quite a different landscape than ours in many areas, especially around analytics around levels of collaboration in certain areas. In subsequent talking with JISC we've decided that our 2018 meeting and we're just setting the dates and venue for that now is going to focus on an international look at this change, this new roles of libraries within the university in the 21st century as a theme because this is something that they are terrifically mindful of as well as they try and work with their members. So those are some of the exciting things that CNI has made a lot of progress on and is moving ahead on over the next few months. Let me talk a little bit more sort of more broadly about some of the issues that I'm thinking about working on, engaging with and I'd love to have your thoughts about some of these areas at the reception between sessions separately from the meeting anytime we can catch up. And there are six areas that I'm just gonna touch on them at a very high level. So I've been looking a lot at what's going on in the research data management area. And as you know, CNI was very early in drawing attention to this area and there are many, many people who are taking the baton from us about operational implementation although we remain interested in it and helping various parts of their institutions work through this. But I feel like we've kind of missed a bet here. There's an easy problem and easy and hard are very relative terms here. Which is about how do we deal with data that at least once it's published is public. Climate data, high energy physics data. There are issues about wanting to keep it confidential except for perhaps use within the refereeing process until publication and different fields have different norms about that. But when you get all through, the reuse and access of it is mostly about documenting and preserving. You can just kind of make it available and the more people who can find it, the better. Unfortunately, this leaves out massive areas of scholarly work including very, very high payoff areas around medical things, genomic things in some cases. It omits much of the social sciences which relies on data about people. And particularly as we look to a world where we need to be able to reuse data, we need to keep more personalized data for longer or more re-identifiable data for longer. Somehow, I believe it's essential that we focus much more serious efforts on the hard problem here of how we share to the extent that we ethically can and how we reuse data that can't simply be made public. And that's gonna require a lot of thinking about IRB practices and responsible research practices as well as the actual apparatus by which we maintain and share data. Right now you hear, for example, crazy stories about data that's essentially orphaned because an IRB authorized its use in one project and other people can't figure out who to even go to for permission. It's not their IRB, they might be at another institution. This is all part of sorting out how we think about institutional responsibility for data and data stewardship. And these, as I say, make many of the problems about public data look easy, but I think we omit them at our peril and I'm gonna be continuing to search for ways to make some progress on that second area. And some of this builds out of some conferences I've been at over the past year. Some of it was touched on at Commissioner Julie Brill's remarks at our meeting last December. We are increasingly in an age of algorithms and algorithmic prediction and filtering. Much of this involves algorithms that have two properties. One, we don't understand them. I mean, literally we don't understand them. Many of them are driven off of machine learning and you've got these very complicated neural nets that are essentially black boxes that are formed by training data and then let loose on real data and then retrained occasionally when you get enough mistakes and outliers. We also, for many of the most important algorithms, don't have stable algorithms. We have algorithms that are perpetually tweaked. I invite any of you to tell me what the search algorithm is as of this moment for Google search. Well, it probably changed since yesterday. There are some gross elements of it that are roughly speaking conserved, but they have armies of people who are constantly improving that algorithm. And by the way, depending on which one of the many, many A-B test groups that they're running every minute of every day, you happen to get tossed into, you will see subtly different behavior, not just from day to day, but on the same moment from person to person. And that's not just Google, that is every really large scale system we've got. Now there is a growing interest in transparency and accountability of algorithms. And this becomes increasingly vital as these things shape our lives. While I am very, very sympathetic and support these discussions of transparency and accountability, I'm starting to scratch my head about something that's a little bit farther down the street, more elementary. How do we document these algorithms? How do we archive them? How do we record them? This is gonna be one of our next big frontiers in archiving our cultural record. And we can talk all we want about transparency and accountability, but if we can't document it, it's really hard to have accountability about it, I think. So that's one that I'm really scratching my head about a lot, but that's moving rapidly up my list of important problems. There are some very interesting research papers that have come out recently that try and present frameworks for at least beginning to understand how some of these machine learning neural networks are functioning, but I'm not sure there's a magic technology solution around the corner. Next area, stewardship and large scale digital preservation. This continues to be an obsession of mine, of course. And in particular, the question of really understanding how what we're doing does and doesn't match up with a constantly evolving cultural record. I have been privileged in the last year or so to participate in the work of the Keeper's Project and I was very pleased to see this morning an announcement that both the, I believe, UK and European Library Associations, as well as ARL have come together to endorse the Keeper's Declaration, which we, of course, also strongly supported and endorsed. And Peter Bernhill is here and I think we'll be talking more about this and some related topics during the meeting. I should note, by the way, just broadly, many, many of the areas that I have touched upon are represented in breakout sessions at this meeting and the ones that aren't, I think you can look forward to hearing about, such as the work with the publishers at the spring meeting. But let me go back to my list of things I'm worried about. So I think we need Keeper's or Keeper's-like approaches to illuminate how well we're doing in other sectors of the cultural record. I'm also really worried that there are all kinds of emerging genres, and I talked a little bit about the algorithms, that we just don't have on our radar screen sufficiently. I mean, social media is an obvious one that we talk a lot about and seem to get almost no traction on, but there are others and we really need to get to work there. I am just finishing up, literally, a rather lengthy exploration of reading analytics and reader privacy and how these interact with authentication. We did a survey this summer about privacy and authentication practices, which I believe comes out today, finally, as an edge of cause piece, but we've had the report available since September on our website. And that was quite interesting about trying to and help to understand perceptions of privacy and privacy risk by libraries. It also began to foreground a couple of other issues that I spend a lot of time on in this forthcoming essay. One is that I think that libraries, particularly in an age where they're being asked more and more to document impact and contribution and value, are starting to get much more interested in a nuanced understanding of how the resources that they license and make available are being used. And that doesn't mean simply the kind of numeric things that you get, like how many downloads or how many hits. Involves coming up with some reasonable compromises between privacy and demographics of the user base and somehow negotiating data flows with the publishers or the platform operators to try and facet usage information along those lines. At the same time, there has been enormous progress in re-identification of users that goes far, far beyond the old sort of, oh, they set a cookie, and if I go through a proxy, I can avoid that sorts of problems that people worried about a decade ago. There's one other factor too that's worth noting here and that's a very interesting one to me. There's a new player that showed up in this sort of privacy versus reader analytics and understanding usage equation, and that's authors. Authors are getting very pressured to document and understand impact as well. And all of a sudden we're in an age now where it's quite common if you publish in a journal as an author that you get a reader dashboard, I'm sorry, you get an author dashboard which shows how often your paper has been downloaded this month. Sometimes it'll tell you something about the geographic download, the geographic dispersion of the download. And of course, there are at least some authors who would really love to have the names of everybody who read it and to really be able to jump across that all important chasm between download counts and actual reading. I invite you to inspect your hard disk for how many PDFs you have locally that you downloaded with every good intention of reading. This is sort of the digital analog of the flows and mounds of xeroxes of articles that you somehow absorbed by osmosis that were piled around your office. In some new reading environments you can cross that chasm. Ebooks are already well across that chasm if you look at the consumer marketplace. And what you do with that data is a really interesting question and who you give it to and how much you keep. Two final areas I'm thinking about. There is a huge and long overdue shift in the way we think about the management of names and factual biography. Moving away from the old siloed authority control files, the ones that libraries did that were so heavily focused on monographic publication. The ones that, for example, people in the art history and art world did to deal with artists' names and provenance and things of that nature. The kind of phenomenal resources that places like the Getty have built up. We have the work that is underway with support from the Mellon Foundation and others that Daniel Pitti is leading in collaboration with both NARA, the National Archives and Records Administration and the University of Virginia, IAF, that now has a pilot project going with about, I believe it's 18 universities and other institutions that's building out a sort of a authority system like thing for names that appear in archives. Not just authors of material in archives but subjects of material in archives as well and documenting the relationships between them. And our challenge increasingly is gonna be to connect these things up and to begin to integrate factual biography of the sort that might be found in dictionaries of national biography or Wikipedia or places like that with these resources to integrate CVs and faculty achievement reporting systems and things like that into this sort of world. There's a tremendous amount we don't understand how to do, ranging from formats and representations through interchange structures, all the way to right balances of privacy opt in, opt out and these sorts of things might be in this area. But this connects up intimately with our ability to structure the world of both scholarship and of evidence to capture and reuse evidence in the humanities oriented fields as well as the kind of evidence that's produced in science to rethink areas like biography and documentary editing. And I think there are enormous opportunities there. I'm pleased to report that just last week I had an opportunity to attend the latest meeting of the SNAC pilot project that NARA hosted and they are making really good progress. It's a very, very exciting project. Finally, and this is one I've been thinking about for some years but seems to me to be reaching a tipping point and I've had two important things happen this year. The basic question here is what does it mean when you can do digital documentation of objects, texts, images, paintings that are as good or better than the original in most dimensions and that then can be reproduced in copies that are essentially identical to the original. What does that mean for stewardship, for scholarship? What does it mean for curation? What does it mean for collecting and the whole idea of markets and rarity and how these materials are used and reused? And I just note a couple of things here. 3D printing and 3D imaging are both things we've talked about a little bit here in recent years. They're starting to take off at considerable scale. There are also these sort of two and a half D things, numismatic collections for example, which are particularly amenable to easy progress and are important scholarly resources. I was fortunate enough to be at the Getty Center in Los Angeles in August. I had an opportunity to speak at the Pacific Neighborhood Consortium meeting and the Getty is just an astounding venue for a meeting like that. The Pacific Neighborhood Consortium, for those who don't know it, brings together primarily digital humanists from the Asian nations and North America and a few other folks, but it has very strong ties to China and other places in Asia. And there was a very special event going on at the Getty at the time, which was one of the reasons why the Getty hosted this. You've heard in the past about the efforts that the Mellon Foundation, for example, has been deeply involved in to document the Dunlong Caves, these incredible Buddhist shrines. They actually built at the Getty both a number of sort of virtual reality tours through some of these caves, but also some actual scale replicas, which were astounding. All of a sudden, these sites, which are increasingly becoming closed because they are so fragile and there are many others can be replicated if we want to, if the custodians of the sites are willing. There's an article which I strongly commend to you which came out in the November 28th issue of The New Yorker just a week or two ago with this sort of annoying title, Factory of Fakes, but what it's really about is the replication not just of art, like paintings or sculptures, but entire heritage sites like tombs within the Egyptian pyramids that are under threat right now. With so much of the historic sites of the Middle East under severe threat or already destroyed, this kind of ability to document and recreate becomes extremely critical and indeed extremely urgent. This article traces the development of a variety of technologies and projects aimed at replicating everything from paintings to, as I say, literally Egyptian tombs. It's a great source of food for thought and it helped me to understand one other thing too. If you think about physical objects, be they heritage sites or statues or paintings or any number of other things that we're trying to take care of across time, these are actually objects that are in a journey across time that is going to end poorly. It may be a long journey or it may be a rapid decline. And indeed, particularly for some of these heritage sites that used to be very hard to reach and are now major tourist attractions, it's an increasingly rapid decline. It's startling the way this documents the extent to which some of the tomb galleries have deteriorated since they were opened up within the last 100 years. You actually have the ability when you do this kind of digital imaging to not only reproduce an object if you want another copy for study or if you wanna replace one that's destroyed or damaged but to freeze it in time. And in fact, as some good examples are given, you can also back it off. You can restore it or conserve it to an earlier state, something that conservators only do with great care and trepidation because it's so easy to make it worse rather than better with the best intentions in the world and the best current state of knowledge. But here you can do speculative conservation, very, very provocative kinds of ideas that greatly complicate questions about curation and stewardship. And I think this is an area that we really need to be seriously engaging. So those are some of the things that are very much on my mind and my agenda as I look forward into the coming program year. There is certainly no dearth of things to do and as you know at CNI, two of the things we try and do is take on hard problems because somebody needs to and to take on problems that maybe are a little bit over the horizon but that we know are going to be very real challenges, opportunities or surprises except they're really not if we get out in front of them. And I look forward to exploring this terrain and more with you over the coming year. I have actually done what I said I was gonna do and I even have about, by my reckoning, almost 10 minutes for questions. Thank you and I would welcome questions. Please enjoy the session. Joyce Ogburn Appalachian State University. Hi. I think you touched on a lot of issues of authenticity, identity and in a number of different ways, what's real, what isn't, what's a reproduction, how do we know, how do we document that and how that's shifting so much and then you layer on that things like algorithms that you can't identify at all and what they're doing. So I think there's some threads there of not only the curation but the transparency and also information literacy and other things how do we even know these days what anything is or how real it is or where it comes from? I think you're absolutely right there and while this is probably not the place where CNI is necessarily best suited to make a central contribution, I think that many of the things that I've talked about and many of the events of the past year or two have really underscored how badly we have underestimated and how ineffectively we've responded to the enormous challenges of information literacy and the broadest sense that you speak of it as genuinely understanding things like authenticity and transparency and bias and those sorts of things. I think that there is a enormous, enormous, an important challenge there and I think that if nothing else our community can bring tremendous insight into the nature of those challenges. Thank you again for joining us and I really would love to hear your thinking on these things over the course of the next few days and beyond. So thank you.