 So my name is Mike Furlough. I'm the executive director from Hathi Trust and welcome to this session on the Hathi Trust Research Center The subtitle it takes a village clearly indicates that this presentation was proposed before November 8th You may recall a book from the 90s falls flat librarians don't remember books So anyway, my name is Mike furlough again And what we're going to do is we're going to have a tag-temp presentation this morning and then some moderated discussion So I'm going to start us off and then we will follow up. You'll hear from Beth Plaley who's immediately to my left from Indiana University of Bloomington. She is the co-director of the Hathi Trust Research Center Then on her left Stephen Downey from the University of Illinois Urbana-Champaign Stephen is also co-director of the Research Center to his left is John Unsworth now dean of libraries at the University of Virginia and And part of the executive management group for the Research Center and then on the far end we have Robert McDonald from Indiana and Beth Naumachavaya from the University of Illinois Urbana-Champaign They're going to moderate the discussion at the very end of this session We we decided we would split the speaker fee multiple ways here So that we could each give each some something for Christmas So I will start off with a little bit of background about Hathi Trust and then leading into the research Center the first thing I want to do I Always want to do is remind people that as much as you think of and as we know of a Hathi Trust as a digital library as a collection of Large number of digitized books online. It is first and foremost primarily an organization That is co-owned and co-operated by its members. There are 120 members in Hathi Trust Nearly all of them in North America and most of those in the United States But not exclusively and our role is really a library mission, right? It's the mission to collect and preserve cultural and scholarly heritage for future generations So at the core of what we do is digital preservation. That's absolutely the basis of it But based on that mission and based on that core activity and the aggregation of content that we've been able to accumulate over the last eight to ten years We are able to launch and and and move forward with a number of other Cooperative programs number of range of programs the Hathi Trust Research Center is just one of them a couple I will mention but not go into detail our distributed copyright reviews, right? So among our membership there are about 20 institutions where staff are Looking at the copyright renewal and registration information for works in our collection to determine if they can be opened There is a program underway to establish a shared print network Among Hathi Trust members to ensure that for all the digitized collection There are print copies retained and disclosed as being retained and that we have collection development programs Especially around federal documents right now so the the real power of what we do is not necessarily in providing access which is Valuable and critical, but the power I think of it as really being able to harness the membership for these for these other programs that really can transform how libraries operate and and because we are so large because we have so many members one real value that I have for the organization and that the Organization has is to draw on the distributed strength and expertise of the membership that And not to assume that it's something that could or should only be located at one place But a a set of programs that can be operated at multiple locations So while I am an employee of the University of Michigan and the University of Michigan provides the host administrative hosting and Infrastructure hosting for the research for the preservation and access repository There is a mirror site for the preservation repository at Indiana University There is a metadata management system that's developed and operated by California Digital Library on behalf of Hathi Trust and The research center itself is co-located at the University of Illinois and at Indiana University So I will as I said, we'll just be talking primarily about the research center and in today's session now on this point of the collection I Talked a little bit about how it's not enough just to focus on the collection I think because what we're going to be doing today is talking about making computational access available to this collection I wanted to talk for a minute about its characteristics. So the collection today is just under 15 million volumes I was hoping I was hoping to get to 15 million by the by this meeting But I blame someone who shall remain nameless for not getting their content and in time to time for us to celebrate that But still let's just call it 15 million for for giggles That equates to about 7.4 million unique book titles That uniqueness is based on mark cataloging. So your mileage may vary on how many titles we actually have And within that that accounts for about 5.2 billion pages digitized pages in the collection Now of the collection about 40 percent Usually it's ranging between 30 90 40 percent of the collection is open and available for reading That means it's either public domain or has been licensed for for open access The collection is primarily books the focus of the collection will remain for the foreseeable future primarily digitized or digital books and serials published works But it is not the case that the collection excludes rare books special collections I mean, it's actually a pretty wide range of material that's in in hotty trust However not surprisingly the majority of the collection probably comes from the 20th century and the majority of the collection as you just saw is Is in copyright the The collection this bar graph is trying to give you that picture of date distribution right publication date distribution And then the orange tips on the gray cigarettes here. It's not like a cigarette with it's on fire at the end It's those are actually the orange tips indicate how much of that decades output are open in hotty trust Okay, so not surprising here for a collection that's that's based on digitized works from research library collections in North America The large majority of this comes from second half of the 20th century So a quick comment on the kinds of access that we provide First on let's call it human access or reading access We it's always confusing to people and there's often misinformation about what exactly we provide access to in the collection So I want to just quickly clear that up One thing we do is provide full search full text search as well as bibliographic search of the collection to anybody anywhere in the world Okay, so the entire collection is available for search The entire collection is also available for data mining and computational analysis and we'll talk more about that as the morning goes on and the entire collection that is open That is public domain out of copyright or Creative Commons license is available for reading to any individual without regard to whether They're at a member institution or not Okay, there are some caveats on that if a work is not out of copyright in your country You won't be able to read it in your country Okay, so if the work is public domain in the United States but not elsewhere in the world Then you should only be able to read it in the United States It's public domain outside the world outside in the rest of the world, but not in the US We can't see it here in the US and there are some some several thousand titles like that Then for members members do have some additional access privileges there so they can download those works in full The the most I think that probably the service I am most proud of of all the things that we do is making available the entire collection Regardless of copyright status to users who are blind or print disabled at member institutions, right? So if anyone on your campus has a need for a library book that's in haughty trust and they are eligible according to US law or the law of your country if you're a member then we can work it out so that they can have access Materials in the collection and we have also announced this year a partnership with the National Federation of the Blind to expand that access in the US So more on that will be forthcoming And then the last service around reading access is just replacement access Preservation replacement copies can be made available for works that you have that have gone lost missing damaged Otherwise unusable and are no longer available on the market in a new condition to to quote the US Copyright Act But what we're going to talk about today is this lovely Previously foreign phrase to me non-consumptive research a phrase that I'm not sure really existed or was certainly not in popular parlance in the United States until the Google book settlement in the last decade The the Google book settlement you might recall there was a Google said we're going to scan everything everybody freaked out certain of those everybody sued Google And the authors and the publishers got together and worked with Google to arrange a proposed settlement For their lawsuits and that went through a couple of rounds of hearings and freaking out and ultimately that settlement was rejected But within that proposed settlement there was the concept for Enabling non-consumptive research on the corpus of materials that Google was digitizing and that proposal is In part the genesis of the Hathi trust research center The idea was that there would be multiple at least two research centers There would be one in an academic institution and this phrase non-consumptive research which essentially means Not humans doing stuff on the texts You regardless you know without having to have reading access comes from this Google book settlement And what's interesting here is this phrasing it comes from that time But as as the folks up here will be able to tell you in order to actually run the research center offer services develop The protocols and the processes It's really necessary to think through this phrase and think through what it means and what it means to enable that Non-consumptive research on material that should not be redistributed Because it is in copyright right so we'll talk a lot about that about that challenge this morning So in order to fulfill that non-consumptive Access role one thing that we do at Hathi trust is distribute data sets So if a researcher needs access to data, they want to run the computes in their own environment because they they need that kind of control That's feasible, but we only distribute data if it is out of copyright Or public domain or creative commons to do so and there's two different Data sets we class in there one is Non-Google digitized about a half a million maybe almost 600,000 volumes and then there's a set that's been Google digitized by Google That one I call out separately because in order to gain access to it Google does ask institutions to sign an agreement basically Acknowledging that they're receiving stuff that is in copyright and and it's for use in non-commercial purposes so in planning our research center as I said a few minutes ago this kind of Partly came out of the Google book settlement So we first started talking about that way back in 2008 which was when Hathi trust was founded If anything we were talking about it before Hathi trust was officially launched in 2009 a group of staff from Hathi trust and other partner institutions put together a proposal for a research center Hosted by Hathi trust was submitted to the Hathi trust equivalent of their board of directors at that time And they approved that and based on that there was an RFP for hosting of a research center among the Hathi trust membership that was in 2010 and two proposals from Indiana and Illinois were received to propose to co-host the research center and Based on those proposals the award was made to Illinois and Indiana So I'm gonna stop with this I wanted to lead us through this history of what we do at Hathi trust some basics And then into how we got to the launch of HTRC and I'm going to turn over to Beth playley Who's going to provide some more background on the research center itself and services there great? Good morning everyone? so the It's nice. Nice to be here this close to Christmas Merry Christmas everyone so the Hathi trust research Center is Kind of I think contributing to the overall Mission of Hathi trust in in several ways I think the most well-known contribution that we bring is Grappling with the problem of how one enables Text a data data mining text mining over a set of content That's very large five point two billion pages is large as well as Restricted, you know 60% of the of the content is in public domain So how do you enable that kind of of of interaction by researchers? In a way that doesn't impede their their research Processes but yet protects the data as it needs to be protected So I think that that's been a sizable contribution of ours we do this as You know as user-driven As as we can because part of it is we're pushing the community to think about Analysis, you know digital humanities being an example of community We're pushing a community with capability that they've not really had before so we're trying to stimulate Their understanding of what these questions are While getting feedback from them so that the development that we do can be user-driven development so You we do this through reviews and user studies But we also do it through things like the vignettes that are sitting up here that some of you may have already seen That give examples of of people who have already Defined research questions over large large sets of content And then again we build tools and this again is a collaboration across You know Indiana University, Illinois and Hottie trust on the last couple years have been good years for us We've seen substantial growth in in terms of People who are engaging with the the the products of the Hottie Trust Research Center 923 new users we have a total of over a thousand registered users a Smaller number of data capsule users. I'll show you what the data capsule is in just a moment and a total of 257 institutions Represented amongst our the the user community So the way I like to characterize this Is is through this this cartoon diagram when when a researcher comes to To the Hottie Trust Research Center And and the question that they ask is that of well, which which of the modes of interaction of HTRC are best for my needs and we characterize that into three types of offerings And I would point out that the fact that we're on a slope here from left to right a downward slope from left to right Is is very intentional. I'll come back to that So on the upper left-hand corner, we've got extracted features extracted features Are are pulled out of the content of the books Packaged up in a way that it can be taken Downloaded and analyzed at a researcher's institution. Oh my on here. I didn't see this Okay, that think that think the analysis can can be done Okay, the analysis can be done at the user's Insta the researchers institution the kinds of content in the extractive feature are parts of speech word counts and and whatnot the middle piece is The portal, you know our web tools the user logs into a web interface and here the access is to things like a canned analysis tools And then finally on the right-hand side is Some you know so where the middle is canned analysis tools the right-hand side is my analysis tools So, you know, I don't want to necessarily use a canned algorithms I want to use my own algorithms and I want to apply those directly against the data and the data capsule Provides a protected environment that allows that so the reason for the slope is You know as you go from the upper left to the lower right you get closer to the data and Because of that more restrictions kick in so the data capsule is first of all, it's closer to them You know it's closer to a virtual machine That one has to be more technically astute to use and it also It's also a little harder to use because here again the closer you are to the data the more protection Mechanisms have to kick in so that is so this is how we're characterizing it and there's one notion I want to bring out here also and that is when you go from your own desktop Which is the extracted features on the left-hand side to the other modes Which is the middle mode of the web tools and the right mode of data capsules. There is a A something we call a work set that Represents what you're doing a work set captures both the the content over which your analysis is going to be done And it's really that research lifecycle all the way to the published product is that work set So the work set isn't needed for extracted features because you're doing everything at your desktop But when you start to work in an environment, that's not your environment that works that has to kick in because that works that represents the Researcher you know in that remote setting. That's the research center So the our primary focus of work over the the upcoming year In extracted features we had a recent release of extracted features over a 13.6 million volumes close to the total number of Volumes in in Hottie trust for the web interface We've got a we've got bookworm being added as it as an user interface Analyst visual analysis tool to accessing the content Here I mentioned the work set so the There's work going on on on an improved work set Which is again is the research context and then there's improvements on the data capsules so that it can shift from being From accessing the what was the public domain content to ship to the in copyright content Which is something that we received about it's been about nine months ago now So those can be summarized as follows the the web interface You know the works the one accesses the canned algorithms, which are inspired by monk Access to the data capsule bookworm extracted features are all done through the through the interface and that's again We have to know who you are for for the auditing purposes We need to implement the data capsule in the data capsule one runs your own algorithms Accessing the data directly, but it's it's in a controlled and secured environment to protect the copyrighted content And then the extracted features said 13.7 million put my apologies and it was released in November So what you know so with respect that was more on a technical what we're doing technically You know what we're doing as a center as a whole for 2017 effort Growing demand we've gotten what we think is is You know considerable interest in uptake in in the digital humanities less so in in other domains We think there is a compelling interest in social sciences We think you know Stimulating that need is is you know where something will have a question for you on It's something that I think we have to do in Hottie trust as a whole things like better Characterizing and describing our federal documents. I think is is these kinds of steps have to have to be done before social scientists Given the way they look at content to analyze are comfortable knowing that what they're dealing with is what they want to Research I think so we have some work to do both in terms of you know, stimulating the computational need but also Putting the conceptual Organization in place so that social scientists know what's there And can grapple with it in the way they think about it and and Laura embarrassed you so, you know Someone who comes in and does research with us should There should be an efficient process From the time they submit most of our our interest comes in through our ACS program Which Steven will talk about from the moment they get an approved advanced collaborative support award until they get their results That whole process needs to be made more efficient than it is right now There are stalls in that process that we need to work on so when I say Lowering our barriers to his news. It's not just you know the the human computer interface to our tools It's making sure that our processes as a center are as efficient as possible So that we're not impeding a research process and there's there's work that needs to be done And this needs to be done not only for you know What we were canonically characterizing the the researcher with small needs Which is a thousand volumes or less and the researcher with large needs Which is a million volumes around that that's how we're characterizing for both of those canonical groups and then finally The and this is for all three use modes and that that's the pictorial diagram that I gave you the the web tools the expected features in the data Capsule and then finally we're we're engaging in partner up partnering opportunities. I won't say much about that because John will talk about that, but the last item there is developing a cost model for for In-kind contributions and when I say contributions, it's not like anybody's you know throwing throwing resources at at How do you trust is as a whole but but how can you know? How can how do you trust and in the research center take advantage of? Resources as they exist at other institutions, and I'm thinking compute resources is the most obvious So that we become more of a community community Sustained organization over time and that's a that's that that's a I think an important aspect for sustainability and then finally some of this work is is being done through a grant from the generous grant from the Mellon Foundation Where we are Enhancing the works at builder. It's it's it's now a link data representation with a search interface that is much better than what we had before And each using solar on the back end and then we're working on Deeper integration of that work set in and out of the data capsule particularly for the large-scale data with a focus on both digital humanities and Linguistics computational linguistics and working on the in copyright content. So those I think are are are critical Steps for the for both the work set and the and the data capsule So I'll leave it at that and turn this over to Steven Downey. So Steven Good morning, everyone And thanks. Thank you very much for coming today What I want to speak to next is some of the realization of what is happening So we have the we have the data marvelous data. We have the hardware We have the networks of systems the staff in place so what is the outcome of some of this and One program that we have it's an ongoing Perpetual program that we have as part of the research center. It's called advanced collaborative support And this is us reaching out to the community and providing resources meant person power compute power the data to help stimulate and prototype classic digital humanities text data mining projects we've been running it as a series a short series of peer reviewed kind of submissions where we get applications where RFP goes out and then Small proposals about seven or eight pages with the research question Are submitted to us and we review them and then we assign a staff programmer some and staff librarians whatever the proper set of Facilities and personnel to help them prototype their project And so in round one in 2015 we had these various Projects that were took place. So we had detecting literary plagiarism the case of Oliver Goldsmith So analyzing the text to see whether the text data mining algorithms could actually detect Where our hero Oliver Goldsmith actually lifted his His tax he was famous for it and actually was famous for doing it across languages Which is really kind of an interesting plagiarism problem Then we have literary geography at scale Matthew Wilkins at Notre Dame one thing. I'm really proud of this Is that that little project was flipped into an NEH grant, which is now a long-term research project of professor Wilkins Then we had at Indiana University a marvelous Study on how to look at the topics inside books not on the outside on the inside of the literature So we get finer grade topic modeling of the literature. I found that really exciting a group of Very wide group of folks mostly from Canada led by our Canadian friends Trace of theory looking at the evolution of the notion of theory in the literature and then tracking technology diffusion over time With Michelle Alexopoulos who's actually an economist and looking at the notion of steam power and how that got talked about from its inception To now and it's a really fascinating very worthwhile study using the resources that we have in this fantastic collection So one thing if you're interested in these projects, we do have up here these vignettes So got the one of them. They're all describing some of these projects We invite you to come and take them with you So the ACS projects for round two We're now looking at some other Research going on we have a scholar right now looking at the notion of yellow fever in the Caribbean and finding all the texts to discuss the notion of sanitation and the confluence of Yellow fever then we have someone is tracing the history of creativity and that's at Brown Then we have a PhD student actually as part of their thesis work trying to figure out What does the influence of the Chicago School of? Architecture and finding that examples of that from the literature and then finally I think one of the more fun ones is the single to noise Study where they're a comparison comparing texts romantic text Walter Scott versus Jane Austin And looking for stylistic differences and similarities between those two and those are ongoing right now And we'll be having another RFP in the new year to take advantage of our In more of our in copyright data, so we will spam the usual lists and reach out to your your Organizations because we are interested in helping the community as best we can So part of this outreach Which is really important so we've talked about the notion of technologies and data capsules and web portals and and big disk and big data one thing I'm really proud of Coming from the the former graduate school of library information science and now the school information sciences at Illinois I have a proud library tradition and we really and truly believe that one of our more important interfaces Is the library writ large? It's going beyond just the technology. It's actually the outreach represented by the librarians active librarians in our various institutions So we we work closely with scholarly commons and digital humanities centers to create these outreach programs And we do a lot of user needs assessment With gathering information from folks some of you actually we've talked to in this room To find out what your users and what your clients are actually needing what where are the gaps that we can fill We've done. We're looking at social sciences. We've done a stronger We have a stronger background in digital humanities and now we're reaching out to our social science communities And also we are looking to train librarians train Staff in our various tools whether it be the portal whether it be our thing that we call the work set builder Whether we call it bookworm bookworms are very fun tool to use the data capsule and actually in reaching those communities at the right level So whether it's an undergrad beginner to a PhD advanced person We have we're targeting tailor-made outreach programs through the scholarly commons one of our great sort of outreach endeavors right now is being funded by the IMLS is being led by Harriet Green at Illinois and the one thing I really Really like about this because it puts the rubber on the road so to speak it is digging deeper reaching further libraries empowering users to mind the Hardy Trust digital library it's a nice mix of Participatory institutions, so we have large-scale institutions Indiana Illinois, North Carolina Northwestern that we get into smaller institutions For example Lafayette College, and it's a train the trainer program We will be probably reaching out to some of you in the very near future to start placing some of our train-to-training programs In your institutions, and they do a fantastic job, and we have two more years. Is that correct Robert on the Beth yeah two more years, so we're now we're ramping up, and we'll be reaching out to you And your your various communities to get your staff More familiar to be the interface for the Hardy Trust Research Center, and it's at that point now I'm going to turn the podium over to John Unsworth Thanks, Stephen. Good morning so my unofficial Title at the Hathi Trust Research Center is chief schmoozing officer CSO and I'm in charge of finding us partnerships and and building those into Something that makes the Hathi Trust Research Center sustainable This has been something that I've been thinking about since I started working on this project back in 2008 Because it's obvious that running this kind of infrastructure has some significant fixed costs and that you know while we could recover some of those through partnership with the Hathi Trust and Deriving some income from membership fees. We could also recover some of these from partnerships with researchers and getting some of our costs written into grant budgets that beyond that we still needed Something more that would just be a predictable source of support for the for the enterprise So that's one dimension of partnerships Another dimension of partnerships is that if we're successful in what we're doing in the HTRC I think we will develop an app an Environment in which researchers partner with each other and collaborate and In part of what we're thinking about is how to build support for data communities and for The work that they do and the tools that they use The data capsule already envisions bringing tools from outside Communities into our computational environment and trying to do that in a secure way And I don't think that the only people who might be interested in Putting Tools in this environment are other universities and clearly we're already seeing Publishers developing these tools and I think shifting their notion of their business in some quarters of scholarly publishing at least Away from the production of content and towards the production of research services and platforms So I actually see the the HTRC as a potential inflection point for this change that's coming and maybe the best case for the scholarly community to Change the balance of power here a little bit and I'm going to explain a little bit more about that If you're doing text mining It's much better to have everything that you're interested in mining in one place It's difficult to do distributed text mining and get results from here and results from here and results from here using You know data that might be a little bit differently prepared and tools that might operate a little bit differently in different environments And aggregate those results and be confident that what you're getting is what you thought you got So Kola co-located content certainly makes it easier for researchers and arguably produces better results there is a study the first part of Some consulting that I'm commissioning on this subject just came in on December 7th from Tony Tracy Tony worked for portico for a long time and is a publishing consultant now and she's working with us on Doing a sort of survey of the text data mining landscape out there What's out there now for services who's offering them? What do they charge for them? You know what kind of shape did they take and then she'll be working further with us to do more in-depth Conversations with libraries and with publishers about how how they think services in this area should develop The business case for publishers For co-locating content with the HTRC is not I think particularly difficult to make a lot of publishers now as Tony's study shows our Preparing data sets for researchers and in some cases charging, but they don't post prices So you can't tell exactly what they're charging But they do sometimes Charge for assembling those data sets and even that probably doesn't cover their costs because they're taking people off of other Jobs pulling these data sets by hand And then they ship the data sets to the researchers I'm sure with an agreement that says that the researcher will destroy the data once the research is done And there's no enforcement of that So they have no real surety that You know their data once shipped will be used as it's supposed to and Whatever's done with it is invisible to them Unless results are published in some journal at the end of the process So I think we can offer a better business proposition in HTRC in a secure environment Where what researchers do with their content is audited and and visible to the people who provide the content Where an 18th century researcher? Let's say would have access not only to the 18th century data sets that the publisher provides, but to literally millions of pages of 18th century literature, which is In many cases Well the 18th century primary literature no, but 18th century scholarship in copyright and Unavailable to those publishers through licensing schemes. So I think we have a Case that's makeable to publishers that it would be in their interest to to participate the business case for libraries is an interesting one Brendan Butler who works for me at the University of Virginia library makes a strong case that When libraries license content they license the right to text mine that content And I believe that he's correct He argues against specifically negotiating that right because That gives it the status of something that might be in question But I think the case for libraries begins with what happens when you get the data set and you hand it to the researcher If the researcher is capable of taking it from there, then maybe that's fine But as we see more of this work being done in Newly data-centric disciplines in the humanities and the social sciences I think that's not going to be the case and I think the business case for libraries is about support If you hand a data set to a researcher, they and they don't know what to do with it. They will be back on your doorstep shortly So we're we're looking at this set of issues in the context of a pilot project Which we've been discussing since the last CNI where the discussion began With portico and J store and the idea is to get some of the publishers that they work with To agree to co-locate some content in the secure environment That we run and to have us look at what kinds of issues are raised by trying to normalize data across these different streams and by trying to Bring tools to bear that may have been developed outside our environment So that that work is underway now The great advantage of working with J store and portico on a pilot such as this is that they already aggregate publisher relationships, so They have agreements with lots of publishers. They know those people. They they have a trust relationship It's much easier for us to work with them as partners than to try to establish relationships with each of these publishers and I think it gives us a way to talk to publishers about sustainability strategies in a larger context of their other business relationships and Finally, I think in this pilot what we're looking at is you know If we do think that this in the future will be an environment where people outside, Illinois, Indiana and Michigan are bringing tools to bear What do we do in terms of API's in terms of some of the fundamental affordances like the work set builder to make this a level playing field So that the best tools can rise to the top which again, I think would be in the interest of scholars So those are some future challenges that we're looking at We have some other known future challenges I think competing and collaborating in a mixed for-profit non-profit environment is Definitely right up there and that's you know, we'll be cautiously looking or you know at the dimensions of that in this pilot Building robust data communities. I mentioned that earlier as a goal Discovery services for work sets and results part of collaborating in this environment will be understanding what other people have already done there And whether it would be of use to you Or whether you need to start over Contributing improved machine readable text to the Hathi trust itself as a lot of work gets done on the text in this environment and Many cases that text will be improved. This has been a known issue since we started talking about this There really isn't at this point a way to feed that back upstream to the Hathi trust. It's not a technical problem it's it's a sequencing or editorial problem and Finally the vetting of research results right now research results coming from the copyrighted data are human reviewed To make sure that we're not releasing text that could be Large enough to constitute a conflict with the copyright situation There I'm sure that there are ways to do some if not all of this review computationally I think that's a problem in cryptography Basically, I think the question is can this particular data set be reverse engineered into the text that it came from and I'm betting that there are some computational solutions to to that problem We invite you to get in touch there's an email address up at the top of the box there We invite you to get involved Keep an eye out for the next advanced collaborative support call share that with your researchers help us identify especially social science researchers at this point with an interest in the Hathi trust and Encourage your users to join the monthly user group meeting and last but not least Thanks to our sponsors And I think at this point I'm turning it over to the mic at the end for discussion