 I think it's about time to get started. Welcome everybody. I'm Cliff Lynch, the director of CNI and you have found your way to one of the project briefing sessions in the spring 2020 CNI virtual member meeting. This meeting will run through the end of May so we're a little more than halfway through and today we have I think a particularly timely session. Certainly the events of the last couple of months have really I think emphasized the need to think hard about how we're managing our print collections collectively and institutionally and I think have also underscored the importance of redoubling our efforts to get many of those connections into digital form. Finella France from the Library of Congress is going to tell us about a Mellon funded project which I think will help us to make some of these choices in a wiser and more data driven way and she will share the specifics of that with you. Finella will give a presentation and then Diane Goldenberg Hart from CNI will pop up to moderate questions and discussion at the end. There is a Q&A button at the bottom of your screen and I'd invite you to use that to enter questions during the presentation at any point so that we have those ready when it comes time to do Q&A at the end. So with that I'll just welcome everybody and thank you for joining us and I want to say a special thanks to Finella for offering this virtual project briefing to help inform our members. Thank you again and over to you Finella. Thank you Cliff and I just really want to take this time to take a huge thank you to Cliff and Diane and Beth who have just been phenomenal with organizing this and made it so easy. I'm going to link out my screen of me so that you can just focus on this screen and hopefully that will reduce any technical problems but I know that if there's any issues just I will be they will pop up and let me know. So I really wanted to sort of dive into this. This goes back a few years and really focuses on the challenges we have with how we make informed decisions and how we think about our print collections and how can we get that information that helps you make that valid decision rather than just hoping you've made the right one. So how can we affect the address the physicality issues of our print collection? What sort of data do we need to make that decision and then to really advance our presence in the digital realm? What can be digitized and then depending on the condition how effectively are we making that that decision to prioritize digitizing that risk collection first? And until we know about the physicality of those materials at risk it's challenging for people to make that decision. So I'm really delighted to be able to share this research project. We were funded by the Andrew W. Mellon Foundation at the beginning last January and this is for just over three years to take the same 500 volumes using the stratified random representative sample and I can go into that in the next later if people are interested. From five different six different research libraries one of those the Library of Congress has agreed to include theirs in this family program and from the time period 1840 to 1940. So we're really looking at that time period of mass production where we're starting to see some of those challenges in terms of the hope and more civic papers but also trying to understand more about what is that physicality rather than us visually looking at it and my assessment depending on how I might think about its condition could be very different to yours but that could have a profound impact on which volumes we're actually collecting and preserving effectively. So we have five research partners Cornell University of Miami University of Washington Arizona State and University of Colorado at Boulder and they I just huge cute off they have just been phenomenal to work with just really supportive from early days when we're desperately looking funding through to now when we're telling them to hold sending shipments because we can't we're not allowed in the building we can't take any shipments in or out and I wanted to give you the link here to our public facing website nationalbookcollection.org and you will get a you can access this later and I'm happy to send it more information but just give us a sort of a bit more in depth details on what I'll be going through in this presentation. So in terms of the data analytics and the possible trends what we're really trying to find out is which factors contribute most to have a more of an influence on what the state of the condition of the text block of the page the book itself is. I'll just note that while we are looking at the condition of the binding that is not the current focus of the intent of this research project. So we want to know more about the inherent properties of the paper the impact of environment and usage and partners are providing environmental and circulation data they have unfortunately generally this often goes back only a few decades and as many of you know when collections get moved from one location to another it's really hard to keep up with it. So the interesting thing is so far and as you'll see as we go through we seem to be finding that the inherent properties of the paper are having a much greater influence than we had expected and I also point to the East Boston Consortium they have some lovely data on their website and they've talked a lot to the students and their data seems to suggest a 5% loss in condition from collection use and this also ties in with another collaboration collections demography with the University College London. So what have we done we have taken all of the catalogue information from OCLC and we've been looking to see how accurate that is and isn't and that's been particularly interesting in terms of thinking about whether we're looking at the same volume very often it's listed in the catalogue as the same date but it's not actually quite the same date so we've been capturing that data as well and sharing that back to our partners and using that to see how that relates to or shows a difference in terms of the volumes we're looking at and whether in fact what people think is available is available on on their shelves as well. From there what we've been looking at first is going from what most people have available what's your visual assessment of your collection and usually people have sort of a fairly standard assessment form and we reached out to partners and other colleagues to see how could we capture the most information that might be usable that we could take from the visual assessment and then link it with the physical analyses that we are doing and so what we're doing with this is taking a small sample from each page of a general collection book and using that three eight about a three eight and strip to correlate the objective scientific analyses with the visual assessment. I will note look at the fourth the third bullet point which is do we know what the what is the same volume and what does identical mean and this has seen a very fascinating area for us as we've started working through this data. So we have a very interesting platform where we're pulling in all the information cataloging that every book has its own individual unique ID we're capturing all of the catalog details about that has it been rebound what other information can we find about it looking at even that the height width and thickness we initially when our first shipment found comparing LC volumes with Arizona three volumes that was supposed to be identical were completely different inside two were the same one was twice as thick the edges hadn't been cut totally different paper and yet the published date and location was exactly the same and these are some of the interesting and I will say delightful rabbit holes we've been going down on trying not to go down too far and so as we start to look into the description we have been describing the binding and these were things that the partners thought would be really useful can we start to see differentiation in terms of the the color but it is obvious rebinding there was a lot of interest in understanding whether this was the original binding and whether that could be useful as we describe the volume and the binding notes you what else are we finding in terms of this that gives us other information that can be shared back with partners and can help us think about what might be useful parameters as we capture a catalog of information and in terms of the text block keeping this fairly high level what are we seeing is it is it black ink is it color ink are there different types of paper in it are the multiple papers you know tippins other things like that then we've been moving on to the condition assessment and balancing from everything we're seeing and this is just a capture from one specific volume is this something that the library would lend so sort of trying to capture the different the number of tick boxes and how that correlates back to condition of the book just from the visual assessment and this has been interesting as we've started to look at this because we've found that we then wanted to separate this art from the condition of the text block into four fairly specific areas that seem to correlate a little bit more effectively with what we're seeing in terms of the objective scientific analysis so the physical condition uh on the first column on the left in terms of is it brittle or crumbling in terms of damage and loss are there loose pages that's sort of a very visceral and initial response and things things that we thought would be correlating fairly closely with some of our objective measures the structural in terms of the physical we separated out as things that might be happening but may not necessarily be inherently connected to the physical quality of the paper the text block itself the visual was the area when we did um a round robin of how many people were assigning the same values here was the one that was most prone to being subjective in terms of color how people assessed the the amount of fox and the amount of staining and so we wanted to sort of group this together because that that was a bit of a challenge what I will also note and I didn't mention earlier we have two researchers um who are all doing the same exact correlate um tests so we're not so we've reduced the number of inter researcher um challenges or potential errors and each month we are giving them a volume that has been looked at previously to see how closely they are matching and to be able to make sure that they aren't moving away from the you know they haven't got a sort of a bias shift in terms of their analysis and then the event traces are things that have externally happened but are not essentially inherent to the paper itself so is the water damage are there insertions other things like that and just wanted to show you an example we've been taking a lot of documentation and this is an example where in fact there's a separation in terms of the the spine but the paper is in very good condition so while it's a physical damage it's not actually the damaging the content of this book itself so as part of as we go through the the rest of the data that we've been collecting um on on these volumes we are doing a photo documentation of the front cover the spine the title page and the title page verse so just to see and do a cross correlation again with what information is in the physical volume and how that relates to the catalog and as we I don't have time in this presentation but without the website you can essentially pull up we've we've almost got it out there on the public you you're better pull up the same five volumes and look to see how similar or different they are in terms of visually and what you can see from condition there which is a beautiful segue into what does identical look like and I was really surprised to find that we had such a difference in the volumes themselves in terms of their bindings and in terms of the papers so I'm going to slowly skim through the next five slides from the five different research partners there's same books starting left to right and you'll see as the some not all partners have exactly the same volumes even though they thought they did but you'll see very quickly that we have quite a difference in terms of the binding the colors of the surfaces and this became a really interesting quick visual to see how similar or different these books were in terms of what we're actually seeing as we went through the different volumes and that is what brings up a really interesting question we're not going to answer this question today but it is one of those things that I think we all want to know what is the best volume to keep we can't keep everyone and the annotations are there other features we could we should be considering as we go through that just one wonderful interesting thing many of you will know about the book traces project we had two interns over summer last summer and we asked them to take the period they were there to go through volumes and start to list how many pages were missing and also how many were annotated and this was a very interesting example of a very personal dialogue between the person who donated these books and the actual what he thought of the author and if I zoom into that you know it occurs to me that Logan is too hasty he tells us that you know that we've only finished half of the volunteers engage and so it's a it's a really interesting interaction between the reader or in this case the donor and the author we also found surprisingly we found a number of annotations that were from uncle so-and-so or aunt so-and-so to someone and dated the 25th of December so it seems to me are books often at certain time periods and most of what you're seeing in this presentation is 1840 to 1890 we were our last-minute gifts to neosthenesis and nephews which is an interesting little anecdote what we didn't expect to find was multiple different types of paper in the same volume and if I zoom in on that this is one volume with five different types of paper why is it important well if one of these is highly acidic then it is going to be having an impact on the the rest of the book even if other parts are in good condition and I think what's fascinating is that we you know we're trying to be very rigorous about how we're assessing trends we have started to see a potential correlation between this and popular fiction books Stowe, Mark Twain we've seen some of those books do often and continuously have multiple types of paper in them so moving on to the subjective tests and that we have been doing in the visual test are we doing our beloved double fold and I'll come back to that in a minute we've been trying to look at spot tests to see if sizing and things like that might actually you know if we're seeing significant correlation between alum sizing and damage is that something that people could use as a quick tool the volatile profile what we affectionately call book sniffing we actually can take the profile of the different types of degradation products we find whether it's acetic acid, peripheral formic things like that and we've also correlated that to the barrow collection which is a thousand books from 1500 to 1900 and we have essentially done a profile at every decade which we can use to correlate with what we've seen coming in from partners so we're not sure whether that's going to really tell us a lot but we we sort of doing everything we can while we have these books to see if that may be a helpful thing with a you know simple sniffer in the collection to see if that would be useful so what you can see here now is as we go into the data the infrastructure the database is we are pulling those general condition points from the visual assessment and we're starting to see how they might correlate to the other assessments that we're doing so as you can see here we've got some of the spot tests which are often very challenging to determine is that is it light pink is it bright pink particularly when we have different coloured papers and this is something we're working to develop a more effective way can we use some sort of stain for different things that regardless of the paper substrate white or yellow background will continue to give you the same response that that you might better use more effectively in terms of the miniaturized test the analytical test as I said we're taking a tiny strip of paper we're doing a mini tensile test to look at the strength the mini ph to look at the acidity of the test and a size exclusion chromatography which is actually looking at the chain links of the cellulose in the paper and that we're just taking a very tiny sample can actually help us get a much more accurate correlation of how how strong is the paper so when you turn the page are you more likely to break it or not a very quick visual entree into our labs and with the ph we are capturing that as I said from that tiny sample and the tensile as well with the tensile test we're taking a 10 millimeter strip from the edge of the paper and we're doing it in two different orientations and I'll come back to that in a in a wee bit but you can see here that we actually can see how that you know how much force does it take to break that paper and how does that potentially correlate to how strong the paper is when someone is using the physical volume besides exclusion chromatography as you can see there you know how strong are the actual sample the the fibers you know as something ages that that long polymer gets shorter and that contributes to the strength what we are doing here is that tiny sample that little yellow white dot is about one millimeter so we take three of those dots and we can tell you how strong the paper is the non-invasive those were destructive tests the non-invasive test methods that we were using infrared spectroscopy so we can look at the different chemical bonds in the the structure and the sizing and different things and start to correlate those with those destructive test methods and that's something we're just delving into to see can we actually use that as a non physical and non-destructive means and ultraviolet or visible spectroscopy just shining some light get the optical measure of that what does that look like so here's the infrared spectrum we can see how similar or different that is from the edge to the the inset of the bulk and whether we're seeing different peaks on that that might relate for example to lignin or other parts of the paper that might be indicative of damage and the reflectance spectroscopy we can actually with the database we're pulling in all of the raw data so we can look at the raw data we can do different derivatives and see how that's correlating which of those peaks I know you're looking at this going oh what is she showing me but it may be that we find one of those peaks is really important enough correlating the damage so if we can do that non-invasively with just shining some light at the edge of a paper can we start to potentially turn that into an app or some way of pulling out the information because we're getting color data from this we can then also look at what's that shift in color objectively rather than subjectively saying I think it's a bit yellower I'm not sure so sometimes you often see you pull of notice that around the edge of the bulk you are probably seeing some some color shift some color change is that what is that shift and is that something that we could use as a simple measure as well and how do we know if the papers are the same or not as we mentioned earlier so what we have been doing is we have a large collection of reference papers that are completely characterized and we have put all of those features into this this database this active database we can now then as I mean refer to start to see what does that mean in terms of the infrared so we can use the known and then start to put our unknowns into this statistical spreadsheet or in mapping and see what does that mean so for example this top left is all of our rag papers are very nice good quality rag papers down the bottom here this relates to newsprint so really generally poor quality paper in the middle we have mixtures of different pulp papers and so we can start to use this idea to see how we can understand more about what we're seeing in our collection so one example here when we did a quick check with this you're seeing and so this is flipped from the previous one but a lot of the collections we saw correlating very closely with rag paper and we were thought that's interesting then we looked at their catalogs and they were all pre 1845 so this does seem to be an interesting way and as we get more of our and furthermore into our testing we've only turned 455 books so far we'll get a bit of feel for what that actually means what we really want to do is see how can we use we know you all can't come into the lab with your box so how can we use this huge collection of data to create really useful third evaluations or stat tests so we've been starting to look at a simple bend test can we get a more effective ph pen that regardless of the color of the paper might potentially save you this will absolutely tell you it's ph5 or less so you might want to start to think about this collection being slightly at risk and put that forward in your digitization what we have also done is we've created a quick query tool so we can rather than having to met everything at once we can very quickly say is there a correlation between the strength and the ph and if we overlay the aluminum what does that tell us so a few graphs here just as I wrap up we're seeing right now we've only done the all of the books from 1840 to 1894 our five partners and we're seeing a large collection clustered in the the 4.5 to 5 range so you know relatively a civic range given that 7 is neutral when we look at the the length of the molecule we're seeing you know again a shift and a clumping sort of with you know showing that there's some correlation there between the strength and that we you know it's not it's not highly as strong as we might have hoped this is a really interesting one I thought many of you might be intrigued about so when we do our tensile testing we are testing it both in the direction of the paper lengthwise and across we're doing five of each and we have found that publishers did not always put the paper in the strongest you know the books are not always bound to make best use of the paper's machine direction so we're seeing clustering here those two clusters you're seeing the vertical and the horizontal and we that can be can actually have a big impact on the the strength of that book as people are turning the page can we see a correlation between the yellowness and the the ph or the acidity and we seem to be starting to see a correlation between that as well sadly I hate to tell you but we've been trying very hard to correlate double fold to any of our parameters as you can see we are not seeing clumping we're seeing a just all over the place in terms of what we're seeing with that so if you go to this website we have a list of the 500 books I've had a number of people ask um could we share that list because I'd be interested and we haven't put it up yet but if people are interested we would be happy to share the assessment criteria that we are we are using and if people want to actually look at some of their volumes to see how they they relate we'd be happy to add that that data into what we've got as well so just to conclude this data we're hoping is really helping us to identify materials at risk and and help you automate more accurate decisions and trying to use this knowledge to create more useful stack tools that doesn't mean you need to take them to a lab but you can use them intuitively that are quick accurate and easy to use and with that I will um acknowledge all of my wonderful colleagues who've been with me and I will open that up for questions. Thank you Finella that was fascinating uh thank you so much for that wonderful talk and uh thank you so much for coming to CNI to share it and thank you to all of our attendees for joining us today for this webinar which is part of CNI's spring 2020 virtual meeting we're really delighted that you could join us today and I want to go ahead and invite you to type in your questions in the Q&A box and Finella will answer those live and while we're waiting to let folks type their questions in again I just want to welcome you to CNI's 2020 membership virtual membership meeting which is going on through the end of May so we will have lots of webinars to come I hope you'll take the opportunity to check out the schedule which I've just shared with you through the chat box and join us for more offerings. I also shared with you earlier the website to the project the national book collection.org website the link is there in the chat box I was just taking a look at the site while Finella was giving her presentation and there's a tremendous amount of really interesting information lots of details about the methodology and their processes so I invite you to check that out as well. I was curious to know Finella what what are the next steps here where where are you at in this process and what is the plan for the future? Thank you Diane so we um we were going along quite nicely until we got banned from building we are doing it that what was helpful everyone's we're still working hard it's allowed us to really dig deep into the data and we're hoping what the really useful you know what people would like to see and engage with we're trying to be really careful to not make predictions that might be relevant to one decade and not to everything so that's one thing so essentially next steps is to try and keep this information coming out to you as we find new information what we would be really interested in is as we start to develop some of these stat tests and simple tools if people would be interested in us sending those out so and we could send out reference papers for them to try them to see if they're simple or not as intuitive as we thought that would be incredibly helpful because we want to use this data in an effective and useful way and what we hope also is that maybe as we start to potentially identify maybe there's a certain decade that is more at risk than another could a follow-on then be can we delve into that and help with that information but I think there's a information a question there yes definitely that's interesting I see we actually have a couple of questions coming in thank you for responding to my question and let me move on to another question which comes from Rebecca who asks what inspired this project were there concerns about the preservation of books in the library and what criteria did they use to select the books included in this project great question thank you Rebecca so I cut a little short because I and you're only here to half an hour and essentially we started in about 2015 and there were two groups involved it and it was really where people were starting to get pushed for resources how do they know what to how many volumes are out there can they trust the catalog information how many should they retain or get you know withdraw because they're they're pretty sure that someone else has it then people were saying they you know they were looking on catalog shared catalogs they were finding oh so-and-so has got six of these then when they they got rid of theirs then they would go to borrow and there'd be 10 pages missing so these study was a big concern about how to make those decisions and following on you know Cliff has been involved and others and some of those those interesting projects about the print repositories and we sort of separated into two groups one in bogus and others from he's now at recap looking at what were the policies and procedures out there but it kept coming back to we really need to have a good objective assessment of how do we assess the collection so from that we did some pilots in house with interns looking at different test methods and then in-house staff we looked at how could we miniaturize tests to take a minimal sample but still get those objective analyses how do we select the 500 we looked at what was common between the five partners we started off with about three I think about three and a half thousand books we initially looked at a random a completely random that concerned me because we weren't getting a good representation of each decade so we had to do a stratified random sample so we look at the number of volumes published per decade and then correlate the number of samples from that decade with what we are calling so again the statistical representative sample very interesting that was great okay thank thank you rebecca and thanks finella for that response uh and we now have a question from cliff lynch so i'm going to hand it over to cliff okay thanks um I this this is not a well formulated question but I was really fascinated that with a rather small sample in terms of number of volumes uh you managed to show illustrations of all kinds of interesting stuff happening for example this um phenomenon of it turning out to be an entirely different volume among different partners uh or some of these um some of these practices about multiple types of paper in a given volume um uh do you have a sense for how how late in publishing history those kinds of problems are likely to show up I mean I think most of the examples that you had looked to me like they were fairly early um but I don't really know that no thank you cliff great question so right now because we're still only just at the barely at the beginning of the first year we talk the same we just arbitrarily started at the beginning for the 1840 so we've only done the 1840 to 1890 um so um it'll be really interesting to see how much we see those multiple paper types going forward and if there's more standardization we're also curious whether in fact um are we going to see things from a certain publisher or from a certain decade and whether those trends are specific for that decade and that's something that we still yeah as we the interesting thing was because we've got this active collection of data we can keep building on our knowledge as we bring in new decades and expand that so we're still only sort of at that the early period um because the poor sales that's publishing increased exponentially as we went through the decades um but yeah I have to say these have been really fascinating and interesting rabbit holes that we weren't expecting and I just keep wanting to read more about the history and know more about the history of publishing and and publishers and it's been truly fascinating. Oh yeah it sounds like you've uncovered some really interesting mysteries that will be fun to to dig into later. Thanks for that question Cliff and and thank you Finella for that um interesting response and um seeing that we are past the end of our time here I just want to let everyone know that I will be going ahead and shutting down the public portion of this session so I'll be stopping the recording but invite you all to um stick around after we turn off the recording Finella will still be here we'll leave um the webinar environment going and you can raise your hand if you would like to approach the podium and have a brief exchange with um Finella make a comment ask a question we welcome we welcome your um interest in doing so so thank you again to Finella for this wonderful talk thank you again to all of our attendees. Many thanks for that Finella and I really hope you'll keep us posted as you get back and build out the database.