 Alright, I'm going to go ahead and get started. It's 10 o'clock and I've got a lot to go through here. My name is Chad Hutchins. I'm the head of digital collections at the University of Wyoming. And I'm really excited to talk about this project at UW where we've been working with our geological museum to digitize paleontological collections. I've got a two-part presentation today. I'm going to talk about digitization of the paleo collections, what's in the collections, some of the equipment that we're using as well as workflows. And then, in the second half of my presentation, I'm going to talk about a bigger biodiversity digitization project that we're working on with Indiana University Libraries to build an open source repository system for biodiversity collections as a whole. Thankfully, a person from IU Libraries is here, so if you've got lots of questions, we're going to have to bug John. So, at the University of Wyoming, I want to give you some background on the collection itself. This is your typical natural history collection or dinosaur collection if you want to call it that. They have large specimens like the Brontosaurus over here on the left. But we also have about, well, we have about 40,000 specimens and a lot of the things are not featured in the museum itself. They're not available for public display. Users don't know what they are. Visitors can't see them. So, we have smaller things like skulls. Some of these things we've printed out here, and you can come take a look at these prints afterwards if you want. Some of them are about a foot in length. We digitized a large bison skull a couple of weeks ago that was about this big or about two feet across. Agent bison were a lot bigger than modern bison are. And then they have lots of really small things, these teeth that are about a millimeter or two in diameter. And this actually comprises about three-quarters of the whole collection. So, 30,000 of about 40,000 total specimens are teeth or mandibles or things like that. And you can get an idea of what these look like here. On the bottom left, these are really, really tiny in some cases, smaller than a pinhead. In some cases, they look or they're comprised of a mandible and a whole lower jaw. And then on the right, you see one of the drawers in the geological repository where they're in these little vials mounted on cork, and that's how these are stored. The collection at UW, and I should step back. I'm not a paleontologist. I don't know that much about paleontology. It's a very interesting subject, but we're working really closely with the museum curator on this. She's the real expert on this, not me, but we're working together on this project. One of the really big strengths of our collection at UW is that it has representation of mammals on both sides of the KPG event. So, the KPG event, for those of you who don't know what that is, was the meteorite that struck down near Yucatan Peninsula that wiped out the dinosaurs, make the dinosaurs go extinct. So, we have representations of mammals when dinosaurs dominated biodiversity and ecosystems and afterwards. So, you can see the diversification of mammals in our collection, and a lot of it is through these teeth, which don't really seem that interesting, but you can get a lot of really interesting information out of them if you digitize them and use microscopes and things of that sort. I'll get to that in a minute. Research uses. Again, not a paleontologist, but obviously you can do lots of species morphology and evolutionary studies with this. When you're looking at these very small teeth, I'll show you an example of it in a little while, they can get information about what these animals ate, how old they were, and that gives them information about the kind of plants that grew and the kind of climate that it was. So, there's all sorts of really interesting research uses that you can get out of this if you're a paleontologist and you study dentitions and things of that sort. Extinction studies and climate are other things that you can look at with these. The KPG event is actually the largest documented mass extinction on record, and like I said earlier, this is all basically housed in our collection, so we can study it at Wyoming. So, Paleo Digitization Goals. We wanted to accomplish a lot of things with this project, and we've been working on this for about a year or so. We wanted to learn some new imaging technology so that we could work with other collections on campus like this. Laura Viedi, our Geological Museum Curator, she has really good abilities to make connections throughout the state of Wyoming. She's a Wyoming native from Thermopolis. She really wants to do a lot of outreach to K-12 to make curriculum materials, open educational resources, and other pedagogical materials to generate interest in paleontology, as well as engage students, young children really, with 3D modeling and printing applications, and obviously there are lots of carryovers in this area with virtual reality and gaming and programming and all sorts of things like that. We wanted to enable digital preservation of the fossil specimens so that they can be downloaded and printed like the things I have here, but also a lot of times with this research collection, researchers from other places in the country will request something like this, and it will go out in the mail, and in the mail it can get destroyed, it can get damaged, it can get stolen, it can just get lost. So this is a way for us to preserve not only digital content, but also the physical specimens that are not replaceable. Remote access to hidden collections is an obvious motive here, and Laura is really interested with this project in digital veracity, so making sure that the models that we make are actually true representations of the fossils themselves, not just some glob that kind of looks like the fossil. And again, we have some other ulterior motives with this, there's a sub-collection of the paleo-collection that they use for teaching purposes. These are things that were generally excavated without any context information, so they don't know where they were found, so they use these in classes and they move them all around campus, but people drop them and things like that, so we want to make it so that they can print these things out and look at them instead of schlepping them all over campus. Digitization equipment. So I wanted to talk about both of these pieces of equipment in this presentation, I don't think I'm going to have time, so I'm going to focus mostly on the David SLS-2 scanner. This is the scanner that we use for medium-sized fossils. If you give you an idea, this is a small skull about, this is one-to-one scale, this is about an inch or two across, so we've done things about this small, and then like I said, we've done bison skulls that are two feet across, three feet across. We use the Keons digital microscope for these small teeth, and I'll give you a couple of examples of those, but I don't think I'm going to really get into any detail. So the David SLS-2, how many of you here are familiar with scanners? Okay, quite a few of you. Most scanners that people are familiar with, at least in the consumer industry are laser scanners. This is something called a structured light scanner. This operates a little bit differently. It's a moderately priced unit. Between the laptop and the scanner itself, we have about $7,000 into this. We wanted to make sure that this setup was mobile so that we could move it to the geological museum and back to the libraries. So that's why we have a laptop for this one. The scanner itself is about $4,500. So how this works is there's basically a projector that projects structured light onto an object, as well as a camera. And then I have a little video clip of this that I'll play in just a second here. Basically what it does is it casts structured light across an object, and it uses that structured light and it measures the displacement of the light over the object to generate the three-dimensional model in a very small nutshell. The two models that you see on the bottom, I guess the one on the lower right is pretty dark. These are the first two ones that we ever did. This is a fossil clam and an ancient camel skull at any rate. So I'm going to go ahead and play this clip, if I can here. This is the scanner running in our office and we're scanning here. This is actually a replica of a sabertooth tiger's lower mandible. And you'll see it casting the light over this and it'll run here in just a second. You see the turntable that we've mounted on it. So for any one specimen that we've got, we are basically taking 24 scans on one axis so it'll automatically rotate it around after it's done taking the first scan. So in a minute here you'll see it flash red, green, and blue to take a white light balance and then it just starts the whole process over again. So now you can see it rotating and then that's basically it running to give you an idea of what this is. And let's see, we'll go to the next slide. So these are our scanning steps. Anybody that's interested in this, I'm not going to go through these one by one. Anybody that's interested in our documentation, we have about 14 pages of it on how we're doing all of this. It's not as cut and dried as we had hoped it would be and we pretty much had to figure this out on our own when we first started this. The documentation for the David SLS-2 is decent for really basic things but when you get into more complicated tasks like merging two sides of an object, there isn't a lot of information out there about that. But we were really lucky in that we had a paleontology grad student that had 3D modeling experience. Go figure. And then we got lucky again and we hired a junior computer science student who had virtual reality and augmented reality experience and all of this is done by students. In this case, these two students. I want to call particular attention though to items six and seven. So within the 3D scanning and 3D modeling community, there's not a lot of agreement about preservation or access for that matter. There's no real agreed upon file format. So we're kind of hedging our bets and we're outputting all these files or all these models in three different file formats. So a higher resolution set of files for masters or preservation level copies and for access and derivative level copies we are just lowering the resolution on it and making smaller files available. And that's what these prints that I brought are based on are the lower resolution files. Because there aren't a lot of systems at least in digital asset management systems and libraries that will render these in a viewer online. We use Islandora at UW. We're also having our students output animated gifts and thumbnails for just to give users an idea that you can actually do something with these files. Just yesterday I learned from the OU folks that we need to look at DAA file formats. This is something that we were sort of familiar with but we need to look into that more. Gay for us, we learned something new yesterday. So these are some of the results that we'll get out of these things. All of these are of various sizes. The one on the top left here is an extinct antelope in Wyoming. This is fitting we have about three times as many antelopes as we do people. In the middle is a mastodon molar. This is a very large thing. We had to do this, rotate it manually. It was too heavy for our turntable. And then on the right, this is a North American fox. I'm not sure why we chose that one in particular but it's a pretty small skull. It's about three inches across. So like I said before, Islandora is not really set up to handle three dimensional files. So we have kind of a temporary solution here before we get onto Emago hopefully. So you see here in the screenshot and all these are available online at that handle there. We're basically taking all of those files, so the preservation copies, as well as the access versions, and we're putting them into a zip container that people can download. So that's the... We have a VW specimen number. That's the accession number that the Geological Museum gives them. The species identification. And then this obviously isn't rotating on this screen but that animated GIF there we're also making so that people can get an idea that you can do something with these. I'm probably not going to show you this right now but basically you can download these and open them in any kind of 3D rendering application. This is in MeshLab. Again, this is... I'll play a little bit of this. This is that same skull. This is the low resolution STL file. I know this is the PLY file. You can do a lot of things with this. You can print it. You can manipulate it. You can add texture information to it. You can do volumetric calculations, measurements and things like that. And those are just things you can't do online with Island Door right now. So we're letting people take these and do whatever they want with them. The Keon's Digital Microscope. This was a high dollar item that we bought with one-time monies from the libraries. Hopefully our... I don't know if our Dean knew that but he's in the audience today so... This was before he started. So we use this for the small specimens. I'll go through this pretty quickly because I want to get onto a mago and leave some time for questions. The Keon's let's us do some really interesting post-processing things that typically take a long time and two of those things are image tiling and focal stacking. So if we've got something like this mandible on the top left here, the focal area that the Digital Microscope lens has won't cover the entire object. So you can... I don't know if you can see it very well on the slide or on the picture there but you can see some kind of steps on the bottom left. So basically what it does is it'll take an image starting up here and then it'll just move back and forth across this specimen and it tiles them and puts them all into a composite image automatically. You basically give it a range and say I want everything from here to here and it just goes back and forth and does it. The other thing it does is focal stacking. So if you've got an object like this very small tooth for example, the focal depth on something like this, it can only take into focus something at the top at one focal, at one focus. So you can see something like this on this cow that I have over here. Basically it'll take a stack of images at different focus depths and it stacks all of those into one focal stacked image that's all in focus. So it does that automatically as well. And what you get in the end is something like this. On the left is... this is a small, very small molar that's been tiled and focal stacked. And then on the right it lets us do elevation maps and things like that so researchers can get some really valuable information out of this. It also does elevation profiles and it generates 3D models. But this is some of the really interesting things that you get out of these teeth that don't seem very interesting on the surface or at least to the naked eye. On the left here this is a really small molar and that image on the left is a total width of a thousand microns across and that's one millimeter. So you can see the the wear patterns on this really small tooth and this is the kind of stuff that paleontologists study to generate research questions about climate and flora and things at the age of the animal and things like that. This is, again, that same tooth. This is the 3D model that the Keons generates. You can see the glue on the bottom where it's mounted. It has some problems with this. We're still trying to make this process a little bit better. So, like I said before in this community or in this area there really aren't any good preservation standards and this is where we're really hoping that Imago can step in and give some benefit to the biodiversity community. What kinds of file formats should we store? What resolutions should we store them at for preservation purposes? Should we be storing a water type model or a pre-fused model or not? What's the importance of the mesh versus the color information on the model? For some people they want to see the color information. For us, we want to see the color information on this because the UW specimen number is on that and that's how we look up the metadata. But the paleontologists they don't seem to care that much. They really want to see the model itself. With focal stacked images should we store the individual images that it takes or should we not store that? Should we store just the stack? So there's all sorts of aspects of this that may or may not be most important to paleontologists and thankfully we're working with paleontologists on this to answer some of those questions. I'm going to go ahead and shift gears here and talk about Imago because this really dovetails the paleontological digitization really dovetails nicely with this project. We're also working on our campus with our herbarium which is one of the biggest in the country. We have almost a million specimens in that collection. But Imago is really built to work with current biodiversity systems and I'll give you some context here in just a second. This system is being built on Sophia, Hydra, and Fedora 4. We haven't gotten access to this at University of Wyoming yet but it's available now at imago.indiana.edu where I think they've been mostly just working with herbarium specimens. Up to this point. We have a lot of use cases for this in Wyoming because we have such rich biodiversity collections in the state and at UW. We have an NSF grant out for this and we're hoping to hear back about this actually sometime this month. We just presented about this at ACRL a couple weeks ago and I thought this would be what particularly it's just given the nature of this talk. At UW, obviously we have the Geological Museum with the fossil specimens, about 40,000 of them. We have a vertebrate collection with about 9,000 specimens. These are more contemporary animals obviously that are tax-adermed, things like that. Then we have the Rocky Mountain Herbarium. This is the, I believe, fourth biggest herbaria in the country. We have 850,000 specimens online now. These are mostly two-dimensional images for the herbarium and for the vertebrate museum and then just last week we had a meeting with our invertebrate collections manager and they have a million insect specimens. We have lots of stuff but we have nowhere to really put a lot of it. Just to give people an idea of what these things look like behind the scenes. On the top right this is two pictures of the geological collections or fossil collections. On the bottom left is one of our librarians and the herbarium curation manager and a plant specimen and then on the lower right is the museum of vertebrates and in this case those are bird specimens but these are really kind of like special collections in their nature. They've got lots of moving cabinets and things like that. So what do they do with these things? Where do they put them? In the biodiversity community there are lots of different platforms for digital asset management just like there are in libraries. But these have some shortcomings and I'll get to that in a minute and you'll get an idea of where Imago fits into this ecosystem. Arctospecify and symbiota are really what you could think of as parallels or counterparts to D-Space, Hydra, Fedora, Islandora, things like that. So it's a splinter community. They all do a lot of the same things but they all do them in different ways. They all have pros and cons etc. And all these are typically hosted at individual institutions although in some cases they're hosted at Kansas and Florida for example. University of Florida and University of Kansas in particular. The problem with these systems and where Imago comes into it again gets back to preservation standards. So none of these content management systems have any kind of preservation means whatsoever. So just a quick story about this. When we first started talking with our Geological Museum about a year and a half ago they use specify. And that's the one that I'm most familiar with but we also have Arctospecify and symbiota running on our campus and IU also has I think both symbiota and specify but maybe not Arctospecify. When we started looking at the documentation for specify one of the things that we were trying to understand is where do they store their master images either in two-dimension or three-dimensional format? What kind of technical metadata are they capturing? Are they applying something like FITS to pull out technical file information? Are they applying fixity checks or running check sums on any of these to preserve these objects over the long term? And the long and short answer to that is no. Just no. These systems are really set up for access only. So they don't generate any derivatives based off a master file. You have to do that externally, basically. You can't store any of those in these systems and to make matters worse most of these biodiversity digitization initiatives are funded by something through a group called IDICBIO and the NSF in a lot of cases through these grants called ABDCs. So that's Advanced Biodiversity Digital Reflections grants. But the NSF doesn't fund the preservation portion of it. And this community just isn't really talking about this in the ways that libraries are. So I think libraries really have an opportunity and a lot to offer this community right now. And that's why I think this IMAGO project is so interesting and will hopefully help this community out. But I've actually read NSF grants from collaborating institutions within the Rocky Mountain Region that wanted to work with Wyoming on herbaria digitization projects. And in their digital, not their digital, in their data management portion or their data management plan portion of their NSF grant, basically it said we're going to put this on an external hard drive and we're going to go put it on a shelf. And it got funded. So that's where this community is at. We're just a lot further along than the biodiversity community is with regard to preservation. They're talking about it, but their systems just don't handle it. So IMAGO is, like I said, being developed at IU. I'm not going to be able to get into a lot of details with this, but what we're hoping is that IMAGO will work with systems like Symbiota and Specify. We don't know where necessarily it's going to originate, whether it's going to go into IMAGO first or into Specify or Symbiota. But, by and large, this is not meant to replace those systems. It's meant to compliment them. And just for another parallel for you to think about in library land or in libraries, we aggregate content all over the place. We push things to share. We push things to Europeana, etc. In the biodiversity community, they're doing the exact same thing, but they use a couple of systems called GBIF, which is the Global Biodiversity Information Facility, and IDIGBIO, which is not only a system, but it's a community. And those are aggregation systems for the biodiversity groups, and that's where these scientists find their information, so they're trying to do the same thing. I don't know how many of you are familiar with JetStream, but basically part of how UW is hoping to be involved with this is we can run IMAGO on a JetStream allocation or an Exceed allocation, and this is really meant for institutions that don't have as much technical capacity and support as a place like IU does. Wyoming is pretty small. We do have an HPC group or a high-performance computing group that works with Exceed, but we don't have a lot of development capabilities, so we're hoping to get this running on JetStream and kick the tires with it, and hopefully we'll be able to offer this to the broader biodiversity community and other collections and institutions that have these same problems. So I have a couple screenshots here. I'm not going to get into a lot of detail because I'm about to run out of time, but at the end of the day, I work with hopefully two-dimensional and three-dimensional images as well as preservation level metadata. You can look at it live at imago.indiana.edu. On the right here is just a screenshot. Here's another one, and this is your typical herbarium specimen. On the right, the biodiversity community uses something called Darwin Core. This is a derivative of Dublin Core, but it's meant more for taxonomies and observation data and occurrence data. I think Robert McDonald built this slide, so I guess at IU, they have about 70,000 specimens right now in imago. I don't know how many they have total in their collections at IU, but we have a lot too. There's a lot of collections like this out there that have just hundreds of thousands of herbarium specimens. Just like most library systems will offer faceted browsing, and this is something that is also kind of a shortcoming of the specify symbiota and arctus systems. You really have to be a scientist to use those. They're really hard to use. I'm kind of hoping that imago will be a little bit of a user experience upgrade over those than specify and symbiota will. Just some time for some acknowledgments. At IU working on this, we have a meeting set up next week to talk about how Wyoming is going to get going on this. At the University of Wyoming, we obviously have quite a few people involved with this as well. Myself, a couple of librarians and our Islandora developer. If you have specific questions about imago and Jetstream, I've been told to have them contact Robert. Robert McDonald. You can contact me at the University of Wyoming. With that, I'll take some questions.