 All right, hello everybody. Thanks for joining me for more than manuscripts, transforming special collections materials into ornithological data. I also wanted to give a quick thanks to the CSV COMP organizers for programming this conference together and transitioning it so well to a virtual conference. So my name is Wesley Teal. I'm a metadata librarian at Iowa State University. My pronouns are he, him, and I'm presenting from the ancestral lands of the Bakoji or Iowa nation, ceded to the United States by the Meskwaki and Sock Nations in the Treaty of 1842. So today I'll be talking about the avian archives of Iowa online, or avian for short, and how it's part of digitizing and describing items for several physical collections. Library staff built an ornithological dataset from historical manuscripts. So what is avian? It's a digital collection of approximately 10,000 ornithological primary records selected from eight physical collections. The items in the collection cover a wide variety of birding related materials from all the way from as far back as 1819 to 2011. These collections include a large amount of sighting documentation as well as photographs, lantern slides, correspondence, administrative records for various Iowa birding organizations, and more. The picture here on the slide shows the landing page for avian.lib.iastate.edu. It's a custom web portal designed by library staff and Iowa State's web development team and implemented by the web development team. So I'm going to give a little bit of the project over to you. Avian was made possible by a digitizing hidden collections grant from the council on library information resources awarded in 2016. Clear provided $188,825 for a 24-month project with the library for providing a matching amount. And so between June of 2017 and June of 2019, 26 university employees worked on the project to varying degrees, including two staff members hired specifically for the project. And additionally five student employees took part. So here's the full list of everybody who worked on the project. As you can see, our work depended on collaboration between four library departments and one university IT unit. Between a lot of us, I don't think we had any background in bird watching or or ornithology at all. So the project proved to be quite a learning experience. And is this data was kind of the big question we were coming into. Because one of the unique aspects of this project compared to earlier ones is that we plan to convert some of the collections material into scientific data. One of the physical collections we digitized, the Iowa Ornithologist Union papers included roughly 2,500 well-documented rare bird occurrences. And so I've got a picture of the meme of a person pointing at a thing, in this case, a rare bird occurrence documentation form and asking, is this data? So a picture on this slide is a fairly typical rare bird documentation form detailing the occurrence of a Rosy 8 spoonbill of the Wabansi wildlife area in Fremont County, Iowa. The form includes species, specimen count, location and habitat data, as well as the name of the person recording the occurrence. Additionally, the form and many others included information on bird behavior, markings, weather conditions and the references used to identify a species. We had what looked like data detailing more than 2,000 rare bird occurrences, including 194 different species at 670 unique locations across Iowa and in bordering states, mostly Nebraska or Illinois along the rivers there. And these occurrences took place between 1975 and 1999. So now the question we had became, how best to convert the information in these documents into data that was useful for unethological research? After evaluating the documents and researching scientific data standards, the project team decided to use Darwin Core. And most of this work was done by previous librarians and staff, particularly Paloma Graciana Picardo, who was working on the project before I joined it. I kind of joined partway through. Darwin Core, if you're not familiar with it, is a scientific term, standard glossary of terms intended to facilitate the sharing of information about biological diversity. And it seems to be pretty widely used from what I can understand. So pictured here is part of a spreadsheet created by Paloma, who I mentioned, while working out what Darwin Core fields we could populate from the collections rare bird documents. So after identifying how to convert the information in the rare bird documents in the Darwin Core, the next step was entering that information into the digital collections database. This work was handled by our metadata team with Peter Sutton, who was hired for the project, predominantly handling the rare bird documents. During the data entry stage, we encountered various complications. Some of the handwritten documents proved harder to decipher than others. Even more cryptics were some of the shorthand birders used for various field guides that they consulted when trying to confirm a bird's identification. It took us a while to learn that citations like autobahn meant the autobahn society field guide to North American birds Eastern region, and not the Western region, nor the autobahn society master guide to birding, which was also used, but was usually shortened to autobahn master. In addition to including data from the documents, the avian website included metadata about the digital objects, including title, description, digitization method, and so on that are not part of the object scientific data. And pictured here, you can see on one side the original documentation form for Rosie at Spoonville sighting and part of the metadata and how that was displayed on the avian website. So once we had the data in our databases, we still had to export it and separate the scientific data out from the digital collection metadata. So pictured here is a portion of a spreadsheet I made working off Palomas earlier work detailing what fields in our database map to what fields in Darwin core, along with various notes for the export. Erin Anderson, the projects manager and co principal investigator, and I worked with the web development team to export the data as Darwin core. And after nearly two years of work, we finally had our scientific data set. Well, almost. We still needed to publish it. Our plan was and is to add our data to GPIF, the global biodiversity information facility, by publishing it through VertNet. While we were working with VertNet, it became clear that our data needed some further cleanup. Some of this was due to our relative inexperience with Darwin core, leading us to populate various fields with invalid values, or omitting some values required by VertNet or GPIF. Some issues were artifacts of the export process. We had a lot of strays and I colons littering different fields. And there were some additional issues caused by messy data, particularly geographic data. There were a few Darwin core fields that had two locations, usually because the border listed a sighting as being between two locations, and that really wrecked the export process. So pictured here is part of a spreadsheet that shows kind of two versions of the field, the original export on the left that just came out of the database. And on the right two columns showing how we've changed in collaboration with VertNet. So you can see that we dropped the prefix from all the field names and altered a lot of the values, either correcting them, inserting them and even adding new fields, such as catalog number and establishment means fields. And this is just part of it. There's changes throughout. We have 60 to 70-something fields by the end of the process that we've populated. So there's still some remaining work to do before we can finally publish our data. The data is most of the way being ready, but there are still about 20 records as shown in the spreadsheet image here that still need to be cleaned up. Each of these had two locations listed in the database, which like I said, wrecked havoc. And so I'm still part of the way I still need to make the time to get through and clean those up. And once that's done, I'll be able to send the data back to VertNet. They can take a look at it, see if there's anything else that needs to be cleaned up, and then hopefully we'll get it published. But it is actually sort of available right now. It's just kind of a more raw form. So you can download it from the Avian website's data page at avian.lib.iastate.edu slash data. And that's pictured here. The project metadata contains not only the rare bird occurrence data, but also the metadata for the entire collection for every object we've got in the collection. The website also has a read-only API, which allows for exploration of the collection's data and metadata in a more web-scrapy way, if that's kind of how you'd rather look at it than in a spreadsheet. And there's also some data in the collection that's yet to be uncovered. So for any intrepid data explorers, we've got a lot of stuff that's still locked away in the digitized documents. Among the low-hanging fruit is a set of species occurrence tables, like the one pictured here for an eastern bluebird, that were compiled by the Iowa Ornithologist Union, that just have all the sightings usually from about 1970s something to sometime in the 80s. There's also other rare bird occurrences dating back as early as 18, 19, that generally lack as much detail. So they might not have a full location or something, they're just not detailed enough for us to create a full Darwin Core record from, so we didn't do that, but they might still have information of value to researchers. This is your five-minute warning. Thank you. For those willing to dig deeper, the collection could be potentially come for a variety of historical data, including weather and habitat data, migration patterns, population sizes, bird behavior observations, and so on. The collection's also got a copious amount of letters, field reports, it's even got a few different journals about birding, just personal journals, and researchers could use those to study the social networks of birders in Iowa, and the ties they have to the larger birding community as well. And finally, for those of you who are wondering what a Roseate Spoonbill looks like, here's a picture. It's an interesting-looking bird, and now I'm happy to take any questions. Thank you very much, that, Wesley. Absolutely. So we do definitely have time for two or perhaps three questions, if anybody has anything they would like to ask, Wesley. And if not, as with everything else, then please feel free to have a think about it and ask some questions later, and we'll share them in Slack to make sure Wesley has the opportunity to come back and ask, answer them. Wesley, we can either end here, or if you've got some last things, comments or observations you would like to make, there's a few minutes left. Yeah, I think ending here is fine. Oh, we have, sorry, Marianna says she has a question, and I apologize for missing that. Oh, yes, we have approached local birding groups about their interest in the data. We worked, well, not so much me, Aaron, that I mentioned earlier, and Anderson worked pretty closely with the Iowa Ornithologist Union with their current team on a lot of this. We had a sort of a public launch and gala back when you could have public events, and had a lot of, had invited a lot of folks from different birding organizations. So, yeah, we've reached out to our local groups to really keep them involved and make this a resource for them too.