 Okay, good morning, everyone, and welcome to this session, which, I think of as a presentation on the robotics project. My colleagues think it's about collecting complex multimodal materials so you can figure libtur views. I'm Keith Webster, dean of dataset libraries at Carnegie Mellon. My colleagues Brian Matthews and Kate Barbera, you will hear from shortly. Before I begin, I'd like to thank the Alfred P. Sloan foundation for supporting this work along with contributions from corporate and private donors. My colleagues knew I couldn't resist talking about the World Cup at some point, so these are some soccer robots quite seriously. I'm just glad we're not up against the Argentina Croatia game because I would not have been here. These robots quite seriously were developed by faculty in our robotics institute with the aim of figuring out whether robots ever could beat humans at soccer. They thought they were on to a good thing because they built a training data set looking at the moves of the Scottish national team and then they tried them against real players and it all went badly from there. Enough of the soccer stuff. Carnegie Mellon is an institution with an early and rich history of artificial intelligence and robotics and many of the technologies common today were first developed on our campus. Things like self-driving cars, speech recognition, internet of things, facial recognition software, artificial intelligence and art, machine learning and library special collections. Our robotics institute was founded in 1979 with the dream of ushering in a new age of thinking robots. Since then, the university has experienced many successes in intelligent manufacturing, space-related robots, sending robots into nuclear accidents in this country and in the former Soviet Union. We also have a separate national robotics engineering centre, which develops and matures the research from the robotics institute into a more conceptual and commercialisable approach. The next in that trinity of large-scale robotics facilities will be our new robotics innovation centre, which will begin construction on a brownfield site next year with a focus on translational research. So where does the library fit into this sort of thing? Many of you have heard me talk before about the inside-out, outside-in approach to libraries that is shaping our work. The way I would characterise that briefly is that, as our contemporary collections become almost universally digital in arrival and fairly homogenous across a network of libraries, our focus on being an outside-in library is one that was established to bring scholarly content from outside our campus in the form of books and journals. It is really a process, and do not tell my colleagues this, but it is really something that operates on autopilot. That shift in effort has allowed us to allocate resources more fully to becoming an inside-out library. That is one in which our efforts are devoted to capturing, curating and sharing the products of CMU's research with the outside world. That is in line with the university's strategic plan, which calls on us to curate the evolving scholarly record and use it in ways that have an impact on the world. Like many, we have enjoyed the opportunities and technical challenges associated with preserving data code and publications to support that ambition and to drive the impact of our scholarship and promote scientific reproducibility. As a side note, I was thinking when I was in Sal was presentation yesterday about the machine learning reproducibility crisis that we are not talking much about. The crisis, to an extent in AI where we see less than a third of research being verifiable and only one in 20 researchers sharing their code. That is a looming topic that we really need to unpick against the backdrop of reproducibility work. But what about the tangible products of research beyond publications, data and code? Our colleagues in the fine arts, where they are concerned with paintings and sculptures and costumes and compositions have done tremendous work, and the Glam community has really advanced our thinking and our practice around curation of tangible artistic outputs. We really have not begun to scratch the surface of how we think about the tangible outputs of artificial intelligence work. Side is leading our open source programmes office and is focused on how we capture and make the most of code. But what about things like a robot, which is inherently a tangible object powered by AI? Robotics really serves for us as a very fruitful study in how we think about curation of the fourth industrial revolution, because what we don't want is for our robots to rust in peace. Thank you. Sometimes I worry if the accent is going to mangle those things. We have been very fortunate to partner with our School of Computer Science, where our robotics institute is located, to raise awareness of the issues around curating robots and associated artefacts amongst our research community, both on campus and in the growing Pittsburgh robotics community. We have become a university fundraising priority, and we are very pleased with the work that has been completed in the first phase of our research. With that, I am going to hand over to Brian, who is going to give you the philosophical bit of the lecture, and then Kate will tell you what is really happening. Brian, over to you. All right, so good morning everyone. Keith talked a lot about CMU, but I wanted to take a step back for a minute and talk about the cultural ethos of robots in Pittsburgh. I took two pictures on my way to CNI in our airport, and this is one that's like a bridge that transforms into a robot. It's kind of an interesting 20-foot sculpture that's in the southwest airline arrival area, but this one was even even more interesting to me. This is kind of a subtle one where it's in Concourse A, if you ever find your way there. There's this like art installation, have you seen this Kate? Yeah, that's this like retrofuturist, what they kind of call it, but it's this robot repair shop where you bring your robot in and get it anyway. Robots is really just everywhere in Pittsburgh, and when we started this, we kind of asked this question, and I don't know that we've ever actually answered it. It's still an ongoing kind of thing, but it's this philosophical question of what is a robot? If we're thinking about archiving or building collections or just curating them in general, it's kind of like we need to have a common understanding amongst ourselves, and it's interesting when we talk to the practitioners of robotics because we saw this very early on that there's some that lean really heavily more into the engineering side, and there's others that lean more into the computational side, the programming side, and there's nuances between there and the differences between there and different priorities or emphasis, and so that's kind of a question that we're still kind of working on, but we've kind of took a step back from that and really tried to look at robotics as a scientific endeavor, as a scientific enterprise, and we see it as this widespread, significant, historic, intertwined with the human experience. You can kind of see it in manufacturing and transportation, medicine, healthcare, agriculture and mining, exploration, environmental cleanups, and other ways as well. I think what we kind of feel is a responsibility and a drive to document the technical, the social, the cultural history of robotics given this prevalence, particularly at our university, and so when we think about that, we think about it as a scientific enterprise, a scientific endeavor, a mode of inquiry, we can map that to our current models in terms of workflows and processes and tons of models, all tons of models, and we can understand data life cycles and grant life cycles and research life cycles, all of that, and think about stewardship practices with that, but here's the thing is we kind of dug into it more. A thing that was maybe the most important thing was really the social infrastructure, the human interaction, the engagement between all these kind of individuals. I know it's kind of hard to read, but you kind of have these different sort of people that are part of labs and part of projects, and understanding the role between them is really what we needed to unlock to really understand sort of robotics, and I think what we felt is to really put all the pieces together of this. The human element is important, so again this might be a little bit hard to read and this is not exhaustive, but it's pretty close, of the types of items that we encountered in talking with people, the type of information artifacts and objects, so I mean you have physical things like machines and tools and casings to take soccer robots to tournaments and things like that, you've got to transport them and things like that. There's research, there's the code, there's the data, there's more hours of video than you would imagine, there's photos, there's all kinds of websites and just dynamic content, financial documents too, shout out to Meredith on that one. So what we kind of arrived at is many questions, but I think these are the two that really resonated for me that kept coming back is like what's important and how is it all connected, and we found ultimately that there's this vast amount of diverse and dynamic and fragile information and we can't keep all of it, we can't take all of it, so it's kind of like what do we take and again I think what we see again is you have a robot, you have the code, you have all this stuff, what's that connective tissue that brings it all together, that's kind of the thinking that we kind of had and we came up, we started using sort of this term to give us a sense of it, which is the sort of multimodal collections, you know it's this sort of interconnected ecosystem or this network of all those like tangible and intangible types of information, the objects, the artifacts, the narratives as well that really comprise that scientific process. And I think you know in my notes what I kind of wrote down is like you know a big key thing of that was understanding once we really started to appreciate and understand robotics as an interdisciplinary practice what we realized is we needed to bring an interdisciplinary approach to that and that it was it kind of stretched beyond what our university archives could kind of do and in that matter is beyond what a SCALCOM or what a data group could do or any of these individual groups, it's really the sort of bringing it together and so we kind of had to form our own ecosystem for this project and it's helpful being able to bring a lot of this talent, a lot of the skills and tool sets that we had within our library that we could bring in for conversations or work with us on different parts and pieces of it and also we had thankfully Sloan was able to invest their interest as well and their money as well in helping us to round out some things that we were missing but really being able to have that ecosystem to understand an ecosystem I guess is kind of where we're at and just to give you a sense of some of the things we would talk about is like I go out on high concepts a lot of times but this like stratigraphy was an interesting thing that that really stuck with us because I think we connected this with like version control a little bit where when you think about like a robot and you have you know version one version five version 50 you know all these different things it's kind of interesting because each of those versions are kind of developed by slightly different teams or totally different teams there's different components they're using different computer code that they're kind of using you know there is assortment of methods and processes but while they're unique there's a through line through it there's a there's a lineage as a trajectory you know these are the kind of layering and stacking like year after year grant after grant team after team that you can kind of document and you can kind of understand it and that was that was just an interesting concept from a geological perspective this is one where it's the operating chain I'll let Kate do the French pronunciation for that and I am cheering for France I've been cheering for France since the beginning much to Keith um this pleasure but uh here we go but um but I do like the I'd like Morocco Croatia final that would be interesting um it's a very international campus we have a lot of football comes up anyway the operating chain was really really um this one really stuck with me I think stuck with Kate too you know this was an interesting concept that comes from anthropology and archaeology and it's it gave us some sort of framework for conversations around looking at technological development or processes combined with social acts I'll just read this off because I think it's really good it's like um operating chain enables one to better understand not only the society in which a technique originated but also the social context the actions the cognition that accompanied the production of an object this one they're looking at wheat um you know wheat production but we're kind of applying that again to that concept let me go back here again you have all these different parts and pieces happening a lot of things is kind of non-linear there's not just one grant and it's done there's 20 grants happening at once and postdocs doing different things and dissertations being written and it's very it's a messy a messy enterprise so again being able to think about that that how is the how are things being developed over time and how are they all connected was really really interesting to us and so moving it on you know I think we feel we feel a sense of urgency because there's a lot of the foundational leaders and practitioners and robotics at our university are retiring or have retired and you know there's a an urgency we feel to really get those stories and get those materials and we've kind of we have a exhibit that kind of talks about some of the the work that this project has been working on that covers that and this is the one slide I grabbed from it but it you know the the obsolescence is nothing new to this group but it's something that we feel is even I don't want to say exaggerated or heightened in sort of even early career researchers today because of the type of environment the type of projects everything's kind of cloud based there's a lot of reuse or or deletion you know things that kind of happen where we can't wait 20 or 30 or 40 years for someone to donate their materials for us because it's going to be gone you know and we keep getting at this idea of how can we get access quicker how do we get snapshots and more iteratively to projects that are kind of happening in in sort of robotics and beyond robotics really to really do this type of work so I'll end my kind of piece here before handing over to Kate around this was kind of like a unofficial motto maybe it was a battle cry it was a grand challenge that we faced of of how do we preserve things that were not intended to be preserved you know and that's just kind of um it's been a fun place to to operate in so all right Kate hopefully I can get both computers up here without knocking the entire podium down let's see well I'm Kate Barbera I'm an archivist and oral historian at Carnegie Mellon University and I am also lead archivist for the robotics project so I also want to add my thanks to the Sloan Foundation for funding this research and letting us explore some of these really fascinating challenges robotics is a wonderful case study for archives so you've heard about the background of this project and some of our thinking about our approach well now I'm going to give you the archivist perspective so I'm going to talk about our activities and our research over the past couple of years since 2020 and I also included on the slide here just a summary of our vision so our vision is really to document the development of robotics and educate and inspire future generations so what that means is we're not just interested in products we're interested in process we're also focused on long-term preservation so thinking about robotics and longer time scales so 10 years 20 years 30 years beyond are we going to have access to materials from this crucial moment in history from this field that is really transforming our society and our world so as Brian mentions the field produces a large volume of complex material this is artifacts archival material documentation software code video photographs and many other types of material this is made even more complicated by the vast teams that undertake robotics research it's a highly collaborative field and what this means from a practical point of view so for myself as an archivist is that these materials are often geographically distributed so not just within our region but across the country and across the globe so that in and of itself is is a is a difficult challenge we're facing so the image on the slide is a common visual understanding of robotics so you imagine these discrete items that were designed to accomplish specific tasks but what we found is that robotics actually looks something more like this and you can imagine the look on our processing archivist's face when she saw one of the lab storerooms for the first time it's very rarely discrete items they're interconnected they're complex they're messy they created by many different people who are part of these transient communities so you have students postdocs researchers coming in and out of these labs creating a very dynamic and as you can see messy environments um and so this is really the process the problem that we're facing and we could have attacked this from several different perspectives right we could have looked at it from an archivist perspective we could have looked at this challenge from a data management perspective could have looked at this challenge from a digital preservation perspective software preservation pick your discipline but what we chose to do is actually to take a step back and think about what are the fundamental questions that we need to answer in order to even figure out how to get started um and Brian already touched on some of these so the first one is how do we define robotics and if you ask a roboticist you mentioned this before even they don't really know you ask a question what is robot what is a robot you will get a different answer from every person you talk to so defining our scope is one of our challenges the other question is something that you know archivists think about all the time is what is most valuable so thinking about what has long-term value what is important to preserve over the long term and who is making that decision so who are we preserving it for is it for archivists is it for historians or is it the community of researchers that we're working with and so unpacking these questions really became the focus of the first phase of this project and we decided to focus our efforts essentially on data collection right we don't have all the answers let's go out and find them so one of the strategies that we used was community engagement and we built this in from day one of this project we really tried to center community engagement to build meaningful connections with roboticists and with the robotics community so this means treating it as a true partnership this means working with robotics institute with scs and the robotics community in order to do everything from market and design the project to do site visits to think about how we're talking about the work that we're doing we also wanted to ground it in the cultural context of our organization so thinking about what makes robotics at Carnegie Mellon unique but how does it also apply to the broader the broader robotics environment so what what concepts and ideas can we observe in our own context but then apply that out to the broader robotics field we also looked at pre custodial fieldwork so thinking about how do we collect as much context and data as possible about the material before they come into our care so for us this meant designing a pre custodial fieldwork approach that would allow us to gather this context so this involves site visits it involves observation it involves a lot of ethnographic methods that are sometimes utilized by archivists and information professionals but not always but in this case we found it absolutely crucial to begin thinking about how do we gather as much context as possible and this data informs our collection development strategies so what are we collecting how are we collecting it and why but also thinking about appraisal what value are we placing on these materials preservation so thinking about digital preservation strategies as well as physical conservation so these are often physical objects we're dealing with and gathering as much information about the context of creation can sometimes inform the care and maintenance over time as well as our public programming so what does our messaging look like about these materials also informs metadata and discovery so this data collection approach really began to be the heart of what was going to allow us to tackle these these really complex issues we also looked at prototyping so how do we apply what we're learning from the pre-custodial fieldwork that we're doing to small somewhat manageable collections and we began to think about you know future access and discovery modalities so how are we going to be serving these materials up and to who and who will find use in them if you talk to a roboticist about what they see what value they see in historical materials often you'll get the answer they don't see any value in it but what underlies that that answer is a just a difference in how we assign value and how we talk about that value so it's not that they're they don't find it useful to have this historical documentation and artifacts available it's that the way we even talk about that is just different so understanding you know through these prototypes how we might be able to collect and preserve this material in a way that makes sense for the community became really important we also as I mentioned began to look at different access and discovery strategies so what we're interested in doing long term and Brian talked about this a little bit is how do we present the archival collections the material the artifacts in a way that exposes this really fascinating interconnected ecosystem that makes up robotics one answer for us was to do this using digital collections so serving up the archival material and really letting it speak for itself so in next year we're going to be premiering a new island or a digital repository which is modeled after our digital collection site and actually one of our colleagues is right next door talking about that right now that will premiere to the public next year and what this system really allowed us to do was to take advantage of those linked data capabilities that the new version of island or has and begin to build out these relationships between projects between teams between this ecosystem and try to expose that through the digital collections we're also exploring network visualizations so on the slide you can see an example of a project that one of our graduate students did this past year looking at some of the data in the robotics institute annual research reviews and starting to play with that starting to look at how we can visualize the interconnected community behind robotics research and you can see that there are a few folks who are really central nodes to just about everything that happened in the robotics institute during the decade that she looked at to Keo Kanade one of the founders of computer vision being one of them a central node in many many different projects so we have a couple of key takeaways from this process and I share them here because they speak to some of the ideas and concepts and challenges that I've been hearing from a lot of the presentations during this conference over the past day or so is for us one of the early challenges was overcoming these persistent communication islands so what I mean by communication island is we tend to live in silos when it comes to how we talk about how we think about the value of long-term preservation and so for us it became key to try to figure out how to overcome that so for us we began to think about even fundamentally how are we talking about archives and documentation documentation became an interesting term that we had to unpack with the community as we were working with them for us for archivists and librarians documentation means one thing to the robotics community it means something very different and so being mindful of how we're using this terminology and coming up with a shared language became a kind of a side quest of this project it takes time to build trust and understanding and this is a skillset that I work on as an oral historian but I've also found applications to this project as well and I would stress don't rush that process you know fundamental to this project has been community engagement from the beginning part of the reason is we need to overcome this issue of communication islands is how are we talking about and thinking about these materials coming up with a shared language coming up with a shared concept idea and and goal and pursuing that in partnership with the community so another takeaway that I think might be interesting for this group is that for robotics a holistic collecting strategy is preferred given the prevalence of hybrid and dependent artifacts and documentation so by that I mean robotics material is often best viewed and understood either in situ or in context so for example there's a robot in our collection called the Trojan cockroach and I don't have an image of it here but it's a fascinating machine I recommend you look it up but it was a six legged massive robot designed in about 1983 by Ivan Sutherland who is best known as the founder of computer graphics but apparently in 1983 he was playing with robots at Carnegie Mellon who knew but he created the the Trojan cockroach and was the first robot capable of carrying a human what we have left of this machine today is some dispersed documentation some plans that he created and a few parts and pieces and that is it an occasional video or photograph but you take any one of those those items and look at it in isolation and it doesn't give you an understanding of what that robot actually did if you look at the parts and pieces the pistons and and other components that we have left it doesn't give you a sense of how it moved it doesn't give you a sense of how it functioned if you look at the video it doesn't give you a sense of the size and the scale and so in talking with the robotics community one of the key takeaways for us is that we need to rethink how we're collecting this material archives we often focus on interstitial material all of that the the documentation the items the photographs videos and and other material outside of the physical artifacts but with this this fields we can't necessarily take that approach and so part of our collecting strategy needs to be breaking down these silos between the published content that usually ends up in libraries the artifacts that usually end up in museums and all of the you know the catch all of everything else that tends to end up in archives begin to think about how we can approach it more holistically so the the final takeaway for this group is that archival and long-term preservation work are hindered by the lab environment so this means you know these transient communities these students these postdocs that are constantly coming in and out and taking information and knowledge with them but there's also a lack of incentives for for lab members to begin to think about personal responsibility when it comes to building their own archives and maintaining them over the time so these sustainability factors hinder and sometimes block long-term preservation efforts and so how do we think about this issue not just from an archives perspective but incorporating a lot of the skill sets that are in this room data management software preservation how do we combine those into a model that we begin to apply to this this complex field so it's not just about you know the issues that this lab environment causes today in terms of you know reproducibility of data it's also causing issues for you know collections that will need to access in 10 20 30 40 50 years you know the question becomes in this really vital moment in robotics history which we're living right now what material do we want to be able to access in the future you know what do we want to be able to pass on to the next generation and what will students and researchers want to be able to access and understand about some of the technology that is being developed right now so early next year we're going to be premiering what we're calling multimodal archives a toolkit for collecting robotics and other material in a research ecosystem so it's basically a summary of the lessons learned of the strategies that we tried and a set of recommendations for archivists and information professionals who are interested in this in this work our hope is to begin to develop a community of practice for archivists and and others concerned with this idea of long-term preservation not just for robotics but for fields similarly defined by these multimodal materials but also these complex collaborative processes that add an extra layer of challenge so this could be artificial intelligence computer science design architecture like I said our goal is really to cultivate this community of practice so that collectively we can begin to think about how we want to approach these issues so we are right on time we have about 10 minutes for questions so thank you so much everyone our contact information is on the slide and we really appreciate your presence here today thank you so much hello firstly may I just congratulate you on fantastic presentation and really looking forward to the report as well also fully rooting for Morocco Croatia final as well so really on that sorry Keith I just wanted to this is not a question but a connecting aspect so there are a couple of projects in the UK which are looking at not robotics but the process of design and how to our capture that from an archival perspective and I thought I'll connect that just in case if there are links that might be helpful one of them is called PR voices which is practice research voices and how to incorporate the context and the repository infrastructures and everything else that's needed and they are looking at like you were talking about island or are they are looking at a particular repository infrastructure called haplo to see whether context can be incorporated and practice base or practice led research can be incorporated in repository infrastructures and the other one is called sparkle I came with that acronym is a university of Leeds project people hate me for that acronym but I'm not going to go into that sparkle stands for sustaining practice research for sorry sustaining practice assets for research knowledge learning and engagement and that's very much looking at the context and the social context and the practice context and how to capture that effectively while a design process is going on so we were particularly focusing on school of fine art and school of design but I can almost see very very similar things you're talking about in your robotics aspects that link very strongly with that so I think this is just a comment about there are lots of similarities in what happens in design even in nursing in any kind of medical profession where practice is involved with this so I'm really looking forward to that and thank you thanks Massoud John I think my question comment is about audience in a couple of ways I believe it was Keith early on said something about this project one of the goals would be to inspire students and others coming to the university to see the the kind of unique program and the history of the program and that really got my attention and interest and so I kept thinking as you went through the different presentations why wasn't exhibits museum kind of framework being talked about a little bit more because there are two things that I think of one is for students seeing it is going to be a lot more important than just knowing you've got an archive and I don't mean just some small exhibit of one screen or you know something in one exhibit case but something that's on a really fairly large scale that would be inspirational I think and the second thing in terms of audience was you talked a lot about consulting with the people building the road all the members of the team and things about and I was struck by at least one of them saying yeah we don't really care about the history but the people would care about the history are historians of technology historians of science and I just I'm assuming you interviewed some of those people as well in terms of what they wanted to see in terms of the record so again I there are more comments but I would welcome your views on those. I'll say real quick we have grand visions okay we just we were talking more about the data modeling and the concepts there but Kate will look over there. Is that working? You hear me? Yeah. So the idea of a museum comes up quite a lot and while we were really interested in this idea and it's likely something that will continue to pursue and investigate one of the challenges of of approaching it solely from this kind of museum exhibition perspective is restoration is really difficult and really expensive and the organizations that do it well to pick and choose which objects they're going to restore and so when you're talking about you know the robotics institute at Carnegie Mellon has 40 plus years of history has been prolific as it has been how do you decide which robots you're going to restore but then also how do you ensure that you have the expertise the skill set and the intention the attention of the community that is able to maintain that item in order to keep that exhibition going and so that that's a question that we're grappling with is restoration really one of our priorities or is there another way we can provide the same level of inspiration and engagement through other means whether it's through a digital collection experience through through VR or augmented reality experiences so there are there's a whole constellation of options that that could be in the future for this project but yeah you're not the only one to to bring up this idea of a museum and it is something that yeah we're continuing to investigate so we did have a a small but very interesting exhibition in our main university library last year and there are artifacts from that on the website including various animations but Kate hit on many of the points I mentioned in my remarks that the university is building this robotics innovation center and there is an expectation that we will be creating exhibitions there one of the challenges that Kate alluded to is that many of the robots things that were designed to go to Mars or the moon or whatever are big and storing them anywhere is something that we're wrestling with and another thing that surfaced through Kate's remarks was robotics research often begins by building a robot and then dismantling it to create the parts for the next robot and so on so we really do have this challenge of restoration and how do you resurrect a dead cockroach it's a great question we do have a to your second point Joan in our special collections team we have a very strong history of science and technology perspective and a rich community there who have been intimately involved with this project so they have had a voice we have three minutes if there is a final question of what it sounds like it's a sweepstakes on is it going to be Croatia Argentina France or Morocco I can say just a couple more words about Keith's point concerning cannibalisation is it is difficult to identify concrete strategies for how to holistically display a robot that no longer exists and it's a constant challenge that we have as Brian had that question on the slide is how do you preserve something that isn't meant to be preserved and it's the question that we come back to again and again and again you know using the Trojan cockroach as an example if we had waited to take on or preserve that material likely it would have gone it would have been lost or further cannibalised and so as time goes on it becomes more difficult to do this work so that there's also a sense of urgency around this so it's not just the question of restoration versus cannibalisation and it's also at what moment do you start beginning your preservation work no please go for it just quick one in term of intellectual property and pattern did you run into this and how did you handle it yes um so intellectual property is is definitely one of the conversations that that is ongoing with this project you can imagine the the difficult barriers concerning you know intellectual property rights and preserving this material um one of the things I often say to the team working on this project is we don't have all the answers but we know where to look and so um someone um was talking earlier about how we can pull inspiration from different fields um in order to design models for robotics and we're doing that with intellectual property as well looking for inspiration you know in other fields that perhaps are a little bit further along in terms of long-term preservation and identifying different strategies yeah so we are right on time so thank you for being here thank you for your questions enjoy the rest of the conference