 Welcome to Wikimanias session on GLAMS and Open Access. This is a study from the Smithsonian Institution. Hi, I'm Kelly Doyle. I'm the Open Knowledge Coordinator for the Smithsonian's American Women's History Initiative. And I'm Andrew Lee. I am the Wikimedian at large of the Smithsonian Institution. And to set the stage here a little bit, I just wanted to introduce what the Smithsonian Institution is. It is the United States largest museum complex and the world's. We have 19 museums, galleries, as well as the National Zoo. And it was established in 1846 and it holds 138 million artifacts and specimens. And our goal for the next year or so is to really engage globally with our patrons and our audience so that we can have a wider impact for those specimens and artifacts. And to do that, we have a goal to reach one billion people every year with a digital first strategy across all of our different museums and Smithsonian units. And the best way to do that is through Wikimedia engagement. We know that our finding aids and our websites alone will not bring in a billion individual people viewing our content per year, but Wikimedia can help us get there. So one of the things that really was a standout example of how Smithsonian assets could have an impact in Wikipedia and Wikimedia's resources is this example that we had an editathon in 2019. And we created an article about Vera Simons, who was an inventor and balloonist and artist who had never had a Wikipedia article, had no images readily available. And after we created a Wikipedia article and uploaded the image to Wikimedia Commons, we decided to kind of check Google to see when did it actually register in Google. And we found that within 15 minutes, Vera Simons was appearing as Google search result number three. So that article never existed before within 15 minutes was now at the top of the Google search results. And that same day, you didn't even have to go to Wikipedia itself. Wikipedia's contents used by so many different sites, including Apple's Siri, which is on every single iPhone and iPad in the world, that if you just typed in Vera Simons on your iPhone, you would actually get what you see on the screen here. You'd actually see her image and the text of the Wikipedia article. So that really was quite a powerful illustration for why it's important to get your content into Wikipedia. So the work that we're doing now is been over a decade of collaboration. But finally, in February of 2020, the Smithsonian had its official open access launch. We had Secretary Lonnie Bunch, Effie Kepsalis, who's been a spearheader of all this in the Smithsonian. It's been a work of many folks over the years. Some folks in the Wiki movement might recognize some of the names here, including some folks from the Smithsonian and Wikimedia DC past and present. But if we're talking about how the Smithsonian will engage and it has engaged with Wikipedia over the years, you can maybe look at the information pyramid, which is just a very basic way of thinking about different layers of knowledge and wisdom that we have. And if you kind of start from the bottom with the data layer, we'll often see this as the kind of the raw material that you can contribute to any kind of effort. So in this case, it's open access materials, Wikimedia Commons uploads metadata from collections. But then the next layer above this, in terms of information, we're looking at things like lists of notable folks or different ways of organizing that data in different kinds of compendium. And then on top of that, folks like the Smithsonian or a glam entity can bring expert scholarship and practices to the table, yet another layer on top of the information and data that we see here. And then finally, wisdom, these are insights in there. Those are new ways of exploring the content we've never seen before. So these are very abstract, but what are some very firm examples of these? And Kelly's going to talk about some of these. But if we want to take a look at the kind of the breadth of what we've seen so far with the Smithsonian, as we said, the very base layer here is the open access policy of February 2020, allowing us to upload millions of records and images into Wikidata and Wikimedia Commons. Kelly will be talking about the American Women's History Initiative and the different events that we've had over the last few years. And then we've also had engagement with scholars and curators and experts at the Smithsonian to donate their direction and their insights into how we craft Wikipedia articles and Wikidata ontologies. And then we're also building new tools on top of what we see in Wikidata and Wikipedia with that expert knowledge as well. So that's just a basic framework for understanding how the Smithsonian sees Wikimedia engagement in these different layers. And Kelly's now going to talk about some very specific, exciting examples about this. Thanks for that framework, Andrew. Yeah, I wanted to talk a little bit more about just a sliver of Smithsonian engagement around Wiki, and that's through the American Women's History Initiative, which really is a relatively new Smithsonian initiative about three to four years old, but it's unique in that it's a completely digital initiative. So again, trying to bring in those one billion individuals and views per year. The goal for us is to get our collections about women more knowable and shared digitally, and a big piece of that is through our Wikipedia programming. So we know that Wikipedia, especially Wikipedia English, but all the other languages as well have gender gaps. So what we've done is worked with our AWHI, American Women's History Initiative curators to host editathons. These have been really successful. We've only been doing them for about two years. Now we've already hosted 20, and as you can see, we've added lots of words about American women, specifically from our collections. And what makes this so special and unique is that our curators within the Smithsonian are developing worklists about unknown or not well-known American women so that we can add them to Wikipedia with Smithsonian resources, potentially archival collections that we can add to the pages as well. Another thing we've done with Andrew on board is plan for a dialogue type model to hosting our virtual editathons. As we know with the switch to the COVID pandemic, we've had to switch our events from in-person to virtual, which is a really hard switch for editathons and Wiki engagement. And so in doing so, Andrew and I have been able to talk out complex Wiki problems and topics together virtually and also bring in more experts into our editathons that might not have been able to attend before because they're virtual. So we've been able to turn the switch to virtual-only programs into a way to deepen our engagement across the US and worldwide. And we've also been uploading our Smithsonian assets about women to Wikimedia Commons. This is just a small overview of what we've added. To date, it's been about 700 images, so not a lot, but it is a good test case to show us a couple of things. One, about 600 images have garnered over 10 million views on Wikipedia. This is a great proof of concept that we can engage at a high level and bring in potentially those billion views or close to it if we're adding more of our assets about women into Wikimedia Commons, WikiData and Wikipedia. Another thing that we found in adding these assets to Wikimedia Commons is something that we all kind of know in the Wiki world, but that we've been able to see on the Smithsonian side in what we're uploading in that of all the images we're uploading, images of women of color are performing the best. They're getting the most views out of any of the assets that we're releasing through the AWHI. And that gives us a little bit of room internally to argue for more images of women of color going on to Commons and prioritizing those images the most because we know that the public is looking at them and that Wikipedia editors are going through Commons and adding them to the Wikipedia pages for us, which is great. And this is just an example of one of the highest performing images that we've added. This is The Joiner Truth. This image alone is generating, you know, probably a fourth, if not more of the image views I just mentioned that almost 10 million views. This really was the image that started it all for us in trying to upload more images and being more intentional about what we upload. And this is something that Andrew pulled of the top Smithsonian archives images in Wikipedia. So we can see that there are women in the top seven, but we definitely have more room to grow with how many images of women are some of the top performing images from various Smithsonian museums and units across Wikipedia. And so some of the other engagement that we're doing besides editathons is work with the Smithsonian affiliates. The Smithsonian affiliates are a group of over 200 glance across the United States that are affiliated with the Smithsonian. And so what we've done is worked to provide them Wikipedia training to affiliate staff, to glam staff, through the Wiki Education Foundation's Wiki Scholar course model. And so we've made this completely about American women and adding information to Wikipedia about American women. We encourage affiliate staff to edit Wikipedia about women from their archive or women from their geographic location that aren't represented on Wikipedia now. To date, only 60 staff have gone through the training, but they've added almost 50,000 words to Wikipedia in a relatively short amount of time. This program did not start until March of 2021. And as you can see, they've added almost 700 references about American women to Wikipedia. So this is a really great program and we're excited to continue it. Another thing I wanted to mention is that there's a list building component at the end of the course that piggybacks on some of the other work that we're doing with the Smithsonian to go through our archives and find women that should be on Wikidata or Wikipedia and aren't. So the affiliates are providing us lists of women that we should know about, that we can use for worklists at editathons after we've batch uploaded them onto Wikidata. And I also just wanted to briefly mention that we do have an AWHI intern cohort every summer. It's only a six-week internship, but as you can see from the dashboard below, from my two interns that just completed their internship today, they've had a lot of output in a relatively short amount of time. We know that there's a high demand for Wiki and digital internships, and we hope to offer more of those. But they also help us explore new audiences for events and what new generations of Wiki contributors might be looking for in terms of public programming. And these interns are leveraging that open access release to find images and content to add to Wikipedia and enrich Wikipedia. Andrew and I have also done Wiki media training for all interns somehow in an hour, an hour and a half, we went through all of the Wiki knowledge and we hope to do that again in the future as well. And this is that list building project that I was talking about. It's named after Dr. Vicky Funk, who was a scientist at the Smithsonian and the Funk list specifically, you can type that link in and read more about it is just a list of Smithsonian women who are notable that may or may not have Wikipedia articles, but should who are notable enough for them and definitely deserve at least Wiki data properties. So kind of going through by hand and scraping the archives to figure out women who can be added to Wiki from our holdings to help close that gender gap on Wikipedia. And so looking forward into the future, we're thinking about how we can increase our Wiki media public programming and what we can do besides editathons. Maybe we do more dependable editathon programming like editathons for every heritage month that our patrons know to expect and they're virtual and accessible. Continue our list building project with Wiki data, whether it be with affiliates or interns or just other curators across the Smithsonian to add more women from Smithsonian collections to Wiki. Adding more images is a huge goal of mine. Really working with curators across the Smithsonian to find those untold stories of women and adding their images as well because we know that Wikipedia pages get more views when there are images on the page. And then hopefully having more interns in summer of 2022, working on Wiki and digital projects and Andrew and I hopefully expanding the Wikipedia training that we beta tested with AWHI interns to a larger cohort of all Smithsonian interns having a base level understanding at least an hour or two hours of Wiki training to take with them after they leave the Smithsonian. All right, thank you and please feel free to reach out to Andrew and I with any questions you might have about this programming or if you wanna get involved. Great, thanks Kelly. And again, thank you for your time and we're happy to answer any questions that you might have and feel free to contact us. Thanks again. It's under SEC zero license. So this is certainly very interesting to the Wikimedia community. And we've been working with Jeanne Choi, general manager of collection information at the Met for a lot of these projects we'll be describing. As you can see the Wikimedia foundation and the Wikimedia community were part of the initial launch of the Met's open access collection. And the way we like to think of this is as a new type of engagement whereas many glam engagements that we've had with the community have emphasized either Wikipedia article writing or donation of images. We believe that this is the most comprehensive at least in the United States of a engagement with Wikidata. And the reason why this is interesting is because it is using Wikidata as a structured database for comprehensive upload of all the notable works and figures from the Met Museum. And this is interesting because of the language independent nature of the rich metadata that's in Wikidata and the ability to do more precise and interesting searching and sorting and interactive engagement with the content. So we thought we'd show you some of the stats that we've been seeing so far. So yeah, one of the most important goals of the project is to make sure that people see and appreciate the images and the artwork that are in the Met. Not just in the Met Museum, not just in the Met's website but in the universe at large. So there have been more than 700 million views to Wikipedia articles with Met images over the last three years. And we saw over time that average traffic to Met images on Wikimedia projects went up five or six times. You've seen a steady growth over time as more and more articles both about art and also about history have often included these images and as people in the Wikimedia community become more aware that this trove of free and open images exist. This is one of our most interesting items. The Princess de Broglie was an article that got about 220 million views when it was on the front page of the English Wikipedia. It was started by an Irish Wikipedia editor, user Keel, based on working with our project and editor who's quite interested in the Met and remotely visited a couple of times and was particularly interested in the cloisters and also in this artwork. And was able to start this major artwork which did not exist before and build it up to a high level based on the enthusiasm of getting a free image of it, a free high quality image of it. Great. So I thought we'd talk a little bit about some of the things that we've been doing with Wikidata in this traditional, at least in data science workflow of extracting, transforming and loading the content into Wikidata. We actually have a number of processes and bots that take a look at the data from the Met and compare that against a sparkle query against Wikidata and tries to find the differences and to load the differences into Wikidata as appropriate. And sometimes if there are discrepancies, we'll have a report that's sent back to the Met so we can actually fix some data or to ingest the data. And a lot of interesting things can happen once this occurs on a regular basis. We can depend on Wikidata having accurate and up-to-date information from the Met. Some of the things that we do on a regular basis is produce an dashboard of stats. Not only the ones that Richard talked about with traditional Glam tools, but we also track what are the most popular artworks that are written about in Wikipedia related to the artwork. So tools like Integrality or Listeria on Wikidata are useful for tracking these things on a regular basis. We've also been able to do some interesting experiments, especially with AI. So this is an experiment that we did a number of years ago with the Met, the Microsoft and MIT where we fed the keywords that the Met spent a lot of time on labeling all their artworks with and feeding this into a machine learning system. And after we trained the AI on what was actually depicted in different artworks that the Met, we could actually feed new artworks into the prediction system and it would come up with recommendations on what it thought was in paintings, in sculpture. And our crew of Wikimedians can help sift good and bad predictions in this manner. So you can see here, we've actually created an interface using the Wikidata game to do exactly this. So we made this type of a very simple way of taking these predictions from the AI where the AI said, we think there's a tree in this picture and you can help out as a human being by saying it does depict the tree, does not depict the tree or you can skip it. And this is quite a fun little game that allowed for a lot of folks to very quickly contribute to Wikidata by checking the work of the artificial intelligence system. So just in a very short amount of time, we saw that there are more than 7,000 judgments via the Wikidata game, resulting in about 5,000 edits to Wikidata and this amount of adding new depiction statements in Wikidata about artworks, mostly around landscape painting features like trees, boats, flowers, horses. These are really well performing in the AI that we had which is very interesting and allows us to experiment with these things going forward. So we talked, we talked about some new horizons going even beyond the AI experiments that we've had. One is the use of structured data on comments. So some of you might be attending sessions at Wikimania talking about this bringing Wikibase or Wikidata capabilities to Wikimedia Commons to allow for more precise searching the metadata of artworks in Wikidata. So for example, instead of having to lexically search for Met Museum and maybe getting false positives, you can precisely search for statements like this image depicts the Met Museum precisely and exactly. And the great thing about this is you get a multilingual search right off the bat because it's using Wikidata like statements. Yeah, it's one of the other new horizons is really an old horizon, community campaigns because Wikimedia and Wikipedia don't work unless the community is involved and unless there's a human engagement, not just ingesting data, not just dealing with data but dealing with human identification and human interaction with the data and the information. One of our tools that we started was Mbabel, the museum of all possible artworks, which just has the ability to create draft articles on Wikipedia based on Wikidata, which we started the project for an art in English Wikipedia, but it's adapted by some Brazilian Wikimedians on Portuguese Wikipedia and has been enabled and grown beyond that through Wikidata-enabled art info boxes and has been used in other areas as well. So the nice thing about it is is it takes some basic information on Wikidata and such as might be provided by the Summable Paintings Project and it automatically generates a draft Wikipedia article. Now, of course, you cannot write, a machine cannot write a Wikipedia article. This is a basic violation of the sacred rules but it can help provide a draft and it can create a nice info box that explains the year it was created, the artist, the genre, if that's recorded in Wikidata and have some sections. And it's a nice starter for someone who has some art knowledge but doesn't have as much Wikipedia framing knowledge in terms of how to do the nice cities of putting the different sections together. Yeah, and as Richard said, this is a great example of how we're doing more and more of our editathons and meetups around Wikidata. So it's no longer just Wikipedia as the starting point but we're actually doing a lot more with Wikidata as the main focus of these meetups which I think has been a nice on-ramp for a lot of folks because Wikidata is more structured and more modular. Another thing that we're also looking at is data round tripping. This is the idea that it's not just the flow of data from the Met to Wikidata but it's actually going from Wikidata back to the Met and this is a virtuous cycle. So for example, if you now look at the Met API which was launched back in 2017 when we first started working with the Met, they are now returning Wikidata Q numbers associated with their data which is just great to see. So for example, when you go to the API of the Met and you query an object, you will also see that they're pointing to the Wikidata Q ID if it exists in Wikidata. They're also pointing to the artist or constituent ID in Wikidata and they're also holding in depiction information. So if they think that an image depicts a bird or a tree or a mountain, they're also including the Q number of that in their API and this is just a great thing to see and it's quite rare among Glam institutions to have this level of detail with linkages to Wikidata. So if you wanna try it, you can actually go to the API and see some of this in action there. And as we start doing more with structured data on comments, you'll see even more linkages there to the images that we have. Then finally in the area of knowledge exploration, it's one thing to contribute the OA content from the Met to Wikipedia and Wikidata and it's another thing to actually collaborate around doing things like AI but to explore and create new connections and new stories is really kind of the dream of this kind of co-creation, elusive stage of co-creation. And what we're seeing is that we're now having a lot of interesting tools that we can make with the ability to use Wikidata and structured data on comments. So for example, we've developed this tool called the Wikidata Graph Browser where you can actually navigate through all the different artworks and artists and depiction information at the Met simply by clicking and pointing at things rather than having to learn things like sparkle query language or needing to understand a lot about Wikidata. So this is kind of the dream once we have all this data loaded in structured statements, we can now provide a much easier to use interface to navigate content. So this was created very simply using the pause system which is actually a developer system that the Wikimedia cloud supports. So anyone with an account on the Wikimedia system can actually get access to developer resources like this. So if anyone wants to try it, you can actually go into pause and try this yourself or click on the link in the presentation. And that will also put into the etherpad so that you can try out yourself. And this is the fruits of lots and lots of labor of folks in the backend developer community in the Wikimedia movement that have helped support things like this. And it's these kind of exciting interactive tools that really justify all the effort that we have. And it really does finally show folks who have been involved with the OA release of content at these grand institutions, the promise of linking your collections to Wikidata and Wikimedia Commons. So hopefully that gives you a taste of some of the things that we're doing at the Met. And we hope to show a lot more in the coming years. So feel free to contact myself or Richard at the Met. We're easy to find, and we hope you'll come up with more and imaginative ways of using the open access content from the Met, whether you find it on Commons or Wikidata or Wikimedia. Thanks so much. Thank you.