 So let me begin with the land acknowledgement. So we here are sitting at the Archeological Research Facility, which is located in Huichin, which is the ancestral and unceded territory of Chochenyo-speaking Olone people, successors of the historic and sovereign Verona Band of Alameda County. We acknowledge that this land remains of great importance to the living and vibrant community of Olone people here and that the ARF community inherits a history of archeological scholarship that has disturbed ancestors. And it was complicit in attempts to erase living Olone people from the present and future of this land. It's therefore on us, our collective responsibility to critically transform our archeological inheritance and practice in support of Olone sovereignty and to hold the University of California accountable to the needs of all native and indigenous peoples with not just our words, but by our actions. So today, Dr. Kansa, whose title is up there, he's going to talk about his work here as the Program Director for Open Context, also a PhD in Anthropology and Archeological Field Experience in the Near East, Egypt, Italy, and North America. His research interests explore data informatics, research data policy, ethics, and professional context of the digital humanities. He runs research and development for open context and manages the technical aspects of data publishing and archiving, including systems into our operability, data integration, and indexing. Thinking about all the stuff we do, all the data we produce, right? The next step is everything that Eric and Sarah have put together as kind of a roadmap for us. So this is such a critical talk. I'm really grateful for your being here. Please help me in welcoming Dr. Kansa. Thank you all so much. I'm really grateful to be here. And this is like one of the first times I've actually talked since the pandemic. And it's really just going, you know. Being here is actually really special. So I really, really appreciate the chance to be able to be here with you all in person. And everybody in Zoom world, hello too. We've already just, we've just heard a really eloquent land acknowledgement. And I want to just highlight how it's important framing context also in the way we manage archeological information. Because the same issues that we just raised when we discussed the history of UC Berkeley, the archeology in California, and indigenous peoples in California, and those intersections, those kinds of issues also have to inform how we deal with the information and communicate the information that we manage as archeologists and we manage on behalf of and in hopefully partnership with indigenous communities. The same sorts of issues are also going to be important when we think about diaspora communities. There's a lot of peoples in the Bay Area who are essentially descendants or are refugees themselves from places around the world where archeologists also work and their own interests and their heritage also something that we also have to acknowledge and support. So these kinds of framing issues, we're going to not just sort of mention it at the beginning of the talk, but hopefully we're going to be able to touch on this over and over again repeatedly in different contexts when we talk about some of the information concerns that we'll be exploring today. So some of the things that we want to talk about just at a very high level, contextual integrity of information is something that we have a lot of concern with. Our name is Open Context and so context is something that we seem to care about for open context, but context itself is something that archeologists really routinely discuss as something that is actually one of the cores of our discipline and how we try to understand the past as through our understanding of context. And one of the really interesting questions that I want to raise and explore today is how do we represent context in archeological data, in archeological information and do we do a good job at representing context, making context actionable so that it can inform our interpretations and that's something that we'll explore. I'm also going to talk about some of the specific approaches that we have with our specific platform service, Open Context and how we manage information. And today we'll just close out with a few notions of some just general good practices that is something that we could all apply whether or not you're using like online data or offline data or just managing your own projects, some just practical good tips. So those are the three main things that I hope to get through today. Just a real brief background and introduction about Open Context. We're a data publishing service where an open access data publishing service and we publish open data that's opening license. Now, because we focus on open data does not mean that we think that all data should be open. That's not the case. There's a lot of sensitive information that we encounter as archeologists that should not be on the public web. And that those sensitive kinds of data need to be curated properly but it's beyond the capability of a small four person organization like ours to be able to do that adequately. We're focused on that subset of archeological information that can and should be open. And that's the services that we try to provide. Again, it's not the sort of universal things that we can provide services for but we try to focus on what we can offer services with in ways that are appropriate. And hopefully, as we talked today, we could talk about what is, how do we figure out what is appropriate for open access, open data dissemination and what kinds of processes and governance needs to be in place for thinking about more sensitive kinds of information. We work in collaboration with lots of other information systems out there too. So one of the critical services that we use is from the University of California, the California Digital Library. They provide archiving and preservation services for the data that we publish. And they also provide persistent identifier services and those identifiers I'll be talking about a lot that allow the information that we publish to be citable and linkable with other information sources that are out there. Another thing is that we really rely on the work of other information systems that are curated by other organizations, other professional communities where they're putting a lot of thought and effort into providing information that helps add a lot of context to the data that we publish. And we cross-reference and link to that information because that helps provide a lot more meaning. And for a lot of this and for our general approach, we work, have worked and continue to work closely with the digital library community and we've gotten recognition from that community for working on data curation kinds of issues. Professional societies, the American, the Archeological Institute of America, the AAA awarded us, gave us an award in 2016. And we also received recognition from the Obama administration, from the White House in 2013 for promoting data sharing and openness in the sciences. Really briefly, this is our team. And I, you know, everybody here on this team are very dedicated scholars doing an incredible amount of work and I can't summarize everything. This is taking multiple talks, but we do have people that specialize in different aspects of this overall landscape. So, Megan and Paulina are focusing most of their work on public education and instructional programs. We're using data, archeological data. Lee Lieberman has been working on professional development kinds of programs so that archeologists can engage with good data practices and she's been working closely with a lot of professional societies to develop programs in this area. And Sarah is also our executive director. So, and as we read, as in the introduction, I'm primarily focused on some of the technological sides of things. Just a few questions. Well, we get going here. How many people in this room have ever shared data in any form? Have you ever published anything? Right. Also, yay. And then how do you share data? Has anybody shared data in, say, a table and a paper? Right. Okay. Anybody shared data in a repository like TDR, ADS, Open Context, anything like that? Okay. Yes. Awesome. Okay. So why not? I mean, there's a smaller number of people who put data into a repository than into a paper. So why is that? So we have these services. TDR has been around for a very long time now. So it's more than 10 years. Open Context since 2006. So why is it that there are a few people putting digital datasets in a repository versus in a paper? It's an interesting question, right? So, all right. Let's talk about that a little bit more. How many of you have ever reused any data that anybody else has provided? So you do your own work and you want to compare it with somebody else? How many of you have ever looked at anybody else's data? Okay. Anything useful in anybody else's data? No. How hard was it to use? Was it hard? Easy? Hard. Hard. It is hard, okay. What was that source of the information? Anybody use data from a repository someplace online? No? One, two, right? Okay, so that's somebody. That's something. And so do you feel like when you're looking at somebody's publications or data, do you feel like you have an adequate understanding of context? Okay, hardly ever, right? Hardly. Yeah, so context is one of these really hard things. And we seem to do a bad job in our discipline communicating context in any sort of publishing venue that we have, whether it's digital data or conventional publishing. Whoops, okay, good. So that sort of informs some of the challenges that we face. So in dealing with information in archeology, especially digital data, here are some like the big core challenges in some way. So ethical, right? And so we do not want to re-reinforce or recapitulate or rebuild colonialism online. That would be bad. And at the same time, we wanted to actually try to use the affordances, the opportunities that we have with digital information to try to maybe do things better, engage more, be more collaborative and bringing in people to shape our agendas in ways that are better aligned to our ethical priorities. So these are some opportunities too. It's not just dangers in digital spaces, but they're also important opportunities. One of the big challenges in our information too are just semantics. Everybody describes things in very different ways. Archeology itself is a very inherently multidisciplinary kind of endeavor. So we have inputs from all sorts of different fields, right? We have zoology, we have botany, we have soil science and geology, zoolophology, we have art history, we have anthropology. We have all sorts of things that are feeding into us that make juggling the information kind of hard. And we also have regional specializations, different kinds of traditions and describing chronology and describing typology, all sorts of things. Makes the semantics, the meaning of the information that we have really hard to juggle. And we have to do all of this with like no budget and very little technical support, right? So that's the other really hard thing. So just the technology of dealing with all of that diversity and all that complexity is a really big challenge because we don't have a lot of resources to draw upon and be able to engage with all of that. So the capacity is a big deal. And of course, we have to do it all, right? So you have to sort of like divide your attention amongst all of these different kinds of things and our time and attention and our money is scarce. So it's great to promote data professionalism, reproducible research, but you have 80 other priorities you have to deal with like emails from the dean or whatever. So there's a lot of work to do and limited time and limited capacity. So two big framing things of data and these are the sort of guiding principles that are being promoted right now about how to deal with data and how to deal with this sort of information appropriately. There's fair principles, findable, accessible, interoperable and reusable. And the care principles. The care principles are collective benefit, authority to control, responsibility and ethics. Fair is sort of about trying to make data free flow and fungible and reusable so that it could be compared and aggregated to actually a really big scales. The care principles are really more concerned about the social and cultural embeddedness of information. So that's sort of the community associations with it. Contextual richness is something that is really important with the care principles. And you think that there's sort of a, there's some tension there and there are real tensions between those two main principles and those are things that we have to navigate. And this is what makes information in archeology so really awesome to deal with. Because it's like, if you want some really hard problems, this is a great place to play it, really. And not just playing, but you know, you really, we really need people to engage with these topics. It's really core. So we have to, these are the things that we have to consider. And we'll be talking about parent fair types of things as we continue on here. So just to give it a little bit of background, we have tried to grapple with this stuff, these sort of two things, care or fair kinds of concerns. Care didn't really come out, I don't think formally until like 2018 or something like that. But before we even put anything online, we started, we wrote and put a lot of thought into intellectual property, traditional knowledge, indigenous sovereignty issues in the sort of management of information, cultural heritage information. So there's a journal article from 2005 that I did with two colleagues, Jason Schultz is actually an intellectual property attorney, Arsh Bissell, ecologist, and this is something that really touched a lot of follow-up stuff with including work with a really interesting and important project called Ipinch, the intellectual property issues and cultural heritage. I was headed by George Nicholas, Simon Frazier University, and Joe Hallowell. And those, that issue that Ipinch gave us a lot of really good guidance that we made into, we tried to put into practice with our own policy. So in terms of service, terms of conditions, service for open contexts and our intellectual property policies really derived from a lot of the insights gained from the Ipinch project. And we're really grateful for that work and for participation with them. And this is all continuing too. So we're continuing to develop new projects, lots of grants being written, that type of thing and collaborations. Also with other repositories, digital libraries, organizations that represent various indigenous cultural heritage interests too. And stay tuned, some of those will actually connect with some thunder and we can talk more about that later. So now those are some of the sort of background issues about why this is interesting, it's hard and it's an ongoing area where good practices are evolving. The way that we consider this for our own approach is that we're a publisher of data. And the idea that we have is that archiving just preservation of information is really necessary, but it's not necessarily sufficient. We also need the intellectual investment in making that data better and making that data more useful, trying to align it with research and ethical priorities and community needs, thinking about interoperability, how is this information going to be reused and how do we try to communicate meaning that can be understood by water communities currently and then to the future. So basically it's the same thing as publishers. We work with data authors, data contributors and trying to edit, review, promote quality in order to make the communication more effective. So very few people write a paper and then just post it online. Typically it goes through a peer review process, it goes through editing and you get comments, you try to make improvements, there are copy edits. All of that work goes together as a sort of co-production of scholarship, right? You plus the reviewers plus the editors plus the copy editors, that is a collaborative effort that we need to publish a manuscript like an article or a book. Similarly, those kinds of services and collaboration we think are important with data. So that's why we call ourselves a publisher of data because we provide those services. And the services we, two big areas of focus are on contextual integrity and data quality and also try to deal with issues of complexity. So data quality is hard to define and so a lot of the definitions sort of focus on what's the sort of suitability of reuse? How can you feel like you can reuse a data set with some confidence and does it support your analytic goals? That's the sort of one of the definitions of quality. And so it really depends on what you're trying to do with the data set, if a data set is gonna be of sufficient quality to support that agenda. And what's interesting when you think about long-term reuse and you wanna preserve information site for future generations, reuses may evolve over time and new applications are gonna come up and the data might be understood in context of other data sets that are yet to be available. So this is a moving target. And so one of the things that we try to do is to try to think about quality as something that also unfolds over time and can improve over time as additional information, additional context can be made available to enhance the power of a given data set. So a data set in isolation might not be all that interesting but if that data set is part of a larger whole then it becomes something that may be more interesting over time. So few dimensions of quality to think about documentation. So can you understand it? Is there enough background information to provide the sense about how it's collected, the methods or what sorts of problems and what's missing, the gaps, that type of thing? Specificity is also important because the more specific the information is the more likely you are to be able to manipulate it in different kinds of ways. If it's very general and very aggregate you can't sort of, you know something that's been lumped together you can't necessarily split all that easily afterwards. So that's why specificity is an important element of data quality. We try to publish very specific results with open context. So, you know, when somebody wants to hand us like a summary data table but, you know at this site we have 10% sheep and 20% goats and that's great but it would be really nice to have the specific bone elements because then somebody can say what the measurements are of the goats and the sheep and compare it with, you know pathonomy or whatever. There's a lot of more sort of opportunities for interpretation and analysis if you have much more specific information. And then completeness. I mean completeness is something I want to talk about because that touches on with context. So, one, lots of people here probably work in collaborative projects, right? How many people work in collaborative? Okay, who here considers themselves like a specialist? Okay, and do you ever get sufficient information from the rest of the team sometimes? Okay, good, yay, good for you. But sometimes you're working on different timelines, right? So you might be looking at a collection years after it's been excavated and then you have to rely on documentation that has been collected by other people. And sometimes you're going to be, you know that documentation is all that you have left it might be incomplete. So there's some issues associated with that. And so the different timelines of different kinds of people working on an archeological project that creates the frictions that I want to talk about. And people might have different timelines also under publication dissemination. So like you might want to really, really, really get out your seed or your bone record really fast because you've got some sort of deadline like tenure or whatever. And the excavation director is dragging their feet because they really can't figure out if this locus is stratum A or stratum B or whatever. So they don't want to commit. So these different kinds of timelines also impact the context that we have available because you might have an incomplete context you might have incomplete context information because the timeline of dissemination for one aspect of the project is lagging compared to another aspect of the project. And so that's really difficult. And I'll talk about identifiers. So how many of you have ever kind of tags like this? Okay. So this is actually kind of common. So like if you're a specialist you're looking at bags and bags of stuff or trays or whatever. And it's got all sorts of identifiers on them meaning the names of archeological contexts or like a small fine number or something like that. And you're trying to record all of that. And sometimes that might be written with like a Roman numeral and sometimes an Arabic number. Sometimes the dates are gonna be Euro style. Sometimes they're gonna be American style. All of these variations really start muddling stuff when you're trying to record all this. So you might be building your own spreadsheet and you're recording all of this information. And somebody else might be doing the same for like you're doing seeds. Somebody else does the bones. Somebody else does lithics. And you record it all. And hey, nothing actually fits together because somebody wrote Roman numerals on one thing and somebody wrote Arabic numbers on another. So these kinds of practices where you're working in isolation and without having a lot of coordination with the rest of the people that are looking at other aspects of the material that creates a lot of problems when you start trying to bring together context. So this is just a little illustration of that. You know, pretend Roman site. Here's context information, locus. Here's the bone person recording things with Roman numbers sometimes and then the coins person recording it with other ways. None of these different tables who would actually join together, right? You'd have this sort of, you could sort of do it by hands. You can intuit how they might relate. But for a computer, these things are all really different. They're not gonna relate together. So you can't bring that information together. And this is actually, we're breaking context, right? So the contextual understanding of the coins is you can't associate it with the stratigraphic unit information or the bone information. And so this is a common kind of problem that we encounter when we deal with archeological data sets. And so it makes each one of these things as an isolation, being isolated, it's less complete. You can't really see how it relates to the bigger picture. So with open context, we spent a lot of our time going through identifiers and trying to make those things, make the corrections, try to work with data authors to try to identify where you might be problems and how to fix them and that kind of thing because then we can start bringing together that information. All right, so the other aspect that we try to deal with is complexity. Sorry to inflict this on you. This is an entity relation diagram, which is database speak for the sort of organization layout of a relational database in this case. And this happens to be for one cemetery, Anglo-Saxon cemetery, and the data set is archived in the archeology data service. And there's a deal I do it, which means you can download it. And they've provided this really awesome, great documentation. And that's great. It's documented, you can figure it out. But the problem is you have to do this over and over again. You have to figure it out. You have to look at these kinds of relationships and scratch behind trying to figure out what the relationships are over and over again. So if you wanna do some sort of synthetic study that aggregates information from multiple of these projects, you're in for a giant headache because every project might organize their data very differently with a similar level of complexity. And that, and taming that complexity is something that is a big priority in managing archeological information. So really quickly, a better approach. I won't get into too many of the technical details, but imagine lots of those sorts of entity relation diagrams, very complicated. Every project has its own way of organizing things. We have something called an extract transform load with process ETL, which is basically to take that project-specific way of organizing it and map it or relate it to our general way of organizing information that we have and with open context. Internally in open context, we have a graph data store, which allows us to represent that variety of ways of organizing information. And, but instead of having a million different relational databases, we have one database that we can have common querying and interface services that allow this kind of information to exist in open context. And the project information is there described as it's originally described using bookabulars and attributes as it's originally documented in this original datasets. It's just that it's represented in a big graph database structure, which allows us to be able to provide a user interface around it and querying interfaces around it, whereas we couldn't do that if we had to do that with every random relational database that we got. Now, that's only one step. The other step is we try to annotate, add additional information to say that there are common metadata, common ways of describing things because even if we relate, keep a bunch of randomly described things in one place, it's still hard to use, having additional ways cues about clues about meaning that are common to multiple datasets that helps usability. So these are the two ways that we try to help manage the complexity by modeling things in a common graph database and then adding those annotations that allow things to be cross-searchable. So what does that look like? Here, this example on the left is cattle, oops, sorry, I'm getting really big on the screen there. Sorry, Zoom people. I think that's cattle, metal, podiums from about 30 different projects in Southwest Asia and in Europe. And those can be comparable. They might have common measurements. They're described according to a common biological taxonomic schema called G-Biff, global biodiversity information facility. And something called Uberon, which is there for describing anatomy. Those common ways of describing things and common measurements allow this dataset to be explored as a whole. And we also provide chronological information to your spatial information and common standards that allows one to do max and things like that. And then you can start exploring questions about like size changes, maybe associated with domestication, or if like Roman cattle husband grade, because maybe they're foddering animals differently from iron age and medieval kinds of periods. So these are really interesting kinds of things that we're gonna explore by relating a bunch of diverse information to a common set of attributes. And we're increasingly doing that also with the Getty art and architecture thesaurus from material culture. It's easier to do with Zoark as more of a tradition of reuse and regional kinds of comparison in Zoark theology, material culture is harder, but we're starting to do more of that with the Getty art and architecture thesaurus and a few other standards that people publish. And we formally use the British Museum thesaurus and I'll get back to that in a minute. So there's some of those annotations are for interoperability, it's to be able to aggregate data in interesting ways. And then some of those annotations that would provide the linkages that we provide are also for providing additional context. In this case, context with literature. So we have a project called the Digital Index of North American Archaeology which is a big gazetteer essentially of sites that are drawn from public records. So mainly from state shippers. And we've looked at the sites, the identifiers for those sites and we've related them to publications in JSTOR. And also the federal registrar, the federal registrar's regulatory determinations, US government regulatory determinations about decisions that are made about that impact different archeological sites. And you can actually make a map of that literature and the coverage of that literature by seeing the site information in identifying it in the literature. And then we know where the sites are because the shippers have given us that information. And we know where the sites are. To have low degree of special positions, we're not linking them up to leaders. So that's the only one. So this information is allocated, binned at about a 20 by 20 kilometer grid. So that's not a threat to site integrity. Another key aspect of that linking aspect and this gets back to the framing point where we talked about colonialism. It's not just about interoperability. Context is something that we can and should treat more formally with archeological information and also descending communities. And one of the projects we published has to deal with a site that was excavated through a collaborative excavation with the Lower Elwa Column Tribe. It's the Olympic Peninsula, Washington State. And the recognition of that sovereign nation's interest in this dataset needs to be expressed in ways that can be widely understood. So one could just write it down and we have written it down here as a sort of a text note. But just as a text note, that doesn't necessarily make it interoperable if this information flows into another repository, another digital library, if it gets indexed by something, it's not necessarily going to be understood in that way. And so one of the really critical things that we need is an additional infrastructure to express those associations, those interests between a descending community and some information in a way that is going to be widely understood across multiple platforms and systems. And so there's a really interesting project of local context, which is trying to provide that kind of framework, that kind of metadata framework so that information can be understood and understood in ways that can be recognized by different library systems, different publishing systems. So the same way that you're sort of, if you're a researcher, you might have an identifier like an orchid that says, yet you are the author of an article. Well, that kind of information is based on a set of standards and it's understood by libraries and search engines and whatnot. This kind of information that shows the contextual associations, the social need to connect living people with information documented by their ancestors, that information can also be expressed with similar kinds of standards. So this is an interesting example where those care and fair principles kind of complement one another about the sort of issues of interoperability also can be used to serve the agenda of meeting the needs of the social context of information. All right, this is just a few examples of reuse, like all of that information that I showed you gets reused in interesting ways. People have done augmented reality, people have done work on data visualization and there's also, I think, depending on how you count it, something like between 60 and 100 different publications in the past year and a half or so, cited open context, according to Google Scholar. And I'm gonna skip ahead to just get to some things that I wanna talk on and then talk about and then close because we're getting on time now. This is, OpenCondex has been around since 2006. So these are some screenshots of the latest iteration of our project. And to highlight why this matters and why this matters for interpretation, teaching and things like that, these are spatial distributions of textile implements at a Truscan site in Giuseppe Siena in Italy. That's where we hang out every summer, it's nice. And this represents some like 54 years of excavation at the site. And what's cool about this is that these are Riketti, which are spools and over here on the right there, you see spindle rolls, two different textile implements and they're very different spatial organization on this site that suggests some different kinds of spatial patterning and different activities related to textile patterns. And that is something that just sort of comes out new. It's a new thing because we mobilize the data. And as we've worked on this and we provide you user interfaces on this, these patterns can start emerging and this is actually used a lot now teaching and people are writing papers about that kind of thing. So mobilizing the data, making it dynamically accessible like this in a common interface really has a value because it allows us information to actually be used in conversations, right? And that's great. Problem is, it's expensive to build and hard to maintain. So again, this is how it feels like to be a technical director of both the project and person of older appeal all the time. And we've rebuilt this about five times. What's interesting is that the kinds of things that we do with visualization and search interfaces and querying and all that, other people do too, right? In archeology, outside of archeology and really ideally we don't want to reinvent any of those wheels. It's really a lot better to try to reuse the work that other people have done and have contributed in an open source way. And that because it's open source, then we can redo it. So as far as sustainability is concerned, a lot of our own sustainability is easier to accomplish because other people are actually engaged in this field and making important contributions. And then we can leverage those contributions for our own work. One hard thing about this is that, and this goes into the future. And I think I'll probably close here just because of the time that we have some questions. Governance and sustainability is not guaranteed. So one important case study is actually the British Museum. And they published a really important Fasaris back in 2011 or so. And lots of people reused it. And that's the Fasaris, remember linking together like all those animal bones, those kinds of common frameworks for organizing this kind of information. They published that other projects started using it and then the British Museum just dropped it without telling anybody. And a lot of us were left with broken links. And that's a bad thing because it just broke all of the sort of contextual associations, right? Because those links bring those contexts together. So it's really important to be able to do that, to have that context, but it's somebody has to actually maintain that. And the British Museum decided that they didn't wanna do that anymore. They wanted to play with Google and Samsung and very climbing projects. So that's a big problem. And this is an interesting kind of a thing when thinking about prioritizing this, like the British Museum directors, they didn't make a priority of this. Our community needs to make these kinds of issues and sustainability and sustainable engagement in these kinds of issues, much more of a priority. Otherwise, key contributions like this that actually make a big difference for organizing information, making it much more useful, those things can be dropped very quickly and everything falls to pieces again. And that is something that I could probably close with because we're at time. So. Thank you. Thank you. So you, I would love to have you guys just right away, which is the difference because I think the care fair thing is a really good way of talking about it. Thinking about when context is appropriate, and you said, you're a great looter, math and things like that, right? Right. So for example, just work me through this this week with a community partner where they're looking for land repatriation from a landowner whose family's investment in that is like, where exactly are they using their people in the ground here? Yeah. And so the trouble, the trouble of leadership offered data sets that included like ground energy and radar maps, you know, but left off all the actual spatial coordinates as to where people's now resting place were. But as a way of showing like, hey, we got the DC Berkeley people that do the radar work and you can see that they found quite a few of our ancestors on this property that it would be meaningful for us to be able to care for them properly and repatriate the land. Right. So obviously, this is kind of like what you're talking about the looter, right? You don't want to tell people where people are. You don't want to tell the broader world rather where people are. Do you want to be able to shed the data enough that the owners of the land are motivated to repatriate it, right? So, you know, when you were, if you're being asked to, well, you may be, what else? If you're being asked to help create this other irrevocable research project. So let's say the travel leadership decides to pull me from a project and they want to work with a different, you know, partner on doing the work, right? They want to move the data, right? That we produce, but they also need to be able to share it in ways that sometimes protect the actual geospatial information. Right. So like those kinds of things. How would you recommend just in that kind of case study how those types of pieces of a project be laid out? Because I think stripping context is important. Yeah. Yeah, it's dealing with sensitive data is hard and expensive. It just is. And then the people that are best equipped to understand who should have access and who shouldn't are the people that have the local contextual knowledge, right? They know the actors and the players and they have their own interests at heart. So really the best kind of solution is, like this nation has its own digital infrastructure that it manages like trolls. The problem is capacity, right? It's costly. And, you know, so the best thing to do would be, I mean, there's ways to reduce the cost. Open source, there are good open source solutions for various kinds of information. So one thing I didn't get into is there's a system that's open source that's been financed by the Getty Conservation Institute called Arches, which is a really powerful geospatial database tool and it have total control over managing access and permissions issues. And if they can run that, that would be really useful. But running that itself requires some expertise and money and server and all that kind of thing that not everybody has, right? Right. And so, you know, advocacy is really the main kind of tool that we can have. It's just like, you know, if a Southern tribal nation in the US has the resources to be able to do that, then, you know, they can manage their heritage in a way that really suits their interests and that would be the ideal solution. But that requires access to that money and the resources and it's money and also technical expertise. There are some nations that really have that. So I think that like the Seminole Nation in Florida has got like exemplary kinds of systems or processes where people are placed. They're doing a great job with that. But again, they have a capacity that is not necessarily the same that another group would have. And they're not really recognized to be in Florida. Exactly. Yeah, yeah. So I mean, it's a, you know, this is where, you know, it's like anything in the space, right? It's really hard to do good work without the financing. I mean, you know, things like even open access, right? Like, that's great, open access is great. And then, you know, some open access measure might come out that requires really expensive APCs. And that cuts out like junior scholars and unaffiliated researchers because they don't have anybody to pay for that. And then that's a bad thing in that way. But so, you know, having the resources available to do the good work is what's really important. I think we need the advocacy there because it's not just going to be a technical solution that you really need to have the organizations have the resources to be able to manage this in the way that they see fit. Yeah. Did you see Nico's question? Oh, no. He says, can you tell us more about the changes in the new version of open context being rolled out? Yeah. You might have to, it was just a question. Yeah. Right, but the digital online audience? Yes, so what are the new changes? A lot. A lot of it is, so some of it is just better user interface stuff. So this is the caddy to spools and, you know, changing the colors, being able to zoom in on things in more interesting ways, be able to, one thing would be like, if I just click on an object here, that's a spool. Like a little banner image just shows some sort of visual context of where this record comes in with the rest of the project. There are improvements in speed and stability and scaling and the, which are all important and also things like standards aligned, but there are, here's another example. This is, let me shift back to this. This is a survey data set. We're all the way now, I'm jumping into the Peloponnes in Greece, okay? This is a survey data set created by David Pettigrew and colleagues and this is one of the things we've been building is a lot more visualization to be able to sort of explore a data set and understand what's in it and then sort of a more form manner. And so this represents sort of a quantitative view of the time periods associated with the different objects that were collected in the survey. So some objects are very, have a lot of current uncertainty and what they date to, some are much more specific. And so the wide bumps are very uncertain kinds of objects that the narrower ones are more chronologically sensitive kinds of material culture that have narrow time frames associated with it. And then you can pile them on top of each other and color code them based on quantity and the height of the length is based on quantity. And that gives you sort of a visual impression of time in an interesting way. So these are some enhancements that we're going. Well, basically a lot of the enhancements are really for usability, user experience. Most of the sort of standards in terms of interoperability have been put in place for a while now, but a lot of what we're trying to do is just to make this more usable. And so you don't have to be a completely maladjusted nerd to be able to use the site. So that's one of the examples. This is from a drawing of the Sphinx statue from work that Mark Lainer did way back in the 70s and 80s. And it's just a very high resolution scan of a plan and it's color coded with different episodes of restoration work that happened on the Sphinx. Some of that restoration work was actually dates back to the New Kingdom. So around 1500 BCE or so through to the present day. And that's actually something that's been really kind of, it's also useful for tracking like more recent erosion and changes on the Sphinx statue with climate change and wind and dust blowing around. Who else? Yes. I have another question online from a chat. Okay. I can't see it, sorry. Jordan is asking, could you talk, thanks for the talk, could you talk a little more about any favorite studies that had come out of drawing data from across projects, which did all the projects? Well, that would be yours. Yeah. Yeah, so I think the biggest sort of aggregative data reuse that has come out, reusing the information that we published has been around zoarchaeology because there's already these traditions of reuse of information and zoarchaeology. And then a lot of the information is really comparable across different sites and time periods and regions. So zoarchaeology, a court study led by Ben Arbuckle around looking at the dispersion of animal husbandry from the Near East towards Europe that came out in plus one, I think 2016 or so. And other really interesting work has been done by Catherine Cook and Kevin Gardski around actually using this teaching, which is also really an really important role that we need to recognize. And they've published some interesting work about what the challenges are, how to make some improvements and making this more usable for instructional kinds of purposes also. So that's something that we're really grateful for, the work that they've been doing too. Yeah. It's a really simple question, but if this is open source, that's the goal, and individuals or projects put their material up, is it like a burial plot? It's there forever in theory? Or it's like, what happens when those people have pretty much passed away? Yeah. I mean, is it then, I mean, like the Phasaurus, I guess I'd like to use that as an example. My part of Phasaurus was like term definitions, like a, you know, let's say a, you know, of a triticon monococcus. Yeah. Will be a triticon spelled the correct way, by the way, in every single site that you link together like your bones, right? Yeah. And so I thought that that's what the British means with the monococcus of the Phasaurus is once it's a triticon monococcus, it doesn't matter what's one thing, it's still a triticon monococcus, and you just push a button and see where those exist. But it sounds like it was doing much more than make standardizing definitions. It was doing some linking things that you've lost. And you're just saying that, and how that's different from your... No, that's a good question. So the British Museum's Phasaurus was that, it was an authoritative definition for different kinds of terms, concepts, right? Different classifications that people use. And it was made computer actionable, meaning that the relationships between those concepts were described using standards that computers can understand. So like one of the biggest things is hierarchy, right? Here's a parent concept and a parent like mammal, and then mammals have primates and carnivores, et cetera, et cetera underneath them. And so having those formal relationships between those concepts is really useful because then you can start making inferences to, I want to search for mammals. Oh, that means you also want dogs, right? That kind of thing. Right, so the British Museum's Phasaurus was published in a way that's a definition. It was made computer actionable. And they had web identifiers as basically the ID for each concept. That's great, provided that somebody's actually going to sustain those web identifiers, those links, right? That's not going to understand a mammal. What should define the mammal? Why can you use a link? That's how it's going to be a mammal. Right, but how do you identify, where do you look up mammal? What source? You don't look up on the source. You won't. The British Museum was the source. It was like an ancient, right? Yeah, yeah. I mean, everything defined mammal is this, and what is defined dog and ghost. Right, it's just like, yes, it's just that how do you know that it's the British Museum's mammal versus somebody else's mammal? That's the thing. Those links provide an identity to the concept. Another good example is places, Alexandria, right? This is a place, right? Well, it's actually multiple places. Right, there's an Alexandria in Virginia. There's a very famous one in Egypt, right? So which Alexandria are you talking about? So a gazetteer is really useful for disambiguating which one of those you're talking about. And there are web gazetteers that do that, that have web identifiers for different places, and we use those and we link to those. And we have to trust that they're gonna be around. And what I think is the critical thing is like, it's not that gazetteers made with computers are bad or put up on the web are bad. What's broken is our governance of that. And our institution is not prioritizing and maintaining these things. That's I think the real critical lesson here. So suddenly all those definitions just went down. Exactly. Yeah, that's fine. It's not just about pulling your definitions down. This is what I'm using, putting in your databases that you have a link to that thing. So others and computers. You're just gonna grab that handle. Yeah, and then the link is broken and your thing is suddenly not working. You can't go get that handle. Right. Yeah, there's a really interesting project called Palacios which has done a really interesting work in aggregating geospatial references, common references to places like with gazetteers. And so what they've done is like, they mostly focused on the Mediterranean world. So Herodotus wrote about a lot of places and mentioned a lot of places in his text. They've linked those places mentioned in the text to the gazetteers. And you could sort of scroll, you could sort of map out Herodotus's book as he wanders around talking about different places. And then you could see material culture that's associated, but for the same place that maybe it's stored in a museum. And those kinds of aggregative ways of doing things to be able to pull together related information that's really context rich is driven by the fact that there are these web identifiers that make a very unambiguous assignment about, we're talking about that, Alexandria, the one in Egypt, not the one in Virginia. So that's the kind of thing that's really important. It's the notion this, I didn't get to this in my talk talk because we ran out of time, sadly, but the difference between the literal and identifiers really a critical thing to do in your own database. So literal is just like some description about something, right? And it could be like a text note, it could be the length of something, but an identifier is something to think about because an identifier is something that could be described elsewhere. And you wanna think about that when you're thinking about your own data, is this concept that I have in my own spreadsheet? Is this something that could be potentially described elsewhere? Like a context record, right? And if it's a context record, then you better make sure that it's identified in a way that's unambiguous in that somebody can actually look up because then all of a sudden you've added context to your own information. And the same thing with something like biological taxa, something like that, sheep goat, is this defined someplace? And can I point to it in a way that other people could also point to it? And then all of a sudden, if we have a lot of people pointing to these similar concepts, then our data joined together in an interesting kind of a way. They're related by those shared concepts. Yeah. One of the challenges to that would be the different definitions. So I'm thinking of, say, stone pool artifacts, right? So what they, you know, backlight some people with something else or somebody else. Yeah. So are there, you know, do people then link to specific typologies or? Yeah, you can link to specific typologies. I mean, so what we do with all the context, we do both. You define it the way you want it. It's your backlight. And we'll just represent your backlight. But we'll give your backplates an address. So somebody else could cite that. So one of the examples would be, so let's say here's an object type, you know, roketo. So this has a landing page. You could cite that typological concept of a roketo and here are a bunch of examples about it. You can make a definition description about that. And you cite that in your own work. And that would be something that would be, okay. So at least somebody might agree with you or they to say that I'm using make copies backplates. If there's a wider standard, like some committee of lithesis skits together and defines a whole bunch of object classes, then you could reference that and say that my black blade is subtype of this general type. And, but the nice thing about doing all that is you start making these things explicit and it can be computationally actionable. And then that's when that starts to happen, then first of all, people sort of wishy washy do things all the time in literature. And you can never really understand sometimes what they're talking about because it's all wishy washy. But doing this in this way makes it explicit. So you can disagree or not, you know, these are, just because it's not a computer doesn't make it objective, but it just makes it explicit. So at least it makes it something that you can actually sort of like, okay, you could agree or disagree. It's not an objective fact, but at least it's made explicit what this relationship is. A way of reducing the inherent ambiguity. Yeah. I've talked a lot before. They are descriptions and everything. Yeah. They're more ambiguous than they are. Yeah. I mean, and typology is hard, right? I mean, people have functional types, people do use shape and form as types. I mean, it's sometimes it's by manufacturing kinds of processes. So yeah, it's a hard problem. It's, we're not going to solve it necessarily ourselves, but at least the idea is here. So let's start being a little bit more explicit about what we're talking about. That's mainly the, that's mainly what is the incremental step that I think is more to you. But yes. Well, I was going to say, you can't make work in a user. Yeah.