 Hello, everyone. Thank you for joining my talk today on designing metrics programs to respect contributor expectations and promote safety. My name is Sophia Vargas. For those of you I haven't met, I am a researcher and program manager in Google's open source programs office. And I'm also an active member of the chaos community, which is a project that focuses on discussing, defining, and building software to implement metrics around open source projects. Specifically designed to measure health and sustainability. A little bit more about me and why I'm interested in this topic. I have a background in market research, consulting, data analytics, program management. And when we talk about something like data collection and data practice and data policy, I want to expressly point out that I'm not a lawyer. So please do not take anything in this presentation as legal advice. Instead, consider it as awareness. What would I like to have known before and acknowledgment of when you should be consulting legal expertise in your practice? So bring us back to the question at hand. Can we measure the health of our projects? Well, we can. There are ways to collect data around our communities. And if we set goals around whatever health means to us, a growing contribution or converting more intermittent contributors to full-time contributors to maintainers, whatever it means to build health and sustainability in your community, can we measure that over time and track it and see how we're doing? But in order to measure that, we have to collect data about our community, about people. And often, people that can choose to be anonymous, people that can choose to identify who they are in a community call with their camera off or supply their work information and the location information in their GitHub profile. So we have a whole range of information about people that we are working with in these communities, but it's not necessarily consistent. So we think about collecting data in order to measure something like project health. We have to think about the individuals and the data we're collecting about them. And so how do we ensure that when we approach these types of programs, we're meeting people at their own privacy expectations in order to ensure their safety in any of the community? Because if you think about, there are some identifiers that are now at under attack in some communities and we would never wanna put our community members in a place where they could be subject to harm. So the first question we have is who's actually accountable for the data itself? Data could be collected by companies, by projects, and influenced by regulators and regionalities. And depending on where you are and where you're collecting the data, you might be subject to different rules and regulations. And say, I like this example, in the chaos community, we've been working on the DEI event badging program where we're encouraging events to have more diverse representation in their speaker profiles. And so in order to do that, we've asked the event leaders to share that demographic information with us as we review it as part of this badging program. But the more that we were doing this and more that we were collecting this data, we recognized that, well, we should actually be transparent with the community in terms of how we're using this information, how we're ensuring that their information is being used expressly for this purpose and nothing else, as well as how we're protecting and ensuring that this information is only staying with us in the trusted member of this community. So we thought this was a necessary addition to put on our public-facing documentation to ensure that anyone who shared their information with us knew what we were gonna do with it and how we were gonna protect it. So if you are thinking about collecting data about your community or a community that you're a member of, generally the recommendation is that you, the collector, are accountable. You might be working at a company, you might be working for a nonprofit, and yes, some of those, again, nuances and what's being regulated and how it's being regulated might change depending on where you work. If you just assume responsibility, then you're the one who's gonna be navigating this on behalf of your community. So when I think about data collection, I think about it in sort of three main buckets. We can ask people directly who they are, what they do, what they're interested in, what they're working on in the form of surveys and interviews, discussions and forums. We can go out and scrape data that's already there through crawling websites or pulling from platform APIs like GitHub and Stack Overflow. And then we can kind of fill in the holes from what we know or assume. And you can see I'm kind of sounding a little bit more skeptical here because this is never gonna be something that I'm gonna recommend, especially in the context of community health and metrics. I will say that there are use cases that I think this is a little bit more appropriated, say in an academic approach where we're looking at thousands of people and we're trying to figure out how to estimate various different types of identifiers in a population. In that case, okay, maybe that's okay because those things that are being labeled are never coming back to the individual. They're happening at an aggregate level where those individuals, yes, their identity might be assumed, but we're not actually labeling them individually because it's happening at an aggregate and systematic level. But in our communities, this is where things can start to get a little sticky. It might encourage you to assume information that people don't necessarily wanna share with the broad Republic because they've shared it with you as a trusted member of their community and maybe they don't want you to write that down in a place where they haven't agreed to that. So I'm never gonna recommend it in the context of gathering data about your community and displaying it back. The other area where guessing becomes highly problematic is that when we think about how we get consent about using this information, if you are guessing and you're not interacting with the person directly, you're not actually getting consent from them versus if you go up to them and say, hey, are you okay with this information being used? Then you've expressly gone out and asked for their permission to use it in this context. So again, never gonna recommend guessing in this context because it can encourage inaccuracy and potentially stereotyping. So if we are asking people, and again, form surveys, interviews, we can explicitly ask for consent. We can also allow for individuals to provide conditional consent. Say I only want my information being used if you have at least 30 people that follow my category because then I know that I have some anonymity in that. Or maybe we offer partial ways. So allowing respondents to opt out of questions that they may or may not feel comfortable sharing with us. The one challenge here is that later on, I'm gonna generally, my general stance is if you're running a survey, it should be anonymous. That can encourage more, reduce the fear of harm or retaliation, as well as allowing people to feel more open and honest because they know it's not coming back to them. However, this does present a challenge in terms of, say, being able to withdraw or alter submission because if it's truly anonymous, you can actually retract it anymore because there's no way for me to find which one is yours. So in theory, you could, say, add an identifier for surveys, but by some regulation approaches, this is not anonymous anymore because people can be re-identified by those values. When we're scraping data that's already there, I would say the general recommendation is as long as you are following all of the existing platform policies around data use and data use case, then you are essentially writing on the implicit consent that's already been granted by the platform. If you sign up for GitHub, you say, yes, I agree to your terms of service. So now that I'm pulling data from GitHub, I wanna ensure that I am actually following those terms of service, say, I'm not using your email to send out marketing content because that is explicitly prohibited in GitHub's terms of service and data policies. So as long as you know what those policies are and you comply with them, then you can essentially borrow that consent that's already been given to the platform, which has also allowed individuals to only share information conditionally, say there's some information that's only shared with org members or some details about interactions that are only going to be org members. And some information, you don't have to necessarily volunteer. You don't have to put your name, you don't have to put your geography, you don't have to put your company on GitHub. So there is an element of allowing individuals to share as much information as they want to on that platform. Where this breaks down and gets more ambiguous is that when we start to aggregate data across platforms because there is no mechanism to allow for consent to aggregate this into yet a new dataset that will have net new information in it. I had an example in here earlier just as I was trying to cut down time, but if you look me up on GitHub, you can see where I work, you can see where I live and generally that's enough information to find me on LinkedIn and see where I work before, see where I went to college. And at a certain point now you're aggregating a lot more information about me in one place that I haven't necessarily expressly given permission to. So this is a case where if you are planning to do this, it's a good idea to bring it to your project leadership and you're bored and discuss your approach first, what you want to do with it, why you want to use this information and give the community an opportunity to provide feedback and or opt in or out of this. So a little bit more about survey sampling and here's where I'm putting my market research hat on. If you've ever run a large and comprehensive study, the thing that really qualifies what we can do with the data is the sample size, how many people did we get and what portion of the thing we wanted, who do we want to know about is represented in that sample. And typically this allows you to say, understand how you can extrapolate the data. So say the US Census has information about all of all the people who live in the United States, well that supplied that information, but it's such a massive portion that they can kind of extrapolate what's happening in the broader community because they know they have representative sample. But in our communities, we often don't know how many people are there or present at any given moment in time. So it gets a little bit fuzzier when we think about open source communities because we might not really know. We might think we have a majority, but we might not. And so just think about that in terms of what it actually means, how you present the data, what things you can conjecture based on that data. And recognizing that it's always gonna have some sort of inherent bias. It is incredibly difficult to remove bias from a survey because at a certain point, I'm only say recruiting participants of the Kubernetes community. That's just one community. That's not everyone. And so we're inherently narrowing it down to the populations we wanna talk to. But to say if we only recruit on a platform like Slack, then we're also limiting who we talk to. And recognizing that where we recruit is also going to affect who actually is exposed to this, who can take the study, and again, acknowledge who you're able to include in this type of exercise. And generally respecting the norms, there are some channels that are not meant to be recruited from. Say if it's a channel dedicated to something like issue triage, they probably wouldn't appreciate having a survey placed in there that's not about issue triage. Generally market researchers have lower limits that they are more or less comfortable with talking to represent data in percentage forms. And from a conservative sense, this would be a sample size of 50 people. The less conservative sense, maybe it'll go down to 30. But below that is when the numbers sort of start to break down because representing it as a percentage starts to become more misleading and you might start to be able to figure out who people are based on how they respond, just the smaller and smaller the sample gets. The challenge is, yes, you might not have 30 people in your community. That's actually quite large number for some open source projects. So it's just an acknowledgement that you might not be able to present the data in the same way. And it's gonna take a lot more care and consideration in how you present your findings to acknowledge that you're not actually inadvertently exposing someone who thought they were anonymous. To build on some of the other best practices around surveys for questions, please only ask what you really need to know. It's really easy as a researcher to just kind of keep shoving in more questions because wouldn't it be nice to know that? But it's just gonna make things longer and the longer they get, the more likely people will bail. So generally, don't exceed 10 minutes. I know I wrote 25 and 15 here because that's usually, again, what market researchers say, but I think that's still too long. The shorter the survey, the more likely people will finish it. In addition to that, it's really easy to ask for too much information than you actually need. So one example that came up recently in my world was asking where people lived, which is a sensitive and protected category. And when I dug into it with the project leaders, the things we really wanted to know were what languages are people working in so we can prioritize translation? Okay, that's a different question. Or what time zone should we be prioritizing for meetings? So maybe asking what time zone you like to work in, which may or may not correlate to your geography, and what language is your primary working language or do you feel comfortable working in, those were much more pointed, much more to the point, much more actionable than asking for something like geography, which is a lot more sensitive and potentially not even getting to the things that you need. Limiting open-ended questions is really easy to slide in a few more extraneous details that can expose who you are. So it's helpful to remind participants to talk in generalities. And again, you as the data owner are accountable for ensuring that you redact anything that someone has said that might help re-identify them. Rackets and buckets can help to make things more general. So something like a category like age, instead of asking for someone's age, ask for someone's age in buckets. And this way you're reducing specificity while still providing the analysts some things to run distributions and analytics on. In very small populations, you might wanna group more things together, especially things like underrepresented groups, which yes, that might limit your analysis potential, but if you only have two or three people identifying in these different underrepresented groups, that's not a large enough sample to really learn from. So you can talk about these things in general, but then maybe if you only get a handful of folks in the back end, if you're gonna display the data, group it together. And that way, again, people's identities are a little bit more protective. Allow people to opt out of anything that is sensitive, say potentially all demographic questions. And where no possible borrow from existing lists. I've written, I don't know, hundreds of surveys and I still struggle to say, comprehensively list out all the various different types of titles someone can have in a technical profession. It can be massive and if you wanna keep your list to like say 13 to 14, which is sort of like standard size of a survey question, it can be really hard to do that. And so just look at another survey. See how someone else has done it. It's much easier to react to a list and make it better to fit your case than it is to write a list in general. And this can get even more problematic when we start to think about, again, those sensitive demographic questions. So here I'm recommending looking at the Open Demographics Project that has been crowdsourcing how we provide a comprehensive set of categories around demographic and identifiers. So these can be really helpful places to start by looking at existing studies as well as looking at projects that are focused on this specifically. In interviews, most of the time these are definitely not anonymous. There are people talking to people. So then again, it's on the data collector to ensure they're redacting and abstracting away any of the details. So typically, again, you can ask participants how much about themselves they're willing to share. But you can also just fully abstract that in your findings. Participant X said that. And if you wanna include more information about who you spoke with, then maybe you have a table that says, I talked to 10 people and five of them came from large tech companies and five of them came from these different particular regions. But because that's separated from your interview kind of like participant X, Y, Z, then there's no connection back to those identities. Typically in larger studies, we'll see an abstraction like job title in a role in industry. So I might be a program manager at a large tech company. And I'm okay with that identifier because there are thousands of us at Google. So I'm not really inadvertently exposing myself by referring to myself in this category. However, if I said program manager at Google and we knew it was an interview from say chaos participants, you know exactly who I am now. So acknowledging who is present in the study, who is present in this space, what lists are public can help you identify how best to abstract this. And if you're concerned at all, again, just revert back to full abstraction. A little bit more about passive data collection. So again, pulling from existing information sources, you have to acknowledge that there isn't always a place to opt out. If your platform, sorry, if your code base lives on GitHub, then your participants have to opt into GitHub in order to participate. And so there might be areas where you're now starting to aggregate information that may or may not be identifiable because you know your community members. So again, this is an opportunity to bring this back to the community, to acknowledge what you're doing and what you're trying to do with it and allow for feedback on how this information is being portrayed. A lot of things are outdated on GitHub. Just acknowledge people don't go in and change things immediately. So if you're looking at prior or historical logs, they may or may not be showing accurate information. And there's always gonna be information that's missing because again, you didn't go out and ask people directly. One of my favorite examples is there's a lot of effort to assume what company you work for based on the domain of your email, say the DevSath project does this, and it is helpful to know where people are coming from. But we don't know if people that are identifying as working at Google are actively doing this work during their time at Google. A lot of people are actively volunteering in these spaces and might identify that they work at a company, but that doesn't actually mean that they're doing this work on behalf of their company. And so there's a lot of nuance that's lacking from these platforms and that's just again, the information that you have to go directly to the community members and ask them if that's something that you need to know. And we already mentioned it, but being considered across, consider it with cross-platform data aggregation. Now I want to spend a little bit of time talking about another program in the Chaos Project, which I've mentioned a few times, stands for Community Health Analytics and Open Source Software. So again, we think about metrics, we design metrics, we build software to collect these metrics. We launched another program last year where I already talked about the Event Paging Program. We have now the DEI Project Paging Pilot, which is a way for projects to highlight the efforts that they're taking to promote diversity, equity, inclusion in their communities. And the reason why I wanted to bring it up in this context is when we think about measuring something like how we're doing in DEI, we often revert back to representation and identifiers in our community. And everything we've talked about today acknowledges that these are highly sensitive and protected pieces of information. And in fact, pieces of information that could cause harm depending on where you live in this country. And so it might not be prudent for you to go out and collect this, especially if your community is small. And so what I like about this program is instead of looking at metrics or presenting metrics that look at representation and actuality, instead, they encourage projects to write intentionally how they plan to work on things like project accessibility or communication transparency, what they're doing to welcome newcomers and provide an accessible experience for newcomers and how they're thinking about increasing diversity in their leadership, which is the most visible part of that community. And these are things that projects can report on, create metrics to measure against, and they have nothing to do with collecting protected information categories from their community. So I really like this as an approach for particularly small communities or ones where you're sensitive about collecting this information. And again, if you have a community of two people, you're probably not diverse because they're just two of you and you can't really be diverse with two people. And that's okay, but you still have an opportunity to say, hey, I want to make sure that my project is friendly and welcoming to people that may or may not look like me or identify with the things that I represent. So with all these things in mind, I think mostly we want to take a step back and ensure that we choose the methodology that not only meets our goals, but fits the shape of our community. So some general principles, we've mentioned this a few times, but where and all possible, preserve anonymity. This can really help to encourage honesty and again, reduce that harm of fear of retaliation. We already mentioned this already in terms of the accountability piece, but really try to limit access to the raw data because typically the raw data is gonna be the identifiable data. Again, even an anonymous survey, if you only get six people filling out your form, you might be able to figure out who those six people are based on how they respond to the things. So you as the data owner, collector, analyst, are responsible for ensuring that the way that you portray this information again respects that privacy and anonymity that was presented at the front of this program. You can use a third party to help remove and reduce risk here, but it's generally not something I'm gonna recommend because this costs money and these people don't have any context about your community and may not be able to know how to interpret the results in a way that's helpful and relevant. You can also try just limiting your retention period. If you finish your analysis project, delete the data, which can sometimes be a little bit scary because you're like, well, what if? I wanna go look at that thing and you're like, well, okay, maybe after a year we throw it out because then we're gonna run it again, we'll have a whole new data. So save your aggregate results, but don't keep, limit the retention period. And again, it might be uncomfortable, but the longer these things stick around, the longer that they might get shared to more and more people that may or may not respect the intentions that were initially outlined for this project. Reviewing and complying with all applicable policies, there are many policies that should be considered here, not only happening at the foundation, the community level, but the platform level. And again, if you're a company, then the company level or any regionality that you're subject to. It's always the best practice to have somebody else look at what you're doing. So auditing, reviewing and testing, your methodology can come from your project leadership and community members. If you're doing this on behalf of a research institution, then you can lean on your institutional review board. And again, consulting, legal and privacy experts, one applicable. And always communicate what you're trying to do, how you tend to use this information, what you're expecting to do with it, and what measures you're gonna take to ensure the privacy and protection of all those who participate. I wanna close with some general considerations. Typically 100% participation is never gonna happen. And that's okay. I think in most surveys, like if I get 10% of my population responding or my outreach list, that's actually pretty good. So again, expect less. And when you get data, be thankful for that data. But then also recognize what and who is potentially missing from the sample. You're never really gonna collect everything. So what does that mean about what you can infer? Again, thinking again about that initial recommendation of understanding the sample you collect and how is representative or not of the community. And remembering that no data is data. And that if you don't collect any information about some folks, that could mean that, yes, maybe they're not present, or two, maybe they didn't feel comfortable sharing. And maybe that's something else that it's willing to explore, understand why or why not this information is being exposed in this collection practice. Always recognizing that externalities might impact your results. So you probably don't wanna run a survey during a release. It seems simple, but I've seen it happen. Don't do that. People are busy already and your people are your most valuable resource. Be respectful of their time and acknowledge and figure out what cadence makes the most sense for your community. And please don't over-survey. I say personally had four surveys in my inbox this month. That is too many. And so I only pick and choose the ones that I either have to take or that I want to take. And so again, be respectful of your community member's time and maybe only do this once a year unless there's a really, really good use case. For additional thoughts and best practices around data collection and research in open source communities, I'm referring to a piece of research that was published last year by a couple of fellow researchers and colleagues around best practices for open source ecosystem researchers. I wanna close with the fact that data is a tool. It is not innately good or bad. And it can be used for both positive and negative things. So you again have to assume the responsibility to use data wisely. And so kind of the general principles I like are being incredibly deliberate. Data is a tool that can be used to elicit a change. It can be elued to elicit a change in behavior. But is it the behavioral change that you wanted to encourage? My favorite example here are leaderboards. I know I've heard a lot about desktats this past week and I love the project. I love what it's allowing the broader community to track and to be visible of. But it does also encourage some behaviors by saying this is how we create our nominations list for our leadership by how these contributions are appearing in this particular aggregation tool. And so you're essentially encouraging people to ensure that they have the right and highest level of stats in order to do that. And those may or may not be the things that are really needed in the community. So when we do display a leaderboard, is that again, is that the behavior that we want to change? Or maybe we don't necessarily want to always show all things to all people. Because sometimes your community leadership needs to know information, but it isn't necessarily helpful to show it to the broader community. So being really deliberate about what we show to whom that can really again elicit the change that we want and the behavior that we want to influence in the community without again encouraging unwanted or changing the behavior that weren't intended. And it's really easy to cherry pick data, pull it out of context, use it however you want. I've seen this happen numerous amounts of time. And so try to fight against that by always including your bases, your sources, your methodology, how a sample was collected, when it was collected and having sort of a referenceable information sheet around your data. Don't release data without this please ever. People can understand where it came from, understand how to interpret it. Cause even if you're taking those things into consideration, having as a researcher, I always look for it before I interpret any data. I want to know how many people were present, how people were participating in it, how they were allowed to participate in it, when it was collected. These are all the things that helped me understand what I'm looking at and how to interpret the results. Not everyone will do this, so please include it all in one place to again not leave room for misinterpretation. Thank you, that's it. I actually cut this down a lot, maybe too much, but I honestly don't mind having a short talk. You can follow me work on LinkedIn. I've kind of left the other platform that we don't speak of anymore. And in addition to that, we just recently launched as a team, a publication site. So I work on a research team of two and we've been slowly but steadily publishing research related to open source ecosystem and related best practices. And we're finally starting to put that in one place. So if you're interested in seeing some of the papers that I've written as well as my colleague Amanda Cassari, that's now available on our website. And I'm personally always open to feedback. So I put this QR code up, it'll take you to the app that allows you to rate the session and give me feedback. And with that being said, thank you for your time and I'll be around for the remainder seven minutes and 30 seconds for questions. Anyone? Yeah, please use the mic so the recording can hear you. I just got a question. I think this is around the same topic, but for us who are like maintainers or dealing with open source projects, I think I'm currently like a product manager right now. Getting like signal and feedback from just usage of usage of one of the projects has been tough. And this seems to be like two camps. Some people, hey, telemetry is evil to try to embed in a product project. But it's super helpful. Like if you talk about like enterprise apps and how to get, you know, to not have to reach to people and just have that data come back to you just easily. I was just curious, what's your perspective or just kind of like your outlook on those? Yes, I have many thoughts and actually an active project looking at this right now. So I'm glad you asked. Yes, we're talking about usage and how that can serve as feedback and mechanisms for project leaders as well as project leaders of how people are using what and how. And so I think, yes, there is generally negativity associated with adding any kind of telemetry to your project. I would say there's sort of a double standard here because if project were released with telemetry, people don't care as much. But if you add it, that's where we start feeling like we've overstepped the trust of the community. And so right now I'm looking at a project to try to assess the usefulness of approximate information around usage. So one of the most popular ones that's come up is say number of people viewing your documentation site because the assumption is that if people are regularly reviewing your documentation, they're probably using your tool. So that can be one where you can sort of track activity around documentation. You can look at how many people are posting questions on things like Stack Overflow, how many people are generating issues in your GitHub repository, how membership to mailing lists is growing, changing, activity in your Slack channel is growing and changing. People like downloads, they're real flaky depending on where you're pulling them from. So like sometimes downloads can be indicative. Sometimes I don't know if you're installing like 500 things on your machines inside your company. That's not really necessarily a good signal. And actually one of my favorite download debacles was an individual from the Bazel organization that pulled download information across their versions because they wanted to know which versions were still in use. And the data was a sine wave. It was not going up or down. It was doing this. And we're like, oh, because they're being pulled into ICD systems, this is like automation. This is not people. So there can be a lot of funkiness and noise inherent in any of these signals. So the general hypothesis we have now is that if you look at the signals that are available around approximate usage indicators, then that can at least give you a sense of trajectory. It might not be an actual value, but you can generally see is this community growing? Is it shrinking? Like in trying to see, I didn't really learn from that. The other element that you can bring in that can add an element of relativity is looking at similar metrics for similar projects to see like in other projects in the CNCF community, maybe that'll give you a better signal of whether or not you're growing with them or growing apart from them in ways that again, it's relative, but it provides a little bit more context to say like, okay, we have a number. We don't know if it's real. Is it more or less than another project in the CNCF that's similar to what we do? And that gives you a little bit more context for how to talk about the information you have. So nothing is perfect. Nothing is actual without adding some sort of phone home virtual enemy directly in your project, but again, that's generally not recommended. So I would suggest looking at that and I guess stay tuned because the research community is actively thinking about it because I would love to be able to demonstrate which metrics are better than others to try to understand usage trajectory. Okay, we can talk offline because I guess, we try to like, there's battles as far as saying, which features do we think are important? So there's a lot of assumptions made by the ambiguity or we think or some big well client comes and says, oh, I need this. And then you find out, okay, they're the only person that has that feature, so. It can happen. Absolutely, let's chat after. Pretend for a moment that you're joining a small company building an open source product who is firmly in that guess category. What are your first couple of steps? About who's in your broader community? About collecting this kind of information and putting together surveys and... Yeah, I guess I would always start with the existing information versus going out and asking people first. So what I didn't say in terms of order of operation, if you're gonna run a survey and you're again thinking about how many people you're trying to reach, we don't even know, we don't have any information about who they might be first. And so starting with what you can already measure by say what's happening on GitHub, who's interacting on GitHub, who's interacting on Slack, who's interacting on Stack Overflow, who's interacting on Discord, wherever else your community members are active. So you have a general sense of, okay, is this like a three people? Is this 10 people? Is this more than 20? And you kind of like can get a better sense of your population size. And then I think if in smaller spaces, I feel like I really like the discussion forum format where instead of going out and asking people directly, which can be kind of intense, that especially if you don't really have a long history with them, in a forum it allows for a general conversation and people can share as much information as they want, but also hear from others, which I think also encourages other people to share. They're like, oh yeah, I do that too, or I wouldn't have thought of that, but now that you're saying that you're using it in this case, like now I'm interested in that. And now you as the person listening acknowledge, oh I now hear a use case and hear that other people, that's resonating with them. So if you can bring your community together in sort of this forum environment, which can be virtual, it can be physical, depending on how your community gathers. I like that as sort of like the initial starting point for this, especially, and I would say for discussions and initial discussions, keeping things really general, just like maybe about use and interests and direction, just so like before you're getting into any kind of demographic information, because I think that that only comes into play when things are getting larger, we're thinking more about your community population. And so that's kind of where it would start I guess. And again, if you are setting up a forum like this, it's great to have again your intentions and your collection practices listed up front. I typically never record these sorts of things, but I will take a lot of notes. And so like I think we ran a session like this for another community and all of my indicators were like talk, like I took the person's name and took their first initial and then put like a one, two, three, and that's how I recorded people's names and there was no list of who people were. And so then like if you were there, you could potentially remember who it was, but that way how you recorded that information didn't actually say it to anyone else who wasn't there. And so it did preserve some of the known inaccuracy for the people present, but also didn't create more details that were willing to be shared outside of who was present. There's also a lot of popular forums that exist with Chatham House rules. Are people familiar with that concept? Yes, for those who are not, it's basically saying that there will be no notes or attribution taken in a conversation. So there will be general notes taken, but you will never actually be subject to the things that you say. So that basically allows people to be freer in their comments because they won't come back and say, oh, someone from Google said that. So that won't happen when you all kind of agree to this. So setting a kind of like code of conduct or practice happening specific to this forum can also help the participants know how much they can share and how much they will say will be attributed back to them. What was the term for that? Chatham House Rules. Chatham, C-H-A-T-A-M. Was there an H in there that I forgot? Possibly. All right, I think we're at time. I'll be around for the next two days, but thank you. Thank you.