 Welcome everybody to the community and outreach session number one. So we're scheduled to have four talks today. You'll see those titles on the screen. So I think I think I might need to announce the code of conduct. I think unless someone else is popping in to do that. So make sure to visit the code of conduct on the use our website if you'd like more details, but please, you know, treat everybody with respect and visit the details of the code of conduct if you're not clear on what that means. But essentially, that also applies to all of the online areas the chat, etc. So just please be aware of that. So for our first talk, the speaker is actually not here today. So we are going to use this time and just have a discussion about community topics. And we'll do that for another almost 20 minutes, more like 15 minutes from now. So what we're going to do is, a lot of the people on this call are obviously involved in communities. And so we're just going to introduce ourselves and just have sort of an open discussion about, you know, community efforts in our community and we would highly encourage the audience members to ask questions on the Q&A, and chat. It's a little bit easier if you do it in the Q&A we'll see it more easily. And just so we can engage with you with in this time where we don't have a talk so I will. I guess I'll just since I'm already talking introduce myself so my name is Aaron Liddell. I'm involved in the Our Ladies community. Also they are open side community but not as much anymore but Stephanie will say a lot more about that. So I'm one of the co-founders of Our Ladies Global. So we're just a very large at this point organization that's facilitating running meetups all over the world. There's slightly over 200 chapters now and we're still growing even through the pandemic we've shifted to online meetups. And so I will pause there because we'll talk more about it but I just wanted to introduce myself and next up we'll ask Stephanie to introduce yourself. I'd be happy to thank you. My name is Stephanie Butland. I'm the community manager for our open side and I guess I won't say too much because I will be giving what was scheduled to be the second talk in this session. I will say that I am in the Slack right now. And I know Serenjit is there. Others will be joining. So we're all quite anyone here is quite happy to engage with you on the Slack. There's a channel that's specifically called, I don't have it open now, but it's specifically called community and outreach for session 1A. We can answer questions there. All right, Luis, do you want to go next? Yeah, I'm Yuria Villa and I will present later on about my work and analysis of package submission and reviews. And if you have any questions about submitting a package to grant and about the process, ask them on Slack or reach me on Twitter or any other way. And I will try to help you. All right, next up Serenjit. Hi, my name is Serenjit Toru. I'm representing the R developers guide in this session. I'm also involved with the R ladies community, global team as well as local chapter in Pune in India. If you have any questions with me, I'm on Slack and we can chat over there. All right, so I don't know if Gwen wants to also introduce herself. She's kind of behind the scenes, but I'll let her jump in if she wants. Can you see me? Hi, I'm Gwen. I'm I wasn't really listening to other introductions, but I'm involved in our ladies DC and I'm also just founded on our user group at Harvard, where I'm a postdoc in the business school there. And I'm behind the scenes so excited to be here and can't wait to hear what everyone else has to say. Okay, great. So yeah, so we didn't have a lot of advanced warning that there was going to be this discussion so we're just kind of going off the, I don't know, seat of our pants I think is the expression. So, so yeah, we're pretty open to talking about anything we could talk about. There are different types of our communities so there's communities revolving around sort of networking and people and identities. So we've got a number of different, you know, groups that fit into that that category are ladies. We have a number of different types of communities that are in our, which stands for minorities in our, you could probably consider like the Latin our community in that area as well and then we have a number of other types of communities are more centered around different projects in our or different open source tools in our. communities but I think one thing that I, at least for myself. And that I think a lot of people in the art community feel is community is actually one of the core foundations of the art community, well, of the art community that's a little redundant but I think when comparing different programming language to one another and we don't like to say you know anything's better or worse than anything else but I think one of the things that are is especially known for has been very inclusive and having a very strong community vibe to it and I think it's it's great that we have that and I would love to hear what other people think about the different communities that we have you could mention a specific community. Maybe your experiences with it, maybe your experience is organizing within the community, any kind of unique things in our community to mention or issues to mention anything is sort of on the table. So if either any of our panelists want to jump in or if we have any people on the chat or I think there's at least one person monitoring the slack I will try to jump into that as well. All right we've got one q amp a thank you very much Mike Mahoney. My question is what makes a community useful. What are the shared traits or goals that in your view is helpful for a community to pursue. I'll just put that out to the panelists and see if anybody has thoughts on that. Actually Erin this is stuff I'm thinking since you don't get to give a talk and you have experience, maybe you can tell us. You're right I should, I should do a little bit more of the heavy lifting here. Since you're, you're all on deck soon. So what do I think it makes a community useful. I think that I mean something for me it's like if there's a place where I always feel sort of at home or like I can feel comfortable. I think that's what makes the community the most useful to me so a place where I always know that I can sort of seek comfort, let's say, if there's any kind of challenge whether it's technical or social or emotional challenge like that we just deal with working in our and working with other people or groups in our. For me, at least that's what our ladies is for me it's a place that I can always get support if I'm struggling with any of those topics. It, I mean we, we, we have a quite a range of, of, and pretty much anything is on the table at our ladies. We have a very strong like our help community so I know if I have a question I've looked on stack overflow I've looked at all the docs or, you know, whatever else people seek help for our I almost always can get help, help there and that and that's just you alone. And I think that that's a really nice feeling so I think for me I think for that's what makes makes it useful for me and in terms of shared traits or goals. I mean I think it depends on the type of community so if it's it's sort of an identity based community like our ladies. I think our shared goal is to make a more comfortable space for for women and gender minorities in our community, and also to help people's careers help them become our package developers help them eventually join our core and so we have more parity there so I think you know there's a lot of different goals that we have representation is is the main one but you know that's that's something that we share together and when it comes to open source I mean usually people feel quite passionate about whatever package they're building, or maybe it's like a whole network of packages like our open size so you know if you have a shared goal, let's say, lots of this little stuff that you deal with let's say a day to day in your life. If you don't have shared goals just kind of. It's not as important so you know I think people can come together more when there is a shared goal so I think that's a good. It's good to form a community around that and for example and our ladies we have a whole mission statement about what we're trying to do there and I think probably. You know, every every community even if it's just a community around an open source package, you know they have they feel passionately about it and can share. Sharing that together. So hopefully I answered that question we've got a number of other ones popping up there thank you community for helping us out. Does anybody else from the panelists want to speak to this question what makes a community useful. Okay, so we'll, we'll go to the next one. I have one word to share is transparency transparency moving toward trust. And it's not an easy thing to get but I find that in one on one relationships and then just in the community in general as if you know if we can all be open about what it is we want to get out of it. You can find like minded people that way. I would also add to this that for transparency we need a lot of communication, and it's important to have open communication and be open about what you expect and what you want to build or share with people because otherwise people have different interests and it can be it can start well but then the community can start diverging a lot and we don't reach the same goal for everyone. And this way that it's less useful for all. And also add a sentence. So if you are part of a community it's very easy to be able to contribute to open source projects now, because communities like programming communities usually have many open source projects going on. As you are part of these communities you can always inquire, can you contribute or how you can help these projects. It's a great point. Yeah, even if it's something like our ladies, it's a great place to find help for actually getting involved in other communities as well. We have about five minutes left. I'm going to ask another question that I think is is a good, a good one so it's by Michelle in the Q&A. Is there a list of different communities that exist in our and how does one get involved with the community. Well I'll try to start answering that and other people can chime in so I don't know if there's a list I think that there should be a list on our foundation website. I don't I'm not sure if there actually is I do think that on the conference website there might be a list of the communities that are sort of represented here I know the little the cute little marmot. Angie, do you want to say something about that? Is there a list? So, yes, if you have attended the welcome session, there was a list of like a page which mentioned all the communities that are part of use our 21 conference, MIR, our ladies, Africa, Latina and many more. So yes, you can contact from there, you can get some contact from there. Yeah, so, so that's I mean those are kind of the what I would call identity based communities where you're sort of forming a group around a geographic location or some other identity. Then we have a number of open source communities so our open side is a big one. There were a couple sort of related ones that I well one that I used to be involved in this call our open health, which is like a fork off of our open side like a little tiny fork off of there. I think there might be a couple other ones like that so there's like our spatial there's a whole spatial community I think the keynote before this was on that topic. So there's sort of other communities that are revolving around like, you know, different subfields so someone else has a question do we have an agri agri are so agriculture are users. Shall we have it. Shall we have one if it's not existing. So that's a great question maybe there maybe there should be an agri are. You know there's no reason why you can't organize so I think if you are somebody who has an idea, maybe you don't want like some sort of smaller community doesn't need to be a gigantic one maybe it's just your little field. What I would recommend is if you reach out to some of the other community leaders like maybe myself or Stephanie or Sarah Jeep we've all we're all like involved in, you know, those type of communities and we could give you some tips we have with our ladies that, you know, you could start forming meetups around that I think for smaller communities starting out it would be good to start sort of virtually in like a slack or something like that and then maybe eventually have have online meetups that type of slack is a pretty good way to start out or some other type of chat based service so I think anyone else want to add to that we have two minutes before Stephanie's talk. I want to add that there are also by a conductor it's a large organization around the specific topic of bioinformatics, but sometimes I feel in it that it doesn't it doesn't get mentioned a lot on general conferences, but it has a lot of users and a lot of developers that they develop a lot of packages and it's quite use it on academia. So if you are looking for by informatics or some kind of data around biological structures, you might find that your community or your package. That's a great point so you know bioconductors been around for a very long time and it's actually quite large of a community they have their own, I don't know if it's like a board or advisors or you know committee of some sort so another thing is you could get involved or even you know encourage some existing community to get more involved in organizing yourself so if you want to do stuff for a community that's existing and you don't see things happening there. I'm not saying that that's the case with bioconductor but if you want to see meetups or things like that you could just reach out to the organizers of an existing community as well okay. Stephanie up next so I'll hand it over to you Stephanie. Let me introduce her talk so she's going to talk about our open size model for managing a federated open source software community. The work is by Stephanie Butland and she has two co authors, Lou Woodley and Karthik Ram who are not on here today but part of the talk. All right, go for it. Thank you. Thank you very much. The work I'm going to be talking today is a collaboration between Karthik Ram and me at our open side and Lou Woodley at the Center for Scientific Collaboration and Community Engagement. In preparing my talk I thought a lot about what my intentions were. I experienced great joy in helping people recognize themselves and so I'm hoping to connect with two audiences today. So as we were just discussing maybe you're someone who helps organize in our ladies meetup you advocate for research software engineering as a profession. You're building up resources for an underserved community or maybe you're trying to start a code review club in your department. I'm hoping sometime during your talk during my talk, you might go, hmm, I'm helping build community. Wow, we're putting out blog posts and hosting meetups but I never realized there was a framework for looking at how these activities align with our goals. An audience number two if you're someone who's thought you might want to participate in our open side by the end of my talk I hope you go, aha, there's a guide to help me find all the places that I fit. 10 years ago when our open side was founded the our landscape was a very different place. The infrastructure for developing packages was pretty sparse. Our studio only had about four employees. And anytime people needed to develop to access some data, they needed to develop their own bespoke software solution. No one else could replicate that without writing a lot of their own code. So our open side was created in part to fill that gap. Initially we provided four or five tools and then over time work with community members to develop new ones as needs arose. And in that way our side became a kind of clearing house for packages for a lot of people. Over time the ecosystem got richer it became more accessible, and it was a lot easier for people to develop their own packages. So our focus shifted from providing users with a different tool for every different data source to helping people develop their own by developing and sharing standards for doing that. These days, our open side is best known for our rigorous yet friendly open software peer review system. And this is supported by our dev guide to creating and maintaining good research software. We continually iterate on these in response to questions and open issues and pull requests from people in the R community. More recently we've launched three new projects that my colleagues are going to be presenting later this week. Mark Pajim will talk about our system for statistical testing and software peer review. Myel Salmon will give a tutorial on HTTP testing in R that's the subject of a new online book. And Neuroneomes is going to give a keynote on Friday on a whole new infrastructure program called platform called the R universe that allows you to easily maintain and distribute your own collections of reproducible tools, tutorials or research compendia. So from our first blog post in 2012, first unconf 2014 and first of many community calls in 2015, our open side has been a really community oriented and responsive organization. I found my dream job five years ago when I joined as their community manager to help build a community of practice around our processes and our platforms. All of the work is done by this lovely group of people that together make up the equivalent of about five full time staff. So who is the R open side community. Anyone who interacts with our open site can consider themselves a member. This includes people who share a use case for a package attend a community call report a bug submit a package for peer review. The humans of our open side and the trusting relationships among them are our secret sauce. Part of my job as our size community manager is figuring out how people interact with each other and interact with our infrastructure and figuring out what scaffolding we need to create in order to support these interactions. A good starting point for looking at this is the community participation model developed by our collaborators at the center for scientific collaboration and community engagement. It's a mouthful. It's easiest to remember them by the acronym and it sounds a bit like a cheer C S C C. They are my second favorite community so it's deserving. They've written up this model in a guidebook that's accessible at the URL on this slide. And the framework describes four modes of engagement that can happen between staff and community members or between community members and themselves. Going from left to right they are convey consume contribute collaborate and co create. On the left you can see this little network diagram where the center node would be the community manager or staff people and outer nodes would be community members. And as you move towards the right there's less and less involvement and drive by the community manager or staff and more and more interactions between the community members. Mapping all of our activities onto this framework gives us a way to evaluate how the things we're doing need our community engagement goals and how they align with our mission. Currently our open side engagement happens primarily in the first three modes. So in convey and consume typical activities are things like read watch listen. And are in our inside this translates to use a package or read our newsletter or blog. And a slogan for this mode is here's something interesting in contribute mode typical activities are things like comment vote like or tag. And in our open side this includes things like share a use case for a package or write a blog post and a slogan for this mode is give us some feedback in collaborate mode. This is where we start to create the scaffolding to help bring people together to do things typical activities are discussion knowledge knowledge exchange and production. In our open side this would be things like review a package improve your package and response to reviews speak in a community call or ask and answer questions in one of our fora slogan for this mode is how can we work together. In co create mode activities are things like integration and synthesis multi directional learning and co production and our open side our past unconferences are one example where we met we brought people together we never told people what to work on we didn't tell them who to work with. We worked together they created things like the skimmer package that live on till today. A slogan for this mode is what shall we do next. So community calls provide an opportunity for multiple modes, pardon me, multiple modes of interaction. So for example, people attending a community call are consuming information. We have an open public GitHub repo where people can suggest topics for community calls comment and other topics. During a community call people can add their questions and answer each other's questions in the shared notes doc, and some people are motivated to write a summary blog post about a community call. And in collaborate mode. This is my one of my favorite examples of collaboration in our open side where a week before a community call I bring all the speakers together, and they think out loud about what their content is going to be. They ask other questions and this heavily influences what you actually see presented in the community call itself. The community champions mode across the bottom goes across all these other modes and it refers to community members taking on emergent leadership roles. And this can happen inside or outside the community. So some examples in our open side would be a manuscript reviewer using our code standards for evaluating a package and giving recommendations for improvement inside the manuscript review, but completely independently of our open side. Or a faculty member using our peer review guidelines in the curriculum for a master of data science program, or it could be someone giving a talk at a conference or meet up about their experience in the package review process. So let's zoom back out to look at the model as a whole again. One of the things that helps us grapple with is that communities are not static. There is no one of these modes that marks a healthy community. There's no one correct place for any person to be participating in this is going to vary depending on a person's motivations and what their limitations are at any given time and we know these things change. Clearly, our side is putting a ton of resources into creating and managing a lot of programs, but are they the right programs for us and other right programs for community members to address what people need now. Well the ecosystem is changing again. Outside of our open side in recent years more and more organizations and initiatives are arising that are promoting open and reproducible research and code and publications and adding advocating for careers in these areas. Even until a few months ago I was feeling pretty stuck in deciding what our community strategy should be moving forward. I knew we needed something more nuanced than a survey. And so I had a conversation with Lou Woodley of the CSCCE, and I was reminded in this conversation how much I am energized by having conversations with people. So together with Karthik we designed an interviews based study to get at people's perceptions of what can they get from our open side that they cannot get anywhere else. So we designed this interviews project where we want to interview a range of a diverse representative community members. So far we've completed 10 of 20 interviews and they're turning out to be providing a neat way of seeing the different modes people participate in as well as their temporal movement between these modes. So I'm going to give you one speculative observation based on our preliminary results and it does echo something that Karthik and I felt that we already knew. So in this diagram the concentric circles represent different levels of intensity of participation in our open side activities. So the yellow center is staff, pretty intense, but we get paid to do our jobs. The outer circle would be some less more peripheral activity, things like attend community calls or read the blog. So one example path is this set of arrows that's sort of marching toward the center and this might be someone who is a user of our open side packages at some point volunteered to review a package, got the confidence and upskilled so that they submitted their own package for review and they iterate around here participating in these modes for a while. And then some people become editors for software peer review. Now just like people's careers, more typically these paths are not a straight arrow trajectory and some of these zigzaggy things are more typical where someone might submit a package for review, write a blog post about it, and then participate more peripherally asking and answering questions, but then also externally in their own workplace maybe they're advocating for better code review processes. And this is something that happens outside the community that we're not necessarily aware of unless we interview people. So practically speaking, it's going to help be helpful to expose these paths so that others might recognize themselves and also sees how tells me how we can guide people in future based on things they're participating in now. So we're left with needing to complete the interviews will do an audit of our current community activities and see how we're signposting participation in those will be writing a manuscript. And I know Lou is particularly excited about mapping these practical things we're seeing onto community practice theory if we can swing that. So while we haven't yet exposed paths through activities in our open side, I think we've done an okay job at creating some signposts where people can actually get started. So for my last few slides I'm going to come back to my audience number two. If you thought that you might want to get involved in our open side. A lot of people ask me how they can do this and I realized at some point that it would be more equitable and much more sustainable if we created an online our open side community contributing guide. Stephanie two minutes to the Q&A. Perfect, thank you. The guide asks you to consider your motivations and goals, you choose from a list of actions you might take, and based on your choice takes you to instructions and resources that help you contribute. So chapter two is called what brings you here I want to and it's divided in these five sections discover, connect, learn, build or help. And based on your choices here you're taken to chapter three which has descriptions of resources in our open side like code of conduct, communication channels, blog guide, packages and docs, task views, use cases, open issues lists, dev guide stats peer review and more. I will give you just two examples. Let's say you choose to connect. You want to gain exposure in the open science our community. One of the options would be to explore open issues in our open side packages and consider submitting a fits. So if you click on address an issue that takes you to chapter 3.8 issues list and one option for how to contribute is look at the issues list. If you click there and in the guide but also in these the online version of these slides, this takes you to the our open side GitHub organization to all of the open issues that are labeled help wanted people actually want your help. If you see an issue that interests you, you take a look at the project's contributing guide, comment in the issue to discuss your approach first with the author and create a pull request and submit your solution. A second option would be if you want to build you want to influence package development, you could volunteer to review a package so if you click there it takes you to 3.10 dev guide, how to contribute fill out that form to get started that link takes you to the form. You'll be contacted by an editor when a submitted package fits your profile and first time reviewers are actually paired with more experienced reviewers so we lower the fear barrier there. So I'll bring it back here for discussion. Where do your community activities fit in the community participation model. What paths do you see people taking. Do you recognize yourself on a path through our open side. Now I think you understand that I do love a good discussion so I want to know what you think in the question and answer. Thanks. My question would be for a community that is, let's say starting to get large let's say our ladies for example. How would we go about finding somebody as skilled and knowledgeable about community organizing as you Stephanie like is it do we know, like, you know, there's lots of great communities out there but there's very few that have like a core team dedicated to organizing. I don't know if that's a good question, but yeah, I would say first of all, five years ago when I started this was not a thing and so I was lucky to have a training fellowship in this area I started out as a research scientist publishing papers like many people here. This community CSC CE is a wonderful warm community that I live in. And there's lots of people doing community management stuff. Some who are doing it as volunteers and some who are paid and that's my number one recommendation for anyone who's even slightly curious about this you could get in touch with me and I can, their slack is open I can get you in touch highly recommended. Thank you. So we have two questions now in the Q&A. One of them is pretty open ended but what's what's the next steps for our open side where you know where do we what do we see on the horizon. Well that's exactly partly on the community side why I'm doing this survey is really to get rid try to get rid of my own biases and figure out what community members need. But also I think some of the talks that are happening later in the week like our statistical software peer review system and the our universe are the big exciting things and I'll be doing community work to support that and get more people involved particularly groups of people who are typically less visible less represented in these in these types of activities so that's a thing I'll be paying attention to. Great. Another very important question. Okay, how this this goes to any open source community really but how do you how are you handling main main developers of packages who leave or abandon packages. So somebody saying they've had that issue. Is that something that you deal with within the open side community. Yeah, I have to say and this is where I give big props to my colleagues. Part of what our open side does is it makes a commitment when someone has their package pass software peer review and become part of our organization you still maintain control of your package. But we commit to helping find a new maintainer for packages we do a lot of that. And that's on people my colleagues like Scott Chamberlain my old salmon, they do a great job with that. And one thing I think we do particularly well is when a package is needing a new maintainer. We do try to balance balance, let's say representation, genders and things not just balance but like reaching out to people who aren't necessarily what we call the usual suspects and get people involved in that and they really put an impact on helping people get started if they're not initially comfortable with that. That's, that seems like a huge value for, you know, doing the initial work to get your package into our open side you're kind of guaranteeing in a way that it, you know, it will be taken care of I mean I'm not saying people should do the work and then disappear but I think that sounds like a huge value because that's a big, a big problem for people. People get jobs like your own motivations and your own needs changed it's okay that this is not necessarily your priority anymore. Maybe you've moved on, but lots of people still rely on your package so we try to help people do that. Yeah, so we probably should wrap up here so we yeah we should we should wrap up now. And thank you for the questions we'll try to get those answered in the slack or wherever we're posting the questions. And thank you very much Stephanie, and we'll have our next speaker join now so our next speaker is just pull it up. She's going to be talking about the our developers guide so Sanjit Kaur Bogal, and she has two co-authors Heather Turner and Michael Lawrence so all you Sanjit. Hello, everyone. I'm Sanjit Kaur Bogal. I'm very glad to be presenting my talk on the art developer guide at user 2021. I would like to give a brief background on how this guide came to existence. After user 2020, I got to read the discussion on the early dislike about building a user friendly art guide, which could help a new person to be able to contribute to our poor. In this duration, I contacted Heather Turner over Twitter and asked if we could write a funded proposal for such a guide. Heather very promptly replied back in positive and the seeds for this work will lead them. The next task was to find a funding source. Heather pitched this proposal to the art foundation, and they agreed to fund me a generous amount. Meanwhile, Heather Turner and Michael Lawrence agreed to be my mentors for this project. Over the same time the art contribution working group was formed and I was funded to work on this project from February 2021 to May 2021. Many people joined us on the way as contributors and reviewers. This guide has benefited by contributions from all of them. A special mention to Carol Willing. This guide is highly inspired from the Python developers guide and Carol has immensely supported it by being a representative of the Python community in this work. The question might arise in one's mind. Why do we need such a guide? Yes, definitely there are documentation and articles written on how to contribute to our code, but most of them are dispersed and not easy to navigate for a new company. We have taken inspiration from many of these materials while preparing several sections of this guide. Our hope is also that this guide will facilitate the onboarding of new contributors to our code. Some statistics now. If you view the art project website, it mentions names of 20 members in the art code development team. Whereas the Stack Overflow Developer Survey of 2020 states that out of approximately 57,000 respondents, 5.7% which is approximately 3,000 respondents said that they mostly use art. This gives us an indication of the large potential audience who could benefit from and use this guide. Where is the R Developers Guide hosted? As of now, the R Developers Guide source code is hosted on the forward GitHub R Dev Guide repository and compiled as the R Dev Guide, which you can find at piddly slash R Dev Guide. Inspiration. The format and structure of the chapters in this guide are highly influenced by the Python developers guide. We thank the Python community for all their support. The image is of the first page of the R Developers Guide. It has the official R logo at its top left. There's a scroll bar which separates an index on the left from the main body on the right. The main part mentions the guide name, working group name, date and names of the author, mentors, contributors and reviewers. Now let us look at the various chapters of this guide. The first chapter is the Getting Started chapter. In this chapter, we describe the instructions on how to install R and the tools required to build R and R packages on the Windows operating system. We started with instructions to build R on Windows since Windows is the most widely used platform among general R users, especially in the parts of the world that are currently underrepresented in the contributor community. R uses a major dot minor dot patch level version numbering scheme. Accordingly, there are three main releases of R that are available for installation. The official release also called the R release. The patched release also called as R patched and the development release also called as R Dev. R Dev is the next minor, eventually major release development version of R. Mostly bug fixes are introduced in R patched, while R Dev is for introducing new features. If you want to build R and R packages on Windows, then you will need to install R tools and a distribution of latex. The next chapter is bug tracking. This chapter explains what you could do if you come across a buggy line. Not all issues are bugs and the first part of this chapter helps you distinguish between what is a bug and what might not be a buggy line. Once you are confirmed that there is a bug, check if it is already reported. If you are confirmed that a bug exists, then you should definitely report it to the bug tracker. For reporting a bug on bugzilla, you need to create a account on it. To create a bugzilla account, please send an email to bug-report-request at the rate r-project.org, briefly explaining why you need an account from the address that you want to use as your login. Reporting a bug is not the only thing you can contribute. You can also test an existing bug or, if possible, provide a code that fixes the bug. I will explain this in the upcoming slides. You will find a bug in the code or documentation of either the R-Core supported packages or in the packages that are not supported by R-Core. To identify who is the maintainer of a package, run maintainer package name. Packages that are maintained by R-Core will have the maintainer listed as maintainer R-Core team r-core at the rate r-project.org. For reporting a bug in R, run the command bug.report package name, whereas on R-Studio run utils.bug.report package name and you would be directed either to bugzilla or the specific issue tracker for that package. To write a good bug report, you should provide a minimal representable example, mention the sort of software architecture and, if required, use built-in datasets as far as possible. By doing this, you have saved time of the R-Core developer who will be addressing the bug report you submit. This chapter also provides examples of bug reports that are already submitted to the bug tracker. The next chapter is reviewing bugs. You could help by reviewing the numerous bug reports that are already present on bugzilla. Use the advanced search option if you want to find reports on specific topics. What can you do to review existing bug reports? Well, you could check if the bug report that is reproducible, if the steps to reproduce it are mentioned or not, and if they are not, then you could help to provide those. You could also check if the information like the machine architecture, version of R, and the operating system on which the bug occurred is provided or not. If not, please add it. A review of older bug reports might lead to fixing the bug sooner. The next chapter is source of functions. You may want to have an overview of the R-Core base just out of curiosity or maybe to gain more insight into what a particular function is actually doing. This chapter explains the procedure of how you could find the R and C source code. The next chapter is lifecycle of a patch. Suppose you come across a bug in R and you have an idea of how to fix it. This is a situation when you could submit a patch. But what is a patch? A patch is essentially the set of differences that is either additions or deletions between two versions of code, the original version and your proposed version to fix the bug. To be able to submit a patch, you need a subversion, client, SVN, and the latest developer version of R. Follow the guidelines for making good patches mentioned in this chapter. Whenever possible, it is a good idea to get your patch reviewed before submitting. I will share links to where you can get help with reviewing your patch. Also, whenever possible, help by reviewing patches already submitted to Buxena. The next chapter is documenting R. You can submit bug reports and packages for the R documentation as well as for the R code. The help file for R functions are written in R documentation.rd file format with a header, body, and footer, which are described in more detail in this chapter. You can also help by correcting typos in the documentation. The next chapter is testing pre-release R versions. It is also possible to help by testing pre-release versions of R. Whenever possible, use a fresh library for testing or even better, use a virtual machine for testing. This will ensure that you do not damage your existing R installation. What you can test? You can test your own programs, your own workflows, your special ways of installing and setting up R, the things that interact with external libraries, and interactive R packages. The next chapter is where to get help. To get help on various topics, please use either the Slack or the mailing list. We have created various channels on the R-double Slack, like the bug reports for review channel, the R-double help channel, the code documentation channel, the code translation channel, and the patches for review channel. You could also follow various mailing lists like the R-double mailing list, the R-health mailing list, and the R-package-double mailing list. The next chapter is news and announcements. The latest news and announcements from the R-port world can be obtained from various official sources. You can follow the official public blog of the R-project. Announcements and updates about the conferences that the R Foundation supports are available on the conferences page of the R-project website. Short to medium-length articles covering topics that could be of interest to users or developers of R are featured in the R-journal, which is the open access refereed journal of the R-project. The news is a separate section on updates to R, though R-port members also contribute regular articles on big changes to R. You can also follow the R-announce and the R-packages mailing list. The R Foundation maintains a Twitter account with the handle at the rate underscore R underscore foundation, which can be used as a reference point for interaction with the R development community. The next chapter of this guide is developer tools. This chapter focuses on how Windows users can get the tools essential for development. These are the resources provided for getting the subversion client, grep, git, and help with various common tasks on GitHub. Although R-core is maintained in SVN, some of the recommended packages are on GitHub. How can you contribute to this R developers guide? To be able to edit this guide, you need to have a GitHub account. After you log into GitHub, click on the edit icon highlighted with red lips in the image. This will take you to an editable version of the R-markdown file that generated the page you are on, and you can suggest your edits there. To raise an issue about the guide's content or to make a feature request, use the issue tracker. We request the maintainers and contributors to this guide to follow the code of conduct. This is just the beginning. Please join us and help in the further development of this guide. As further work in the near future, we will include materials on how to translate warning messages and errors in R to non-English languages. We also want to expand this guide to include instructions for non-Windows operating systems. Thank you. This is the end of my presentation. These slides are created where the R package is having in. I'm open to questions now. Thank you for the talk, Sarenjeet. Thank you. We have, let's see, there's no questions on the Q&A right at this moment. Let's see if there's anything on Slack. There's a few questions about where do we get the slides, but Sarenjeet's going to share the slides, a link to the slides on Slack after the whole community outreach session is complete. So if you're looking for a copy, if it was a little bit hard to read, you know, the YouTube was a little bit blurry on the text, so we'll get you a good copy of that afterwards. We have a few minutes. We have five minutes or four minutes for questions. Kirill is asking a question on Slack. Can you also contribute the guide by filling issues that the resources not mentioned? Yes, yes Kirill. You can do that. So I'm answering over here. Yes, you can do that. You can open an issue using the edit button. If you find any issue with any of the chapters that are present. Or if you want to add any materials or resources that are still not mentioned in this guide, please do add the issue and we'll go through it. Sarenjeet, I can ask a question. What would you say to people that are sort of only familiar with GitHub but they don't know about SVN and maybe that's some kind of, you know, friction point for people. What could you say about that? So there is a chapter called as developer tools. So if you are inexperienced with SVN, it's very likely that it is new for many, can go to that chapter and there are resources that you can read from there. There are many resources that help you with the subversion plan and the commands that are there. Okay, so there's there's a related follow up question. Someone says, David Lindelhoff says, because of the visibility, I'd much rather contribute to projects that are on GitHub. Do you know if there's any plans to move our core away from SVN to GitHub? I'm not sure I'm the right person to answer this, but yes, there is a GitHub mirror for just made by one of the our core members, though it is not official. But if in future it does move to GitHub, that would be the repository that we would be moving to. But yes, still now it's not official and yes, still on SVN. Yeah, that's a good question. I know that people have asked about that before and I would just say, you know, it is a huge effort to move all of the infrastructure away, which I'm sure people are aware of. And yeah, I think, you know, had they decided to start our today, probably would have done it on GitHub. But yeah, it's a huge, huge undertaking. You know, perhaps there's some, I don't know, future effort or grants around that where it could get some help on that. But yeah, it might be a while. So thanks for the question. All right, we have one more minute for questions. There's someone on the Slack saying, are you familiar with recent developments of integrating GitHub with the SVN workflow? Yes, this is the one that I was pointing to. Okay. And there's also some links shared on Slack with the different seniors, things like that. So, but it sounds like I mean just even the developer guide itself, that's a sort of a document that you can contribute to it's on GitHub. You know, that that seems like a good place to start and then, you know, if you get to the point where you're ready to start contributing to our core then then you can sort of clear that last hurdle of moving from GitHub to SVN. Yes, it is. It's just been a great performance into this project that we are writing this guide, and we would want people who are more experienced to be contributing to Skype as much as they can. So that gap that is there between the most more experienced developers and the new developers is reduced. So yes, we do invite others to join us in this project. All right, well let's wrap it up there and we will leave a few a minute to get our next speaker ready and thank you Serenjit for the talk. Thank you. All right, so coming up next, we've got the final talk of the series we've got a talk by Luis Riviera Sancho, and the title of the talk is packages, submission and reviews, how does it work. And if you were were like about 30 seconds early, so here we go. It switched so feel free to start off Luis. Thank you. And, well, hi I'm the Riviera. So I will explain that now on the talk but I just, I was just a normal user and then I was curious about how does it work. And I just wanted to share this experience with you and how does it work. But now I think that we really switch to the video. Hi, this is Luis Riviera presenting packages, submission and reviews, how does it work. When I first submitted a package, my advisor asked me, whoa, two days just for acceptance, this is quick, we need to do something else. And I just didn't know if two days was quick, if we need to do something else after acceptance or not. And so I gathered some data about packages, submission and reviews and I want to present that to you. But first brief introduction about submission. So when we make a submission to an archive, it's because we want to share something of quality that can be useful to others and make it so that it's easier for them to build upon our packages. Maybe it's part of our work, maybe it's part of our research, maybe it's a hobby, or maybe it's all of them together. But then we submit to an archive. I submitted packages to three different archives, CRAN by Conductor and OpenSci and I will explain a bit about how do the reviews work on each of them. But I will focus mostly on CRAN because CRAN is the most important one and it aims to store all non-trivial publication quality packages. While by Conductor focus to promote high quality, well-documented and irreoperable packages around informatics and OpenSci, it's like an overlay and you can submit the package on OpenSci and then to the other archives. But OpenSci focus to improve the adoption of best practices with useful transparent constructive feedback to the maintainers. So each archive has different objectives of their reviews and they further have different differences between projects. For instance, CRAN doesn't have any guideline about how to write packages for them. It only has the manual about how to write extensions on air, that's packages. And by Conductor and OpenSci they have books on our website to just help you how they want your packages to be. On CRAN you will submit just a compressed file and the review is done by email and you don't need to set up anything. While on by Conductor and OpenSci you fill an issue on their repositories on GitHub and then you need to set up something. On by Conductor you need to subscribe to the mail English and then set up some accreditation because they clone your repository. And on OpenSci you need to set up some continuous integration test because for all of them the three archives, the first step is to check the package. So there are some parameters that you need to check and almost all of them need to check with their option as CRAN. Except by Conductor that has a different set of checks and they have their own package. But all of them run these checks on three different operating systems, Windows, Unix and iOS and for multiple versions. By Conductor is also a little bit special because it tests just for the next release usually or the current release of R because it's released twice a year, one a week more or less after the annual R release and then six months later. CRAN has a team of five reviewers, by Conductor half a team of ten reviewers from the project and then OpenSci works differently because it has a team of ten editors that select and look for volunteers from a list and they require two volunteers, two reviewers for each package while CRAN and Conductor are just one. So each project has their differences and also the submissions are different. If we look at the round numbers of submissions between them, looking just mostly submissions of new packages, CRAN has around 200 new packages submissions by Conductor and order of magnitude less, around 50 on the best month 100 and OpenSci even less, less than 10 submissions monthly. And from now on I will focus mostly on CRAN because I think it's the most important one and all the others depend on them and I think it's the most interesting for all of you including me. But the problem with CRAN or given the amount of packages submission that they receive and the question is how do they organize? So the organization of CRAN submission process is done on folders. Packages are moved between folders along the process. The most important ones are the pretest folder and the newbies folder. Here we can see how full are these folders from September to last month more or less and how many packages are on these two folders. We also on the middle, December we can see the CRAN holidays and then there is a lot of variation between them. If we focus just on the packages after holidays, the first thing we see is a big spike of packages on the pretest folder. This is because as I said all packages must pass the automatic checks that are done on this pretest folder. Later on they are moved to several other folders. If this is a new package then it's moved to the newbies folder where the manual review is done and the packages are checked. And here after the CRAN holidays it took more or less two months until the volume of packages went back to a lower level of around 25 packages daily on the queue. And there are another spike of packages on the pretest folder around a week before the new R release 405 that maybe a lot of packages needed to update for that release. I'm not sure. But we can see that there is a lot of packages that goes and when away so take patience and also try to submit your package whenever you can. But there are also some submission patterns so more packages are submitted usually around the middle of the month, around the day 14 as we can see on the left. And there are also more new packages submitted on Sunday, Monday and on Saturday or Friday. But overall I would suggest you to submit your package whenever you are ready and then wait until your manual review is done. It can take some time so it's better if this time is spent on the queue than waiting on your computer. But if we look at the review time we can see that most packages spend very few hours on the queue. There is some package that has spent more than 2000 hours but that's an old liar. That's not usually and that's not expected. And most of them spend less than 24 hours. Out of 6000 packages spend less than 24 hours. And even if we go to a week there are a lot of packages that went away to Cran before a week. I had a package that was submitted and in a couple of days I got feedback from them saying that if I could change something, I rapidly, if they're saying that it was not my cause and I think they should change something or by a conductor and it took more than a week. But usually feedback is quite quick and you can expect to have reps before and to have your package accepted very soon. But reviews are short and brief and to the point so don't expect like a cold review, a style review. Just they point out errors and missing parts that you need to fix or address before it is accepted. But this period of time is with all submissions and we have seen that there is a different process between new packages and packages that need to be updated. So if we split them we see that there are some differences. There are updated packages took more or less 5 hours and it's quite constant along the period studied. But new packages can take a lot more time. On September it took around 200 hours. Before Cran holidays it took around 80 and after holidays with this queue of packages there were a spike to 100 hours and later on a month ago it was around 50 hours for a package. For reference by a conductor it takes around one month until your package is accepted and on an open site it takes around two months maybe because there are two reviewers and the feedback and the process is more... ...lifely I would say. But anyway expect like three or seven days until your package is on Cran since your submission. Here we don't see the exchange between the reviewers and the maintainers because the communication is done through email. So we cannot see how many comments or how many submissions did it require to get accepted. But if we look at the data from Bioconductor and OpenSci and the users that they comment on the GitHub we can more or less get an idea. Looking at the user roles on the left is Bioconductor and the right is OpenSci. We can see that the users that participate on more issues and do more actions are reviewers or editors. Then there is a group of users that just participate in one issue. They submit a package and then they do the rapid, they comment and so on. And there is a subset of users that participate in many issues and they also comment a lot. These are some subset of users that are very actively involved also in their reviews maybe providing help or being asked to help or providing expert advice to new users. But the most important part of this feedback is our comments. So if we look at the comments we can see that there is a correlation between authors comments and reviewers and editor comments. So there is a conversation between the reviewers and the authors, maintainers, that this is not kept on the other people that participate on Bioconductor that are not reviewers. So we can see on the bottom left corner. And there is a special user that provides also feedback on Bioconductor that's the bot. This bot provides useful feedback on the way tagging on the issue, providing some comments to the maintainers such as saying that it has found that this issue is already posted, that these packages are already submitted somewhere else or that maybe there is some submission without changing the version or that the setup is not properly set like the SSR key or that there are multiple repositories identified. And this bot is so useful that Bioconductor and that AeroConside now it's copying and it will have a new bot to provide this useful feedback to all. And this is quite helpful also because it provides with a real result or if it's accepted and it removes some kind of burden to the reviewers and editors. And finally the most important part is if our package is accepted. So looking at the success of the submissions between new packages and updating packages, we can see that most of them are accepted on the first try and on the second and third and so on they are also accepted. Many of them are accepted. The rate of acceptance is higher on the updating packages and on the new packages but overall the acceptance rate since September is around 80% on new packages and around 90% for updating packages. So if you submit the packages, your package and if you rapidly unfix the issues, your package will be accepted mostly. So to get your package accepted, prepare with the manual how to create packages with a book. I recommend reading our packages. Follow also the policies, grant policies and guidelines by Bioconductor and AeroConside. I recommend also the guidelines about writing packages on AeroConside even if you don't submit to this archive. And then check your package because this is the first step they will do and if it doesn't pass the automatic checks, it won't go to manual review. Use a hub, it have actions, your computer, everything that you can to make sure that your automatic checks pass. And then submit and if there is an issue fix them and explain to the reviewers you have fixed it or the steps you have done even if you cannot fix it, maybe they can help or you can ask for help on the mailing list. I would like to take the opportunity to thank the Aircore members, the grant team, Bioconductor core, AiropnC, Aetos and reviewers for their work they have done to provide a high quality content of packages for all of us. Many thanks. Thank you for the talk Luis. So we have about five minutes or so for questions and we don't have any popping up in the Q&A on Zoom right now. So we'll wait a minute to see if anybody has anything otherwise I have some questions I can ask. There is one question in Slack I think. Okay. Oh yeah. There's a question in Slack right now by Mike Smith. He says, is the submission success rate on slide 13 for all repositories or only CRAN? This is just for CRAN packages. Yes. The success rate for the other repositories by a conductor is like 50% and AiropnC is also more or less the same 50%. But this percentage on AiropnC and Bioconductor includes many issues that people ask questions or duplicate issues that this is not content on this slide for CRAN. So yes, there are some differences, but this slide is just for CRAN. There's a follow up question. So any idea why the success rate has decreased recently quite a lot or maybe I misread the x-axis. This is I think because package reviews take some time. I said about 200 hours or 100 depending on the package. I think the maintainer is responsive. So by the time I took the data probably the maintainer didn't replicate or didn't submit a new version of the package fixing all the issues raised by the reviewers. So this is kind of expected, especially on the new packages. And I think it's that. I mean probably on the first slide there is a link to the repository with all the code to reproduce the figures. So if we run again this slide and probably we will see that the percentage is more or less the same. I wouldn't expect to change much. So it would be around these days around 80%. So I could ask a follow up question to that. So it's I think you said it was 80% success rate for new submissions. Is that right? Yeah. And 90% for the follow up. Are you able to get any data about like what is it about those 20% of packages that are not getting through? Like what is it, you know, they're not, you know, quote, notable enough? Or is there any other data that you were able to access that kind of or have you had a chance to look at that at all? It would be curious. There isn't much data available. I mean, all of these are almost all this is just drawn from the Q system of the FTP system set up by Melle. I'm Stephanie Locke. So there's only just the how many packages are on each folder. And then when a package disappear from these folders, I don't know if it disappeared because grand clean up the packages or maybe the maintainer with wrote the submission or maybe just it got there the file and they didn't continue. I don't know. It's harder to know on an open site and by a conductor. And there's a lot more information because they use levels to signal the information about the package status and the review status. So it's easier to know why packages are failing. And I have also explored that. And most of them are because maintainers or submitters don't fix the issues raised by reviewers. Maybe I have other projects. Maybe they don't know how to fix. And then they abandon the submission. Do you have any sort of advice or I don't know the leadership or in Cran or elsewhere? Like what would be helpful in terms of transparency to try to, you know, I think a common problem is people find it's difficult to find all the documentation of what exactly is required to pass Fran. You know, you run our command check. Like you do all the stuff that you do. You do our hub. And then there's still like a couple of things that are sort of undocumented. And if you've done it a while, you know what those things are, but it's especially difficult for new people. Do you have any advice on that? Well, as you mentioned, there is some kind of written rules on Cran that maybe at first you don't know. For instance, there is a new rule on Cran that you need to clean up every file and user information you will store on the computer. And this is not returning a check. And there are packages that store some information on users' computer and then you don't clean up. But I would think that more transparency or more discussion before a new rule or a new Cran policy would help a lot of people to be aware before submitting a new package. Or just to get some consensus about if this is a good policy because sometimes as a package maintainer we receive news about new Cran policies after they are already on effect. So it's kind of we are on step behind and then we need to react. And for new people submitting new packages is even harder because they are so used to all the other previously written rules. So I would suggest to have more transparency to discuss must be more open with the community because there are a lot of ideas and suggestions that without having much work to the Cran members, because I know that reviewing all these packages is a lot of work. It's one package every five hours or 20 hours for a new package. So, but it would help also to them if we know in advance what to expect and how to better submit packages to Cran. No, I think that's great advice. So we're right at the end of the series now so if people do have suggestions about this last topic please put them in the Slack I know I saw there was an open letter on GitHub to Cran about some of these issues I don't have the link off the top of my head but there are a lot of discussions around this and I think it would be a great follow up topic for many users to come so thanks for bringing that up and that is the end of our session so please join Slack and engage in the breaks over there and see you at the next event next stream. Thank you.