 So I'm Francesca Vitellini and I'm a Senior Solution Consultant at Mirai Solutions and today I'm here to moderate to the panel our user or our developer. This is the question and it's actually really a question because since I started talking about this with my colleagues and in my network I realized that it is really no consensus of what these two roles are in the community. What type of skills should our developer have versus our user or what type of backgrounds is it involved. So I realized it's really important for us to have an open conversation about this and today we are going to do it with five different experts of our community that I'm really happy to host here today. So we have here I'm going to read that tiny bit of a biography sorry. So we have Martin with a professor of statistics at the Mathematics Department of EDH and it was one of the pioneers of the R language and is currently a part of the R development team and the R Foundation for the R project. Then we have Alexis Seig Lager who is head of analytics and technology department at Parnary and he has expertise in the reinsurance business but also bridging the gap between commercial, actuarial and analytic teams. We have Nick Crane who is a data scientist and a software developer with expertise in production grade code and training and she has experience in developing both R packages and Shiny applications as well and NP models in Python. Then we have Sandrine Doudois I open pronouncing the surname correctly my French is terrible anyhow she's a professor and the chair of the department of statistics and professor of the division of your statistics at the University of California Berkeley with a research focus in development of statistical methods and software for bio medical and genomic data. She's also a funding core developer of the bio conductor project together with our master but not least a panelist that is a professor of applied statistics at Harvard and share of the department of data science at Data Farber Cancer Institute and is also a developer of open source software and for the implementation of statistical methodology and part of the bio conductor project since the beginning. So I'm very happy to have you all here as well as our audience and I would also like to thank our sponsors I was loosely coupled with the organizing committee of user 2021 especially in the sponsoring section so I really know how important it is to have people supporting us so I'm going to say thank you for the sponsor of the day which is our studio. Now we know Fidget I do because we only have one hour I would like to dig in and start with the first real question which is what is the definition of an R developer and how does it distinguish itself from from being an R user and I would like to ask all our panelists to give a little bit of their own perspective also based on their different backgrounds so why don't you start Martin. Hello everybody I have the perspective or yes of an R core member but also of an academic who teaches math and stats and computational stats and things like that and as many of you know R is really in some sense the daughter or son of S and John Chambers is the guy who mostly devised the S system and he got the very famous computer science award the ACM award which is only given to one person once a year for creating S and he wrote the book that's now the topic he wrote the book with the title programming with data using R or using S and so his approach is good data analysis good statistics good data science whatever you want to do it good machine learning means programming you cannot do good data analysis if you just click buttons and choose from a user so you have to do programming and creating S and from there are really provided tools to use high level programming languages that doesn't need a computer science education that doesn't need compilers and so on as you all know to do data analysis that is adapted to your to your data situation to the problem you want to solve and so in this sense I think all good data analysts data scientists should be programmers using R and in that sense in that sense in that wide sense all such good data analysts are our developers because some people told me well a developer is somebody who programs who writes code and in this sense good data analysts writes R code even writes R functions to make his whole analysis so much reproducible well documented and so on. Do you agree with Martin's perspective? Yeah absolutely so I definitely agree with the idea that you know it's quite a wide thing that we could probably call a lot of people in our developer and I think this question also is something that we really need to be careful about the context that we're asking it in because we're at a real risk of potentially sounding quite gatekeepering so at a kind of high level absolutely you know and our developer is somebody that writes R code and I might even put something in there about a little bit of knowledge of software engineering principles even something as simple as writing functions or even including documentation but then it gets more complicated I guess if we're thinking about that distinction in the context of the job market right because I think a lot of the time on our developer and a data scientist can be very similar but there's more distinction between data scientists and Python developers say than data scientists in our developer and I think it gets a bit more complicated when you get to that point. Yes thank you very much and Alexis what is your perspective from an industry point of view? So I there's a difference between people who use R and use that to fit models who are clearly writing code and what I would call an R developer for me an R developer and a distinct from our core developer which is something else but an R developer is someone who applies software engineering principles to build production code and environments and happens to use R to do so and that makes it rare because R is not a particularly well-known computer science type language in comparison to say Python but in the right context and in the right organization and we've recently hired our own R developer where we say there are reasons why we want to build production software in R and it's not about fitting models and it's not about data science actually and it's not about ML or anything like that it's about build R code which is production ready which can be deployed which is reproducible which is scalable use techniques from software engineering to make sure it will be maintainable and robust also for your colleagues to work with and when you do that you are in our context and R developer. Thank you Alexis and Zendrin what about academia this is it the same there? Hi everyone and thanks Francesca for putting together the panel so yeah so what's an R developer that's not such an easy question and I think that's one that we could ask for other languages as well and it's clear already from talking to different panelists that there are different perspectives in academia or in industry or even within academia and industry I think I'll step back a little bit so there's really a range of expertise in R users from someone that just uses the software for the first time to someone that advances the language and makes research level contributions to computing so I think a software and R developer is somewhere on that spectrum the real difficulty is where you put the boundary and also what features of that person you use to define that boundary so you know for instance you could put a developer as someone that contributes a package to repository like Cran or by a connector or someone that produces production level software but then I think there's a real big difference between someone like me who contributes packages and then say an R core developer like Martin who advances the field of statistical computing so I often use this analogy from math so there's a big difference between someone that uses a theorem applies a theorem in a very useful manner and then someone that actually proves that theorem so I do think it's worth adding maybe a third category to reflect at least three levels you know that cover the whole spectrum so beginner, intermediate, and advanced so I think the real question is define the variables and the features of that would define these different types of R users or developers and try to come up with an understanding so I really look forward to hear more about this. Thank you and last but not least Raphael what do you think do you agree with this three category model or you have a different opinion? Yeah I mostly agree with what Sandrine explained also my definition is similar to Alexis so the first thing I thought about when you invited me to participate in this panel is why are we defining this what's the point of defining this term and the one reason I can come up with and it's something that I actually is somewhat important to me is for hiring people so if I'm going to use that term to hire people I want it to mean something that everybody uses has the same definition that will make my life easier that's the selfish goal but let's take that goal as my goal for now so with that goal in mind I would define our developer similar to Alexis and Sandrine I have two jobs I am an academic that writes papers and tries to disseminate statistical methodology or computational algorithms that I come up with or my students in postdocs and I'm also the chair of a data science department at a cancer center where we actually develop production code for making decisions for patient care those are different very different but at the end the R part is similar so in the case of an academic I if I come up with an idea that I arrive at as Martin described programming with data in R and I have I want to share this idea with the world want others to use it to implement that I want it to be fast efficient well documented so I want I have become too busy to do that myself so I want to like I want to hire someone to do it for me now or for my postdocs and students wherever else and there's also the other reason too is because there's people out there do it much much better than me so I would I would define our developer someone who could write efficient R code know how to how to make packages know how to document know how to make make it efficient and by for example writing C plus plus code or using matrix algebra and that's similar to the other to the other side of my job which is to implement production software so in that sense that's how I would define our developer I also what when I first heard the topic of the of the of this panel the when I thought about developer this is before I thought about it more after our conversations the first thing that came to mind was someone that was making our improving the R language but since we've since we have conversations I've had now a more practical definition of what an R developer is yeah that's great I'm really happy that this panel is already bringing something useful really if it started there less than 10 minutes ago anyhow let me share my screen again I would like to show you what has the community said so far so quite interesting interestingly almost 50 percent of the people who took this survey and we are talking about a really small amount of people it was about 150 people who took the survey over the span of a week well they thought that another developer is someone who developed tools for other to use and this can be also domain specific then 21 percent of believes that is someone who structures the code as a package 13 percent believes is someone who extends the R language 10 percent is a someone who applies basic software development concepts to code designs and the rest is other answers and also in terms of defining an R developer versus an R user we see quite a bit of a variety we had this question both in the survey and in the LinkedIn poll that I ran for a couple of weeks and then we saw in both cases quite a uniform distribution slightly picked towards the R user but yeah this is another sign for me that there isn't a clear definition or a clear understanding of what an R user is with respect to an R developer okay so with this I would like to go back to what Raffer said at the end of his answer and it's about hiring so first of all Nick what do you think is the concept of an R developer in the job market? Yeah so I guess as I was saying before it's really kind of pretty hard to distinguish so for example I've had previous jobs where I've had a job titled data scientist or data science consultant but actually when you look at the work I've been doing it has been mainly R development so with that it becomes difficult and then there's so many different facets within R like again I've been an R developer I've ended up doing web app development I've ended up doing package development I've ended up doing machine learning so again it's really hard to pin it down although with all of those things said as somebody that doesn't come from a traditional computer science background I think the things that have made me stand out from other candidates that I've been told have been useful when I've been applying for R related jobs that have been more developer focused and say stats focused have been things like knowledge of concepts like unit testing version control deployment coding standards and all those more kind of traditional software engineering type concepts and I think that probably comes from the fact that traditionally if we kind of divide data science into R and Python a lot of our users will come from an academic background which isn't necessarily computer science and I think it's those bits and pieces that can make somebody stand out a bit more and I see you Alex is nodding there so what do you look for when you hire an R developer continuing if you're just a hired one so I think what Nick just described is what I would call the carpentry of R development so of any development actually unit testing deployment automation mastery of your tools and so on and then for an advanced R developer I would be looking for someone who can I don't know think in a computer science way so here is an algorithm in code is this heavy on memory or on computation what if your memory is constrained how would you modify this how do you do the trade-offs which you would do in software engineering how do you trade off speed versus memory how do you understand that or can you if we go okay this is cool but now we need to parallelize it how would you do it effectively and then linking that to R has some really sneaky gotchas when you're trying to vectorize or turn something into matrices or so do you understand these concepts and do you understand how they apply to R and you understand how these trade-offs work so that you can develop something truly next level the way in R's to do so and that's those are software development skills linked to knowledge of the R language it's and for me it's generally genuinely both it is R developer I expect all these things of a developer but if somebody has all these things and just knows it for Python or for C++ and doesn't know it for R they will not be able to perform in this role because they don't necessarily know and understand how the underpinnings of R can help or hinder you as you're trying to solve these questions and Rafa do you agree with Alexis do you think this is applicable in academia as well absolutely 100% both Nick and Alexis they were they're hitting they're basically summarizing what I think we need to explain when we write a job at and that's what I was referring to when I said I would hire them not only because I don't have that much time anymore but because they're much better than me at things like unit testing and other software engineering principles so that's that's exactly right I think it's it's that's how I would want to write an ad now to hire a software engineer explaining all those all those things it has to be R because and why does it have to be R that's something else we haven't talked about like if you're if all you want to do is share your statistical idea that could be done in Python or whatever else but but I think it's there's a lot there's two reasons one is because already the prototype will be done in R and second because the users that we're targeting are mostly R users at least I at least in my case this maybe Python is closed but in the genomics world it's about even right now but yeah I agree okay so maybe you can continue in this and tell us a little bit what type of skills so practically should this person have a leak should they know dev ops version control or which ones would be the most important for you and maybe Alexis you can follow up there with what would be most important in your case oh sorry did you ask that one to me from Jessica first no no I didn't but if you want if you have an opinion on that no no I thought you said Nick but you didn't no I was asking I have to continue that but yeah I don't want I don't want to be repetitive a lot of a lot of the concepts and skills have already been described the one that I the one that I would add for my own area is it's we don't always get this and you can learn it on the job but if you get someone that already knows the the details and the nuances of the biological problem or technological or the technology that's generating the data that that is a huge plus it saves us a lot of time in training them and of course I imagine that's similar to other application areas but other than that I think between Nick and Alexis they kind of mentioned everything I would I would I would put on the list I would add one I'll call it a non-technical skill it's a behavioral skill empathy with an end user so it's it's it's if you're technically amazing but either the end users don't want to talk to you or you don't understand what they're actually after and you don't build what is actually needed then you are not as valuable as you could be to the success of the the organization you're in and Martin maybe what would be in your opinion that the top skill that an art developer should have and mute yourself I'm used to teaching that I'm never muted sorry um rough you made in a very important remark there are some skills that you can learn on the job and you should learn because it's inside the company you work they have their standards anyway and so so I think it's it's it's well of course I have the point of view I want to be to teach people to be creative and thorough and self-critical data analysts and that's something that needs much more time than to learn a new style or to learn a new dialect of sea or whatever so the question is really and I see that in the Q&A somebody actually asked the question too now most are users and that's probably still true come from statistical data science background and not from computer science at least that's very much the case in Europe I don't know how much it's in the on the west coast or in California now that things change and of course in industry there are computer scientists too that that come to us or but I still think the the data analysis skill which is not just a pipeline of things you run your data through and then you get the output is something that needs time to be taught and we still emphasize to teach that to our students and not computer science skills because then they well if they go to computer science that's fine and machine learning is actually at our institute machine learning is is taught by both statistics and and computer science with an emphasis on computer science whereas there are other fields so can I can I comment on something I I agree with everything that Martin is saying but I would agree to it when describing a data analyst which I consider to be a separate task of the software developer and I in academia at least the academia I've experienced we have good academics that are good at data analysis and we teach it and it's hard to teach like Martin says it takes time you have to be there's a lot of critical thinking but we're not at least in my experience we're not that great at software engineering there are some there are good ones out there there's you know our core and the people developing tidy verse and data table they're great software engineers but I don't think that's the majority in academia we tend to not be trained and also we don't have the incentives are in place to make you spend more time developing performing data analysis and developing methods either of the two depending how applied you are so so there's not that many rewards I my opinion unfortunately for people who spend a lot of time making their software very efficient very user friendly and that's why we're trying now now we're trying to because there's no incentives from the academia side we're trying to use grant funding or other funding to hire people that are really really good at that and let them do that part given that it's hard to it's hard for me to tell a postdoc or a graduate student spend the next six months making the software really really good as opposed to telling them write another paper it's not it's not really my decision right it's that's just the extensive extensive structure that's in place right now we and we're all trying to make it to reward more for people who actually take the time to but it hasn't quite happened yet that's what this point of because it really brings me to say something so for what in my list of questions there was one about the way we teach are and I would like to ask Sandrine if uh should we shouldn't start to include some computer science uh elements in the way we teach are already considering that normally the R user or the R practitioner is someone that is maybe coming from a very domain specific background and not computer uh science right definitely and and I just want to say that I you know I call what Martin and Rafa have just said um about the qualities that we're looking in it's in the self-criticism is key common sense is key and then Rafael mentioned this um idea of incentive system I mean there's been a lot of resistance in academia about spending time not just on on statistical computing but even applied statistics so I think we really need to to change that and and and start including these in the curriculum so so you can go you can learn by doing um that's one way but at some point I think you do also need some form of training uh to be able to have you know general concepts about computing and and more concepts from computer science to incorporate that in the statistics or data science curriculum um you know before doing that I think we have to think about what populations of students we want to train it's not one size fits all and then for each of these different populations think about the learning objectives so so just from my own perspective um you know sharing a statistics department we already have different populations of students we have our non majors who you know we basically want to teach them a little bit of data literacy and computer literacy but they will most likely not end up being developers so so that's a population we have to think what we want to impart to them then there's um our graduate students um so for these students we want to teach them more advanced notions about data analysis and and in computing and there's a real problem here in particular with data analysis is it doesn't scale up very well um you know if you really want to teach applied statistics or the practice of statistics or computing with data you want to start with studies and and it's a bit of an art and here interaction is really important it's not something you can just do passively teaching to a huge audience and and you know it takes time and interaction so I think identifying the different groups of students and learning objective would be a start and just making it's a lot of communication around that and and showing the value of teaching statistical computing to the academic community it's not computer science it's not statistics it's sort of something that brings elements of both and that will be evolving as data science is evolving nowadays I I think it's very hard to teach computer to teach good software engineering practice in the context of a data science program or a statistical program because your constraint is time and then like both you and Rafa have said incentive software engineering is a craft it's something and you speak to software engineers and they'll tell you and I know this from my experience the first language you learn is a language the second language teaches you concepts but by the time you've learned your fifth programming language you're starting to understand software engineering and I've seen the sort of deeper patterns and teaching multiple software languages in parallel to an entire data science curriculum is both challenging time-wise people only have so many hours in their semester and incentive-wise if I'm a data scientist why would I learn these things when I think I'd probably better learn Bayesian methods of machine learning because that makes me more marketable in the job market so I think producing our developers straight out of university is really hard our developer is the marriage of people using R having experience somehow melding those two things together a lot like Nick has described her journey and then at some point you go I have got both sets of skills I can program in R and I know what it is and that's that's how our developers are made I think it's very hard to teach them so with this idea that the R developer can be a bit of a unicorn but it's a necessary one to have production ready code then can we say that R is a language ready for production and I would like to ask Nick to comment on this because you have the experience of bringing of producing the production ready code yeah absolutely so actually when people express skepticism about this idea it really frustrates me I was in a job interview in about 2018 with a multinational consulting company which I won't name and they dismissed R as how it's just something people use in academia and people don't use that in prod which is utter nonsense all it meant is they didn't know how to put it in production themselves which is very different and you know R there's so many tools out there you've got things that you can make API for plumber web apps with shiny and so many different things out there so in terms of you know R and production there's so many examples as well so I worked on projects that have involved kind of making a massive shiny app that can support hundreds of concurrent users that's now used in the NHS in the UK I've worked on projects involved writing packages and APIs that are used by the Office for National Statistics so I find it completely ridiculous when people say R is not ready for production and I think as Marc Sellers said in I think it was our studio conference 2019 it's cultural not technical barriers that get in the way of R going into production run over completely agree oh I just realized we have questions in both the Q&A and the chat so you know in case you're looking oh Francesca you muted them I did I I didn't want to mute people I was trying to unmute myself anyway um what I wanted to say is that it's great that we have questions in the Q&A I will reserve a little bit of time at the end of the moderation to take whatever the audience is interested in and ask it to all the panelists but please can you put the questions not in the chat but in the Q&A and you can also vote for questions in there so that we will take the most popular first thank you so back to the main point here we were talking about R being a language for production and I wanted to ask if being R a language that is very user friendly makes it more a language ready for production or more a language ready for POCs and Alexis what do you think about that in production is a definition right in production means yeah it's up when it's running and it it can do certain stuff and we well written code is easy to maintain in production it's not just about getting it out there and it actually working the easier your language is to use generally the easier it is to maintain the easier it is to collaborate on a project the easier it is to modularize pieces the easier it is to possibly deploy into a more microservice architecture or doesn't have a comp side background it's if you compare it to languages of which have been built to be in production you look at rust you look at go it's not there but that doesn't mean it can't be used in production see sort of how much of the universe runs on C is a horrible language to put into production because it has no god rails there are so many security holes in the world's infrastructure because this was written in C by well-intentioned people and it didn't get the daylights tested out of it so I I don't think the user friendliness of the language is a way to think about whether something can be put into production it's about is the core bug free if something goes wrong can you find something do you have the tools available to modularize and to collaborate and to deploy automatically and all of that is the case for our is it the right choice for every organization no but in an organization which has a lot of people in it to already use our as part of their daily work it's it's a good choice because that means if somebody is looking at an internal website and something's not working they can go look at the source code and understand it as opposed to this has been written in lisp and good luck because nobody else in the organization or the front end of the organization understands that language nonetheless are being an approachable language has clearly been one of its success factors so that now is one of the most used programming languages so I would like to ask martin what do you think is the the reason or one of the things that makes it really approachable well I I know about four or five computer languages but not many more so I think we started teaching our to our students very early or even and even students that are far away from my computer science even much further away than mathematicians or for instance and we never used the thing that is called I forget john fox menu system anyway we thought that it is easy enough we start to say okay it's just like a pocket calculator but that of course now that it doesn't work anymore because people have their smartphones they don't even know what the pocket calculator is but yeah still I mean you can start just by adding numbers and it has a state and it is interactive you don't need compilers you don't there are many things that you don't even have to learn about and that the very beginning you can work in the console so that's the thing you don't even learn have to learn about files and things so so that makes the entry part of course easy and then compared to the proper computer language with a compiler and and where you have to declare all variables and things like that that so it was written to be I mean it was written to be fast and and good at and also good at prototyping us and quickly get from from data to a graphic without much coding without and that and that was a genius genius thing to invent when they invented in the 1980s I think or even in started and in end of 1970s when the only other computer language basically was Fortran and then after a while there came C along and so that was that was really a genius genius genius ideas to say okay let's try something that is close to how we do formulas in math in some sense it works vectorized at that time no other vectorized well actually there was an APL for those of you who may know it that was my first computer language by the way APL when I was in gymnasium anyway no I mean that that that made that you can start using it you get to a plot without having to learn about syntax really you can even just use visual pattern matching and you can get the plot graphic and that that's very rewarding to get people started I think I'm really talking about non-computer scientists here and I think that strength is also its weakness yes I mean we've probably all heard the quote the advantage of R it was developed by statisticians the disadvantage of R it was developed by statisticians you can get really far with R without ever seeing a function and knowing and write or ever writing your own one or so on so you don't most other programming languages as you learn them and most of the things you describe just Python has those as well there's there's a command line there's no compiler it's doesn't have typing it's but that is a very well structured comp-side developed language Peter van Rossum is a computer scientist who developed it so when you learn Python you very quickly learn about these paradigms of computer science but it's not quite as accessible as R is and you can get very far with R just using it to get stuff done without having to learn these software development paradigms and that's why the the fork they've been a few questions in the q and a okay I'm an R user how do I come in R developer you might have gone quite far down that road and that jump is now big because you're suddenly missing a whole lot of stuff underneath which you never got taught which you somehow need to backfill whereas other languages would never let you get that far without you having learned at least some of that along the way I agree with what you say Alexis but the fact is that many people would stop using R if they had to learn all these concepts it's an awesome language for that very reason sort of and it is so fit for purpose for just analyzing stuff that's exactly the reason why so within our organization we have a lot of actuaries people who know a lot of math and we want to give them a tool to do more advanced analysis step away from excel please R is obvious for that it's so easy to get people going but when it comes to okay now one of these folks has written something which is really nice and you want to industrialize it you want to productionize it you want to deploy it you often need to take 15 steps back and rebuild it from behind to get those engineering principles in place and that's fine it makes it perfectly appropriate but it does mean and the same can happen to you in python no question but it it is one of the challenges when somebody goes I'm an advanced R user how to become an R developer there's a chasm it's breachable it's if you're an advanced R user you definitely are going to have the potential to become an R developer but there's a whole bunch of stuff you need to learn to move across and well at least for you the most important characteristics of R that makes it so approachable oh that's a really great question um I guess there's certain things like the R community so it's it's really easy to go out and get help on things like maybe you know in the past I've tried to learn other languages and and you know years ago I was a bit nervous about asking questions on stack overflow it's a bit different now but I'd be so hesitant about doing that whereas now I can go there I can go to Twitter and there's there's so easily answers to my questions and nobody laughs at my questions being stupid um which I think really helps that thanks and do you think that maybe a tidy verse would help making R more approachable in my experience it has I find it easier to teach using tidy verse for the very simple things then ggplot also so yeah that I think has helped but like I think this relates to what Alexis was saying that if you at least in my lab eventually you're going to have to do matrix multiplication operations and so it so you might get really far using tidy verse and then at some point I'll say oh you can't use it anymore you have to you have to now learn any algebra and and and do it this way so yes it does absolutely um but I think the main the main reason why ours is approachable is really the what has been said by by Martin and Alexis that you can get to a plot really quick and that you know you can do that in different ways I tidy verse ggplot is perhaps the the one that's most intuitive to the general public but even the R-based way isn't that bad or or you know whatever else people but whatever else is out there it's very quick it's just like that's that's why that's I think that's why it's that it's just an academia at least applies that assistance is the main reason they haven't they have kept using R even though you know Python might be a more better thought out language or whatever you want to call this because it's no nobody beats R if you what you want to do is explore data which is the most important part of my work is exploring data to to motivate statistical methodology to check statistical methodology and to test out if you know look for bugs the main way I look for bugs is by plotting stuff not by looking at the code and really I can't I mean I just I don't think there's a doubt that ours the best for that yeah I agree but on the other hand as Jennifer seems to help for a new user to approach their language but can it be a problem for another user that wants to become a non-developer maybe due to his non-standard evaluation and I think Martin maybe you can comment on this yeah I don't really know coming back to Alexis as well the the chasm that he mentioned I I'm really trying as soon as I get away from just introductory or teaching that tell my students if you want to become a good data analyst even you should write functions you should get out of the habit of editing your R script like instead of writing a function calling it three times you they edit their scripts right and so it's not at all reproducible what they do because they change this line and and so so that I think that part of teaching can be done even to completely not computer science people teaching them some discipline I don't want to get the spaghetti code script from you that that helps and that discipline is is more important than if you're used to tidyverse or we are to be honest I teach them the indexing indexing like subsetting is very very powerful in R and it has some intricacies but but in some sense thinking of logical vectors to do indexing is much more useful in my my view than than going with the tidyverse very much but I don't want to get into a fight there but it's also it's my history I've I always found it quite intuitive the the indexing idea that the idea that indexing in R can help to double your data set right by just repeating the index and there are many many things you can do once you understood this is much more than just extracting one element from a matrix it's much much more powerful and and it's just it's all logical but anyway so people disagree or I don't I agree with you I take let's take an analogy for a kitchen sort of people want to cook food they can teach themselves so you can figure out how to write our code maybe somebody helps you or you read the instruction manuals on how to use the machines which can cut your fingers off because you can then make more advanced meals but you then need to be careful and I think what you're doing is that teaching and it's important but then it comes okay you want to run a restaurant you want to be the chef in a restaurant it's not enough that you can cook good food you also need to understand okay if you leave the standing for more than three hours there's a risk of salmonella you need to there's a whole bunch of things which the cook in your home kitchen isn't even thinking about which the restaurant chef is naturally doing because they have that additional training on top of it and not everybody needs to be a restaurant chef but if you want to be a chef in a restaurant there are certain things you will want to be trained and have done and so on and I'm absolutely saying that not everybody needs to be everywhere on the spectrum you have people at home who cook amazing food and that's awesome and these are the tools which allow them to do that and that it's and they they are free and open source and thank you to you and everybody else has helped sort of produce that building and there is a certain group of people where we go yeah we want you to produce this industrially cool then you need to know certain industrial things awesome and unfortunately we don't have training courses for it for at the moment we just find people who figure it out themselves and we try and help them get better at it great I think this is a really nice analogy now for the sake of time I would like to start looking at the questions in the q&a so I'm just going to start with the one that is on the top I'm sorry I didn't have the time to read them through before so I'm going to read them up now so there is a question from alexander quote one sorry if I mispronounced your name and hey anyhow the question is as someone that just started to learn computer science yeah are I appreciate the narrow gap between the user and the developer I started as a user then started to take functions then packages then adding c++ I do feel however that the recent spread of packages to build around non-standard evaluation as widened this gap a lot at the same time it has generated many more users so they may become developers what is your view about such a gap between user and developers and how to narrow it so I guess this goes a little bit in the same direction of my question from before but yeah I would be very happy to hear some of your thoughts um and you wanna maybe nika you want to sure yeah so yeah no that's a really interesting question and I also find it fascinating obviously somebody talking about kind of getting c++ in there and I'm like that that is definitely a level and I'm just starting that now and I think the non-standard evaluation one is a really interesting kind of example to look at because I remember it was about 2018 early on when dplya and kind of similar packages started using a lot of that stuff and I really struggled with that at the start and I could not wrap my head around it and there's been so many resources that have come out since and breaking it down there's now an hour long cheat sheet which I've used a lot on recent work and I can definitely see that perspective that it could widen the gap but I think there was a really good talk a few years ago by Jenny Bryan at one of the RStudio conferences about basically there's loads of circumstances in which you might think you need tidy eval but you don't actually need to use it and it's more just thinking about how you achieve what you're trying to achieve and so I think it can be challenging it can be difficult in that gap but I don't think that's necessarily a problem given the amount of power it can add to doing things yeah then there's another question by some anonymous attendee accounting that many R users are not from a software engineering background what would be the path from an R user to an R developer and where and how would one R user learn to be an R developer and maybe Sandrine you can comment on this considering that we spoke about teaching before so I mean I think there's multiple paths that's a really good question where that transition occurs I think others have provided partial answers to this I think it's a combination of you know learning by doing but also taking some workshops or courses working with people that are more advanced than you are I found that to me that's been a really effective way of learning looking at packages that that are good quality packages and learning by by looking at the code for these packages and trying to implement some of the techniques that are used in these packages so it's really a range of paths to become a developer I think and you have to you know a lot of it goes back to experience I think I you know I really want to emphasize that Alexis mentioned it earlier for software development as well as for a place to just think a lot of it is an art so you can learn some of it in a classroom or in a workshop but then it's really doing it and experience and working with others as well and I don't think there's a discrete point where you jump from being a user to a developer it's a progression and you'll become a more and more experienced at it and the community let me just go back again the community I think is a great way for that to form some mailing lists the documentation yeah I think that was a really great point that and Sandrine meant about that made about working with others though because one of the things that I found most useful for improving my art development skills is engaging in things like doing open source development and therefore having code reviews and engaging in pair programming you know there's so many courses out there there's so many things you can read but actually having that one-on-one kind of experience I found has accelerated my skills the most. I think also going back to what Sandrine said it's a bit of a progression there is a spectrum between the art user and the art developer there is this other question about being is it possible for a person to be both and is it preferred considering that here in this panel we always talked about the two things as if they are distinguished but is it even possible to be in certain projects on our user and in other projects on our developer and I don't know maybe Rafa you can comment on this I see you nodding oh I've definitely been both I'm most of the time I'm on our user but every once in a while I write a package more before than now but then I become an art developer I guess I'm not I'm thinking about how to make the package efficient and use and user friendly exclusively once I figured out what I want to do statistically but to figure out what I want to do statistically I did a lot of our using yeah I mean I think in academia especially you're going to have a lot of people that do both but as it has been mentioned throughout it's the day has 24 hours so it's going to be there are people that are very talented and they can be very very good at both things but in general you know the better you get at one the the less time you have for getting good at the other and that's where I I'm hoping that we could there could be at least in academia but I think also in industry I mean the cancer center I work at is more similar to industry sometimes than academia where where that will be also true that we could have the division of labor that makes the whole operation more efficient or you have in each task someone that's really really good at it and and they're working in the same environment so there's there's enough communication that it won't that it won't suffer from the fact that it's two people don't get together as opposed to you know just one person I think it might be a bit of a tadpole question when does the tadpole become the frog and when does it stop being a tadpole I've not yet found anybody who went and trained to become an art developer just like we've interviewed people for data engineering positions and there we look for are as well they started using our to do the analysis and over time they found their own interests drove them to more I'm really much more interested in building tools for other people than doing the actual analysis so I will sharpen that skill set and then one day they go oh here's an art developer role I fit that perfectly and it's not actually asking for PhD level data science it's just asking that I can build stuff in R I can do that so there is I think you can be both no question I think a specialist is likely to be better than the average unless you find that unicorn at the development if they've focused some time on getting good at the development part thank you yes we can move on to the next question from Arnold Beckers which is a bit of a different point so he asks so in his experience a scientist needs to know well are but as a scientist decision it is very important to have good communication skills to really discover what is the need by the scientists in the field so for example environmental nutrition can you comment on this I don't know maybe Alexis you also mentioned this emotional intelligence before so that's I used the word empathy earlier and that's exactly what I mean again it's a thing when you have a the larger your team is the more you are able to go yeah here's somebody with a very tight skill set and nobody talks to them because they're a bad communicator but the team can absorb it given how we've spoken about how our developers are rare and they like you're likely to be only one in an organization you absolutely need to be good at especially the listening part of communication hearing what the other person wants then feeding back this quiz what actually can be done this is what we can commit to in the development feeding back and then ultimately delivering that is a communication skill the for the sake of having a communication empathy here maybe we should clarify something for yet Zanna Adada who asks what skills are required to jump from advanced to developer I guess an advanced user and it mentions I was thinking about our inferno but that's just internal that's deployment production in knowing other languages or are you simply thinking talking about the customer perspective so maybe here we should clarify the concept of production and maybe Nicky you can comment on this yeah that's a quite difficult one to pin down for lots of different things talked about there and you know there's some ideas about do you need to know other languages as well and I'd say not necessarily although certain skills like knowing a little bit of bash a little bit of command line that can be useful but then again there's some fantastic packages now like use this which in combination with action you can kind of automatically set up continuous integration for a package on GitHub straight from R so whilst those other skills can be useful I wouldn't say there is necessary as they might have been a while ago and sorry I've forgotten what the rest of the question was it was about say do you need to know other languages or it's just a customer perspective um so so I guess was this the question that also talked about moving from an R user to an R developer as well yeah I guess it's uh it was a bit of two questions in one yeah I mean I don't think that the language is a necessary but that's because I don't see that much distinction between an R user and an R developer really I just see it in terms of you know having a few of those software development skills at a minimum um so that that's kind of yeah hard to save it that one maybe because look at the time we just have one more minute so I would like to close with one question that is about the future so are we expecting to have more support for RRM production in future years and maybe Martina you can answer on this one and you too first questions about the future are always hard as you know as statisticians um I I'm not sure as somewhere as we mentioned implicitly if it's needed at all because Nick was making a point that R can be used in many places in production and other places maybe not there yesterday was at the user talk where somebody presented Kafka-esque which is a package to to kind of interface to Kafka which is kind of this high band with message passing I don't understand everything but anyway I think that I think also very much an interface between systems and can be used at that but then if you have very high throughput situations I think that then the computer scientists need their very specialized software tools and R is not the good tool the idea of S that John Chambers made that point also all the time the idea is not that you do use this language for everything use it for what it's really good and that's quite a lot and then and then you should also know its limit where you really want to go like if you if you get satellite data gigabytes per per minute or even if it's just for half hour maybe you shouldn't try to do the front to that system in R rather in the very dedicated system so I think production always means several systems working together in the end in the end it's the web browser that almost all users interface with more and more right then production means you produce the contents of a web page but that's not everything right in astronomy you have to collect your data or in particle sciences at CERN also in Switzerland you know I mean they need different tools because they even process data when it's generated they have to aggregate the data when it's generated from from their particle physics so they cannot they need very very special software for that very very even very special hardware so it's production as Alexis also said the production can be very very different things okay so by looking at the the clock our time is unfortunately up I would like first of all to thank you all panelists for being here and taking the time to discuss this topic with us I see we have still a many questions in the Q&A and I would like to invite you all to join the Slack channel and continue the discussion there I will also share there the other results from our survey which I didn't share live because the conversation was just so nice and I want to interrupt so I guess this one is a wrap and thank you all again and continue enjoying the user I think the next one in the schedule is a mix no it's a it's a keynote or is it a mix event I'm not sure do we have a slide with what's coming up next otherwise I will quickly look it up okay so up next in the schedule today is the mix our event so yes so we have time to not work in chapter so thank you once again everyone and so I'll see you soon thanks everyone thanks bye bye goodbye