 Hello and welcome to training the next generation of open source developers. My name is Jeremy and I'm the Director of Technical Community at Datadog. If you have any questions or want to follow up after the talk, hit me up at at Linux questions on Twitter and I'd like this to be as interactive as possible so don't be shy. I know we are pre-recording this but I'll be there for the live QA and have left plenty of time for you to ask questions at the end and would love for you to implement a program similar to this at your organization when this is over so would like to answer any questions that you may have along the way. With that said let's get started so what is Datadog and I do want to note here that this is in no way a product pitch in any way shape or form but the product really is part of the story so having just a little bit of an understanding of what the product is will really help you understand the storyline so now we're a platform for modern analytics and monitoring and designed to help you understand your entire stack especially if you're in the cloud but even if you're on a more traditional infrastructure and just to give you a little bit of the idea of the scale that we're at we're processing trillions of data points a day so solving really interesting problems at scale. We think of our services as enabling observability through a variety of tools that are built on top of our platform so you have your traditional infrastructure monitoring and metrics you have distributed tracing and APM you have logs you have synthetics and we've recently added additional features such as networking and ROM and security so a very comprehensive platform to enable observability and you might commonly hear these called the pillars of observability within the industry so that should give you a little bit of an idea of what we do as a product so with that this is usually a slide that you see at the end we're hiring and while we really are hiring the reality is that even in these difficult times many tech companies are accelerating their hiring and in the long term I think in the technical industry recruiting is always going to be a difficult problem so normally this is where I'd say hey raise your hand if you're not hiring can't do that here but I think you get the idea and in the reality beyond that is we're all using open source sometimes I have to explain that a little bit obviously here at open source summit I don't think that I need to write open stores at this point is just table stakes but the more we talk to candidates that were coming out of university we realized that they actually didn't have solid open source experience they didn't understand the fundamentals of open source they didn't have the philosophical interest that many developers have had in the past and they just didn't also didn't have a lot of observability experience but on the open source side we found that this really made roles focused on open source really challenging to hire for especially at the more junior side of the scale and that's something that we wanted to directly address so with that in mind I want to set up a little bit about how open source works at Datadog so we are a proprietary SaaS product but like most companies these days we have a huge huge focus on open source and I'm going to explain that a little bit more so all of our agents all of our libraries anything basically that runs on your infrastructure is going to be completely open source for us so typically it's going to be a patchy too but we also have some BSD and some MIT as well but we also in addition to all of that being open source we release quite a bit of open source based on problems that we've solved internally so one example of this is Kafka kit which we open sourced last year and that's a great example it's some tooling that enabled us to scale Kafka really well and we found a large community growing around that so not only releases of open source as far as the product goes but also some additional tooling as well beyond that we rely on a ton of open source and I'm sure this is the same with your organization you know I've kind of highlighted Kafka and Kubernetes and Postgres go here if I tried to include the logo for every open source project that we use realistically it would not be a readable slide but like I said I imagine this is the same for your organization increasingly open source is just important for everyone we also contribute upstream where it makes sense so whether that is helping other scale as we've scaled or making things more observable for the project or just regular bug fixes performance improvements we very much have an upstream first mentality as you should too beyond just open source though we believe in open standards as well so we recently as an example here contributed all of our tracers and auto instrumentation to the open telemetry project for those of you not familiar with open telemetry they provide you know libraries agents and other components that you need to capture telemetry data for your services and that's going to give you better observability allow you to manage things better allow you to debug things better and it was the result of the merging of two open source projects open tracing and open census so across the board the point here is that we are huge supporters of open source and open standards and it's really important to us that we are good open source citizens throughout that one last thing i should note here if you are an open source project and you don't currently have an observability solution we do offer free absolutely no strings attached no charge data dog to open source projects and this runs again it's a program that we started a few years back then i currently lead and we everything from a patchy software foundation and python software foundation so large foundations we donate to also all the way down to very small projects so if this is a need that you have in your open source project do contact me and let me know and we're happy to help you out there so with open source at datadog and why the this is important to us out of the way let's i would like to cover just briefly a little bit about me and why i think you should listen to this talk i consider myself an ardent but realistic open source advocate and i think that realistic part is is really important but beyond the principles of open source that i do genuinely believe in i also recognize that open source has accelerated my career you know i started using linux in the very early days ironically a little bit looking for a sco alternative i founded a website called linuxquestions.org which is the largest distribution independent linux community on the web and has been you know a really fun ride we're gonna in the next week or so here hit our 20 year anniversary so something that i've been doing for a while i was an early podcaster mostly around open source and in technology in general but i think most importantly probably for this particular talk i'll leave the open source program's office at datadog and the reality here is my life would be very different if it weren't for open source and as i talked to other people in the organization i realized that there was quite a few of us that felt the same way and we kind of wanted to pay it forward and ensure that other engineers and newer engineers had the same opportunities that we have and could be positively impacted by open source so one of the first things we did is is good open source citizens we looked around to see what other organizations were there and the same with code there's no reason to start something from scratch if you can build on something that's already existing i put a couple examples in the slide here outreach ease summer of code osl and the reality here is these are all great projects we actually still participate in some of them but for what we're specifically trying to do here they just weren't a fit for that so we did decide to launch something new so what we launched was the datadog open source co-op program this is typically a 15 to 20-ish week paid internship and that's an important part of this is they get full remuneration full benefits the same as if they were a full-time employee at the organization so they are there's not like a different tier for them they are full employees and that's very important so what did they actually do so one of their main goals is specifically to work on upstream open source projects and i'm going to get into a little bit of the details there and i'll explain why i've included these logos in a few minutes so the main goal is to identify report and then fix performance issues and meaningful bugs in open source projects and do that upstream we skew towards what we consider impactful projects and and how we kind of define that internally are things that are technically interesting for the co-op which is really important but also things that we use as an organization so that's where you get the benefits to both sides uh after that we write about the experience on our engineering blog and we found that written communication skills are vastly under appreciated by new engineers a skill that i think is super important it helps you clarify your thoughts helps you solidify lessons learned it also enables you to share all things that i think are super important the key difference between some of the other programs though here is we wanted to make it relevant to the business so that we could make it sustainable something that we could do long term something that we could do at scale and so we have them do this all with datadog the product so if this is going to be successful a core piece really is that it has to be beneficial for the students and really meaningfully meaningful beneficial so what does that mean in practice some of the benefits that i'll cover they gain real-world experience in engineering that's publicly referenceable and i think that last part is what's truly important you know there's a lot of great internship opportunities around but a lot of times what happens is you're going to say i work at i work that company x and worked on project y and that's about all the details that you can give where in the case of our internship here because it is all open source it is all upstream you can say i worked on project x here are the prs that i submitted so not only can you see my technical solution you can see how i responded to review code reviews you can see how i interacted with other people you can see how my thought process iterated as i got more and more involved so it's a pretty comprehensive picture of what you worked on all publicly referenceable which we think is a pretty unique benefit that we haven't seen a lot elsewhere uh you're going to learn about what open source is and why it's important and it's a huge huge piece of that you're going to see how do you engage with open source communities you know open source communities especially some of them have a little bit of a reputation for being a little bit intimidating and having someone that is familiar with that community to mentor you through the process has been really critical for this program and has been a huge benefit directly to the students and the last thing i'll cover there as a student benefit is you build your reputation in the open source community and this has a couple of knock-on benefits one is it just straight makes you more employable in the long term which is obviously huge but on the other hand it also allows you to work with a bunch of different projects to see what type of projects resonate with you and just importantly what kind of projects don't now the other side of that coin is it has to also be beneficial to the organization if it's going to be sustainable for the long term so what what benefits is datadog scene and why are we doing this well one that raises our open source profile as i said we use a ton of open source we release a ton of open source we want to be good open source citizens and genuinely give back to a community that's given us a lot and this enables us to do that it allows us to to i guess pardon the pun dog food our own products and what i mean by that is a lot of co-ops come out especially where waterloo and some of the other organizations that we work with closely are great engineering schools that people that come out of them are super technical and have a lot of technical skill they don't have a lot of open source skill like i said and they might not have a bunch of observability experience so if they're running into onboarding problems with our product it's very likely that other customers are also running into those same onboarding issues so having someone that we can work with to identify those problems and solve them has been really beneficial for us as well and lastly the recruiting pipeline and there's kind of two things that i'll mention very briefly here one is the obviously we hope to recruit the people that are going through this co-op program but in addition to that it also allows us to show the great things that we're working on and kind of expose our culture to a bunch of open source product projects and that also helps us with that recruiting pipeline so two benefits there so now that i've covered i think why this is comprehensively beneficial to everyone i want to get into a case study this one involving Andrew McBurney so he is from Waterloo which is one organization that we work with quite a bit and before he joined as a co-op as part of this program he had a little bit of experience in the open source world had kind of looked at a couple projects but he really wanted to dig in and learn more he joined us in his junior year in 2018 and as the first co-op as part of this program he would need to prove that the concept was viable that it was beneficial to him that it was beneficial for the organization and really that we should continue doing this long term and happy to say the story has a happy ending we still have the program and he actually works at Datadog full-time as an engineer now so all the way around the story ends well so as i mentioned part of the objective of this is to find meaningful performance or bugs or issues use Datadog to find and fix that and then get that fix not only submitted but accepted upstream so Andrew identified a performance issue in homebrew and for those of you who use homebrew specifically in the linkage command now here is the part of the PR that he initially submitted aside from the fact that you can see here Andrew is very Canadian one great thing here is that we use data to brew internally at Datadog quite a bit so this is where i was mentioning there's a direct impact to us but also something that we can share with the broader open source community so here is a flame graph from brew linkage before Andrew submitted the PR and here is the PR this is a excuse me this is a flame graph from after the PR was merged it's a little difficult to tell here exactly what's happening you can tell there's an improvement but let me get into some specifics so with no caching at all brew linkage took roughly eight seconds to run now he initially used sequel light but this introduced an additional dependency into the project and it's something that the project did not like and this was a great lesson learned for him and something that we mentored him through after the fact that you need to really before you just start submitting a PR to a new open source project you need to understand what their norms are i kind of explained it as you should participate a little bit in read only mode first this was something that was documented so had he looked around a little bit more it's something that he would have discovered and something that i think it was an actionable takeaway for him but moving to the solution that was actually accepted which was based on dbm and did not introduce any additional dependencies you can see here that it took about a tenth of a second longer to run on the first time and that's of course while the cache is being built but subsequent runs are quite a bit faster on the order of 200 milliseconds so quite a performance win now with that PR accepted and merged it's time to write about the experience in the solution for our engineering blog which was where we share a lot of our engineering stories with the broader community now one thing that was interesting to me here is that Andrew enjoyed this process much more than he anticipated and he actually noted that learning about the importance of writing was one of the biggest surprises of his time here at data dog and i kind of underscored the importance of written communication skills before and how underappreciated they are so this for me was personally very satisfying and i thought it was great that it was something that he took with him and i'm quite sure will benefit him for the rest of his career so what else did he work on so you'll probably recognize digraph you might recognize fast plain but what is jello so the great thing about jello is it's internal tooling that we've open sourced but the part that i really like is it's something that andrew identified as an inefficiency in our internal processes he worked on his own across teams to fix it and then released it as open source so the resulting trello github integration is now i know being used at other organizations which is great to see also something that's interesting is other co-ops after him have also worked on it and actually our our co-op right now ursula is porting it to jello which is something a change that we've made internally so it's great to see both a community starting to form around it and subsequent co-ops also working on it to give the program a little bit of continuity there so what did he learn one how to interact with upstream communities and as i've kind of said earlier in this talk some open source communities especially have a reputation for you know being a little bit intimidating but even the super welcoming ones and there are many if you're new to it and haven't done it before it's something that you're not going to really know the ins and outs and having a mentor to walk you through here's the technical part of it here's the people part of it all the different aspects that are involved with open source having someone to mentor you through that product was something that he really took away from this now he of course he learned to talk about modern microservices distributed systems modern infrastructure cloud etc like i said we're solving a lot of cool problems at scale and him having production experience there and we we do kind of throw you right into production is something that he a lesson learned and something that he was able to take away this is a small when he learned an additional programming language obviously if you know a couple learning an additional one is not a huge thing but it was something that he said he enjoyed the process of especially under the conditions that the program facilitated the importance of making data-driven decisions and this was a big one i know data is in our name and we are a very data-driven company and i showed you a little chart in a couple flame graphs but the amount of data he used to iterate on the different solutions as he was working through the technical side of this problem was huge and it's something that he noted that was a huge takeaway for him was how important making those data just driven decisions were and not just saying i think this is going to be a better solution but actually proving not only is it a better solution this is how much of a better solution it is and having numbers to back that up his next co-op actually ended up being working on one of the projects that he contributed to while at datadog so this isn't maybe specifically a lesson learned but something that he was really happy with and a takeaway from the program and the last thing he learned is the importance of a positive work environment i think you know this is really a great place to work we have a really great culture and he noted that it was something that he didn't realize how important that was and as he was was exposed to some other places he realized that it was something super important to him personally and identified as one of the reasons that he joined us full-time about a little over a year later so with that out of the way i want to cover some of the challenges and by challenges i want to cover challenges for him challenges for us as an organization and some challenges for the program itself so i'm going to lump these into some broader categories the first one i'll lump into what i'll call onboarding and so andrew went through the same onboarding as all engineers do here i identified earlier that we were finding that folks coming out of these work didn't have the open source experience that maybe we would like but we i think we under appreciated just how much of an additional crash crash course they would need on the open source side something that we've since identified and are much better about in subsequent co-ops but something that you can learn from us another challenge is working through the what i'll call the vagaries of different open source projects and in the communities that he was interested in as i said every open source community is going to be a little bit different and mentoring him through that was just generally a challenge certainly not one that's insurmountable one that you will definitely come up against and just more something that you should definitely be aware of now the last one i'll cover as part of onboarding is helping him understand the value of the program to the organization this is one that i'll be honest we did not anticipate in any way shape or form but we have a lot of co-ops from waterloo and a lot of his other you know he was friends with some of them as they were submitting prs to the product itself it was a more clear here's a direct line i'm i'm committing to the project i'm sorry to the product where he was working on upstream open source he didn't really have a good understanding and we didn't do a good job to be frank explaining to him why this program was super beneficial to us as an organization not just to him once we were able to explain that to him and he really understood it it recharged him completely a morale shot way up and he would under once he understood it he was super happy and continued with it but so something that you should be aware of if you're going to implement this at your organization the next group that i'll cover i'll lump into setting expectations and this is going to kind of become a recurring theme right after he joined he we sat down with him and he discussed kind of a list of things that he was interested in from a technical perspective projects that he had identified based on us telling him what he was going to be doing that he might be interested in working in and the thing to understand here is they're going to be optimistic possibly wildly optimistic your job is to be realistic so if they want to work on kubernetes as a whole or rails as a whole probably not a great fit given the time constraints and complexity of those projects a lot of eyes on those projects a lot of people working on them so what you need to do is set realistic goals and that's really really important so setting those expectations both in terms of those goals but defining what success is is really important for a co-op program waterloo is a specific example has a very rigorous rubric for grading their co-ops and if they don't understand exactly what success looks like they're not going to understand how they're graded which is going to be terrifying for them given how the program works so we wanted to make sure we he understood okay here's what you'll the kind of the what success looks like for you and here's how you actually attain that success the next things challenges i'll lump into what i call false starts so even if you put in the work to pick a project well some projects that may initially seem like they're going to be a good fit for a variety of reasons may end up not being as good a fit as you would hope for and this might be the issue that you identified and you wanted to fix was fixed by someone else before you were able to submit it it might be a problem that seemed pretty easy to fix ends up being intractable especially with that time box that you have the important thing to consider here is that's okay in fact you can use it as a very good learning experience but what's important here is that you understand when to move on and allow that co-op to time box that effort realize that it's not going to happen but it was a lesson learned and that is fine to move on while i'm speaking of false starts go bills the next group of challenges i will lump into maintainer and project responsiveness so one thing that we failed that is we didn't work with projects well enough ahead of time so that they knew that this was coming google summer of code for example does a phenomenal job here and is actually we've modeled how we handle this now much more in that frame but you projects aren't if they don't know that you have someone working on this they're just not going to you know be able to accommodate it a lot of times based on how open source projects work and part of that is their timelines and your timelines they're just not going to be aligned right you have 16 or 20 weeks exactly to do this that doesn't fit in with their timelines or how they do sprints or how they do releases so very important to reach out to projects ahead of time let them know how the program works let them know that someone from the program is interested in contributing see what issues they think would be a good fit for them to work on and then work on those the another thing i'll mention here is kind of the variance of response times in open source projects is high and it's not only high from project to project but it's high for specific projects individually over a time right so things within a project may change maintainers move on maintainers change maintainers get busy so understanding that it's going to be variant that's how it is that's just the reality of it baking that into the the process is going to be super important and lastly i'll cover length of program getting everything done in a compressed time frame can be frustrating for the co-op and that's the reality of it so once again you're going to have to be super deliberate about time boxing things because of that you need to plan very well and this is going to come right back to setting expectations is super important like i said there's a theme here if there's one thing you take away from this the importance of setting expectations across the board both with the co-op with the organization with the program critical absolutely critical to success so with those challenges out of the way i want to let you know and give you some tips on how you can apply this to your organization now and some lessons learned along the way so you can not make the mistakes that we've already made so number one and these are obviously going to come directly from the challenges that we faced engage with maintainers and projects and do this ahead of time be super deliberate about explaining what this what your project is how much time you have what your goals are as an organization make sure that it's a good fit for you a good fit for the project and then work together collaboratively so that you're setting the co-op up for success next write the story as you go and this one i think is really important and probably going to be underappreciated but like i said writing the story down written communication skills in general help you clarify your thoughts help you cement lessons learned and really enables others to learn from you which is the entire part of our engineering blog and most engineering blogs is you want to share those stories but doing it well takes more time than most people realize and i think starting early part of what it does is ensures that you're documenting things along the way so that they're fresh in your mind but also to be frank hopefully it enables you to finish on time and in andrew's case his post actually wasn't published until after his co-op ended which is something that we worked very hard to ensure didn't happen again next the length of the program impacts success directly and we work with a lot of schools i've mentioned obviously waterloo specifically a few times we also work with northwestern we work with a couple universities in europe and they tend to have 16 week or 20 week programs and we found that that's enough time to really really get something done and dig into a couple of different projects write about it get it posted etc we've we found that some of the summer internships where you're getting four or maybe even eight weeks it's just not enough for a program like this for our goals if your goals are different it might be that it fits but we've just found that it is realistically not enough time uh mentoring speaking of time takes time especially you know at data dog we are a pretty distributed organization but helping a co-op evaluate a project then picking a project then mentoring through the technical challenges that they're going to run into the open source processes that they're going to run into the human elements that they're going to run into all of this takes time so if you don't have the time to do this well realistically i would recommend you not institute something like this but hopefully you can make that time and do it well uh but it's just something don't underestimate the time that it's going to take for this to be a rewarding experience for the co-op and why i mentioned is this this is a great great opportunity to get more people involved in open source what you don't want to do is do a poorly and then sour people on open source you know that would just be i think an opportunity lost so in conclusion four students to date have finished the co-op program there you see andrew but also uh taylor shiva and uh chelsea who have gone through it i mentioned ursula who's on the team now uh currently working on a couple cool things that hopefully will have some pr sports soon but we've submitted dozens of prs across dozens of different open source projects all four co-ops have said that they've really enjoyed it and found it beneficial they're all still interested in open source which is something that's been super satisfying for us as an organization super satisfying for me individually and i think that you can implement something like this in your organization so as i said i've left plenty of time for questions and would love to hear specific thoughts on how you might implement it if you're doing something similar to this would love to hear your lessons learned and we can learn and share collaboratively together thanks thank you for joining me for training the next generation of open source developers as i said my name is jeremy and that was the story of how we implemented an open source co-op program at datadog there was a couple of questions that came in during the session that i answered by a text if you do have any other questions i'll give you a little bit of time to uh thank you steve a little bit of time to get them in but if not as i is on the slide here i'm at linux questions happy to help you in any way i can implement this at your organization once again an absolutely great way to mentor and foster open source amongst the next generation of developers so with that appreciate it and thanks again for coming to the talk question from joe was do you have any tips on a specific angle arguments that should be used to get students excited about open source so i think there are a couple of different things that you can underscore and utilize that are in their benefit there one is open source is obviously a great way to learn about different technologies get exposed to different programming languages different methodologies see what is happening in the industry but again i think that one of the big things that i would underscore to students specifically because that part is obviously broadly applicable is that this is a way for them to work on projects and have the next people that are going to be hiring them be able to see how they deal with code how they deal with code reviews how they deal with interpersonal skills how their thought processes iterate these are all things that in any other co-op program if they weren't open source you would not be able to showcase those things so being able to almost have an open source portfolio when you're going to you know going through the hiring process is going to be really critical and something that multiple co-ops here have said has been super beneficial to them as they've gone through the process so great question joe appreciate it and once again i will upload the slides there which was the question from from alpash so it looks like that is all the questions we had once again if you think of a question or are watching this on replay and would like to ask a question hit me up at at linux questions and happy to help you out thanks again bye