 How will I know? I really like your tunic. That's very nice. Fabulous. So we've got about three minutes left and then we'll get started. And it's not going to take the full 45 minutes, so we'll just have a good conversation. I brought my personal journalist and photographer with me. My daughter's studying journalism, so she's going to... You're going to tweet, right? You're going to live tweet. You're going to write an article on me. Okay. Totally bought the Kool-Aid. Yeah. Been feeding it to her over the last 20 years. Mother's milk. Someone you know? Please do. The mic is there. About a minute, I think, right? Three o'clock, yeah. There are some pieces here if you want that you can kind of... I've worked for Comcast, then. Thank you. Thank you very much. And I live in San Francisco, so it was very appropriate that I used the Golden Gate Bridge, right? To talk about the stock. And a little bit about myself. I recently joined Comcast. How many of you know who Comcast is? Yeah. So what is really wonderful about joining Comcast as an open source program manager, open source director of open source is that I found a culture that was already adopting open source inside the company. It's actively engaged in open source projects, whether it's OpenStack or the Yachter project, which is the embedded Linux project, on the Apache traffic server in the content distribution network. So it's a very, very active open source culture inside the company. And my job really is to bring more organization and orchestration across the company so that we become more of an open source culture and a technology company, you know, from a community perspective. So this topic is very near and dear to my heart, and I'll tell you why. Just to introduce myself. So this picture was taken about five years ago and I've not changed it because I like it. Who wants to age your picture, right? So I started working in open source in 1998. How many of you know Silicon Graphics? Yes. So SGI was a phenomenal company, a highly innovative company behind, you know, programs like Jurassic Park and Indiana Jones, a lot of post-production rendering and technology. And we used to produce proprietary hardware and proprietary software. And the software was called IREX. It was the operating system that ran on SGI's boxes. And SGI realized that there was the writing on the wall in the late 90s, that people were moving to Linux on servers, people were moving to Windows NT on servers. And so soon we started shipping SGI boxes with Linux and also with Windows NT on them. And therein was my introduction to open source. I got to work with people like Jeremy Allison who works on the Samba project. And I got to work with people like Dave McAllister on the strategy of moving to Linux as a company. It was everything to do with, you know, which pieces of IREX do we open source such that Linux becomes more rich and more capable of doing, you know, what IREX used to do. It was, so how the heck do we make money now that we don't charge for IREX? We used to charge for IREX. How do we provide customer support to our customers who use this for post-production and need, you know, that 7x24 SLA based support? How do we provide support to these customers? So it was really a lot of interesting work that we did. That was my first introduction. I remember asking Jeremy Allison very, very naive questions like, so who is in the community? How does one work with the community? And, you know, how do we guarantee our customers that there will be a patch if there is a problem and so on and so forth? And from there I got to, and just for early days and frankly if you saw the IBM campaign on open source piece and Linux, you found that we were all as companies struggling to understand how to message the fact that we were in open source, that we were in community and that we were cool and hip and not corporate and not commercial, right? So we were trying to kind of transform and bridge ourselves into the open source world. From there I went to a small company called Tripwire. How many people know Tripwire? Yay! So Tripwire is a very, very cool company based out of Portland, Oregon and in the days that I worked with them they worked on intrusion detection software and Tripwire had started its roots in Purdue University as a freeware project written by Gene Spafford and Gene Kim and then the company had made it proprietary and they had a version for Solaris, a version for Linux, et cetera and the community had really not been happy with Tripwire that they had taken something that was freely available and had made it into a proprietary product. So my job was to mend fences with community and to open source Tripwire for Linux and to build Tripwire.org and to start kind of building the bridge back with community. So I really, really enjoyed working for Tripwire. From there I've done various things I've worked in very small companies, very large companies and then I went on to a company called Windriver Systems which is now a subsidiary of Intel and there is another story there. So Windriver was known for its proprietary embedded Linux operating system called VxWorks and VxWorks was the way the company made money but again the company realized that the writing was on the wall and that people were moving to open source and Linux. In fact, the embedded world, if many of you know it, is so fragmented. There are so many, many flavors of Linux and each of us can make our own Linux, right? You just have to go download and create your own embedded Linux for the device that you're making. So what we tried to do was to create a standardized Linux distribution, if you will, for embedded because not really a distribution, it was a development environment which was Eclipse-based with a bunch of packages that we would ship you that you could then pick and choose to create your own embedded distribution. So it was a build system, it was a development environment plus it was a number of packages and in the fragmented world of embedded Linux it was the number one embedded Linux commercial distribution so you could call us the red hat of embedded Linux. So I got a chance to kind of understand how to work with upstream communities there and then how to also articulate the value of why a commercial distribution made sense and also really work on the business model of Linux in a commercial realm. And as you can see from my experience I really come from a commercial side. I have not been in community as much. So my job is to understand how to work better with communities and scale and organizations like Scale and Flostam and Northwest really give me a sense of coming back and connecting with community. I got to work on the Octa project which was in the Linux Foundation as one of the founding members and I just joined the Linux Foundation Board last year as an at-large director working on diversity and inclusion for the open source community. And I lead a couple of groups, the women of OpenStack as well as the women in open source and essentially we are trying to change the culture to be more inclusive to include more people of color and of different genders and races into open source. And I love speaking on what I know best, right? My company perspective on open source as well as diversity. So the reason this talk came about was as I've wandered through a number of different venues like Oscon, like Linuxcon and even community venues I discovered that people, there was a big chasm happening between community and companies. And communities would say, I think open source is no longer what it used to be. It's not the old coding days, it's not the old hacking days and it's transforming, it's becoming too slick, it's becoming too commercial. There's too much money that's driving open source and open source interests. And there was this tension that I was noticing between ideological groups in open source and then the pragmatic and the more commercial groups in open source around licenses, around how events were run, how money was made, how companies were structured, etc. And then on the commercial side and the company side I would hear things like, gosh Linux is still not ready for the desktop, it's still not mature enough, it's still not tested enough and it's not stable enough and can we use it and oh, I'm scared to use this particular license or can we expect people to give us the SLA that we need from a support perspective? So my talk was really motivated by how do we bridge this divide? Because as Linux and open source gets adopted and transforms industries, transforms society, transforms organizations, we really have to get this right. We have to work together. We have to make each other stronger and build a stronger community. Otherwise I think it will not see open source 3.0 or 4.0 coming up. So my attempt is to come up with a few things that I've learned from my own relationship and my own family and also from observation that we can do to change this. Okay, so many of you will say it's no surprise. It's not a surprise that there is this tension and some amount of healthy tension is good, right? Because we make each other better when we pull from different sides. And the reason for this tension also is there's millions and millions of projects now. I mean there's a project for everything. One of my colleagues from Comcast is here, Sam Glusky and Sam was saying, you know, to my daughter who's studying journalism, well, if you're using this package, commercial package, there's an equivalent in open source. So the challenge is that we are all moving to open source. 78% of companies are using open source and frankly only 3% don't use any kind of open source. That's one of the first things I always tell my company also, never believe a vendor who tells you that there is no open source in the code that they're shipping to you. It's bound to be some open source. Look at your iPhone. If you open the attributions in your iPhone, you'll find that there are some open source attributions there as well. And with the software eating the world, every industry is getting disaggregated, which also means that there are a lot of new entrants into open source. People who have never seen open source have never been in open source who are completely new to this whole world. And so therein lies the need to even more build bridges, right? Because we need to welcome these new people into open source. We need to make them feel comfortable with how to work in open source. And frankly, if you look at communities also, how many of you work for companies? How many of you feel you're also part of communities? Exactly. I think a majority of communities these days is made up of people like you and me who also work for companies. In fact, companies are paying us to be involved in open source. And yes, companies and communities are very, very different. And I think a lot of you have read The Cathedral in the Bazaar, right? And when you read The Cathedral in the Bazaar, you realize that inherently we are such different creatures. Companies are more like cathedrals. They are these grand structures. They're organized. They're top down. They have a certain way of releasing product. Whereas communities are more like the bazaars that I grew up in India, full of chaos and sound and smells and hawkers shouting out to you. And somehow you manage to sell and buy and get business done. It's a very bottom's up. It's a very community collaborative approach. And I like the slide a lot which came from one of the guys in the Linux Foundation, Bill Steinberg. Because it says companies really are very top down management. There's a very structured way of creating software and engineering. There's methodologies. There's checks and balances. There's program management. There are processes and metrics. Every software engineer is driven by deadlines and FTEs, number of FTEs and budgets. And frankly the measure of success is of you selling this product, of you making money, of you making profit, is there an SLA that we give to customers saying I will get back to you within two and a half hours or three hours. And it's really our customers are also formal structures and organizations who expect this contractual agreement and support from us as a company. And as you can see from the other side, open source projects are completely different. They are not at all structured. There's more consensus. There's more collaboration. There's more a sense of technical merit. And as we were talking at lunch, everyone is equal. It's democratized and people expect feedback and peer review from each other. And the processes are very informal. They tend to have users, sponsors, communities, contributors. And what I've also observed is every project is a different culture. There is no written down culture and norm. You've got to kind of lurk around in the user groups. You've got to lurk around on email in order to figure out how to behave in that culture, how to get your path accepted. And it's still very chaotic. I mean, millions of projects, there is no standard, if you will, of how projects behave and run. And some of the definitions that have really been the cornerstone of open source communities are the license. I think there are two really distinct characteristics. One is the license that they use. And the license is what really shapes the nature of communities. It basically says that the source code is made available. You have a right to study it. You have a right to change it. And you have a right to distribute it to anyone. We were trying at lunch to explain to my daughter who is not an open source, what is open source. And we were saying that's what makes it different. Nobody is trying to tell you that you can only use it for this purpose or you cannot examine it or we will take it away if you don't pay the fee. It is yours to work with. And so that's such a distinctive element of what open source is. The other distinctive element I think is the culture of open source. It's one of the only examples, positive and vibrant examples of open collaboration. And I think the license dictates the culture but the culture also kind of makes sure that we keep faithful to the license and that we preserve the freedoms that open source gives us. And as we talked about, every community is so different. Everything from the very formal large Linux community, the group of kernel developers pictured here, to some projects which may never see the light of day. It may just be your git repository. And you created it because it was a project at school maybe or Google Summer of Code or you did it because you had an issue and no one else really knew about it. So my point here is community is still so different and everything is, you can have bigger small, you can have very formal communities, very unstructured communities, you can have a wide variety of diversity in a community, not diversity. And from a company perspective, the communities we often look at are those that we consume, right? And that we are dependent upon that we use in our products or we use in our infrastructure. And not every community gets the same level of attention or respect from a corporate perspective. So there's a lot of change going on in companies as well and from a Comcast perspective as well, what I've discovered is that we use open source everywhere. So there's a ton of open source we use. In fact, one of the philosophies that we're trying to get to is look for open first. And if you cannot find open, then see if there's a proprietary equivalent. If you cannot find it, then you make it. So don't always dive in and say, if it's not made here, it's not good, right? But always try to consume because there's always a good starting point somewhere in the open source community. The second is, this is a little harder in commercial companies to do and we are trying to do that. I used to work for SanDisk, Western Digital and it's a very hard road to transform to. Many of us consume, but hardly any of us contribute. And the reason being, as I said, being a very structured engineering organization, you tend to consume and then you have to move on to the next project and there is never budget or headcount set aside for contributing back or upstreaming. And also people don't get the importance of upstreaming. People sometimes tend to think, but my competitor will get to see this code and they'll benefit from it. But they don't stop and think, I benefited from somebody else's code too, so I need to do this. Or the cost of not upstreaming is less understood. If I don't upstream, I'll spend up all my time constantly catching up, integrating, making sure it's maintained. And then I'm forking with distribution. So this is something that is on our goldenness to do as a company is to make sure that we contribute back to open source, that we upstream everything, that we give back. And it's also a good thing to do. The third thing that we're trying to do and many companies are trying to do, I see Dwayne here as well, is use open source and the methods of open source inside the company. And I think O'Reilly and Dwayne knows this term very well. It's called Inner Source, which is if open source is so successful and so vibrant as an external method of companies and people working together to create code, why not use it inside the company to collaborate and create products inside the company? And it's a powerful, powerful way to break down the silos in a company and to have people reuse code that's produced inside the company to collaborate with each other, to not have technical debt, to not wait for somebody else to fix something. So what if customer support can go fix a bug doing a pull request and having it committed to the code rather than wait for the engineering team to fix it? What if I am downstream from groups in the company that have already created building blocks? Why can't I use those to make my product? Why do I need to start from scratch and create everything? So that's a powerful new thing that we are also trying to do as a company is to break down the walls and have people reuse and work through open source. The fourth one, and many companies are doing this, they are actually discovering code that they've written for themselves or they've probably taken in some code from the outside and they've improved it for their production or their scale and then they're putting it out back again and saying, I'm sure there are so many other companies out there that can use this. Facebook does this a lot. Facebook will ingest a lot of projects. They'll modify it. They'll use it in production. And then because it's hardened from a production perspective they'll put it back. So they put out about 300 projects, I think. We've done a number of projects as well. We've put out a ton of projects out there because we felt that others can benefit from this. And one of the other side benefits of this has been, it's been very helpful for recruiting developers because developers can now see for themselves the kind of work that you do and if it's interesting to them then they will start contributing to it and then if they start contributing to it you can actually take a look at people who have the skill sets you need and who know the kind of work you do and it becomes easier to recruit people as well. So we're trying to do all four of this in companies across the board. I know Duane's company PayPal does a lot of this. A number of companies are working through all four methods not just consuming but really actively participating. And communities, I'm seeing communities grow up as well. You find that communities are learning how to scale through doing better onboarding, doing better communication, doing better documentation, doing better UIs, doing better websites and they are understanding that some amount of structure is good having a community manager, having a governance having code of conduct, et cetera. So I'm finding that communities are also growing up and realizing that if they need to reach a wider audience because all of us have the same objective, right? We want to attract more users. We want to attract more contributors. We want to attract more money and other things into the project to make it successful. A number of us measure success of projects when I was with the Yachter project also we look at how many downloads of this project have there been, how many users, new users have we acquired, how many open issues do we have, how many new contributors have we attracted, how many pull requests, et cetera. So the measure of success has been how do you scale the project successfully and I'm seeing a lot of maturity in a number of projects which is fantastic. The takeaway for me from the last two slides is there has been a lot of movement towards building a bridge. Companies are becoming more community oriented. Companies are becoming more structured and professional and growing but there's more work to be done. So how do you build bridges? How the heck do we build bridges across company and community so that we're ready for that next term? So here's where I'll get personal. So this is my husband and I when we first got married. This was about 30 some years ago and as you can see we're very different people. I grew up in Bangalore, India and my husband grew up in Bismarck, North Dakota. So we came from very, very different worlds and believe you me it's not been an easy journey and you don't build a relationship over 30 years in a smooth way. So there were a lot of lessons I learned from our marriage and from building a successful relationship especially with someone who's so totally different than me and I feel a lot of those parallels apply to community and company as well. I'm more like the bazaar. My husband's more like the company and so we realized that it's important to understand each other, understand where each other comes from, what their motivation is, how did they grow up, why are they the way they are, right? Community, why are some ideological people in the community so strongly passionate about what they do and why do companies care so much about what they are doing? One of the things that we learned early on was to value our differences and to really value our strengths. My husband will say that he could never have traveled as much as he has now had it not been for the fact that he met me because he had never been west of the Mississippi, I think. He had a small town boy, he grew up in Bismarck and maybe he had gone to Minneapolis once but he had never really kind of moved past that and then here I was all the way from India to Bismarck and he got to travel the world, he got to see so much and we inform each other, we strengthen each other because we value differences and we value what each other brings to the table. I learned a lot from him. I'll have you know when I first came I did not know how to use a washing machine. Here I was saying how do I wash my clothes? My husband wrote down a chart for me on how to sort clothes. Blacks here, colors here, white clothes here and then look at the temperature and read the label on the clothes. I wish I had preserved that chart but we learn from each other. Respect each other. It's so easy, it's so easy to lose respect for each other when we are different and to think that our way of life is the only way and that that deserves respect and that if somebody else thinks differently or leads a different life that they're somehow not worthy of respect. So it took a lot of work to understand that there is a perfectly different way of being a genuinely good person and that it's okay and that we should respect each other's way of leading our life and really all good relationships start with respect and come from respect. Open communications. I must admit it took me a long time to come to that conclusion because I would say, what do you mean he's not a mind reader? He should know. He should know that I want flowers for Valentine's Day and that I expect him to ask me certain questions and if he didn't then clearly he doesn't care but it took me a long time to realize that people can't read each other's minds that you really need to tell them what you want and how you want it and what you need from them, right? If you had a bad day, I need a hug. Can I get a hug, please? Instead of expecting that person to know that you are somehow had a bad day. There are some people who are very perceptive but not everybody. And relationships take work and so also I think community relationships they really do take work. It doesn't come easy. No relationship that's worth its salt, worth its weight in gold comes easy, right? We have to work at it. And especially as a woman I needed to learn that I can't lose my own self and become completely immersed in who my husband is or who the relationship is that I need to continue to encourage and empower myself as a person. And I think a lot of us depend upon the other person for happiness and we say he's not making me happy or I'm unhappy because of him or because of her. And I think it's so important that we take responsibility for our own happiness and build on common purpose. And this is such an important plank I think for the community, company, conversation as well that there are some common teams, common purposes, common things that brought us together and that building on that common purpose is what makes us strong. So what do companies need to do in order to develop better relationships with communities? I think it really comes from relationship building. Respect, you've got to give respect in order to get respect. So we have to respect communities. We have to respect what they bring to the table. We've got to respect the culture of communities, how they work, and work within that culture of community. The second is companies need to be transparent and I think Dwayne and Denise were talking about this before. It's very, very difficult, especially for someone like me who kind of sits between community and company to not be transparent, right? Because my bills are paid by my company and they are the ones who are paying my paycheck and I need to represent their perspective but I also need to understand that I am this conduit between company and community and that I need to represent both perspectives and think and balance the two perspectives and companies need to be more transparent. The more transparent you are, you're trusted by the community, the more you can get to your objectives accomplished as well as a company and the more opaque you are, the more agendas you have, the more difficult it is for organizations to work with you. And we talked about a couple of examples of projects that are very, very narrow and run only by companies who have no company participation and they create all the roadmaps inside the company and then they kind of put it out there or you build a big code math and then you just throw it over the fence and expect everybody to start adopting it. Those things don't work. The next is, and this is something that we're working on, is that we cannot sit at the table if we don't bring a dish for others to share. It's like going to a buffet or to a potluck and constantly eating what everybody brings to the table but never taking a dish of your own. Soon someone's going to kick you out of that potluck and say, you know, forget it. You're kind of a freeloader. You leave. And so I've been trying to tell my team at a previous company and here as well that we need to make time for contributions. We need to earn our right to be at the table and we talked about contributions just a little while ago that there are many, many good reasons why we need to contribute back to the community. I take compliance very seriously. I think that shows respect for the license. That shows respect for what the developer's intent was when he or she created the project and when they added the license to it. And so we work very hard as a company on making sure that we are compliant and that we attribute the project, you know, writer and the copyright holder, et cetera. I think a company can show respect for the community through being compliant and through taking compliance seriously because that is an important element of, as we talked about, of communities. It's the license. The other pieces where we go wrong sometimes is if we try to control projects and we say, you know, we want it to go this direction or we want the patch to be this way and controlling just doesn't work. You've got to earn your right to influence the direction of the project as a company. And companies, I think, are in a good position to provide a lot of support whether it's in people support or financial support or evangelization or community management, et cetera. And so I think companies can do a lot to build bridges. If there are any other ways companies can work better with communities, I'd love to hear about it. You can get back to me on my Twitter account or you can certainly, you know, at the question and answer hour we can talk about that. I think communities also need to grow up from the perspective of the fact that yes, code is very important. Code is the beginning of everything. But from code we also need to create more complete projects. We need to document projects better because when we document it it's easier to onboard new people. It's easier to get new users. It's easier to scale the project. We need to work on UI. We need to work on better testing of projects. And we need to onboard new and diverse members of the community. Frankly, we talked about the fact that many new companies and many new industries who have never worked with technology are now entering technology and open source. And it's so important that we teach them how to enter open source and work with open source. But also we welcome women, we welcome people of color, we welcome other groups into open source because it only enriches the community. It only enriches the project and it only makes it better. And I think when we use most structured community management and formats, governance, et cetera then we also tend to make communities more professional or easy to work with more capable of being adopted and scale. So just to end my talk before we break into questions I just wanted to say I'm just so privileged and so lucky to be a part of open source. It's something I stumbled upon at Silicon Graphics and it's been such a joy in my life. It's paid for my kids to go to college but it's also been very fulfilling from a personal perspective. And I think the keynote speaker this morning also talked about how we have a responsibility to transform the world in a social, cultural, political way. We can. We are the technology fabric and we do need to build bridges. We need to build a stronger community, company, relationship for our growth, for our survival and I'd love to sit down and talk about how we can do it better and how we can improve working together. So that's my talk on building bridges and I'd love to take any questions or if you have comments to make I'd love to take those. Yes, some of the projects that I put a big, big value to projects that build very strong communities and one of the projects that does a fantastic job is Python. I think it's diverse, it's inclusive, it rewards people for contributing. It doesn't have just drive-by contributors they actually build community. They kind of keep them together and they work really, really hard to get that going. I think Rust is also very good. Django is very good. Do anyone else have examples of communities that they know have worked very well? I completely agree with Sam. They take community building inside OpenStack across companies that usually compete with each other seriously and they also work very well to build community, people who are not part of these commercial companies and community very well. I totally agree OpenStack does a great job. Exactly, Node.js is also a fantastic community. I've had a chance to talk to some of the members of Node.js and I think Denise is on the board of Node.js as well. I guess what I'm looking for is I've participated in many of these communities. I'm looking for kind of for example you said that just releasing code isn't good enough having testing and documentation in UI but do you have what I think that large companies need to be doing is looking at successful pieces and slices of this kind of thing was good and here are basically building blocks if we want to open source something these are the kind of things that we saw were very successful and not just oh the Node.js community is very successful but also what part of it made it good. Their documentation is fantastic. They're very clear on what is supported and what stability is that something that you're looking for. I'm just trying to push into how can we make it repeatable and kind of standardize. That's a fantastic question and that's an excellent point. I think there are a number of people who are coming at it GitHub is trying to work on that and here they're saying can we create more standards around how to be a maintainer how to create documentation so they have created some really good documentation you can see I forget what it's called but it's called how to get started guide that's right so they've done a good job I think open source open source guide I'll try to tweet the link also because it's fantastic so GitHub is coming at it from that perspective organizations like the Linux Foundation Apache Foundation Eclipse are also trying to do the same for all of their projects they're basically holding them to kind of a structured standard that they have to have mentors, they have to have maintainers, they have to have onboarding documentation how to get started guides and so on and so forth so there is that example also and I know Sam was going to say something here as well there is no universal standard yet but I think if you look at what Git is trying to do because so many of us live on Git GitHub brother and if you look at what Linux Foundation is trying to do you may find two good examples Sam you had something so you can look up a group where a lot of the film non-ballet companies collaborate like that the to-do group so check out the to-do group as well because they also publish a lot of stuff with how they interact as an enterprise with community and how they do community building and how to basically properly behave as a citizen the company in general as a citizen of the community so that's another good resource Thanks Sam, I should have mentioned it because I'm part of the to-do group Duane is part of the to-do group and we are a bunch of program managers at companies like Duane and IR and we try to share tools scripts processes etc at least so that companies know how to work better with community and then from a community with company Git Hub Linux Foundation I think are two examples The question that I had was if a company is benefiting from or using some open source what kind of responsibilities and what can you do to figure out when people are contributing to open source at an unsustainable level and at risk of burning out or leaving the project So the question is how do you detect if someone is contributing at an unsustainable level and could be burning out how can you support them, help them Does a company benefit from detecting things like the open SSL situation and other cases where you have something that a lot of people depend on the companies that need it aren't connected to the people who are doing the work That's a great question and that's a very very good point there are lots and lots of projects that are run by heroic efforts of one or two people and yet we all depend upon it like open SSL and others that kind of hold up the internet I thought it was a fantastic thing that the core infrastructure initiative did which is raise money from companies saying hey we're all depending upon this infrastructure and yet it is totally unacceptable that these folks are burning out and not supported, don't have the funding to do what they're doing and so they collected money to make sure that these projects are propped up and supported etc I think companies do have responsibilities to have the responsibility to support initiatives like that so that core projects that we all depend upon are not at the mercy of both bad sometimes burning out but sometimes lack of funds or sometimes also it helps to have some professional practices brought into the project so that it runs successfully Did I answer your question? Yeah, and I'm with Mozilla and we're supporting some research on trying to figure out when people are participating in projects and what helps people stay motivated to be working with open sources I'd love to, yeah that's so fantastic I was just going to say coming from an ops perspective if your company depends on something you should pay for support for it that means you actually pay somebody else money for support or you hire engineers or advertising people whatever it is that you're doing that you need support for or you're contributing to the community and in general if you depend on this product you depend on that group that maintains that product whether that be a company or a free software group of people that do things on weekends whatever you depend on a producer for goods that you then make into your product and that producer goes away because of whatever that causes a problem for you if the monitoring project that you've been using for five years to make sure your system goes away and you stop getting bug fixes and so forth or you find out you have really massive SSL holes that greatly impacts your company so taking care of that community but yeah like not just giving math but also helping the community stay healthy as opposed to just throwing some code over the wall once in a while letting people come to conferences like this letting people help with the documentation the UI, the testing, the things that you brought up yeah Duane, anything else to add? Yeah I think one of the things that we see a lot in the OSPO is engineers who have technologies that are generally passionate about they want to make contributions to as a separate process from a full grasp of the projects that the company stack is really really reliant on and part of a benefit that can come out of your compliance processes as part of an OSPO is when you can get an audit of what your company is using then you need to be advocating from within your office to say these are the top 5 or 10 things that we rely on we need to prioritize engineer time to become contributors and committers into these technologies so that we don't want to do exactly that problem I completely agree the inventory and the audit of what we're dependent upon as companies is such an important piece because it allows you to kind of understand how important those pieces are and how you need to support them So I did have a question the logo on the screen right now I'm sure for many of us in the tech industry pulls up a kind of a visceral reaction and I know it's early days for you over at Comcast now but that's the reaction that it's going to be your job to kind of work through building community between Comcast typically a company that's not well-loved within the tech community and to turn that I'm trying to be do you have thoughts on how you're going to attack that particular problem I know it's very early days for you there yet but I'd be curious to hear what you have to say So once you start working for certain companies you always get asked what you're working for them I had that when I started this page yeah and people would tease me and say oh so you get free cable and can you have friends and family discounts I'm glad to say I do get free cable and I do get free internet I do get HBO and other things but seriously what really drew me to Comcast is a very sincere desire to transform the company into an open-source company into a technology company to really embrace this role seriously I would not have changed I had a fantastic job at Best in Digital Sandisk I was perfectly content and happy and I lived a very good life in Silicon Valley as we were talking about I didn't want to be on planes going to Philadelphia but I was really struck by a sincere desire to change and they made room for creating an office such as mine and I'm fully empowered to do what I need to do to build the culture inside the company and also work with communities outside the company what I looked for was do I have executive support for this so I got to meet with our CTO Shri Kote and Shri some of these slides come from Shri by the way Shri has been talking to the executives in the company about the need to be an open-source company so to me we had executive support we I've been talking to Comcast by the way since April of last year so April through December I've been talking to them and then in December is when I agreed to join without that period I knew that they were serious they kept coming back to me they kept talking to me they kept allowing me to influence how we build this office the third thing I was looking for was is there a widespread community of interest Sam Gliasky is here Sam is one of our folks who evangelizes inside the company to other engineers on creating a GIT culture a pull request culture he automates tools he works with other engineers to take them forward and transform the culture and with leaders so I saw Slack a lot of Slack usage in the company we have active channels around open source open stack GHE which is GitHub Enterprise 50% of the engineers are in GitHub Enterprise another 20% will move soon I think sorry yes and I'd like Sam to speak as well so I saw executive support I saw a serious need for this function that they believed in I am being given full empowerment to go work outside and inside the company it's such a wonderful feeling when I go meet some people who say I've been waiting three years for you to get here what took you so long so there's a deep desire in all levels of the company to transform and part of it includes to Dwayne's point the branding of the company the change how we are perceived as a company in the community and did you also note so the other thing I loved is it's a company that supports a tremendous amount of diversity and inclusion and my boss the chief security officer is a woman I have never seen more women and people of different areas of the culture and world in the company as much as I've seen here it's not a Silicon Valley company but it's such a diverse company also the Comcast Foundation is something that I thought was fantastic that they support free internet services and free community and give back a lot so all of those were factors that really drew me to the company it's not about money anymore it's more about is there a place for me to carry out my mission and do what I want to do but Sam, give him a microphone sorry and we'll come back to you apologies before I directly address your question who asked the question about the culture Duane so before I answer that specifically let me give you sort of the one sentence lowdown of where I come from just so you have a good idea where I come from specifically so I'm basically a Linux assignment who developed software at the hobby I wrote my first open source patch a little over ten years ago and I've been avidly contributing to many projects ever since well anyway so to sort of more directly address the branding question so the first thing that I noticed coming on so I came from two perspectives first I went to college in Philadelphia and as a college student you toxically hate Comcast and I'll just be outright mad because that is you're on a budget the bill is expensive and in general there are a lot of things to dislike about the company so when I first came on board it actually took a neighbor like eight months to convince me to even think about applying so the first thing I noticed when I came on board in the company is first of all the psychological aspect of being in the company as an engineer in general because as Nithya has come to learn and as I'm sure some of you may know when an internal company employee goes to introduce where they're from there's a huge psychological courage that has to come with that like because you have to be willing to take a little bit of a beating when you say it and it's a challenge even for me today when I introduce I purposefully take that beating because I'm trying to sort of me as a person, I'm trying to change the perspective of the company just so I can give people sort of a look and say like what kind of people work for the company so things that I have to evangelize around to other employees even is like first off don't be ashamed of where you work if you show that shame externally or if you or yourself are ashamed the first thing you have to get over is yes the company has a bad reputation but you're in the company you're one of the people who would be changing that company so don't be ashamed of where you work and try to relay what it's like now from the reverse aspect of outside looking in I'm definitely not happy about the company makes a lot of money and we don't necessarily have to charge as much for certain things as we do and so part of that also comes with internal evangelizing of what's sensible what should we be valuing as a company and community of people who work company itself a group of people who are working together to push the ball forward there's definitely a lot of different aspects but the most important thing I think that comes out of that is a lot of the employees it's the best place I've ever worked as far as culture and community is concerned one of the things that I relay and we try to build out is first try to change internally the sense of where you work and form your own identity as to where you work and what you value and then from there to our reputation is everyone's responsibility it's not just some oh it's the support guys they didn't support the customers very well and now everyone hates us it's not that at all it's also oh you have a problem tell me more what's your name, what's your problem let me help you maybe I can go into the company and look for some resources myself to help you and it's sort of everyone in the company's responsibility and through time hopefully you know that will raise the reputation of the company in general so that's kind of some of the ways I try to do it thanks Sam that's why I'm happy to be there because I get to work with people like Sam please to build on that a little bit I'm also in the Bay Area and this is I think the first time I have heard Comcast and open source in the same breath there's some evangelizing outside the company that needs to be done but my question you touched on the first part of it already the distinction between consuming open source and contributing or publishing to it on the other side and I think they both enter into the problem of licensing conflicting licenses especially open source projects libraries, applications on your own servers there's no problem but if you're bundling it into a project anything that's being presented to the user then the lawyers get involved or you say is there something else we can use that actually matches this license and we aren't going to run into conflicts on the other end if you run into that and how you deal with that in the real world if I understand correctly you're saying consuming inside the company is fine but when you include it in products that you're shipping is when challenges come up around licenses and conflicts and licenses and things like that yeah, we take it very seriously when we consume inside the company we have a very fast-track process people just create a bill of materials or track which pieces that they're consuming but when we ship product for instance our set top box that we ship the Xfinity box and then the X1 box includes Yachta project as a base the build system and then we use Linux as a base and then we build a lot of other components on top of it we do provide a very clear attribution it goes through a lot of review and scanning and we resolve conflicts and licenses and we do publish all of the notices on our website we also make source available where we need to make available so we take that very seriously so there are three or four different compliance things we take seriously of course consumption inside then shipping in product and then contributing to a project and then open sourcing something we built inside the company to the community so all of that goes through compliance I don't know if I answered your question pretty much, yes if you're open sourcing a project that you've developed internally do you have a standard license that you use? the trend that I'm seeing among my peer companies in the to-do group and others is Apache or MIT seems to be the license of choice these days and we tend to stick to that as well, more permissive licenses yeah there's no conflict with that with a GPL library that you might be bundling with something else if there is a conflict we'll make sure that we use a compatible license yeah thank you guys for your time and I appreciate it I can talk about it I have it on my phone thank you good testing one two alright everyone I guess we can get started now if you'd like just an introduction I'm Sinjin Johnson I'm the engineering manager at Yahoo so just so everyone knows there's a bunch of swag in the back if you'd like to get some pins and stickers and all sorts of fun things okay so let's just kind of to get a show of hands here how many people use Jenkins as their company right now it's almost everyone that's pretty good and how many people here do continuous delivery of their company it's actually that bad alright that's good 20% okay so today I'm going to be talking about how Yahoo does continuous delivery and so I'll start out by saying what is Screwdriver so Screwdriver is up here on the screen it is Yahoo's continuous delivery software it's centralized so business units don't have to create their own instance pipeline is represented as code so everything is checked into the repository no clicking through buttons and websites we offer secure deployments your production to secrets from other people's production secrets and it uses crowd source patterns so getting started is ridiculously easy the idea is if I want to build a Python module at Yahoo my Screwdriver YAML has one line says Python and we've been doing this for five years so let's kind of get into how did Yahoo get to continuous delivery because that's kind of the story that we're trying to get here so Yahoo's been around for 22 years as of Thursday which is pretty awesome and for I'd say three quarters of it we've been using a custom package manager so we've been shipping to dozens of languages or we've been shipping dozens of languages to multiple operating systems and in the past pretty much everything was built by hand someone's dev box they package everything up in a TGZ and they upload it and it goes to production there's no source code where that comes from there's no kind of reproducibility about it at all and at some point in the last ten years Hudson kind of came along and people started running their own Hudson servers they were doing some continuous integration some people were doing unit tests, fun check-in some static analysis some people were working on packaging which was pretty nice however most of the time they're tied to some specific Jenkins node and if it goes down there's no way to reproduce it so in 2011 Screwdriver was actually created the idea there was and the Node.js team said man we're about to build 200 different node modules now I gotta go and create 200 different Jenkins jobs I gotta go copy make files around if I want to change anything it's kind of a pain in the butt so it was designed to automate Jenkins jobs just recreating the same job over and over again and following a basic set of workflows and they only supported two at first which was Node.js library and a Node.js application for our internal paths over the past couple years wow it really doesn't look good at 1024 but over the past couple years there was some grassroots efforts across the company kind of looking for ways to improve I mean teams were shipping code to production every quarter maybe I mean that was kind of the cadence it was and so there was this movement to go to continuous delivery and so some teams like Flickr were showcasing hey we're shipping out to production multiple times a day and this is commit to production with no human intervention that's the idea they wanted to have a red button pattern to say that's not the line and at the same time teams were moving in looking at test-driven development behavior-driven development looking at strong contracts so you don't have to deploy your database and deploy your API at the same time with the race condition or take things out of rotation at the right time it's all independently deployable and different business units were looking at sponsoring trainings they'd bring people on site talk about how to improve their behavior-driven development tests model and we've talked about DevOps in the last couple of days already many times but in this case we were focusing on kind of the ownership to owning the product that we're actually writing so as a developer I'm going to write code I'm going to write tests for it and when I click that merge button or someone else clicks that merge button then I own the product going out to production if it gets broken somewhere I have to fix it if it gets broken in production I have to fix it I'm responsible for it so the work is developer would write code hand to their QE, QE would write tests handed to the release engineer handed to the service engineer and they would ship it out to production in some sort of like six hour window obviously it never actually worked that way it was kind of always falling back and saying oh your code doesn't actually work, write tests anyway, so we looked at how teams at Yahoo were doing it successfully how were people actually getting continuous delivery done and we're looking at the patterns that made it easy for them so we found a whole bunch of things but I'm just going to cover four of them here deployment pipelines, the idea of any engineer at Yahoo being able to set up a pipeline very quickly so I need to build a new pipeline to test out a new change it should be self service no approvals needed just get me something to deploy code and a branching pattern similar to we found that teams generally like the GitHub flow where master is either production or about to be production assume it's production it's on the way and everyone work off of short-lived branches and then pre-commit test this is an interesting one because with when you get into continuous delivery you're going to be merging changes all the time you're writing some code you're going to approve it, test look good, merge it but after you have a two hour pipeline or something like that and after like five changes are merged then all of a sudden your integration tests are broken so which change broke it now everyone has to stop and kind of look around and try to fix this the idea behind pre-commit test is how do we move those tests closer to that pull request and then have higher confidence and when you click that merge button is it going to go out to production and then the idea of easy rollback so if we're going to be shipping code dozens of times of data production we need should something go wrong roll it back to the last version pretty quickly so we took that feedback and over a couple of years kind of make that easy path the easy path for screwdriver so one thing is obviously deployment pipelines is coded, it's all self-service it's all YAML, it's very similar to what you'd seen like a Travis YAML or the relatively new Jenkins file and it uses the same concepts that engineers look at so environment variables, shell commands imagine you're just running in a shell and the whole idea is to promote best practices but not limit you so if you just do nothing it should give you the best practice but still allow you to get around and customize and try something new now I mentioned GitHub flow I'm not sure if everyone here is aware of it but the basic idea is production sorry master is going to be production and you branch, have a short-lived branch no longer than generally like a day or two days and then merge that in, get a review test it and merge it and so we wanted to make this path easy so screwdrivers optimize to support this and with free commit tests people have used Travis before people are using these kind of things and the idea was how do we get not just your unit and static test but bring all the functional tests in as well so we worked with the platform as a service team so our internal has and said how can we give people eSphemeral instances so that you open a port request not only are you going to static test your code regular test it package it up, publish it deploy it to an instance and then functionally test it so that when you click that merge button when you're confident it's going to go out to production you can just go get some lunch, get coffee kind of stuff and then obviously easy rollback and so we also worked with the path team to be able to say all right we want if something goes wrong so we should be able to click a button pick the version they want to rollback to and then it should go back in a short amount of time and we also generalized this in a way that other teams that aren't using the path can also use it so when you're getting to continuous delivery you like I said in the beginning of the slides you really need those secure deployments you really need pipeline credentials that are split out from other people so we worked with actually a recently open source product from Yahoo called SMZ so we worked with them to provide tokens and SH keys that are time limited and in the build just for your pipeline so you can't accidentally push someone else's package and you can't deploy someone else's machine and we also tied the authentication model to the same thing that a developer would use normally so you don't have to go create a new like groups here and adding this user here no it ties directly to GitHub if you can write code for the software then I assume you can click the build button that's the idea if you can delete the repo then you can delete your pipeline one thing we found when going to CD and kind of these are like critical things that you need speed is very important you want that fast feedback for failure so if I open up a pull request and it has something obviously wrong with it you want it to fail fast you want to go get coffee, go play foosball come back and say oh man a build broke alright I'll go fix it we also wanted the fast rollback time so let's say you do ship something to reduction that's wrong you're just tested that this thing happens you want to be able to click and rollback in minutes or seconds as fast as possible so at Yahoo we used charoots for the longest time and we've talked about Docker in significant lengths at the conference already and so when we moved to Docker we talked about speed improvements we're talking seconds to start up instead of minutes we also moved to virtual machines and that has been a little bit harder of an adoption because you go tell people hey I'm going to shave you 10 minutes off your build they're like yes it's great but if you go tell someone hey I'm going to add 15% overhead to your build they're not going to be as happy about that however we've mitigated the noisy neighbor problem where if anyone who's running Jenkins knows that you're on shared infrastructure and your build starts getting really slow it's because someone is using all 24 cores to calculate something so when Screwdriver was created this is kind of like the big the meat of it was they wanted to they wanted to have these kind of patterns that are just simple to use and so someone could say hey you know I just want to build a Ruby module I'm a Ruby developer I just hired I know how to write Ruby but I don't know how Yahoo does it specifically so the idea here is kind of easy there's a copy paste how to publish a Ruby module or a Ruby gem at Yahoo I can just type the platform Ruby and I get it to work it's less user overhead I don't have to worry about what's actually happening underneath and then the teams that maintain this so like the Ruby team would say yes this is how we want to build gems these are the standards we want to put into place we had two kind of models and I just kind of want to cover them real quick we had this platform concept so you can see here it's a node app calling, linting, doing some tests checking coverage that kind of stuff so this is a prescribed flow if you just want to build a node app you just write node code and it all works you don't have to do anything else and the node team maintains these are the three images you can pick from that kind of stuff however if you want to do something outside of that let's just say I want to compile some CSS but I happen to want to use a Ruby gem to do so some SAS compilation well this is a node container so I can have a Ruby so I need to install Ruby install the gem, I mean this adds time to my builds and it's not really good so we looked into an alternative way which is the step pattern where we would use separate Docker containers it's kind of a composition over inheritance here so connecting different Docker containers and it's all designed to be small shareable pieces the biggest problem here is now you've lost that abstraction it doesn't work the same way it does on my laptop my laptop doesn't have this concept of splitting between Docker containers so I've talked a lot about what Screwdriver kind of does now and I just want to one of the big things about continuous delivery is there are teams at different companies that say yes let's go to continuous delivery but there's kind of this barrier where there's just so much you can do from the grassroots perspective and say yes look it's successful but there's so much of the rest of the company that we've got older technology we've been doing this forever this works for us we don't want to move to CD no we're going to have a technical excellence initiative the chief architect got up and said you know what everyone is moving to get everyone is going to be moving to a central built farm and everyone is going to be doing continuous delivery this is the big thing I mean this is as an engineer this is scary I mean this is this changes how I do my daily job this means I'm going to have to spend a lot of time figuring out how to do continuous delivery safely and successfully and it's scary but it's also kind of at the same time exciting because this means I can move faster I can ship things I can get closer iterations on am I doing the right thing for our customers one quick thing about centralized infrastructure I know there are a lot of teams and I hear this many times where oh some people are using Git some people are you know inside of their company there's a couple of people using Subversion or Perforce or there's some teams running Hudson over here and Jenkins over here but the idea behind centralized infrastructure was we wanted this integration across business so that if for example the mail team develops a cool module that we want to use so we want to say alright whenever that module is rebuilt rebuild me the only way to do that is to centralize and it gives us visibility into all products so I can go over and say hey you know I really want to add a cool feature to Messenger I don't know their code base but I can go look at it I can go see how it's built I can test it out myself I can give them a poor request and they can make the choice then do they want to accept it or not and obviously security scanning across the company is easier it's a horizontal initiative let's just say I don't know another open SSL vulnerability comes and we need to rebuild off of the base we can do that now we can just say we'll rebuild the base and then everyone else will rebuild them and rebuild them so just to kind of give you a kind of crazy growth that we had to deal with so in 2014 this is rough numbers here about 15% of the packages were built with Screwdriver because it's an optional thing so at Yahoo we have two things we have a central Jenkins farm which is built on top of Jenkins but it's separate and about 15% of the company was using it and we had about 5 languages supporting it and we did some significant usage here a big user of Jenkins after this technical excellence initiative and moving to CD all of a sudden things changed we have 70% of the company is using Screwdriver now because it was the easier path for them we support over 250,000 Jenkins jobs which is crazy I don't know if it's a good thing or a bad thing but it's a lot and I kind of want to get into what kind of weird issues did we run into when running Jenkins at that scale so basic over architecture everything here is normal that you expect in an ECD pipeline so you've got a UI an API users either going to interact with GitHub or directly with the UI and those two are stateless they can scale, we can just bring in more of them and what it boils down to is your execution engine Jenkins your limitation is going to be there because Jenkins is kind of limited in that sense so according to Jenkins documentation they say go with multiple masters to distribute the load and their suggestions are slicing it based off of either business unit or purpose so you can have one that's for like commits or testing and then you can have one for deploying to production and one that's deploying for staging and their recommendation is about 100 Jenkins jobs per core with over 250,000 with over 250,000 jobs we're talking 2,500 cores a lot of machines we did some load testing we kind of got around 500 jobs per core which is still talking a few hundred machines there that's not going to really work because the more Jenkins masters you have the less you have to interact you have to re-implement queue mechanisms you have to deal with all the plugins that people are asking for in Jenkins is this build block or plugin well that doesn't work across Jenkins arms most plugins aren't designed for that so how do we get for a better performance so I mean this is just a high level of what we did to split up the farms and we have a number of them right now and we slice them a little bit based on so for example for the pull request we have a cluster that's just dedicated to pull requests because those are just going to be created and then destroyed, created and destroyed for fast writes and deletes we have one just for OSX and we ran into some interesting breaking points and I don't know if anyone else here has experienced this it's kind of a weird thing with Jenkins so when you get into large number of jobs your start up time changes from a few seconds to reboot to 90 minutes to reboot so if you end up having a problem your reboot is going to take 90 minutes in the middle of the day that's unacceptable granted the Jenkins community has optimized that down we're at like 15 minute reboot time but still that's 15 minutes that we have trouble with and then disk management right now Jenkins puts everything on disk some of our farms are tens of terabytes of disk that we have to figure out how to replicate and move across and copy over and we get into weird plugins like the what was that one the jmeter plugin great plugin it renders it renders it renders your jmeter output on the screen so it had two kind of weird side effects one was we'd see these weird spikes in our CPU and we'd see jenny threads shooting through the roof and it turns out every time you view that it would re-render them on the fly so now the master that's getting all the load from all these different nodes is now spiking through the roof because it's doing this rendering so you have to watch out for those kind of plugins and then it had this other interesting feature where every night all the build emails from Jenkins would turn into would be in Mandarin we couldn't figure out what was going on and it turns out that every time you'd view the website to view a performance page it would switch the master's locale to your locale it's the strangest thing we couldn't figure out because in the morning it would be fine we don't know what's going on strange side effects like that and then obviously user, I don't want to say abuse here but kind of accidentally miss you I mean this is shared infrastructure so some people can accidentally get fork bombs running or in one case we had someone accidentally cat out a one gigabyte file with no new lines Jenkins treats it as one solid message and then all of a sudden we see this giant spike in our CPUs that doesn't seem to ever go down and it caused weird side effects for like two weeks after that we could not figure it out so we traced it back to when did this spike happen and what happened at that exact second anyway so we're reaching some weird boundaries with Jenkins and we're trying to do things to improve the performance including completely removing access to the UI users can't make views they can't do pretty much anything they can only view it just what data can they get from it and then anything that we can automate off anything we can scrape off we'll throw into our data store and then just represent the UI later this reduces the load on it significantly and by centralizing kind of reducing the complexity of each of the jobs we got rid of those plugins like jmeter plugins and a whole bunch of others that just try to support more jobs and we're up to about 4,000 jobs per core but still the question we really want to ask is Jenkins really the tool for this and I think we've gotten to the point where we may have exceeded that this kind of gets to the real meat of it which is open source so we've been doing this for five years and we have yahoo is baked into screwdriver so much so at its core level I mean it's all tied to the decisions we made five years ago four years ago three years ago and so and so forth so we're looking or we just did we just started open sourcing screwdriver but our goal here was kind of to share our learnings share what we've found with trying to do CD at scale the ways that yahoo did CD and how we can make it successful for them really we want to CD all the things we saw 20% of the people in here said that they're doing CD and with all the talks in the last you know last year about continuous delivery I'm surprised it's not more and it's concerning that it's not more so we want to help push that help push the community to do more continuous delivery and this one's a weird one we feel that we've noticed that when we open source so yahoo has open sourced many things and every time we open source something our own quality of code improves because we have so many eyes looking at it and this is kind of one of the things that I like to talk to other companies and say open source your stuff that you're using that could be helpful for other people not only is it going to help other people but it's going to help you as well and we obviously want to increase our contributor base because we want to have more features faster standard product standard product manager kind of discussion so screwdriver CD is our open source product screwdriver at yahoo has been so baked into yahoo we can't open source that but we're taking a lot of the learnings we're taking a lot of the things we figured out like for example Jenkins may not be the right technology maybe it can help with things like dealing with virtual machines for OSX but maybe it's not the right technology for the majority to build are there better things so we built a minimum viable product it does continuous integration because it's continuous delivery it has well it's about to have reusable patterns the teams working on that today and then extending the pipeline so you can have this kind of nest of chain reaction and the most important thing is it's a plugable infrastructure so today I'm going to show you a demo of screwdriver working on docker so docker swarm on my laptop and then kubernetes hosted in AWS so we can swap those out we support github enterprise right now but we have someone in our community already adding github support so the idea is this is our foundation that we're trying to work on and what we're doing at least today and what we're working on is trying to bring it internally and then eventually replace the yahoo version with the open-source version build it up to feature parity and kind of extending it so that we'll have a community of people adding new features all the time alright so I'm going to do a demo here of cd to hook and go so how many people here don't go alright probably like 5 people and how many people know heroku alright that's better okay so we're going to do okay I'm going to have to adjust this so you can read it so I have a git repo here called randomer it has a single go script here which does it's very simple it's a web server when you hit root it's going to return this html here which is some nice javascript using the chart js and jquery and it's going to hit, asynchronously hit a port or a a route called slash data and it's going to give you a random number between in this case 0 and 10 can't read it it's a little better and then my screwdriver yaml is saying use the docker hub go lang repo and some environment variables I'm sharing around we've got some commands here where we're doing go gets go vets, doing some static analysis going to test it, build it and then actually do a quick functional test where we're just running it and making sure it returns something and finally we're going to deploy to heroku heroku is actually pretty simple this is kind of a fast example here but adding the fingerprint adding our private key into a location and then doing a git push to heroku we'll do the deployment here it called heroku ssh so this secret is only going to be available to the deploy job, not the main one okay any questions? so I'm going to go create a new app on heroku if I can click that's a good idea screwdriver demo is available, I like it okay, so I've screwed over demo and then here's the heroku app oh wonderful, that was easy alright and then we're going to go to our hosted solution so this is just for this is actually we run our own version of screwdriver in AWS using kubernetes and it will build and test our own software so we actually use it to produce itself which is an interesting bootstrap problem so I'm going to log in real quick with awesome so I'm now logged in, create a repo again typing is really my specialty here start a job I'll try to make it bigger so it so this is actually going to be pretty fast, hopefully alright, yes so it's cloning, it's going to be doing all the stuff that we saw before alright, it's done that's what I like to see and it's deploying I know everyone loves watching code and stuff deploy, this is very entertaining come on, heroku 14 seconds ago heroku, I believed in you let's go see if it actually deployed heroku's a little nope, still not loaded I love live demos this is what I'm all about you know it would be a great time for heroku incidents you know what, I bet it's S3, that's probably the issue well, while it attempts to consider normally like my previous deployments oh, you know why something worked? silly me I forgot to put the secret in who do I think I am should probably look at these things before I type them heroku SSH so so I have a key here, I'm going to just go private key that I've generated for heroku we'll go add that there try running this again sorry about that, I knew I forgot to step we're building things as fast as we can, thank you that's very kind of you I'm not going to tell a joke that's a bad idea so now it's deploying, that looks better go app detected going the things cool, now to put heroku beautiful again, like I said, every second it's going to go request a new random number and it's going to keep a list of about 30 of them so the scale here is you can't see which is fantastic on this side it's 0 to 50 we're going to go do a quick trick here go combine these we'll say alright, I like live demos there's a lot of scrolling here so just bear with my insanity so our offset is 0 to 50 so we'll set this to 40 to 50 basically 40 to 50 range now we'll say need more dots so I'm going to generate a pull request here I probably should have put a description, that's a terrible person terrible engineer I am so the screwdriver gets notified it'll attempt to figure out where to run your build it's going to test it and then in a few seconds because it goes pretty fast I can also do it but I can go and see hey, did this pull request actually work and so what it's doing here is it's checking out master going to check out my revision and put it right on top of it and say hey, is it actually going to work so generally you want other people to look at your code before you merge it but everyone give me some votes, does that look good alright oh we got one down vote, sorry okay, so I've merged it so what the screwdriver is going to do is get the notification just like you'd think it would I'm going to go click on this because I always like to do that we'll go back to here refresh alright it's already built and it's already going to deploy because it's really fast and sometimes I get a little okay I'm on Heroku so we're going to see a slight blip here on the free plan so it's going to get stuck for a second as it's deploying to the new dyno oh there it is, wonderful so it's deploying to Heroku pretty easy right so back to the alright so I have another demo I wanted to show you a second demo so that was our hosted solution that we deployed to AWS we just brought up Kubernetes deployed both the API, the UI the data store the artifact store so we have three different pieces that we'll get deployed and then you tell it use Kubernetes here's a token to use and then it will just work so in this case the worker is already in Kubernetes as well so we deployed to Kubernetes and we also use it for executing builds so here's the cool thing, that whole thing, no Jenkins that's the idea not kicking anything so in this demo that was entirely using Kubernetes instead of Jenkins so it does the things that we wanted to do but doesn't require Jenkins in that case the demo I'm about to show you is actually using Docker itself the Docker compose to bring up a build and run it locally as well so we are adding, there are people that are interested in adding Jenkins plugins so using Jenkins as the actual executor anyway but not depending on it to be always up so I believe there is someone in our community already looking at building that Docker compose alright so let's go back to splitting this because that doesn't give me that'll do it, alright that'll work good enough for me alright, so if you go to screwdriver CD there's our home page ouch we'll go grab this Python command here so this will set up screwdriver in a box for you we can close our demo up and I don't need it anymore can you kind of see that? alright, so what it'll do is it'll prompt up some things for us, it'll generate new signing keys on the fly, so you do want to do your own signing keys, it uses JWTs across the board, but this will just generate one automatically for me alright, so it's going to tell me to go to github, pullback URL client ID and a secret everyone write that down alright, it'll automatically do Docker compose pull and then so it's up and open for me, so depending on the speed of the internet here this could take anywhere between oh, alright, good, getting closer you really need a joke, anyone have a joke? I'd like to say it is going that's better, okay perfect, so now I have my own local copy running on my own local Mac and I can log in and then github is going to say do you really want to do this? yes, I'd like to and it should prompt me for my UV key and two factor authentication everyone really helpful alright, now I'm logged in I can actually go and create the same project if I could learn to copy why is this really a big deal for me? I guess it's the resolution? I'm not sure, okay pipeline, so when running on my laptop it's a little bit slower a lot slower it's also dependent on the internet too access the internet so this is actually getting completely running there we go, so now I have a script project on my laptop, using Docker compose I'll go back here you can see that I'm actually running nginx and 2npm modules cool, alright so that is that demo so why am I telling you all this? kind of again, we built an MVP it is our starting foundation it's not, it's something that we are trying to improve and work on and replace what we have there and so I asked you to follow our progress, give us advice, tell us if you think that we're building the right thing is this going to help CD all the things? and just check it out you can go to screwdivercd and take a look at it at this point I guess I'm open for questions what can I answer for you? that is a great question so as the little sheet here tells me I should be repeating the question so the question is why in 2017 do we only have 70% adoption? what is blocking that at last 30% surprisingly it's legacy systems so things that are around for 20 years that they have Jenkins jobs there are some teams, I've spoken to teams that said we have this setup it works we don't know how it works though it just works, it gets things done and we are trying to find ways to help ease that migration because I've been in Yahoo for over 6 years now I've been at Yahoo for over 6 years now and I've seen a number of hack day events where people are coming up with ways to automate Jenkins jobs, automate the creation of this I want to build this, automate it this way and you have a lot of these people creating these pipelines in freestyle pipelines connecting this to this and you have teams with systems that try to view the map of how it works but if those people leave the company if the product is de-prioritized then re-prioritized you lose the knowledge of how to actually build it again so while we do build 70% of the packages we have a higher number of versions that we built because most of the people that are doing CD at Yahoo are actually using screwdrivers so there are some packages there that are built 20 times a day, 30 times a day or more that's a good question so the question is do we have templates for deploying on Kubernetes? so we do, in our documentation we link to a sample repo where it says here's the service and deployment file for each of the different pieces and kind of the areas to configure so yeah, it is there it's relatively easy by default it'll actually look for the mounted service token so if you grant a service token to your deployed API you can say hey just make sure that credential has the correct permission to deploy to to create new pods, sorry okay, well, thank you very much for coming good, okay hello everybody we have Rami Aldanmi here to do a talk lawyers versus developers to fight over FOS in the enterprise Rami is the DevOps Technical Lead for Endpoint Cloud Protection at Symantec a very big open source advocate working with a lot of exciting technologies at the tech stack there at Symantec and so thank you all for being here at SCALE we're all looking forward to hearing Rami talk now it's on, thank you very much for the nice introduction hello everyone, my name is Rami I'm a software engineer at Symantec Corporation I do a DevOps by day parenting and husbanding by night when I get the time and today I'm here to talk about the experiences that I have going from a more of an academic background where everything is de facto open source from there jumping to a proprietary company developing proprietary software and that uses open source as a daily activity and then the lessons I learned there that I want other companies to follow and also what I think developers should follow to make their work also more enterprise friendly open source packages so before I start there are a bunch of things I need to get out of the way the first one is I need to pay the bills so I work for Symantec Corporation for endpoint security we have a hopeful suite of security cloud products for handling endpoints including Linux servers we have a booth number 318 if you're interested please come over and we can talk to you about our offerings now two more things to get out of the way first one is a disclaimer the opinions expressed here are solely my own I do not express the legal political or anything of that sort with respect to Symantec this is my lessons learned from my experiences from being a person who works with open source who comes to scale every year religiously and interacting with people who work with open source so if I say something good it's because Symantec is something bad it's because of me I've had disclaimers since I noticed all the talks in this track has a lot of lawyer illegally things so I want to make sure that I set the record straight I'm not a lawyer I do not have any sort of legal training I do not enforce any licenses I'm not a server hugger I do not follow Ritual Stoneman or his doctor into the letter and I'm not here to talk about legal matters I'm mostly talking about things that I'm passionate about there are a lot of people here maybe a bunch of them in this room who can speak to these legal matters better than I do but this is just what I've seen going forward so now that that is out of the way what I really want to do is just I want to write code that's all I want to do and that's the reason I'm doing a talk here I love open source I want to have code that's really good so the motivation behind this talk some of you remember a couple of years ago scale 13x that was the first scale the last scale at the old venue I had a lightning talk which was my first public speaking opportunity so I had a lightning talk where I discussed how I embraced the dark side of the source and how I moved to work for a company that does closed source software I felt during that time and at the end of that talk I gave some promises now unfortunately because of my academic background we like to show the overarching theme of how things are over the years so you need to suffer through this a little bit more so during that talk I promise that I will do my best to promote open source and make sure that open source tools can be adopted in the company and promote them in and I'm glad to say there is quite a bit of success there also we try to promote open source culture and I guess this year this is illustrated by the fact that Semantic for the first time ever has a boost at a place like scale so Kudos to Semantic for encouraging their employees and actually allowing us to do things like that lastly I said that we need to support the open source community now that was in the days where people didn't have Docker and mirror servers were considered a thing so we actually had a mirror server but people don't care about mirror servers anymore and we used to say we need to support open source projects like puppet labs now for all I care puppet should die no one should be using puppet but I love puppet we use it well enough but now it's just not working for me but also lend expertise and things like that now to be honest I thought my claim to fame was the upscale talk but you notice here that I explicitly put git because I wanted I really love to use git but we were using Perforce and Semantic and I was only two months into the company and I was really having a lot of trouble so if you guys remember there was the first ever bad voltage live and they had a one minute rant you can go and rant about anything you want and my buddy who was with me convinced me to rant about Perforce now I was extremely happy to do that because my buddy Don here who's sitting in the audience used to work for Perforce but left them when I did this talk so I had a lot of anger inside me that I need to get out so I rented for an hour for one minute about Perforce and I learned a bunch of things in that number one I became a celebrity within the circles like oh this guy can rant number two I'm not good at ranting I lost okay number three is actually if you work hard at it we now use Git where we are so we actually got what we want but it took us away to do that number four I'm actually 50 pounds lighter thanks to the support of my family so that's a good thing by the way so thank you very much thank you so if I have so I'm trying to redeem myself and people remember me for something that's not an upscale this just goes away okay so now to the talk in order to make sure everybody's on the same page I wanted to give a quick review on discussion of free and open source the reason I did that is one of the things I learned when I'm at the semantic booth is we got a bigger part of people coming to scale we have people who are not necessarily open source enthusiasts but a lot of tech professionals so I wanted to make sure we're on the same page where we go here so free and open source software is about freedom okay we say that it's a nice word what does that mean well the best thing to do is leave it to the people in the free software foundation to define it for you and they define four pillars of freedom when it comes to software the first pillar which aptly called freedom zero not freedom one of course we're geeks we call it we start with zero is the freedom to run programs you wish for any purpose kind of obvious makes sense there's freedom that any free software license should be able to support the second freedom is the freedom to study how the program works and changes sorry and change it so it does your computing as you wish so basically the second freedom implies that you can see the source code of any application you're running and you have the freedom to change it this freedom however requires that you have access to the source code which is a key point here freedom two which is the third freedom is you have the freedom to redistribute copies to help your neighbors so you don't only run the code you don't only see the code but you can also ship the code to other people the third and last freedom sorry the fourth and last freedom freedom three is the freedom to distribute copies of your modified versions to others which is a very very touchy point in the open source licenses by doing this you can give the whole community a chance to benefit from your changes access to the source code access to the source code is a precondition of doing this so in freedom one and in freedom three access to the source code is required so the result of this is we have two types of licenses in the open source world we have copy left licenses and this is a little bit dry but I'll show you how it ties into everything at the end so the copy left licenses the main issue is you need to make sure that you have all four freedoms because it needs to guarantee that if you receive a binary you are entitled to get the code and any modifications made to that code the code and any modifications made to that source code and if you use all the four freedoms you are required to pass the four freedoms so if I release the piece of software that I own that is based on the four freedoms if I give it to someone else I cannot tell that someone else you can only have three of those freedoms I can only tell them I must give back all four freedoms whether I like it or not okay and we're going to see examples of that the permissive license or what they call non copy left license is you're allowed to receive the four freedoms you can get them so you can get the four freedoms from anyone however you can choose which of the four freedoms to give to someone else so if you take the four freedoms you change the code and you decide you want to close it up and make it proprietary you can do that and that's what a permissive license allows you to do there is a third type of licenses that they call weak copy left licenses but I don't think it falls within the scope of what we're talking about today so basically if you look at it the first freedom is you can run your code the second freedom is you need to make sure you can inspect the code and in copy left licenses you must have the code plus modifications in non copy left license and permissive you're allowed to have you don't need to give the modification out and then you can redistribute the copies and with the modifications that you made so in a use case that we have you grab some code from goodhub that code has a free and open source license copy left or not you fork the code and you edit it and sorry emacs guys I'm a vim user so that's what I have up here and then you build the code and you ship it so this is generally what you do you took a piece of open source code you played with it a little bit you did some modifications and you want to ship it so let's see how this works in permissive and non permissive licenses now in a copy left license if you'd like the GPL the no public license you fork it you edit it you build it you're not forced to do anything you're okay the only issue comes up when you want to ship it if you want to ship it that is when you are forced to ship your code changes to you cannot just ship the binary you need to ship your code changes for instance like Apache 2 2.0, MIT, BSD you can ship your code as is or you can ship your code plus source code it's up to you you decide what freedoms you give okay so there are a number of licenses I'm going to point out some of the famous ones so the Apache tool license the Apache foundation uses it for most of its product Android and OpenStack are all Apache tool license MIT license JQuery for example one of the famous ones no public license Linux kernel is under GPL v2 and GPL v3 has things like Ansible and Bash and so forth this is an interesting figure that I grabbed from Black Duck Software which based on what I read this is actually it's updated from their database so Black Duck is a company that does a lot of research when it comes to open source software and one of the things they keep track of is licenses and because I use Black Duck a lot I decided to bring a Black Duck with me here to accompany their presentation and some interesting things you see here one of the things you notice that MIT license has quite a bit of quite a bit of interest there and we'll discuss in another slide why and then you have GPL v2 and Apache license having a bigger part of the pie and GPL v3 also is noticeable here there are a lot of other licenses that are available but you can see the distribution in general yes this is I believe it's by projects by licensed piece of software and this is a a little bit dated it's 2015 but we can read a lot out of it as you know GitHub started in 2008 and one of the things that GitHub did when they first started they mirrored a lot of the repositories for a lot of the public projects just as a public service so they mirrored the Linux kernel the Linux kernel doesn't use GitHub for it to get repository it uses the github kernel.org I believe but GitHub mirrors it as a public service and they mirrored a lot of projects so you notice that when they started only 60% of the projects had licenses and think about it this way GitHub just started and they mirrored a lot of projects and people started going and only 60% had licenses and the number started going down quite over time which is a phenomenon that people have been discussing for a while when people started going for open source software and free and open source software they were believers in the movement of free software they cared whether something was GPL1 or GPL2 or GPL3 they looked into the licenses and they really cared about it and they thought about it a lot there were a lot of interesting conversations on mailing lists about picking licenses and there were projects that were forked because of licenses but as we have more and more software developers coming in and more and more CS graduates and if you're not open source you basically don't count when it comes to platforms and web and things like that and they didn't really care a lot about licenses so everyone who created a small package even if it's at school they have a school project even if it's a senior project they didn't attach a license for it mostly for pure ignorance because you just didn't care so that was an interesting component that I saw there then you noticed a spike here I took a while to figure out what the spike is I couldn't read it but there is one thing that happened in 2014 that could cause this GitHub released their Choose My license tool so they released a tool that say oh you're developing a new piece of software how about you license it just drop a license file in the GitHub repo and they walk you through a wizard of how to choose a license so that would explain how new projects and people coming in say oh I'm creating a GitHub project license that sounds like a good idea let me put it in which is a good thing I believe but you can see the trend that licensing is no longer something that people you care about as much as they did before and I'm not sure that's a good thing or a bad thing but for the purposes of this discussion it's kind of a bad thing because if you want to share your software you need to specify how you want it to be shared that is your right to decide how your software is shared whether it's public domain whether people can copy it whether people can modify it and keep it for their own or not so this is the interesting information that comes out of here so open source and enterprise when you go to open traditionally enterprises did not like to use open source if you go to a lot of companies specifically that do not work in a field that is directly related to open source a lot of companies that develop for windows or develop for Mac or they develop their own custom appliances they do not use open source a lot because there is nothing they can use there a lot when is the last time you use a library under windows that's not based on a platform that's open source like python or go or something like that you hardly did that companies were kind of fearing the use of open source for a number of reasons one of them is the fear of litigation and there since 1989 there was licensing enforcement that was going on and I picked only a couple of examples I promised I was not going to talk legalese but these examples are actually some things we feel we work with so next who created next company so Steve Jobs when he got kicked from Apple the first time kicked out of Apple the first time he created next and next they created objective C so what next did they added objective C support into the GCC into the GNU C compiler but the GNU C compiler is developed under GPL and they started distributing the compiler out to users so they can write GCC code and compile it on their next devices so they didn't modify the compiler only but also they distributed it and that's when they crossed the line so after some enforcement between the Free Software Foundation and next next ended up submitting a patch to the public upstream where GCC by default now for everyone can compile objective C so you see that there is value to things or something that would have been hidden came out now did Steve Jobs repeat that mistake again not really we have what do people call this like a silver pretty BSD box so Mac generally traditionally is known as the Mac OS 10 following is basically OpenBSD or FreeBSD forgot which one that they just forked and started adding their stuff on top of it so basically they can do that and distribute it as such without contributing back though they do contribute things back but they're not contributing their OS back because BSD is a permissive license so it allows them to ship their product without shipping the code that goes with it okay another thing also in 2004 Netfilter IP tables won an injunction against a company that was selling firewall software this company cannot sell its product anymore because it infringed on IP tables and Netfilters so imagine you can go out of business because you did not do your licensing right and then my favorite one is this one the 2008 not because they sued Cisco but because sorry that came out the wrong way because actually Linksys had a modified a modified Linux kernel which they did modifications to some of the networking components that are used in the Linux kernel to create their world famous WRT 54G family of routers if you go any self-respecting geek owned at least two of this at some point who owned one of this just to show off hands okay most of you are self-respecting geeks so why do we own this because this is the most hackable piece of network equipment that you can ever get your hands on in 2008 any piece of equipment that you can program will cost you in the thousands of dollars you can get this for 50 bucks at this time believe it or not this old model is still sold brand new till today why? because people love it because you have tomorrow open WRT DDWRT because Cisco and Linksys decided that we're better off distributing this code and giving it to the public rather than paying any royalties or any money that we don't know how much it will be and go through litigation okay and this was big especially for the maker community and the do-it-yourself guys because now we have open hardware that we can work with and personally I believe it's a good thing for Linksys and Cisco because guess what now they also sell hardware that they tell you is DDWRT ready you don't even need to flush it just plug the tell us and we'll ship you one with DDWRT okay so it's pretty awesome that way what is and but it's pretty awesome for us that we got the benefit out of it but think of an enterprise coming in and saying hey we want to use this open source library they're like see all those litigation that happen around it and the cost of any such litigation is kind of huge because you can go up to 300K just for legal fees whether you won or not okay and then you can get injections where you cannot sell your product anymore you can actually get damaged they can sue you for damages damages cannot be calculated they're just figure out how much you need to pay the company back I was talking with people in a semantic booth earlier about they say you have a talk what does this talk about and explain and somebody stepped up he said I'm from a startup okay we had a million dollar funding to do a project this is like a month ago okay we have a million dollar funding to do a project and one of our web designers decided to use a font that he liked online he grabbed the font he designed their website using that font okay the website got used for a while they got hit with a lawsuit for a million dollars because they used an unlicensed font so this doesn't even hit the big guys it even hits the small guys and I saw this guy fresh out of college in a startup trying to do their best and they got hit with almost for all their funding of this basically what I so the question that if you go to your management and remember guys I'm talking from an enterprise perspective I'm talking for big companies that I can hire 20 developers 20 contractors for 3 months to develop something that does something like it without this risk so how do I sell them on it because what are they going to tell me they're going to tell me why bother okay why bother with this open source thing if I'm going to have all that trouble I tell them guess what open source won okay we have the best tools we have the best developers we have people who put their hearts and souls into this there's a reason that Linux is everywhere all the servers in the world run it and all the companies use it because open source won and it decided that it won okay so then they say oh you tell them that's what I'm using it's talking to your manager okay just tell me you go to him tell me who do I talk to in order to convince you guys to do it and they tell you talk to the lawyers and it's like really me developer go talk to the lawyers you might as well tell me go talk to the ghostbusters it's all the same who cares okay and I have an interesting story about lawyers that I'll say after a little bit in my personal interaction with them so who do you talk to this is really the people you need to talk to you need to bring in your developers you need to bring in your security guys okay those are those security people are a key component into this because for companies that don't adopt open source a lot the first excuse they bring up is people in their pajamas doing software development at night on coffee and pizza and security concerns okay I know this was a for a long time it's not the stigma anymore and I believe you're here because you agree with that but you need security people in for reasons we'll discuss later you need your program managers and your project managers to ensure that there's a process in place that you can follow and you need your compliance people and auditors on board and of course you need your lawyers but believe it or not companies that don't use open source a lot they use something else a lot they use vendors they use vendor software they use license and the vendor that they pay for and those people the finance guys and the procurement guys they know how to handle these products the only difference is the cost for this one is not traditionally monetary for open source software so that's how you bring those people in so what pieces of information you need to collect in order to make sure that you use your open source packages properly so the thought process is I need to use open source I collected in place what do I need to give them in order to make sure that we come up with a good adoption and a clean adoption of open source in my company so you need to give them the name of the tool you need to give them a virgin and this is important you need to specify explicitly which release you are using in order to get everybody on board with and then you need to give them the license we're talking about licensing their whole what it was about is it good enough if you tell them it's good enough and all your lead engineer says it's good enough they say okay how much it will cost us you'll hand them over the license and then you need to keep the source code that you used and the run time that you used I have been bitten more than once by projects that went belly up or because remember no one likes mirrors anymore they don't host their source code in a place that I can get it by reverse engineering the packet structure for Debian so I actually need to reverse engineer the Debian package in order to extract the source and need to convince the lawyers that this is an official distribution from the author not a version that Debian just came up with okay because they were hosting their own infrastructure and the infrastructure went away okay also keep a local copy of the binary and run time but we don't worry about that a lot because people use DevOps practices and they do a lot of caching so this is mostly something you'll find everywhere now after you collect that information for every piece of software you have that is open source then you document it you keep it updated and you approve it by legal now after you do this you as a developer you can go forward feeling comfortable now I like the concept because I work in DevOps a lot I like the concept of separation of worries if I can have someone else worry about this problem it makes my life a lot easier so once you have it like this the lawyers worry about it instead of you and that's a good thing to have I believe so how do we go about it some of the work I do I deal with Java a lot of how you do this process in Java if you're using Java most likely you're using Maven so let's go and check out a Java project so I got Spring Boot Code I checked it out I went into the directory that has it and I did a Maven install so I have all the packages all its dependencies and I installed everything there what I do next is I run a command that's called Maven dependency tree what does Maven dependency tree shows you it basically shows you the tree structure of every package you have and anything it depends on so if you wrote a piece of software that depends on Java servlet API and you bundle the Java server API on your library it'll show you that you have that dependency and guess what you're pulling in and here's what it will look like I know it's scary and actually it is scary so it tells you this is the Maven dependency plugin that it's running on Spring Boot and this is the jar for the whole library of course more complex projects will have bigger one but this is I chose this one because the only one that can fit in one page in one slide so you can see the list of dependencies going on so you can see for example this is the top project and then under it you have three main libraries that are being pulled in you have Hibernate Validator FasterXML and you can see each library and any library that it depends on so you have actually three levels going in for example here and you see the specific version available and not also that but some companies care about my dependency during compile time or rent time some people like to make a distinction you all know that Maven gives you that and Gradle and all the other projects have similar ways of doing this which is pretty awesome and saves you time so when I started at the beginning when I ended up I wrote a small Python parser that I run this command pipe it to the text file run the parser, give these numbers and start the paperwork from there it's meticulous process now why did I do that so remember I talked to Joan Semantec I was this open source guy who joined the proprietary company I joined this team awesome team, loved them all second sprint in, still learning the ropes and they were like hey Rami, you love open source right? yes you use Linux a lot right? yes I'm the only guy in the office with Linux set up on their box okay we have this open source thing that we want to do, are you interested in it? and this is how it was framed this is the open source thing that we want to do are you interested in it? oh yes definitely Semantec open source I didn't know that they were dumping on me the biggest set of paperwork the company ever had okay of course if you know me you know I'm kind of OCD a little bit so they kind of regretted that I did the process because what I did is I went back through the legal documentation that Semantec requires for open source and what I show you here is the half of it so I actually got the actual process that needs to be implemented and I documented it and I made sure that everyone in the team followed that process going forward and I told them by the way I did this for the whole like 200,000 lines of code project I will do it after each one of you guys did it at that time and of course after that we ended up doing something that is more important which is in order to do all which I will talk about in a little bit so remember one of the things we talked about is we said we wanted to include the security people right and there's a reason we wanted to include the security people CVEs a lot of the projects that you use get CVEs against them especially if you use Tomcat-based Apache any of the Apache Commons code you always get CVEs against them some of them are small, some of them are big but as a project, as a company you need to decide if I really want to follow up on this or depend on the upstream for them to patch it for me for example for some companies it's a deal breaker if there's a CVE I need to figure out a way around it I cannot deploy my software with such a CVE against it so this is something that we need to make sure that we have security follow up but not also that, remember I showed you how to work on your code base specifically but there is something we kind of forgot about we forgot about the platform, we forgot about Java itself we're deploying a Java project you still need the Java runtime to take care of some miserable CVEs like this and you need to make sure that you file for it because once you have the process in line and your security guy looks at their DB and they say, oh you're using the vulnerable version of Java they're going to shoot you an email or create a Jira ticket for you to go and update it and make sure that you're using the live platform so there is process it is tedious, trust me I did it but there is good that comes from it and I know my team is going to haunt me with this and make me do it in the future so also, you're still using Linux right? and you're using Linux packages right? so you actually need to make sure that you follow the same process for it also and guess what, Linux is awesome it makes life easy and this is an example in CentOS if you run this RPM-QA query format and name, version, vendor, release, license the same things we talked about it'll spit it out for you it'll give you every package in your environment it'll give you it's name, it'll give you it's version and it'll give you it's license and all you need to do is plug it into a CVE, plug it into a CSV and bootstrap your request and once you have a master a master file after that it's just a diff that will take you from one place to another okay? make sense? so now after that you actually need to organize your open source process once you organize your open source process you want other teams in the company to also follow, remember you're an enterprise so you need to make sure that everyone follows the process so at some point you want to have a review committee for it, it's not one or two people we already talked, it's security security people, developer project manager, product manager I know in some companies that's all one person it could be probably you but you're a one person committee so also it helps to create a list of pre-approved packages a list of pre-approved licenses that kind of falls into buckets that doesn't slow your development process because you need to decide am I going to stop something if it doesn't get approved if I want to add a new spring package am I going to stop all development until I get the lawyers in and approve it they take their sweet time doing things so you need to define a process for it and you need to share experiences from other people you need to also monitor CVEs and don't forget your transitive dependencies so you don't go for the first level the second level you also go to the third level and keep going down for every level package that you have but here's the good part usually if you have a large project enterprise-sized project with a lot of dependencies those dependencies share a lot of libraries so you end up shrinking your requirements quite a bit and never forget Linux never forget the Linux packages that you use so you say oh I need this Python library in my application but I'm not going to file for it because I'm just going to install it using Appget that doesn't fly because remember the whole idea is we want to sanitize everything we want to make sure we have a process that works for everyone but like you said it's an extremely tedious process for example monitoring CVEs it's not easy especially if you're a small shop so one of the things that you can do and remember I'm a DevOps guy by trait I will not be doing my job if I don't tell you that you can automate it there are ways to automate it it's not only Python scripts and CSVs there are actually tools out there to do it and in my experience I had the pleasure of working with Blackduck Blackduck has actually Jenkins integrations and other tools that you can involve in your CI CD process I'm not trying to sell Blackduck I talked to the guys first time today because I wanted a duck but basically one of the tooling that they do is and remember I don't represent semantics just myself they have tooling that you can integrate into your Jenkins pipelines that whenever it builds anything it will actually grab the packages and it will analyze it for you and generate reports for you so if you're a big company that is probably already using Blackduck for something or you're looking for a solution to streamline this actually this is a way to automate it for you and I know that in the Expo floor there's also a couple of other companies that does the same thing for you so we discussed how you do this from the side of enterprises but guess what developers have homework to do also so if we assume that the enterprise did their job and we hope they do and as I said if you're an open source enthusiast and you love open source you actually make sure that your company is being in place not to protect them but to protect the thing that you love and care about like open source software so how can developers work so these are the items that we discussed the enterprise will be taking care of name, virgin license copyright and I have licensed twice for some reason and copy of source code and binaries okay so names please use suitable for work names I run into three packages that I cannot tell the lawyers about I just needed to send it over email because I cannot pronounce it to respectively to other lawyers so I know people don't do that anymore but I need to put it out there okay versions please use release versions although as developers we like to use the shot the last eight or first eight of our shot 256 commit to be our version numbering you'll release your software for people to use use semantic versioning or make something that is a little bit more incremental so people can actually keep up with you and keep your versioning also please please and I don't wear the cameras there please please dear developers include copyright notices okay I personally needed to submit get pull requests get help pull requests to people to add copyright notices to their readme file I don't know why lawyers really care about them and if a project doesn't have it they may say no to using it so please put copyright notices in your codebase it doesn't need to be in every file as long as you have it somewhere in your documentation that is something that we can work with so now I'm going to show you examples of things that kind of drive me crazy so look here this is a license an MIT license piece of software anyone can tell me what's wrong it's not like I'm highlighting it for you this is a license that I pulled out of a get repo for a project that I needed to use they provided the license template as opposed to an actual license so they need to put the year right holder names in the license file as opposed to just putting in the template there I'm fine with it, I know it's an MIT license go explain that to a lawyer okay talking about explaining things to lawyers so thank you my joke didn't come yet but I do appreciate that so I mentioned at the beginning that I work on cloud products and one of the things you do with cloud you don't actually give your clients hardware to install you basically serve them bits over the internet that is mainly an HTML page or a JSON blob so I was talking to this lawyer who had a lot of experience with on-prem solutions so they have that going for them and they worked with developing and deploying software with companies that deployed software like that so I was like they keep telling me you need to make sure that you file for the Linux that you use so your customer can use Linux I said my customer is not using Linux my customer is using Firefox or Chrome Linux is compiling the bits that's going to them and we started going through this discussion explaining what cloud is and how it works and using examples like Pinterest and Twitter that didn't work, I actually used Nexus Lexus and things like that it sunk in a little bit and we started talking about it and after a little while they got frustrated I'm not understanding what you're saying and pardon me, I'm just a developer so they didn't understand what I'm saying and they're like okay how about we reach this middle ground if you can make your clients install CentOS for you then you don't need to do the legal paperwork and I'm like I'll do all the paperwork I'm not going to let my clients install CentOS and I'm not blaming the lawyer the way they're training let them in a specific way but there's a huge gap that sometimes just doing more paperwork is easier so you can imagine somebody like that throwing a fit over something like this okay the next thing sorry there's another thing today morning I was talking with a nice gentleman who represents Canonical in the Southern California region his name totally escapes me Richard Gaskin and I was talking about the talk and he mentioned that actually with respect to the licensing there's a license that's called and my kids are in the room but it's called WTF license okay and there is a Ed Obuntu which is a distribution of Obuntu that cares about education and actually creates custom distributions for schooling they were working with LAUSD and Los Angeles Unified School District and they wanted them to explore free and open source software and as part of explaining open source software they tell them they should go and look up the license so basically they're teaching them the right thing and they're teaching them how to do open source write so when a student goes in and excuse me I know this is a license but I kind of changed some of the writing a little bit and this is the license okay now it's open source software, it's free software I respect everyone's opinion but unfortunately there are people who use this license people who their software could have been used in education that use this license okay so we kind of need to make sure that we have proper licensing in place but it's just something I wanted to bring up as some of the pitfalls that happen something that maybe you may be a super duper developer where spinning off something is like takes no time of you but there are other people who can really use it and will love to but something like this stops them from using it so also another issue that I faced with dual licensed software so someone can say oh this package is available as Apache 2 and as GPL also okay so you can use it as Apache 2 or as GPL now enterprises in general tend to like permissive licenses so if I want to go and download the Apache version great I go get Apache version code and use it okay but there's a problem the developer wrote their code with GPL license embedded in the code okay which is fine but they say if you want to use it as Apache you need to replace the GPL license with Apache so just basically write a script that goes into my code snips out GPL and puts in Apache that's fine I don't see a problem with that if I was a personal developer but there is one big issue of that that may make some lawyers squirm remember the four freedoms I need to I get to decide what freedom I give others right but guess what if I touch their source code I actually modified the code right so if I go to a lawyer and say this software package is available GPL and Apache and by the way I downloaded it my intention my internal intention is to use it as Apache but there's no way to prove it the only way to prove it is to touch code they say do not touch code especially that it has GPL okay so and it's common practice and I haven't seen I've seen in more than one place people actually go through the trouble of providing it with two source packages one package that uses Apache in the code and other one that use GPL and I believe that's the right way to do it in order to help make it more accommodating for for the enterprise so don't please make users who do not want to touch your code touch your code and remember touch is a command in Linux also for security one thing that package developers may not pay a lot of attention to is you used for example log back okay or log for J log for J has another dependency under it there is a CVE under that third layer dependency okay you may not care that much because you say oh I don't even call the runtime or the class that has that CVE against it but guess what there are other people who care so there are how do you deal with it well somebody might say you know what I'll grab your code I know maybe enough unfortunately to manipulate it to say okay for that specific dependency okay don't use the one they're specifying in their palm use the one that I'm going to overwrite it with this palm with this other dependency but here comes a problem I'm injecting a library into someone else's code that they never tested their code against this library it's up to me to test their code against this library okay so the lawyers may like that approach but me as a developer it puts me in a risky situation if I push that code into production something comes up and they go to the IRC or whatever they're using or Slack or whatever they're using for supporting that package and they say well we didn't configure it that way okay so if you have a project you care about keep an eye out for the CVEs and deploy versioning or alternatively and I was happy that one of the projects I faced this with had this they had enough comprehensive unit tests that even if I replace the internal libraries the unit test touched everything they needed so they told me you know what build it compile it if it works do it okay run the unit test I personally have confidence as the author of the package that if you run the unit test it'll work and guess what it failed on one thing they patched it it worked and two months later he sent me an email saying hey by the way you told me about this I did a change to accommodate the update and I updated the underlying package thank you okay so good things come out of these and for that specific package God knows how many people got the benefit of it also provide means of communication so you don't want to put out your email if you create a library create a distribution list or just say what IRC room you hang out on speaking of IRC so remember the team who gave me that job that they didn't want one of the processes that I put in is Slack was not available at that time it's like where do we get support how would you reach the author and say well they're a mailing list well not everyone has a mailing list but there is IRC so you need to think about it okay new guy coming in talking about Linux and software coming to one of the most established security firms in the world talking to developers who have been doing security code for a long time telling them please log into IRC from your workstations I needed some convincing some convincing to get that running so please do provide means of communication so in closing after we saw what enterprises need to do and what developers need to do I believe there is a sweet spot that we can reach because once people come knocking your door and saying that you have a problem with license enforcement in your organization then it becomes an enforcement matter however when it comes from inside the organization to adopt and to accept open source and try to be good stewards of it as you go forward then it actually becomes a culture and if there is one thing we know about open source it is culture and if that culture goes from within you actually come out with something your company will benefit or you won't get any trouble for the most part when it comes to open source I thought there was an awesome code that I just put my name under in case someone wants to use it so thank you very much this is my talk if you have any questions please yes so when I deal with a language like that I try to enforce virginity so usually although it is not inherently available when you import a package it will always grab the latest however there is a mechanism where you can specify a specific version and generally when you work in a big enterprise in a big company you always try to force specific versioning practices, you always want to know whatever is coming in the pipe it will always come out the same way each and every time and generally in big organizations they have their own repositories internally so whatever versioning happens in the outside world they are always bound to the versions they have internally but it is always good practice to go specifically with the inside versions that's it because when you package your application you package it unless you are using Docker or something like that you package it on multiple layers and once you package it for production it is generally frozen at that state so when you are in you do your dev and you keep upgrading as you go along but once you reach the staging environment you are kind of stable with the versioning generally you are not going to upgrade a library once you are in stage so that is a good time to do that process and the time it takes to soak your application in stage is long enough for the process to go through by that time you pretty much know what the version is ok well thank you everyone for attending I hope you enjoyed the talk I know I did thank you very much you know well Shaden like my daughter was like a 10 minutes to start time and she is bored already were you bored honey ok I am sorry I will make it up to you at least you are honest about it thank you are your daughters coming are your daughters coming oh come on they don't need a pass to come I just cleaned up awesome