 Hi, welcome to my talk on promoting open source methods at a large company I've got some slides, but I'm not going to inflict the corporate template on you We'll post them on the site later, but Go without them. So Here we are at Fostum we're all open source developers We all know about you know all the great benefits of open source. We've got our differences, but really we're all in the same boat You know, we may we may choose a different license or we may choose a different Operating system or perhaps a different editor, but really you know, we're all doing the doing the same things We all know how great open source is so I'm not really here to sell open source to you I'm doing that inside my company and it's a bit of an interesting sell and that's what I'm going to talk about today But you guys I don't need to sell it to you so What I'm what I'm really going to talk about today is our efforts at work to bring our bring modern open source tools and methods Into the software environment at the company And in that context, I'll be talking about our system aero source, which is a repository similar to say source forage and How how we've built that up and a fair bit on the technical side of what we've done there What's worked what hasn't as we've scaled out and then talk about our experiences in terms of adoption what sort of resistance we've run into and What sort of forces have have affected the evolution of our of our system? So first a little background on on the company. I work for I work for the aerospace corporation We run a federally funded research and development center in the US We work in the area of national security space. We work for the Air Force primarily space and missile command As well as the National Reconnaissance Office plus any other civil or civil or commercial organization involved in space We literally work in every area that has anything to do with space systems including things you might not think of such as Communication networks ground systems We have civil engineers because we work on things like launch pads Literally every engineering discipline Plus the things you might think of more commonly such as launch vehicles satellites Sensors on satellites and then post-processing of data from those sensors. So a wide range of things And we have about 2,500 engineers who who work in these areas My my role at the company is in the technical computing services department and we provide Technical computing expertise to all of these engineers so they can get their job done Are we're on the whole these engineers are very experienced in their particular discipline They have on the on average over 20 years of experience in in space systems But they're typically not software engineers and so one of the things we're trying to do is Connect them with open source tools and open source methods and really get them up to speed And and start using modern technology in their software development So as you might as you might guess these these engineers write an awful lot of code we did a survey back in Back before a y2k to try and figure out how much software we had and we determined that we had well over 3,000 packages Either external software or internal software that we were running throughout the company Works out to nearly you know to somewhat more than one parent her employee So an awful lot of software So so these engineers are writing much of this or modifying much of it for their use Some of it is just private use they do it for the right code They run it and then they produce give results to people other stuff is shared for it within the company And then a small amount of it is externally distributed Some of it's purely internal some of its modified code Historically much of this code has been hidden away And this has been a big problem. We've had a fair amount of code that you know is just sitting on people's desktops Waiting for disaster to strike and wipe the machine out In other cases we might be lucky and it might be on a shared volume But often there's no revision control at all Probably don't have to sell too many of you on the benefits of revision control, but it's amazingly difficult sometimes to sell it inside We've had projects where we found where they were working on a collaborative team and You know they were relatively modern They had they were using a Windows file server somewhere and they had a whiteboard where when you wanted to work on a file You wrote the file name on the whiteboard And even then they were one of our more advanced users at the time Even then they had problems with functionality disappearing between revisions because someone would add it They'd build the release and ship it and somebody else would take the code They'd been working on in that same file and just throw the old version of the file back on with their changes and poof there went the feature So one of the things we're trying to fix here is get this code out in the open and to get it into revision control So that they stop losing stuff So that that's sort of one of the many problems we're trying to solve Early on we recognized this problem five or six years ago We saw this and said, you know, we should do something about this at the time we decided to Build a sort of a source forage Lake site Using G forage, which was the last based on the last open source release of the source forage code and You know, we both wanted to try and solve this problem within the company But also selfishly we wanted to have a single server setup So when we had a new project we could quickly, you know, stand up a new new CVS repository and new bug tracker and all that sort of stuff We did get that set up and we attracted a handful of projects, you know a couple users and It sort of worked, but it didn't we really didn't get good traction We did keep it running, but You know, we ended up with the problem that you'll see if you go to source forage pick any ramp pick a random project you know, most of them are dead after all and You'll see the standard ugly default template and that's what we ended up with on our site You know, it's the same the same problem really so it didn't it wasn't a good way for people to find each other's work But it did sort of function The other problem we found is that It was amazingly difficult to perform an upgrade It was typically a two or three day full-time process Which was kind of ridiculous given the thing the system served maybe ten projects Every time a revision came out the whole innards had changed. It was pretty awful. So we sort of let it go It wasn't we had some success, but it didn't really work out In 2006 we looked at the problem and said okay We're pretty sure this was a good idea, but we need to try it, you know, let's let's try again There's better tools available at the time. We were we were really interested in moving to subversion. We also Just hadn't you know looked around and said, you know web apps have gotten a lot better You know what the experience is better the interfaces are better. There's much more integrated stuff Really a lot better. We also Thought you know we should really try and change the culture a bit within the company Try to get things so it's more sharing more open and really Just just try and get get people out there and say okay. Let's let's actually share this code You know we only need probably one piece of C code to read two card element sets for satellite ephemera You know probably one is enough The format after all hasn't changed since it was actually distributed on cards So we're trying decided to try and work on that As a result of that decision we decided that we were going to have an Have a rule that any project that was in there had to be fully open just like it is in say source forage Where everyone everyone within the company could read the code That limited the base our base a bit in that there were a number of projects out there that we knew couldn't use The system because they had NDAs or whatnot were only a restricted set of people could actually use the code But we thought there would be enough stuff out there and we wanted to have have take that strong stand that Only that people would have to be open We called this concept enterprise source We actually don't use the term much at the moment, but the idea was that Basically open source, but within the company so within the company anyone could could get in there So it turned out that there was there was some short funding with it internally to work on a project We got I think three months of funding Under an innovation grant so we could work on this and so we looked around and we ended up selecting a track Is the basis for the revision control system or as the basis for the web interface and subversion as the version control system? What we liked about it is that that it doesn't impose any particular Process on the user. There's a very small amount of workflow. You could add more, but you don't have to And also that it was fully integrated unlike the classic sourceforge model where they took best to breed tools such as a bugzilla And whatnot and sort of mashed them together The nice thing was the tools were all really powerful, but they didn't integrate well So we looked at track and we decided that was we liked the re-integrated model nothing was you know, none of the pieces are As good as the best standalone piece, you know, the wiki is not up to media wiki standards or Twiki standards But and the and the bug tracker is not up to bugzilla standards But the fact that the markup works across them and the interface is all the same was really useful to us So a quick segue before I get more into the implementation details a quick side Overall the whole system is built out of open source software. It's free bsd hosting environment Apache web server We started with mod python as our method of running a track course postgres on the back end and We use the ports collection to create custom meta ports to manage all the software So we bring in track and postgres and all the tools Or all all the various tools we need plus all the plugins All of those are handled through a custom meta port which makes it very easy to replicate the system If we want to stand up a dev server, it's a couple hours work to You know build and install everything and then Shlep a copy of the data over So it's really nice. I highly recommend this approach if you're if you're building any system like this And I will say you know that if your packaging system makes it hard you really need to fix your packaging system So on to the implementation details So one gotcha with with our selection of track is that track is designed for single projects There's has a very tiny amount of multi-project support and it's it's very fairly weak So we had to do a fair bit of work to get that working in our initial effort We we targeted simplicity of the configuration So we used the fact that both track and subversion have the ability to say in the Apache config Here's a directory. All the subdirectories are either track environments or subversion Repositories respectively. So we used that because it was simple and you know the total config was 20 lines And stuck all the all the track environments in one place all the subversion requirements in a subversion directory And and that was that was very simple The the the overall project internally were just you know, or we go to the aerosource host It just redirects redirects you immediately to the aerosource project because we eat our own dog food here and Aerosource is entirely managed within aerosource The initial implementation worked pretty well, but it did have some issues One of the interesting ones we ran into early on it didn't matter too much because to start out all of our projects Were basically the same but once we started doing custom plug-in development. We discovered that oh Mod with mod Python every instance of track was living in the same Python interpreter So if you loaded a module that hooked something like say the custom email Like like like the the email notification thing so that you could tweak the results Which we did for one project if you did it wrong it changed who you emailed to in every project When if you think about it, there's the additional problem that since any project can upload a plug-in you then have arbitrary code injection Which is not maybe the best thing We we also another issue was that because we were using because of the way we were using subversion we ended up Not supporting Subversion over HTTP for a commit because there was no we would have had to trust each of our users to manually Configure the access controls, and I didn't feel that our users really up to that task We provided subversion over SSH instead which worked quite well, but It was a little too much for some of our windows users and did eventually start to hurt adoption There's also a minor issue that because we had this one directory over here of track bits and this other direct bit of Directory of subversion bits. It was a little tricky to tell how much storage your project was actually using But enough implementation for the moment It was it was it was in fact fun to put all that together get it working, but you know the real proof is In adoption rate I've got the one slide here that is probably worth looking at that. I'm not going to show it The it's a graph of project growth overall So for about the first year and a half of Aerosource we grew pretty consistently about three and a half projects a month And then about a year ago we started to get more pickup Due to a number of internal factors And we're now the last six months. We've actually been growing at an average about ten projects a month So I've been pretty happy about that when I put these slides together We were about 204 projects internally and then I think since then we had a little lull in December as no one was really doing any work or at least not starting big new projects and Now we're up over 210 So and in addition to just the numbers we have had some actual some really nice wins. I think one of our biggest wins was that about after about six months in we got Management pressure started to apply to people and say you guys need to get your code Into the repository and start working in this environment and one of the big ones was a program called the satellite orbital analysis program or soap for a maximum acronym confusion Especially when a couple years ago we started writing web services modules for it The so we got that in that was a big one that's one of our few external software products that we actually give out to places like NASA and Some of the military sites It's a very pretty graphical program for plotting satellites and and whatnot and getting that in we thought was a really big win We've definitely got some holdouts, but that was that was a big one and actually One of the one of the annoying holdouts is a major library that soap uses, but I think we'll get there eventually so We have had some a number of things that have delayed adoption in a number of cases In some cases, we've had a legitimate issue. We've had legitimate needs for a more secure system One where not everyone within the company Can see the code Everyone inside has a clearance, but not every but it's not appropriate in all cases to make the code available So we've had some legitimate issues, but most of our issues have come down to a real lack of a sharing culture And we've really been trying to change that with I think some success In many cases, there's there's an odd perceived need for security people are convinced they need to hide this stuff from their fellow employees Despite the fact that they should be able to generally trust them So that was why that was one of the reasons why early on we said look it's all got to be open We won't provide any alternative if you want to be in here You've got to keep it open because that let gave us a wedge and we said we could say look you want these features We know you want them So you need to really look and see do you actually need to keep this stuff secret and in many cases people didn't But it's hard to get people to convince people that that's really the case We've also had interesting issues of there's a real fear of misuse that if people put their code out that their fellow Engineers will go off and use it in some inappropriate way Sometimes that's legitimate You know Something that's not written for human space flight shouldn't be used for human space flight obviously but for the most part we've been able to convince people that you know You know you really do need you really can trust your employee your your fellow employees to do it, right? But that's that's been an interesting challenge and something we're still working on is that that we really need this more Openness There's also some interesting cases of a misplaced sense of ownership people are like this is my code and You know I'm not gonna let anyone else use it, but in fact You know it's the company's code. It's not your code So it's really it's really interesting to get that sort of thing Trying to encourage people to give credit is a real challenge It's something the open-source community does very well, and we're trying to work on on ways to get people to do That had some some limited success in that area, but it's it's a little tricky The strangest Objection though we've had so far to aerosource was If we put it in there people are gonna see it, and they're gonna like it and then they're gonna write new code We actually had a long argument one day With a group and they were like no it would be horrible people would come and they would write code and and make it better And I will admit we lost that one We were like isn't this a great problem to have you know yeah Yes, you'll have to spend some time doing review But wouldn't it be awesome if you had to hire another person so you could do review of new code but That that one still floors me Fortunately, it's not all gloom and doom We've had internal pressures as well that have really helped us out in terms of getting new projects one big thrust That's helped is a big push over the last decade for continuity of technical operations It's similar to you know continuity of business operations where you know you're set up You have a second SAP instance somewhere so you can print your paychecks if your data center gets flooded or whatnot Continuity of technical operations or Kodo as we typically call it inside is basically the same thing But you know how do we get engineers back up and running for instance? We have we have tasks that are Go no go for Rocket launches we have people analyzing weather data to make sure that the upper atmosphere is is right So it's really important that we be able to if we have say a fire or a flood or an earthquake We are after all based in LA That we be able to get back up and running and with all that hidden code I talked about that's really hard You know if the only copy of some critical code is on some guy's laptop You know even if it's in backups you still have to dig the backups out if we lost the data center You know we'd have to have have to retrieve them from Iron Mountain and then figure out where they were That sort of thing so getting code into a repository and getting it getting it properly handled backed up It's gonna help there So we've been So there has been been a push and an aerosource has been named the Kodo repository for internally developed software And so that that's been been really helpful with really increased the number of projects going in and one of the things We're seeing is that once we get somebody in for that reason they'll see the tools They'll like them. They'll say wow, I don't have to maintain this PASCII server anymore And then they start putting in more projects So every time we win a new group we get one converted a new group that it'll starts to snowball So that's really good So one one thing we did have to do when we became the Kodo repository is We we had to relax our everything has to be open setup because There were in fact projects that are of critical importance that did have NDAs associated with them either because you know they incorporated data from a particular contractor or We're using we're using somewhat sensitive Sensor material or sensor information or whatnot So we did actually create a separate project sort of within aerosource called a randa stands for aerospace restricted in NDA and NDA And the idea is it's just like aerosource except that it's access control Project requests are used the same form you just check a box that says I need this and then because we didn't want people going Oh, I'm going to go get an a randa project because you know I don't want other people to see my code because maybe it's awful or maybe you know or you know Maybe I want my job security or whatnot So because we didn't want people to do that we require that people submit a justification with a traceable security requirement And then it's approved by my boss who reviews reviews all those things Now talk a little bit about the implementation changes we had to make to make this to make a randa work First off we needed to harden the system. So we used the Center for Internet Security benchmarks for both Operating system and for Apache to tighten down the security config We were pretty good already, but you know a few tweaks here and there didn't hurt We also switched to per project WSGI, which is a web server gateway interface Instances so this got us away from the Python instance per project Or a rather a single Python Python instance, so we did have a Python instance per project We also moved to have a group-based access restrictions so that we could have So that these a randa projects could only be accessible by certain groups of people And because we did that and had to do all this splitting out as a side of side benefit We got HTTPS Subversion commits working We've also for the randa projects which to per project virtual hosts for the primary name Not really using that for anything at the moment But the idea is that later on we could move to a model where we had perpetually per project jails or Even or even we could cluster the project so we could have multiple servers and use smaller servers to scale In the process we also consolidated the storage structure so that we had one single point of file system entry for every project and therefore We could more easily manage the security and verify it This implementation has also worked reasonably well But it has some new issues because we now have a Python instance per project Running inside an Apache instance it runs running inside a Apache process The server rapidly ran out of memory. We originally had two gigs of RAM in it. We ran put it up to four, but we're still Get it's still a bit tight Fortunately We've got a new server on order so that will do to fix the problem the brute force way and go from four gigs to 72 That should keep us for a little while We've also as a workaround we set the processes up to die after five minutes So given that most projects aren't being hit all the time that kind of helped keep memory footprint down One problem we've discovered though is that the internal web crawlers for our search engines have the world's worst access pattern for us And that they'll go into each project and read exactly the same sub page all the way across at several per second And so the server on the weekends gets kind of sluggish Fortunately the upgrade I think should fix this One other issue is that the subversion accent what well we went to per process or a per project WSGI instances and in that and for a random projects those have a different user for each project We can't do that with subversion because the subversion Dave module Runs as only can only run as the web server. It'd be nice that could be fixed. We may actually Once we get Kerberos integration into our Into the system. I think we'll switch to Switch back to doing to promoting SSH based Subversion access to address this issue, especially in the more secure projects We also have the problem that again with the subversion module. It leaks memory like crazy Under SSL Can't check out a large project without running out of memory basically have to keep retrying Overall adoption has been not what I would have wished when I put these slides together We only had 10 projects We we'd announced in August and that was in November. So I think we're up to maybe 15 now So we seem to be growing, but it's slower than I'd anticipated We knew there were some pent up demand and I'm a little surprised that we haven't had had more pick up there I suspect in part that the reason we've had this is that the users who have these security requirements tend to be in their isolated little worlds and Maybe just don't haven't really seen the value of the open-source tools yet I Think one other problem maybe that they're also expecting that we're offering them revision control and they're thinking Oh, it's gonna be one of these massive heavyweight systems where you have to fill out six forms in order to make a change and whatnot You know like like you would for flight safety software, but really we're not doing that I think others may just not know we exist so we're we're in the process of an ad campaign for that, but it's a It's it's a haven't quite got the pick up. I would have liked there Talk a little bit about future directions here now. That's that's sort of the current state of affairs. So we're up at bit over 200 projects Not bad about One for every ten engineers. So pretty decent So in process in the at the moment we're working on getting new servers and also moving to a Moving to a system where we'll be using free BSD and aid and ZFS To allow us to replicate projects. We're gonna have a new server on our on the east coast So we'll replicate data regularly to actually provide the kind of backup. We should be in addition to merely being on tape We're also looking at some track enhancements one of the one of the things we found for instance is that Users will get get it get one project. They'll get it going. They'll be happy and then they'll get excited And they'll create four more projects because they have all these different projects They're working on and that's great, and then they say oh, but I wanted to see all my tickets at once And right now there's no mechanism for that So we're working on creating a mechanism to do that Hopefully we'll be able to write something that we can contribute back in a useful way and get back into the community We're also contemplating some ideas of sort of a cloud or clustered approach where each project can live on a different server Looking at that another one will almost certainly be doing is Adding more revision control options I think for many of our users centralized revision control is probably what they can get their heads around at this point But I think we're definitely seeing a lot of advantages to distributed version control So we're gonna probably start bringing that in we've got some sort of unofficial mercurial support at the moment But it gets a little more complicated. We need some new some new track features or some up-and-coming track features To make that really work. We also because of this this need for Koto is where we need to where we need to have a reliable central repository for backup purposes We're gonna have to carefully manage how People use distributed version control so that it doesn't all just end up back on their laptop And that's something I'm really interested to see how how that works out in the route How that how that's working in the open source world and what lessons we can apply in that regard and overall what we're really Continuing to work on is trying to convert more users to this open to the whole open source frame of mind Really get them more into it get them more open get them to share with each other because I think that's really powerful One of the best things about open source and I think if we can make that happen inside that that will really Improve our ability to deliver so in conclusion, I think Aerosource and the the aranda component are introducing people to new tools and methods We're seeing some real we are seeing some real traction there But we've got a ways to go overall. I think it's positive influence I've got a few minutes left. So I'd be happy to take any questions people have and either either now or later and I'd like to Also if people have ideas for how to promote an open source in an environment like mine or want to talk about that Sort of thing. I'd love to do that during the conference You're out. Also. I want to plug Richard's talk. He's coming up next and I've seen seen a version of it before It's great fun Any questions? Hello. Hi. I'm Robert Wiggetman from Euro control here in Brussels. So it's the European equivalent of the FAA let's say Yeah, and there was one remark that you may which surprised me when you said that for the soap the so-called soap project that Management pressure was positive. How why did management suddenly realize that this was a good idea? How did you convince them because this is the for me the clueless management is usually what blocks everything? Um We actually got management the management pressure came from the fact that so they were at At the time I was in the the computers in software division and they were I was in a research arm We were a bunch of you know crazy open source people the other the other guys had been working on this thing for 10 years and They were in fact one of the projects that had a history of feature loss So customer would pay them to add a feature and then it would disappear and things like that and and we were spending A couple of man years every year on this thing. So it's a significant accent So management said look you guys you know, we've got this great open this great aerosource thing You guys should use it and and so that's that's where that's where the manage the positive management pressure came from Is any of your software likely to be of interest for external consumption as well? Is that even a discussion? You might Sorry your software is Isn't it gonna be a value outside your organization? Is that even a discussion you're willing to start? Well, I guess we're trying we're trying as as we develop larger modules that that makes sense I think we're gonna try to push those back Most of the software that's in aerosource Is is stuff that we couldn't ever release our release process is not it's kind of a pain We do a certain amount of open source, but Right now I Think we've got one I think we've got a couple projects within aerosource that are also released as open source for instance We have a an LLVM we Some some people in my old department wrote the sell back end for LLVM and that that lives in aerosource Hello, Daniel Brown. I'm working for a company called markets We we our company's gone through and something similar to yours. We've got a lot of different companies that have sort of Come in and we're we've got a lot of different types of technology. I just wanted to ask you about Adoption you said adoption was something you struggled with a bit. How have you tried to educate people in the Organization about what you're doing So I guess how do we educate people? It's it's kind of I mean a lot of its word of mouth Just you know we get one person hooked and they talk to other people The other the other thing is we have some internal Like tech forums and whatnot where people are supposed to come and talk about new new IT Technology within the company or whatnot and so we're using that we're using those forums to broadcast One of the things we're working on is to develop some curriculum for our internal education program Both to teach the particular tools as well as doing general overviews We need we need to build more of that stuff up or adapt existing stuff For instance, you know, we need a subversion course You know and there's good stuff out there But we need a short to the point version and then maybe advanced for later years, but that sort of thing Okay Thank you very much for coming