 Hello, everyone. My name is Sarah Miller, and my co-presenter here is Melissa Robertson. And we're here to talk about container factories and aerospace and defense. So we both work for a company called Collins Aerospace. It's actually not a company. It's a brand. We're branding a whole bunch of aerospace and defense companies together as legal entities that have a long history in the providing of equipment for commercial and military aircraft. But we're also part of a much larger organization called Raytheon Technologies that produces the Pratt and Whitney engines, produces Raytheon's missiles, helping us stay safe in our corporation, as well as Raytheon Intelligence and Space. As part of Collins Aerospace, we are in a division called Mission Systems. In Mission Systems, we provide all sorts of different material and products to our US military, as well as to mystery of defense agencies across the globe. And today, we're going to talk a little bit about our problems. Now, if you're an aerospace and defense industry, you may know that we maintain systems that have been delivered to our customers for over 30 years. This equipment is expected to stay operational, continue to run, and get updates. Because, of course, they want to upgrade things. But we're still using our legacy technologies. We're kind of stuck in those early 90s with the CRT monitors, those computers back in the day, which, you know, I have trouble using them. And I know my information technology organization that someone labeled digital technologies at some point in my history here do not want running in their network, like Windows 95 and Windows NT. How do you continue to maintain these products that are 25 and 30 years old? We're trying to figure that out for our developers to really help them and our product engineers, to really help them sustain what we have as the old, but also adapt and move to the new. And so we have all of the same questions that I'm sure your organizations have encountered. It's like, it built on my system? Why doesn't it built on yours? Or why doesn't it run there? Which package did I miss? Many of our services and setups have been servers that were maintained since the early 90s. Not something that we want to continue. We don't really want to go buy hardware on eBay. We don't really want to go trade with the National Computer Museum for hardware that's actually working. We also don't want to keep our material on floppy disks, especially when the new engineers that I hire don't know what a floppy is. They think it's the save icon. You hand them on. It's very funny, actually. But we have fun in trying to really upgrade our systems to really keep our material in a digital form. Where do we store it? How do we archive it? So it can be there for the next generation. And the systems are old. So if you remember your original UNIX flavors, bash was not a thing. So we really need to get to a state where we really want to have our systems be usable for the modern developer, for the modern engineer. And so we also want to prevent our environment disparity. We want to make sure that our engineers are all operating in the same parity of environments. They have the same things as hopefully our deployment environments. But really what I want is my build service to all match, my build environments between the developer who's been there for 25 years to the new hire that started yesterday. I also really want to, you know, I have all these legacy technologies. I really want to enable my developers to start using something new, to bring in that new technology that they can use, to start using emulation so that we don't have to keep that Windows 95 computer still operational and replace that little lithium battery that runs the clock. Don't forget about that. And then we really want to update our old practices, the practice of storing our material on a floppy disk that we know has bit rot or on the CD that we know has bit rot. If you've ever had to get an old material off of a CD-ROM drive or CD-ROM disk that was created in the early 90s, you may know something about bit rot, and it just does not come out. So we're trying to think transformatively for our engineers. And our transformational thinking today is these declarative ephemeral build environments. We want to have everything declarative so they can be recreated at a moment's notice, and we want them to be ephemeral. So we're not having a Windows 95 machine hang around for years on end where it could be exposing vulnerabilities all across the board to the network. We only want that running in a very short amount of time to create the application that still needs, that silly Boerlin C++ compiler running. And so we've kind of come up with, as a team, Melissa and I work in an organization that you could probably describe as platform engineering. We don't think of ourselves as that. We're just kind of that transitional organization keeping DT away from our engineers and our engineers away from DT. Trying to provide that interface and help our engineers get across. So we have a lot of needs as we migrate to containers. Our engineers have a lot of requirements of what they want to do. Digital technologies, or IT, has things that they want to do. And then we have our governance, risk and compliance, we build things for safety critical. So the FAA has a lot of say in how we build software. NASA has a lot of say in how we build software. The DOD has a lot of say in how we build software. And so we're starting with our infrastructure. And my engineers, all they want is quick access to compute. Give me a machine, let me run something. I want SSH, log in, do stuff. Digital technologies, I want to manage and control everything. I want to make sure that what you're running is good. And then the governance is I really, I don't want anybody to have access to your machine. We really have to control it and protect it. And I really want to know how it's going to be used. So a building upon our compute infrastructure is we really want to be using orchestration platforms. Our developers have a huge need to dynamically scale their build servers and their testing. And so we run college aerosets, builds things like simulators for aircraft flights. And these simulators use a tremendous amount of compute, but we want to scale that dynamically in our cloud. We also need to have policy enforcement for digital technologies. They really want to control access and really give us those guardrails and understand and prevent us from really destroying someone else's work and network. We got to keep our developers honest. And then our governance teams really want to be able to monitor and audit what we're doing. So we can say that yes, no international person accessed our ITAR data. And then we move to our container factories. And this is what we're gonna talk about today is this is how we're developing automated workflows to help our engineers build containers so they can build and test their software. We're not yet to the point where we're putting containers in our aircraft equipment most of the time. We have some systems that might want to do that and are trying to do that. But really what we want to do is scale how we maintain our software and how we build and develop our software in our organization. We want to make sure that we can get authorizations to run all the tools within our network. And we wanna make sure that we're not gonna break someone else's software that's out there running. And we wanna meet all of the guidance that our company has and the rules. And as you can see that we were kind of a conglomeration of organizations through acquisition. And so we've come from 16,000 employees to almost 200,000 employees. And so there's a lot of rules that we have to follow. And then we also need to have a layer for container visibility and observability. We wanna make sure that the containers we're building can operate in the network and we know what they're doing. And so we also want our developers to be able to use these things so that they have little impact on performance. They don't really need to do anything to their app or their build system to be able to use these logging and it's all built into the images. And we can test it. And lastly we have the app. And this is my stuff, don't touch it. This is what I'm building for you. And our digital technologies organization really wants to understand the risk of what I'm building is gonna happen in the network. It's very important, especially when we start running tests and that we're running our non-production software, we're trying to test things that we're building in our networks and how does that affect and how can we understand the risk of those to limit the vulnerabilities that we might add to the network. For our governance we really wanna understand where did that code come from? Who wrote it? I mean, did you pull it from the internet? Did you generate that through chat GTP? Who owns the copyright? So we really wanna understand where that code came from from a governance perspective. So we're trying to build on all of these things to give our developers a chance. But as you can see, what my developers want isn't necessarily what DT and our governance system will give them. So we have a little bit of a conflict and this has been kind of ongoing as our organizations grow. We end up with shadow IT services, things that just are not kosher in the environment. And so Melissa and I through our program is we're trying to create these self-service environments for our developer so they can get easy access to compute. We really wanna reduce our developer's cognitive load through some automation for policies and procedures so they don't have to know the 150 controls that they're gonna have to meet for RMF or CMMC in their environment when they just wanna launch this container to compile some software. We also wanna create all these immutable audit records so that we know who did what when. And we wanna provide them with some curated pipelines so that we have these controls and stages. So when we go to meet our FAA requirements or D178 system requirements and how we manage and build our software, then we have all of these things and stages that teams can pull in or not pull into their jobs and their pipelines. And we also wanna have these standardized containers so teams can actually use containers within their networks. We wanna create these pathways of least resistance for developers so that they stop doing shadow IT. They stop putting secrets in their software so they can log into that service account over there to get access to the get repository or haven't forbid the subversion repository or that old clear case repository that's still out there. So we're really trying to work toward bringing our teammates up, getting them the experience. We also wanna hope that they get some good cloud education and they can spend their time learning about the cloud, not learning about how to run through the policies at our organization. And so now Melissa's gonna tell us a little bit about our work that we've been doing to create some of these automated workflows for our developers. And as we move forward with this discussion is we really bring together a team of not just our team but we've brought in our product engineering teams together to build this. Let me just quick. Hi, Melissa Robertson. I have been working on our container workflow and trying to create a development workflow for our engineers to use. We have kind of just a generic, these are the stages that we're trying to go through and we really need to separate out the controls based on the different customers we have and different requirements we have. But we really need to onboard our teams and teach them how to use these requirements and have the thresholds for them, for the different environments, for the different networks, agreed upon not only by the developers but the security and quality experts as well. So we need to make sure that we choose tools. We have multiple of them currently that meet the minimum requirements for these. We need a record of review, make sure that it's auditable to the quality standards, immutable and per our company, retained for 99 years. We need controls around our source code and we really want to start to prevent the downloading of source code on local PCs and instead being able to develop in the cloud. So we really want to pull dependencies managed within our organization so that we're not going out to the internet and we know exactly what we're pulling down. We want to have this continual remediation of the dependencies and we want to also have the ability to pull pre-approved images from registries such as Iron Bank or CNCF containers that meet a security threshold. So once we pull those dependencies we need to check the signatures and check the certificates that are assigned to them and then we need to link the container and make sure that depending on the different customers that we don't open ports if we don't need to, we don't have access to things we're not supposed to, we don't give pseudo access to things that shouldn't happen. Currently we're using HadoLint for this in our build pipeline. So then we have actually building the container. We need to make sure our agents are running in an authorized environment. We need to make sure it's a declarative and an informal environment and that we reduce the, we limit the amount of the effect of anything being compromised so that if you build one container and it messes up things you move on to the next one and none of that happens. And then we need a audit log and monitoring the environment and make sure there's no unauthorized access and make sure the pipeline fails if that happens. So then we need to scan the container at a OS level and make sure that we continue to have state compliance and at least have two vulnerability scanners. Sometimes the customer requires a certain one but right now we use Grape and Tritie built into GitLab. But we must show that the container is secure enough to actually put our code to build. The thresholds can be different but we really need a separation of duties so that the developers can't change those requirements and it should be done by our quality and security experts for the program. So we also need to test the container and we really need to have that baseline test results so that we can do continuous remediation anytime there's a vulnerability and pull that in automatically. It is a delicate balance between remediate quickly but sometimes accepting that are needed such as running Python 2 and not upgrading to 3 and we need to generate the artifacts. We have a lot of requirements for the documents we need before we release something. Some people want Excel sheets, some people want a pre-website but we really need to work with our governance and compliance and risk management team to not only have the vulnerabilities at the point in time the container was built but continuous remediation and tracking of the vulnerabilities when people are running these containers and we need to automate it so that there's less mistakes and then we need to sign the containers and the generated artifacts that are within each container and then limit the length of the signature validation so that we're not running the same containers 20 years later. So we have a bunch of different environments. We have our corporate network which allows some access to the internet practice, developer network which has a little more leeway and then our cloud and closed environments. We need signatures to have the automated process to get approved to use in each of these different environments but we also need to continually scan for vulnerabilities based on what is running and have the ability to revoke the signature for any of the vulnerabilities that have been flagged above a certain threshold given the different environments and this is just a generic network with different clusters. We really need to restrict signatures for each of the clusters as well and make sure that we're not running something and dev that has access to prod data or a simulation that's running for a longer time has more restrictions on vulnerability thresholds and we really need to educate our developers so that we're not spending up 2,000 containers in the cloud and ruining resources for other developers. Or charging of $4 million in the day. And we really need to have the same environment across the board instead of having five different versions of our Git repository and this is something we're leveraging is ZARF from Defense Unicorns and trying to make it declarative across the board. So yeah, so we're building this and learning this week, I've learned that we're doing some platform engineering, new term for me. I know I've read some articles on it but I didn't realize how much I was actually involved in this and we were really trying to provide developer experience to bring this environment parity. We don't want our developers to be going into their secure lab and have to go back to use the version or heaven forbid clear case and use all the old tools that they've been accustomed to maintain that Solaris server or SunOS server that's been around for many, many years. We really want to bring them up to a really new environment and we want them to be able to use the modern tools that they're expecting to use. We want to be able to recruit new people who know how to use those tools and I don't have to train them all the time and I also want to be able to upskill my existing engineers so the guys who have been running Borland C++ still can learn how to use GCC learn what those errors mean and say, hey, color coding is a thing. So we really want to bring our teams together to grow together. We're doing this in isolation with just this team. We've involved our product teams with us. We have experts from our cybersecurity teams as well as experts from our digital technologies organization working with us but we do face a tremendous amount of challenges. As we go forward, we're still trying to figure out the appropriate ways to run containers in our environments at Collins Aerospace. We get pockets of people, we do a lot of shadow IT so I have Docker installed. I don't know, she might have Docker installed. I ran her installed on my desktop or laptop. So we try and use these things but then every other day or so, MacP kicks off and says, no, you can't do that and then I go file my exception. So we really want to make those things normal. Like maybe get the developer off their laptop so I don't have to carry around this nine pound laptop with an hour of battery life. I can actually have a reasonable one and be able to do code spaces or some really cool get pods so I can just access my app in the cloud and go somewhere else and do it again. So we're working hard to build those experiences, to work with our developers to really understand what their needs are and to really look at how can we support some of the older technologies and lift them up because we will have to continue to support our development systems that were built with custom tools that run on Windows NT, Windows XP. We're still trying to figure out how to get from Code Composer 4 to Code Composer 6 for some of our DSP processing. So it's gonna be a long road but we're working on it. Thank you. Yeah, any questions? It's everything. We have, I think there's, the biggest issue I think we have is a misunderstanding between what our engineering product teams need and what our digital technologies teams is expecting. What does production mean to you? So for me to put my source code that is generally owned by the government into a network, it has to be a production network. And so, but my IT guys are like, but that's source code, you're building source code in your production. That's dev. I'm like, no, no, no, I have to meet all the requirements for prod for me to get my source code there but then I'm gonna have a dev environment in prod. So it just doesn't understand and we talk a different language and we continue to have the same discussions over and over again. We are working through that. We do see the developers really want to be local. They want to run local but they still are, we've migrated many teams to servers so being able to access remotely from their computer without having to use the right VPN would open them up to more stuff and to be able to have people from the different heritage organizations to work on the same project. So collaboration between RMD, Raytheon Missile Defense and Collins Aerospace is a little difficult because we both don't have access to the same networks yet. It's been three years in April. That'll be four or no, April, it'll be three. So it's just really hard. Yes sir. We have tried some virtual machine for developers. We do our... That is part of our IT policies. They do not want us to run virtualized environments on our mobile devices or on our desktop devices. That's apparently a defars violation for them because we don't want to defars the mobile image or the image inside the VM, that's mine. That's because when I'm building my app on that I want to control what compilers, what tools, everything that goes into that. And that's why we're trying to build these container factories so that my teams can create all the tools they need and get them in the factory. We have the process for them to quickly iterate on that so when they're creating their environments at the beginning of a project or trying to reuse someone else's environment they can bring that through that pipeline. We can get its initial certificate to field accredited and then they can iterate on it as they go forward. Yes sir. Ansible right now is what we're running our cloud build in an EC2 instance. Yeah, we're really trying to move to that declarative baseline to make sure everything is declarative so we can recreate it in cloud environments regardless of what our CSP provider is or on-prem data center or in our classified environments because we really want to recreate those environments. Now none of us are Kubernetes experts so we do not know how to manage Kubernetes clusters so we're learning. Our security officers who maintain our classified systems are also learning what it means to be cloud native and what it means to manage Kubernetes clusters and how we bring those forward. But I hope my developer doesn't have to understand all about a Kubernetes cluster. Maybe a little bit. No, not really. If I do it, it's shadow IT. They might issue me a Linux machine that I can use and so that's one of the things that we talk about is the cost of having to issue me multiple machines to match the development environments that I work on. So I'm issued, I have a number of machines that are in labs that I've been issued that only I use to build software, couple Linux machines, couple Windows machines. That's a lot of money. Not only is it a lot of money just buying the computer hardware and it also takes a lot of time to procure it now. It's a lot of money in managing those machines. I'm not a good about doing my own updates so I know I have a Mood 2.1404 running out there somewhere. Maybe I have a lot of servers running 14.04, I don't know. So we really, I mean this part of that management of when I added additional, additional assets that are not ephemeral, that don't get recreated at the time of need, they stay in that legacy mode. They continue to not get updated to not be sustainable. And so, and then they get off. We have one, a couple servers, 14.04 is like, we want to migrate those to the data center. And so they migrated them as they were, kind of like, just duplicate, put them in the data center. I'm like, what were you guys doing? You mean you didn't like switch to Red Hat and launch the enterprise version of the base OS so it was all on compliance and then install the software on top of that? And they're like, well the compiler doesn't work because we need to use GCC49. I'm like, well why do you have to use GCC49? Well it doesn't compile. I was like, can you fix it? Yes. We are using that to, yep, to baseline our images and to really, we're also using it to audit our images so anything that's running out there in the cloud in our EC2 instances, we're using that to audit. We are vansuring in a little bit with Azure. It's kind of a new experience for both of us. AWS is our heritage organization, preferred cloud provider. Azure seems to be the corporate preferred cloud provider so we're kind of playing with both, yeah. Someone told me they were using Google but I don't know how many people do use Google at Collins Aerospace anymore. There has been some, I would say, conflicting requirements in that area especially when we're trying to use our build environments. They tend to be a lot more dynamic and when we're testing our software, especially we want access to change priorities. We write real-time embedded software so if we're running a unit test on a piece of software that's creating a thread, we want that to have access to setting priority and my DT department says no, I can't do that so we really try and work on some of the SD Linux access controls to do that, trying to figure out how can we do that with privileged containers or not privileged containers. What happens if we don't set the priority and run the software? Does it operate? How randomly does it operate? And we also are looking at the emulation environments. How can we run QEMU within containers to really emulate our ARM processors and can we distinctly separate that kernel from the underlying OS so we can set priorities in QEMU and do what we need to do without having root access to the operating system. Right now it's very risk management focused so they look exactly what I'm doing today and then I can't change it tomorrow and so they'll prove it and then tomorrow I can't say, I need to do this other thing like open this port or something so it's really a very manual process so we end up with install my own server back in the lab, my software on it, don't update it and go be merry. Which we want to stop. I don't want my developers having to maintain their servers anymore. I want them to use corporate LDAP, all that good stuff. All right, are we out of time? Are we a few more minutes? 10 minutes. Any more questions? Well thank you for attending, we really appreciate it. Thank you.