 Okay. Good morning. Sorry, I should have tried that out in advance here. I think I've used up several minutes of my time just getting set up here. I want to thank you all for giving me an opportunity to speak here this morning. My name is Jason Sonic. I'm a research scientist at Adventium Labs, a small research company in Minneapolis, Minnesota. And today I'm going to talk about the Secure Server project. And this might be a somewhat unusual presentation because although this project has been incubating for a while, we've only started working on it officially about four weeks ago. So it's still pretty early in the stage of development. So this morning I'm going to be talking primarily about the customer requirements that are driving this effort and also advocating for some architectural changes in Zen server that I think have been on the roadmap for a while. And so I believe this is something that's important to the Zen server community and also to the broader Zen community. And so I wanted you this opportunity to tell you what it is that we're trying to achieve and to solicit feedback from you as potential stakeholders in this. And so I would encourage you, and I will definitely save time at the end for questions and feedback. So here's a little outline of my talk. First I'm going to talk about the motivation and objectives for the Secure Server project. Then I'm going to talk about the threat landscape that we're using for the development of Secure Server, the current design of Secure Server, the status of the project to date, and then our near-term roadmap for future work. So first I should tell you that this has been a customer-driven effort and the desired outcome is a server virtualization platform based on Zen that supports Secure Server Multiplexing. Now clearly Zen has the server multiplexing aspect pretty well wrapped up. And so this morning I'm going to focus primarily on the Secure part and specifically why we feel that something more secure is required and what that more secure system looks like. So I'm presenting this in the context of a cloud-like environment where we have multiple different tenants who are mutually suspicious. And if you read the abstract for this talk you may have noticed that I pitched this originally as a multi-level secure environment. So I apologize for the bait and switch. I've couched the presentation today in terms of multiple tenants because I think that's the more generally interesting case. But if you were to look at multi-level security I think you can apply many of the same design principles there. And so the goals that we are really trying to achieve here is two-fold. So first we want to ensure that multiple tenants can share a single platform while ensuring that data and processing for those tenants, specifically the confidentiality and integrity of those things, are safe from co-tenants. And then the other goal of this work is to ensure controlled information sharing between tenants on that system while still maintaining that isolation and ensuring controls on that information sharing between the tenants and the flow of that information are satisfied. And so one question that Zen developers may ask is that doesn't Zen already support a lot of what we're trying to achieve? And yes, in many aspects it does. And the big problem that we've identified is that most of the deployed systems still rely on a monolithic control domain. And there are many shared privileged components running in DOM-0 that present a threat in this sort of environment. So in this slide I am showing a simple scenario that I'll return to throughout the talk. So we have two virtual machines, orange and green, belonging to different tenants that are sharing a single system running SecureServe, which is the working name for this high assurance server virtualization platform based on Zen. And now from our perspective there are five key requirements for the system. And I already talked about the first two on the last slide so I won't go over those again. But some of the other ones that were raised by our customers that this needs to support enterprise ready management and compatibility with management tools like Zen Center and especially cloud management systems like OpenStack and CloudStack is something that's very important to our customer. And one other thing that was raised was that the system needs to be highly scalable. And we don't really have any firm requirement here, but thousands of VMs on a server has been floated around. And so I know that we're not quite there yet in terms of scalability on Zen server, but considering all these different requirements and we decided that Zen server was the best fit in terms of satisfying these requirements and using that as a basis for building this server virtualization platform. And it just so happened that it worked out very nicely that Citrix recently released Zen server to the open source community because that really provides us with the opportunity to do this work. So in this diagram I moved to kind of the current state of the system. On the last slide I presented what we'd like and on this system I present what we have in a Zen server based system today. And so for the sake of simplicity I've illustrated just a few of the major components that live in DomZero. We have QEMUs, Appies, ZenStore, the back end drivers, device drivers, so on and so forth. In this case we have two, but there are potentially many guests sharing a system. And the weakest guest is, in my opinion, the weakest link in the system. And I'll justify that in a couple of slides. But once an attacker has a foothold on one of those systems, whether a tenant was malicious to begin with or they have been compromised in some way, then that VM can be used as a springboard to attack the confidentiality, integrity or availability of the other tenants on the system. And there are other possible attacks as well. So I'm sure many of you are familiar with cross-VM side-channel attacks. There's been quite a body of research looking at using micro-architectural side channels to compromise confidentiality and integrity indirectly by relying on shared hardware. And if you're not familiar with that, I'd be happy to refer you to some of the work. And then cloud management provides an additional attack vector. So the users need to be able to manage and configure their VMs. And the Zappy tool stack is relatively complex. I'm sure you've all heard the apocryphal one exploitable bug per thousand lines of code. I'm sure Zappy is a lot better than that. But it's still a relatively large and complex code base that interfaces with a lot of other components. And so these are all threats that have driven really the desire to develop. I mean, these are the kind of things that concern our customer and that has driven the development of this secure server project. And so today I'm going to focus primarily on the first threat, VM escape attacks to motivate the rest of the architecture than the talk about. So this is a bit of a tangent, but I mentioned on the requirements slide that we're also interested in doing this controlled information sharing between domains that are sharing a system. So I feel that I'm obligated to say a little something about it. And today you can easily set up inter-VM networking using separate networks in DOM zero. But of course the isolation between those separate networks is a lot weaker than the separation that is provided by the hypervisor between different domains. And what we want to be able to support is these high assurance private networks. And so Zen now provides the spatial separation and some of the low level primitives necessary to define these trustworthy inter-VM communication channels. And so we are interested in developing such a solution that would ideally maintain compatibility with some of these existing filters. This VM that I've labeled compliance here that's enforcing some policy on the information sharing. And so we have an approach for this that we favor, but I'm not going to be talking about it here this morning, that's another talk. But if you have a mutual interest in this, I'd be happy to talk to you offline. That's what we're interested in. So now I'll move into what I'm calling the threat landscape for secure server. And I call it that because we have yet to formalize this as a threat model. Rather it's more of a survey of the recent vulnerabilities against Zen reported in the CVE for the past two years. So we have a little tag cloud there in the bottom corner. And so there are 73 different vulnerabilities that are represented. And many of these have multiple potential effects. And here comes the part where I justify that statement that I made earlier. So of those 73 vulnerabilities, 65 of them include the word guest, meaning that they in some way originated from a guest on the system, you know, allows local guest administrators to do something. And a number of these involve attacks that allow them to escalate privileges or do an overflow attack based on some shared component that is running in DOM zero. And at that point to completely compromise the system. So looking at those in a little more detail, the attackers target the tool stack, the hypervisor, the management software with varying goals. And there are a number of different vulnerability types that appear again and again. And I'm not going to walk through all these things, but you can see that there are a number of possible attack vectors that they can use to escape from a guest and compromise components that are running in DOM zero. So the near term project objectives for this secure server project is to improve the security posture of DOM zero in Zen server. And so what we're looking to do during the next couple months is to isolate the network stack, isolate the storage stack, and isolate the device model because we've identified those as some of the most vulnerable components. And then to adapt the existing tool stack to support this configuration. And so we're working on applying some well-known security principles. So securing the weakest links, separating privileges, and avoiding sharing of mechanisms using DOM zero disaggregation, so technique that I'm sure you're all familiar with. And we're looking at granting and enforcing least privilege using hypervisor mandatory access controls and finally using attestation to verify the integrity of the system and do defense in depth. And so a number of these design principles have already been applied and validated on other Zen-based platforms. So what we're really looking to do here is to establish a baseline based on Zen server that can be used for additional research and development. And I would argue that we're successful if we can produce a prototype that can be used to demonstrate the value and feasibility of these proposed architectural changes and also again to serve as a baseline for additional research and development moving on. So now I'll dive into a little bit of detail on the work that has been done so far and that will be done on this short-term prototype. So using the...we defined a set of technical requirements for secure server using the NIST 800.53 catalog of security control recommendations and the CNSSI security control overlays as a guide. And really the purpose there was just to again establish sort of a baseline for systems that are requiring high confidentiality, medium integrity, and medium availability. And so you might be asking yourself why did they use these security controls and these requirements are out there and they're well-defined. So we don't need to reinvent this. We can use this as a baseline and then sort of iterate on this as a set of requirements for the project. And so as I mentioned already, we're looking to apply DOM0 disaggregation to move some of the components out of DOM0 and into isolated stub domains and driver domains. And so raise your hand if you haven't seen this sort of DOM0 disaggregation architecture before. So this is not a new idea, but it may be new to you to learn that we have this configuration running on Zen server 6.2 right now. And so that's really what we're looking to do is apply some of these ideas to the existing code base. And so just in summary moving the network stack into a separate VM and we're looking at a NIC that can target a physical network controller or a virtual function in an SRI OV device. And for the storage stack we're looking at supporting both a local SATA controller or also a network attached storage. So right now we're looking at iSCSI. In terms of hypervisor mandatory access controls, so again Zen has support for AxisM and we're looking to leverage that support to limit privileges in two dimensions. So one is to ensure that each one of these components that are running in a separate domain is granted limited privilege. And the other is to maintain separation between the different tenants and the components associated with those tenants that are running on the system. And so some of the recent changes that were made in Zen 4.3 and to the priv command driver in upstream Linux has facilitated the work that we want to do here. So we're looking at kind of rolling the support into Zen server. So one question that I've been asked a few times before when I talk to people about this is isn't that overkill? And one proposal is if you really want to protect the IO traffic for instance between these two tenants why not just use software encryption in the different VMs or domains to protect the IO before it ever leaves the domains and so that everything is encrypted in DOM zero. So I went out and looked for a vulnerability that I could use to illustrate this point and it just so happens that the community was good enough to accommodate me. Just a few weeks ago there was a CVE 2013-43-44 where there's a buffer overflow in the SCSI implementation in QEMU that allowed a guest to compromise QEMU and attain privileges. And so even though we have a per instance QEMU process running in DOM zero once that process has been compromised by an attacker then the QEMU process has unfettered access to the memory assigned to that other tenant and at that point your encryption is not very useful because the encryption key can just be extracted from memory. And if you don't like that example I can list a whole bunch of others based on other components that are running in DOM zero. So now I'll talk a little bit about how the disaggregation can address this problem and so again we've isolated these components in different domains and we're using the mandatory access controls to restrict the privileges that are assigned to the various components. So if the attacker comes in violates or compromises the green domain and then uses that as a springboard to compromise the QEMU emulator the only memory that they're able to map in is the memory that's assigned to that green VM and so there's been no compromise of the tenant. The other tenant that's sharing the system and you can make similar claims about providing stronger data confidentiality insurance by isolating the storage stacks. So right now we're looking at using static mandatory access controls because that's all that XSM supports is a static mandatory access control policy. One thing that we're very interested in and something that we're working on right now is dynamic mandatory access controls because in a cloud environment we have tenants who are going to come and go the relationships between those tenants are evolving over time so we're interested in how we can support a dynamic XSM policy and I don't really have time to go into too much detail on this but if you're interested in this topic I would encourage you to attend Phil Chirka's talk later this afternoon talking about this a little bit. So until now we've assumed a trusted computing base that includes Zen and the hardware and so one other research and development thrust that we're looking at in this is supporting both static and dynamic attestation so we don't really intend to trust these things we'd like to use measured launch to check the integrity of the system at boot time and then we're also very interested in using dynamic integrity or dynamic attestation to verify the integrity at runtime and this is especially important on server platforms that are typically going to be long running and so I have a little graph here to illustrate the problem but I'm sure all of you are familiar with this where you can take a measurement using the existing dynamic root of trust and Tboot support, TPM, TXT, all the trusted computing base their support is there now to take a measurement at boot time and in fact Zen Server 6.2 has a supplemental pack that supports measured boot of the Zen hypervisor and the Don Zero kernel and so we're looking at how we can extend that to both cover some of these components that have been moved into disaggregated VMs and then also how we can support dynamic attestation so that we can have a chain of trust all the way up through trusted components So the current status of the project as I mentioned we just started working on this four weeks ago but we've made pretty good progress so far and we've started with the Zen Server 6.2 appliance in parallel we're also looking at using Zen Server Core on a couple different distributions and so far we've been successful in building a network driver domain and a storage driver domain and the network driver domain is we've demonstrated it using both OpenV switch distributed virtual switch and traditional bridge networking and then for the storage domain we have both iSCSI target and also using a local SATA controller and the QEMU stub domain is in development the big hang up there is that well so I've been told QEMU support is not compiled into the Zen Server 6.2 appliance stubdom support is not compiled into the Zen Server 6.2 appliance it is in Zen Server Core and so we're kind of looking to or at least in some version so we're looking to reconcile those things so we can have one platform that is running all these components simultaneously and we also defined a mandatory access control policy for a specified use case and we were able to verify and validate that so Zen Server 6.2 does not support XSM it's on an older version of the hypervisor and again that support isn't compiled in it actually turns out that back porting that would be relatively straightforward but the next release of Zen Server which I believe is coming soon is based on Zen 4.3 and the support should be there some of the challenges that we face so far coming at this as an outsider the main one has really been deducing the relationship between the Zappy constructs in Zen Server and the Zen constructs that we're all familiar with so all of these concepts have been applied before in versions of Open Source Zen but getting this to work on Zen Server without breaking some of those management interfaces and still allowing the users to interact with the system through Zen Center and through some of these other management systems has been something of a challenge and so one of the things that we're interested in is adapting the tool stack moving forward to support some of this disaggregated operation that's okay so just to give you one example of that we have certain components of Zappy that are running in DOM 0 now and some components we can move into a driver domain VM for instance our network domain VM we can install Zappy Network D in our driver domain VM the problem is that it won't function as you expect it to function because it expects to be communicating with Zappy expects to be communicating with DOM 0 via a local domain socket so I guess that's a good segue into my next slide which is talking about the roadmap so I'm going to jump ahead a little bit into secure inner VM communication and that's obviously an important component of this is supporting some mechanism for all these different VMs that are running on the system to communicate and adapting the tool stack to use that mechanism and of course we've done our own survey internally and there are more than a dozen different published mechanisms and this is one instance where we're definitely interested in working with the community and getting feedback from the community because we can prescribe something but our long term vision is that we want to enable interoperability with potentially components that are part of ZENSIR or Citrix ZENSIR and then also components that may potentially come from other third parties and so developing an API to support this standard interface that virtual machines can use for something like this is something that's important to us the other component on this list that's in that vein is the service VM model so right now we're using what are based on the DDK VM that is distributed with ZENSIR or 6.2 simply because it was expedient but it is definitely not the right environment to use for these driver domains and so we're interested in looking at reducing the footprint both in terms of resource consumption and the attack surface that's provided by the service VMs while still maintaining generality that would allow some of these third party components to be developed and interacting via the standard interfaces so I've been updating these slides and I know that Mirage for instance is something that's new to me and that would be one possibility for this and there are others as well kind of running out of time here and I want to leave time for questions so let me just jump to contributions and what we're looking to use this for and so we're looking for ways in which we can engage the community and to use the work that we're doing right now and feed that back to the community so that other people can use this as I said as a baseline for future research and development and one thing that I think we could do in the short term is to publish some blog posts whether or instructions on the mailing list because one of the things that I mentioned that we've worked through is coming up with recipes for how you can support these driver domains on Zen server and identifying some of the pitfalls and roadblocks involved with that and then also long term we're looking at contributing code and that may involve some changes to Zappy and the way that different components interact with one another and interact both through the Zen server developer mailing list and also participate in the Zen server core project and then potentially templates for these different driver domain VMs and Zen server configuration that's involved in this so at this point I'm running out of time and I'd like to leave some time for questions let me just wrap it up here and say that we're looking at developing a secure server virtualization platform based on Zen server I've listed some of the key goals there at the top of building a baseline prototype by drawing on past research and development in a number of different areas we're looking to release a prototype at the end of this year and we'd like that hoping that prototype can be used as a foundation for future research and development and sort of the second phase of this project will be to identify outstanding challenges in the long term R&D roadmap so feedback is encouraged I didn't leave a lot of time for questions but if you want to contact me later I have two questions I have to set up the next before which program do you have to do remotely via Skype so, questions Sir, you you say you're adapting a tool stack for this application are we going to see patches for that because that's not really exciting that's the goal so that's one way that we'd like to collaborate with the community so I said it at the start we're a small research company and so we can't we're probably not the right people to take on the job of completely rewriting the Zappy tool stack but there may be some aspects of this that we can address and then we can feedback those patches to the community and maybe that will drive additional interest and adding additional support for disaggregation into Zappy so there are certainly some small changes like the Zappy network daemon and the way that the rest of the Zappy interacts with that that we could make those patches so yes yeah so can you just go back to slide 12 okay let's see 12 yes 12 you talked about is this overkill one other option I personally don't think it's ever good because I've written several slides like this before as well but would it make sense to consider a kind of dog zero or per level sort of model so what we're doing here is creating dog zero popularity through a series of consecutive domains would an alternative be for example to have maybe a nested virtualization solution when you have any whole system with its own dog zero for each tenant it's certainly a possibility nested virtualization is something we're interested in and we're looking at I have since over 62 running maybe I shouldn't say this VMware fusion on my laptop here so that works pretty well I had not considered that approach and I guess it's something I'd have to think about in some sense we're doing that by doing the desegregation by having these separate per in a way it kind of is a subset of Dom zero per tenant