 Well now, you know, we've gone through a few introductory sessions so far now. We start getting into the meat and potatoes, okay? and leading off this section is is Larry from Cal Zeta and Cal Zeta is a part of the team of corporations that are backings in and I'm sure he'll go into that as well. So how about a big round of applause for Larry? Got it now. Okay So we may be a small group But do we have the best badges that you guys have seen? I don't know if Lars gets credit for this or Russell or who but there are a lot of people walking around with badge envy When they see these big boys Wow, that's that's pretty impressive. I was impressed when I when I got that So we're gonna turn the the focus a little bit in the filter that we look at at Zen at And I'm gonna come at this from the angle of an arm server perspective So you heard a little bit about that in the first presentation this morning. I'm gonna go quite a bit deeper here and so it's It's gonna give you a perspective Relative to a lot of things you're hearing this week in terms of cloud big data obviously virtualization being here today but looking at it with Some significant innovation that's really coming into the data center via arm servers So a little bit of background myself. I'm one of the co-founders of Cal Zeta. We're based in Austin, Texas And we were founded exclusively with the focus of bringing arm technology Into servers and into the data center We actually were seed funded by arm So we've got a tremendous relationship and partnership with arm and it's exciting time In the data center the last five years I've been spending a lot of my time helping to build the software ecosystem around arm and arm servers. You'll see some examples of that today And I'm part of the advisory board for the the Zen a group here at Linux Foundation I'm gonna cover a range of things today. I want to first start from a market perspective And and really give you the sense of why this is a significant movement now And tying it to workload and usage because that's one of the questions we get a lot in terms of What's arm good for in the data center? What's the fit and and show you some of the progressions on where we're going talk a little bit about technology trends That are foundational to what we're doing and then from there look at some roadmaps because I think that's a good anchor relative to understanding both a hardware and software perspective from key contributing elements To this space and talk obviously a little bit about Zen out of all the presentations today This may be the least Zen heavy But that's open K because you've got the right people filling in all the elements around it and talk a little bit about where We go from here So let me start by Looking at kind of classic enterprise IT workloads and think about it at two levels Think about it at scale of application and think about volume in this segment So a number of classic examples here in your upper left IT infrastructure, which really Historically has driven the volume in the enterprise space But not necessarily the scale of the application contrast that to decision support, which has really driven More the scale, but here it's been scale up Instead of scale out and so as you think of these workloads the kinds of figures of merit that matter are typically peak compute performance peak single thread performance Power hasn't really been a focus area and so you see that in terms of cost of Owning some of the the heavier-duty big iron systems You see that in terms of some of the challenges of running the data centers And as we think about virtualization, it's more about consolidation than Management or operational efficiencies. So that's really where where we've been historically But as you think about where we are and where we're going the intersection of Kind of classic web 2.0 workloads where historically, you know, we're really thinking about big data analytics scale out storage Large web tier really changes how you look at these workloads and where the growth is going and Then you add the cloud element to this and moving those workloads into the cloud It changes how you think about scale and it changes how you think about volume From a from a segment standpoint and those figures of merit really kind of get flipped on their side in a lot of those Workloads it's now less about single CPU performance and more about collective throughput and IO IO becomes such an element of Efficiency and capability and in a lot of these especially big data workloads where you're moving a lot of data around It's how fast can I get in and how fast can I get out and much heavier on the compute itself? operational efficiency becomes huge power So the the cost of running the data center the cost of acquisition Becomes less the cost of acquisition becomes less than that cost of ownership if you don't do it right And so we'll talk a lot more detail in terms of how that really plays out Also the operational efficiency in terms of how do I manage? How do I control? How do I provision? Becomes a much bigger metric in terms of how I evaluate success here other things like PUE How do I measure efficiency of the data center as a whole? Do I have governments getting involved in? You know from carbon to overall power footprints to even control within the data center and power per rack Footprints so it's really changed how you build data centers how you operate and how you measure This is a core driver to why we start to look differently at the underlying architecture of servers And how we start to look differently at how we invest around that So I showed you growth from the the application perspective But look at the market and the projections of where we are So over the next few years projections are that the dollar spend for servers will continue to grow But in a much flatter rate But within the overall server market the growth of those web 2-0 workloads become significant to the degree of about 40 percent by 2016 so it's massive and accelerated growth within that space But actually decreasing within outside of that space within the server market Now within the web 2-0 space projections are that 40 percent of that space will become microservers or ultra low power server based And so we're going from basically a zero to about nine billion dollar spend in four or five years huge growth huge change in terms of What that means to the data center what that means the DevOps guys and what that means to the development community? and One of the key drivers and one of the key areas of interest For all those reasons that I talked about in terms of figures of merit and capability is the underlying arm architecture So we've got this interesting thread of 40 40 40 that'll be an easy quiz right so 40% of the market becoming web 2-0 40% in that space becoming Mike ultra low power or microservers and 40% interest in arm so What's the relevance of arm? Well first and foremost the arm architecture and design comes out of a mobile technology perspective and The focus on Energy efficiency is intense and I've kind of learned this firsthand. So my background is really been exclusively servers HPC high-end type servers the other two co-founders of Cal Zeta have come out of the arm space X scale David Borland drove a lot of designs and deliver a great product on that space Bear Evans was GM on the business side and as as we built the company and as we spent a lot of time together I was continually reminded if we ever wasted a milliwatt and for a server guy in the HPC space That's a rounding error But if you're coming out of that mobile space and you worry about battery life and capability You care about every single milliwatt a lot of times look at the question. What's you know? What's the big home run for you? There isn't a single home run It's a collection of many many many optimizations to save each and every milliwatt So that performance energy trade-off is a continuum in this space performance per watt per dollar is the metric that really drives investment and and purchase The arm space pioneered the system on chip capability I'm going to go in a lot more detail in terms of what that really means, but it's the ability to integrate All of your core IP into a single piece of silicon which gives great efficiency but also allows you a level of flexibility and frankly creativity on design and And then at a very fine grain level things like power clock frequency gating The number of power domains within an arm architecture chip is Significant in one of our chips were up to about 20 power domains that are constantly going on and off and they're architected at a level Of efficiency that you're not paying the price for turning on those domains because they're small enough And they're controlling a controllable enough and granular enough Now the other thing that's happened is the performance demands in that mobile space have really taken off And it's no surprise you guys know as well as anybody were from from tablets to higher performing cell phones Apple's announcement last week was 64-bit v8 Based iPhones you're seeing tremendous push in that space and we saw that in the 2007 2008 timeframe with the Cortex a9 from arm where you started to see performance at a level that could drive Those workloads that are more IO driven again and less on the peak CPU okay, so let's that kind of sets the tone relative to the market and Some of the dynamics of where we've been and where we're going. So let's talk a little bit more from a technology standpoint It is a really interesting graph that shows historical trend over the last 30 plus years of single core performance growth In the heyday from the late 80s to the early 2000s. We were sustaining 52% per year your over your performance improvement at single core pretty stunning But now as we've gotten really into the last five six seven years you start to see that tapering off It's still pretty impressive at 22% per year, but we're starting to hit some walls from Manufacturing walls to Other design elements that we've already run through so you think about how that workload priorities changing You're also seeing the Technology pieces start to change around it So while at the same time that single core performance growth is slowing The demand from the workload in the IO space is skyrocketing And just a couple of examples here. So think about distributed application level storage Great examples are our seph and cluster. There's other open source Examples as well But in a world where we don't want to throw any data away It's amazing the growth rate at In this storage layer at a time where the high-end kind of classic storage Providers are really getting attacked from the low end with this, you know, what's being known as software to find storage Using commodity hardware now when people talk about commodity hardware in the storage space What they really mean is a commodity CPU But the IO around it from the networking and drive support Certainly is not commodity because of the demands on that system Couple of other factors that are really interesting here So I'm going to talk more about fabrics and you know in a little bit But one of the aspects of some of this scale-out storage or some algorithms that really go more in an east west Traffic pattern than north south So what that means is you care more about the traffic to your nearest neighbor back and forth Well in a classic cluster world that means I've got to go top a rack switch and back down or maybe out to another rack latency Loss of bandwidth can be significant in a fabric world with a much tighter integration on a cluster That path you nearest neighbor or nearest set of neighbors can be really short-circuited So again the architecture around the underlying base to support the kind of scale-out in this case storage Is a mapping that really comes together Well and again is much less about peak compute than collective throughput So now that I've got all that storage Here comes big data from the analytic side Hadoop obviously is the space that gets a lot of attention With the combination of map and reduce and and frankly I really think the storage guys end up winning this battle because if you're investing that kind of money and that kind of Real estate and storage. I'm going to optimize around that Hadoop's a starting point, but frankly, it's wild wild west here And there's a lot of interesting innovation Coming around in this space as well. Now if you look at the kind of classic benchmarks We're also going away from just simple spec int spec CPU Or kind of a classic what used to be called dusty decks in the HPC space with a lot of Fortran and other Codes like that to really kind of a next generation of benchmarks So one that I'd like to call out is a graph 500 which still comes out of the HPC space But very intense on the data side. So a graph model type analysis Network type model and you're seeing a lot more attention and focus there because it really represents These kinds of workloads these kinds of challenges from the scale-out space that are pushing Where we're going it in this market So now let me come back to the system on chip that I touched on earlier So in the arm architecture world By virtue of the type of IP and the delivery mechanism from arm I can take a standard arm core and Build around it in the silicon and optimize solution. So that's what's been happening in the embedded space for a long long time And it could be specialized peripherals could be offload engines Could be GPU accelerators really depends on What your target market is what's your domain expertise is as a company and as a target market? and it really creates a thriving competitive market because of that flexibility yet with a standard building block and so one example That certainly calzate is focused on is the management space for servers So in addition to the arm core, we have an additional core in every SOC that delivers high value standard server management Controlled via IPMI which is well accepted in the data center But bundled in to every single SOC so I don't need a separate card Or a separate system for that management and because it's integrated The granularity and low-level control that I have is way beyond what I could get by some separate Box or chip so that includes power optimization. I mentioned the power domain aspect before turning on and off power domains I'll talk about fabric and how we can control that Dynamically or based on policies how I handle provisioning how I handle boot it's that optimization and and by the way by Embedding that and baking that in I mask away that complexity from the end user and From the data center operator, so I care about his IPMI and by the way standard IPMI with standard custom extensions that are approved and and a well accepted model within the IPMI standard itself and by the way when you're thinking of Literally thousands of these in a rack any concept of attached display or serial Cables or keyboard video mouse type of Support and service is long gone It is not the support model that you're going to see it doesn't make sense For this kind of density and model which I'll continue to talk about So fabrics is the other space that Calzada has chosen to invest in and you're really seeing as Core of the the next generation of server architecture And you know it all you have to do is look at that picture in the lower right with Cableing strung throughout we've all seen it unfortunate probably in our own physical locations, so we know what that means but Back to that SOC if now I can also bake in fabric technology That allows what I don't go into this but transparent and reconfigurable interconnect node to node greatly decreases cost and Decreases power for what I'm designing So what that means is I don't need an ethernet cable coming off of every chip. I don't need the the control or the port Costs that I would have to pay otherwise because I've baked it in in the silicon itself I can make the connection on my system board and I can control it via software so that I can dynamically affect the routes I can dynamically optimize where I want to carry traffic and I'm not plugging in and removing cables So it's it's a significant step forward and again a mapping to that workload that we care about I can build in redundancy. I can build in different topologies. I can enable system partners to create Appliances and and workload optimized systems that leverage different fabric implementations based on the kind of workload that we care about and Because those volumes are so large and individual workloads. It's well worth that kind of investment. I Used the example of a racer coding before in terms of on the storage side of removing that top of rack dependency But then also even removing the top of rack completely and being able to expand fabrics beyond an individual rack It really changes the dynamics and the metrics as you think of latency and bandwidth So today we're running up to 10 gig bandwidth No denote on our fabric, but we can also ratchet it down dynamically down to one gig and with incremental steps in between and and so The the concept million node data centers is not far away So now as you think about virtualization and control and somebody I think the first presentation this morning talked about breaking the 1024 Barrier for a number of VMs When you think about a million nodes in a cluster We're at a whole new type of level there So let's use and I'm not here for a calzata Commercial, but I just want to use our energy core as an example of a system on ship and and what that really Translates to so there's really four building blocks that we integrate on our server The upper right is the CPU complex from arms. So in this case, it's an arm Cortex a9 integrated L2 cache and Integrated memory controller on the lower right is our fabric switch And that is think of it as an eight port L2 switch That's probably the best analogy I would use with five ports coming off of the chip So I can connect up to five other Socs to that so see so any topology you can create with five links go for it Lower left is IO integration. So SATA PCIe up to 10 gig ethernet SD and emmc all integrated in the silicon so I don't need a separate chip for IO. I don't need separate Chips throughout system design all integrated here as well and then finally our management engine So I mentioned that before but this is a separate core separate from Where I'm running Linux and my Linux stack. So there's security benefits There's control benefits and there's operational benefits again by integrating But also by separating from the core itself so that simplicity and that integration Translates to benefits of system design as well Both in terms of decreasing complexity and also improving efficiency. So here's one example of one form factor So in about a 10 inch card We've got four of our SOC's Think of it as a four server cluster no cables all All in you can see simplicity in terms of that's all I need from a design perspective In this case, I've got four gig of DDR per SOC for a total of 16 on that one card and I support up to four SATA drives Per SOC and all my power my SATA my fabric all comes through the connectors that go into a passive system board So cluster for servers five watts about per SOC so I've got four servers at 20 watts and An example configuration on the lower right. That's actually one of the Moonshot systems from HP. That's a for you with 288 servers 288 nodes in that for you changes your metric for density if you think about 10 of those for use in a standard rack in terms of capability and What you can what you can build into a single rack now included the open compute logo here as well For those of you that aren't aware of open compute. It's a project started really initially by Facebook and think of it as open source for hardware a Community effort to look at designs around systems in a different way. So some different form factors for racks looking at power optimization as it relates to power supplies in a different way and it is Changing the way the industry looks at system design as well Which is a great intersection in terms of how we're looking at the individual SOC or node level And it's really called out. Typically in the industry is for exclusively to the high Super hyperscale guys like a Facebook, but we're really seeing adoption In much broader segments in that that really doesn't get talked about as much publicly But I think you're really going to continue to see traction there. So let's talk about road maps starting with arm talked about a nine going back to really 2008 2009 the Cortex a 15 right in the middle here is Are really systems that are now starting to ship in the server space There's a couple of key things that the a 15 brings to market one is larger physical Address space. So we're still in a 32-bit virtual world, but now we're in a 40-bit physical world What that translates to is more physical memory per SOC and in particular more physical memory per core So I can move to typically a two or four gig per core model Beyond where in an a nine I was limited to a single gig per core Also of relevance for in particular today is the virtualization support. So with the a 15 Arm has what's called an MMU 400. That's their standard IP That provides the hardware virtualization hooks that are key as we push the virtualization space forward and I'll talk a little bit more about that as well And then following is the a 57 a 58 53 So this is the v8 of the arm architecture and what the industry is getting Extremely excited about his full 64-bit delivery and and so certainly there are server workloads that really just plain require 64-bit and This is the the space that we'll see those additional workloads starting to come over and on each of these iterations We're seeing improved performance as well both at the core, but also on the memory side And I should also mention that exiting 2012. I think the number was 40 billion arm devices in the world a pretty stunning number and so I think the arm architecture is is here to stay for a while longer That's the hardware. Let's talk about on the software side a lot of questions always in terms of hey That's great, but what do you have from a software ecosystem and and I'm really thrilled to say that this has Really built momentum over the last couple of years From the Linux distro standpoint. We worked closely with canonical from the beginning canonical has been closely collaborating with arm going back to 2008 with Ubuntu In 2011 he started to see early releases supporting arm servers And and fully in place since 1204 with the LTS long-term support Release model that canonical has And we're we'll be seeing 1310 come out here shortly with full support for a 15 and virtualization and large physical memory Major progress in the fedora space as well. So we've worked closely with red hat and fedora we Although not technically You know publicly stated in reality arms really being treated as a primary architecture now within fedora going into the fedora 20 release We've got a number of systems in the build Center in Phoenix and they're in their data center that support all of the tools all the infrastructure needed to really hit that primary level And now the F 19 came out on arm the same big came out of next 86, which was a first So that's been Major progress from that standpoint similar on the SUSE side with open SUSE 12-2 was well supported on arm 12 threes out now were similar to with F 20 on on a 15. We'll see 12 3 plus on A15 as well and then sent us I'll just say watch this space the guy speaking after me today We'll be able to touch in a lot more depth on centos, but we are excited about figuring out Opportunities for centos support on this architecture as well. It's a it's a great match Especially as you think about some of the workloads that you see in the center space Okay, so now from the zen and virtualization side and This is really moved very quickly when you really think about it. So 2011 Thanks to Ian and Stefano and and and some of the team with Citrix. We had our first working code for Cortex a-15 on Ammu emulation for supporting hardware virtualization in 2012 Lenaro, so let me step back for a second. Lenaro is a industry group started by arm back in 2010 Originally started with I think about five silicon vendors all delivering arm silicon to the market But primarily on the embedded client side real driver was to improve arms ability and capability in the Linux space at that time Linus had some pretty tough words for that community and So there's really been a chance to pull together a collective Engineering effort around optimizing and delivering Linux in that space 2012. We formed Specifically a working group called on our enterprise group focused on servers Including not only silicon guys like Cal Zeta But also OS so canonical and red hat OEMs HP end users Facebook So that's a pretty stellar group when you think of the investment and the commitment to that open source support Around Lenaro Citrix joined early in 2013 There is a virtualization group within leg focused specifically on driving Zen and other virtualization capabilities in the arm space and now This year we've already talked about Zen project Cal Zeta was one of the founding groups for Zen project and in particular to help drive The arm focus and the support for multiple architectures in that effort With the 4-3 release this summer we saw the really the instantiation of that first real arm support And now with 4-4 early next year We're looking that as as a really solid release For that core virtualization capability Okay, so advantages for Zen on arm one is just the history quite honestly of Citrix and arm and it's no coincidence of the the common geo of Cambridge in the UK I think that in in this industry that's always a benefit and so culturally those two organizations click and Having helped drive this ecosystem over the last five years I can tell you there are some cultures that don't click when I talk about arm and arm servers and I lose them quickly But this is one that's been a lot of fun because of that common foundation And and that baseline that that we're coming from so there's a lot of cross-pollination there that helps Optimizing to a small footprint so the team at Citrix understood that from the beginning So the work on arm was a great presentation last year in San Diego That really went into more detail in terms of lines of code and optimization and targets there So that's a key point understanding the landing zone as you're doing the design is really important So then you think about I talked about fabrics and the capability that brings to this architecture You think about things like migration Well, if I've got a low latency high bandwidth interconnect that I can migrate around in that cluster That's a great enabler and that's a great combination as you think about larger clusters And you think about the elements moving around that cluster and maybe migration being more important over time I really like that as a space of some optimization And and improvements that you know frankly I think we we are kind of thinking about now and just scratching the surface so so that that's just one Example of where I believe the combinations of the technologies are really going to play well off of each other Also the consistent management is critical So I like to talk about removing friction when I introduce a new architecture to a data center It's kind of like walking in and saying this is the greatest thing since life spread as long as you change your compiler and your OS And everything you've ever done It's a short discussion It's a much better discussion if I can come in and say hey for the most part Everything's just going to work. If you've got Java code, it's going to work if you use an IPMI It's going to work if you're using standard Linux extra Linux distros It's going to work if you're using Zen from a management provision control standpoint. It's going to work So that's a key piece of how you look at bringing in this very disruptive technology in a way that doesn't disrupt how people go about their job so let's talk about specifically and an example and How we look at OpenStack On Zen there was a comment this morning I think in terms of one of the presentations of somebody was surprised that OpenStack was even working with Zen Well, it's really a great combination for us because of the underlying pinning that Zen brings to the table because of the optimization that we talked about and private cloud instances on an ARM server in that scale that you think about where we're going in those workloads is We think an excellent combination So with Havana coming out this fall with 1310 Canonical Ubuntu and by the way the Canonical team has done a lot in the OpenStack space So great combination there and with the four three four four releases With Zen this is a an absolute priority for Calzada and the partners that we're working with So in particular, you know, we can take advantage of that common existing management and use those APIs to control and move things around knowing that we're going to have a production quality production grade hypervisor with support of of the the full community And I can't tell you how important that is as we go out and and talk to new users in an existing space So in closing I'll say this is disruption to the to the data center space on Legal steroids not the illegal kind We're looking at an efficient core architecture That integration the SOC at the pieces that matter that also enable Great innovation and competition in this space You're not going to see a monopoly in the in the ARM server space with the kind of competition capabilities that we all bring to the table Open source software obviously, but also open source hardware as we think about Things like open compute and those changing workloads and and we're just like I said wild wild west early On those workloads themselves. And so where do we go from here? Well that arm road map That was the public one the private one is even more exciting So there's there's a lot going on there Arm is continuing to invest more and more in that architecture as it relates to to servers in the data center The ecosystem is simply accelerating. We've got more and more projects and programs and capabilities going And it's all about that workload with integrated efficiency and optimization that that really makes a difference Questions Yeah, so it goes up to 10 gig it's about 200 nanosecond latency point-to-point and We can adjust down to one gig Dynamically based on if it's a policy or maybe Policy from the standpoint of you know at midnight from from midnight to six I'm going to ratchet it down just to save power because my demands lower or I can optimize to certain Sections of a cluster based on workload for example as well five ten gig coming off the part and In addition supporting ten gig ethernet So you can optionally have ethernet on any of your nodes as well So think of a you know your your egress coming out of a system so the Core server or the energy core processor. What process is that on if you can tell me sure? And is there an FPGA component or is it all ASIC sure so 40 nanometer process today? Candidly we've been fairly conservative because we don't really think the process is what matters So we're looking at more the you know high volume Optimized from that standpoint any question about FPGA now. This is all in silicon So there's there's no FPGA element at all in in terms of any of the part Time for one more question Has anyone suggested that? Your dint server is Sort of replacing the need for a Software virtualization by the fact that you just have so many processors in a in a for you box. Yeah so bare metal is is certainly an option and Until we had a 15 That was kind of the thing we only talked about because that's that was the the leverage we had now There are workloads and there are certainly environments where People are going to be happy with bare metal and stick with it and And it's what we want to be able to do is provide that range of support So, you know, we've got bare metal. We've got Linux containers. It's in we've got KVM So the range of capability is there and it really comes down to workload and operational environment so if I'm Think of if I have a fully dedicated cluster for nothing but Hadoop it's it's going to run bare metal probably But that same user may want to have the ability to actually execute that in a private cloud In a virtualized environment. So it's it's the flexibility that they kind of want Yeah Virtualization I mean the way I look at it is there's two things you get with virtualization one is consolidation Which obviously we used to on on big systems But the other one which is more relevant to the smaller systems is abstraction So even in a world where there's one VM per host and I know a number of examples of that happening on large x86 systems at the moment the Fact that we've got something between the workload and the hardware gives us hook points for Software-defined networking so doing sort of open flow type stuff Abstraction around the sort of storage we can use so Typically a bare metal workloads going to be limited by The the local storage available in a node or what the OS can natively do like I scuzzy boot for example Whereas if say you want to use Seth Or you know you want to bring in something that maybe you haven't already got hardware integration for The hypervisor of allows you to much more flexibly use storage And then things like the migration element even in a web 2.0 world where your VMs are tolerant a failure tolerant Be able to move things around in a finer granularity for load balancing maintenance, etc I think it's quite a nice Facility so I think that sort of as long as it's low overhead low cost I think that that abstraction is very very useful Okay, that's all the time we have right now. Let's give Larry a good round of applause