 Okay. I guess it's time. Let's start. Welcome to our talk. So, late in the day, I'm thanking you for still being here and not spending time on beer and socializing. Our talk is called Baking Clouds and we did an experiment with Raspberry Pi's and Cloud Foundry. I guess you've all seen this slide a couple of times today before in case of an emergency, panic, and there's the exit. Okay. A little bit about us. I'm Rued Keiser. I'm the cloud native lead for I2Q. This is Chris. He's a consultant at I2Q. A little bit about I2Q for the American public. We're from Europe, exactly from the Netherlands, and it's over there. We are a VMware partner. We design and implement data centers. We don't sell licenses. We're strictly a knowledge partner. We don't sell hardware either. And last year, we were awarded by VMware Global for being the best partner in Europe on that front. So that says something. We're a small consultancy around 50 people. Apart from VMware, a pivotal partner. So we also design, implement, and help application teams build their Cloud Foundry installations and their ecosystem around it, like automating the whole out of it, concourse, that sort of stuff. And we're a member of the Cloud Foundry foundation. So back to what we did, we attempted to run the Eagle Shell on Raspberry Pi 2 and deploy the whole of it with Bosch. The question is why, actually, Dr. Nick already at the on conference said why would you subject yourself to such an experience. One of the reasons is this. I like to think of the awesome pattern. You do something cool. You tell people about it. That's why we're here. And that could lead to interesting projects, not exactly a job in changing careers. But another reason is as a learning experience, we both had the feeling that we understood Cloud Foundry quite well. But we found out during this thing that there was a lot more to know about the intricacies of all the components within Cloud Foundry. But the actual real business reason for this is something else. Yeah, all the jokes aside about being awesome and stuff. We are really interested in running Cloud Foundry on different CPU architectures. You might have heard of a thing called ARM IRM. It's in all your phones. So that's basically the CPU we're talking about. And ARM has a nice feature in that it's a lot more energy efficient than a general x86 processor. So we're also seeing this in high performance computing. They're starting to using ARM CPUs just because you're using less power for the same or little less CPU power. So we think it's very interesting moving forward, saving energy and doing the same thing or maybe doing more with the same amount of energy. So that's really interesting. So what we thought we had to do, let's dive into this. We'll get some raspberries, right? And then build an IS for Raspberry Pis. Sounds easy. Then we create a BORS CPI for the newly curated IS. Then cross compiles and packages, world domination. Then these are the points we thought were the most complicated to do. So creating an IS from scratch sounds kind of complicated, right? It's like creating your own AWS. It sounds pretty difficult. Then create a BORS CPI. Sounds really difficult. So we thought we're going to spend a lot of time on that. Then we just cross compile some packages and we're done. So back in December, Chris told me, yeah, we have this functional Raspberry Pi IOS. It's working and well, we're just in time for the deadline for submissions for talks. So I said, just let's just submit it. We'll finish it. The rest is easy. Let's do it. What can you do wrong? We're almost there. Let's just submit it and we'll see what happens. So now the question is, does it work, right? We won't let you sit here for half an hour and not doing a demo. So we'll just start out with a demo, tiny one. Then we'll get back to some more demos a little later in the talk. So let's see if this actually works on my Internet connection. What we have here is I created a very simple application. The only thing it does is it reads slash proc slash CPU info. So it gives you all the CPU information of the box you're running on and it just displaces in HTML format. That's it. So I pushed one to a general x86 node and this is what it shows. So we can see it's an Intoxy on blah, blah, blah. Not that interesting. Then I pushed the app another time, but this time it runs on a Diego cell running on Raspberry. And as you can see here, we're running ARM V7 and if you scroll all the way down, we can see here that it's actually running on the system on chip, the BCM 2035, which is the CPU that's in Raspberry. So this proves it's running on Raspberry or does it? Is it just a web page maybe displaying somewhere? So take a look at this. It's rather small, I think, but take a look at the route. It says cpuinfo-arm.apps.cfinfo automated lab. So let's check out cftop. Collect data for a little bit. As you can see, we have two apps running here, cpuinfo-arm and cpuinfo-x86. And if you go into the route stats, where are you? There we go. So you have two routes. So by this, we prove it's actually running on an ARM CPU on Raspberry. You have to take our word for it. That's an actual Raspberry. We didn't bring them with us. They do it over hassle with the airport security and everything. So we just left them at home. We'll get back to that. So this diagram of these layers represents what we had to work through to get this deco-ing. At the very top, you see the running application. All the layers below that have to work on ARM architectures. And that starts at the very basement where the physical world is. That's the IS. And then we work our way up. Well, this sort of looks like a building. So we thought we'd make a nice analogy with an elevator and start going down to the basement level. And that's where we find the physical world at the basement. So this is what it looks like. And this is currently running at my home in my home office. It's three Raspberry Pis in the nice Lego enclosures. A switch. And on the right hand side, you'll see our power manager. So in order to build an IS, you need to be able to turn the things on and off, right? So they need to be rebooted and stuff. So we need an out-of-band power manager. So I know a little bit about electronics, just enough to be pretty dangerous. So I smashed something together, basically. It's an Arduino with some power fets on top and an Ethernet shield that turns on after five-fold power supply towards the Raspberry's. So that's basically all for the physical world. The rest of all the Cloud Foundry machines are running on my home lab, which is the tower sitting on top. It's just a vSphere machine. So we just run vCenter, vSphere CPI, et cetera. So we only use the Raspberry Pis for the Diego cells, not for all the other Cloud Foundry components. Maybe we'll do that later. All right. So that's the physical world. Let's move up back into the elevator. Let's move up a bit. We have to invent some kind of IS layer. So I built something in Golang. It's called bakery because baked spies. Bad joke, I know. It's available on GitHub if you want to check it out. What it actually does is it provides a REST API for the CPI, of course. It runs NFS, it runs TF2P, and it runs proxy DHCP, and the bakery software itself. So when you request a new Raspberry Pi from the REST API, it talks to the power manager over ethernet. It can turn it on. The machine boots from TF2P, and then it mounts an NFS root volume. And then bakery does some tricky things, making sure it mounts the right NFS volume, et cetera. So if you want to know more, it's up on GitHub. It's a hacky project, so there's no tests and everything. It's just one of these. So then we needed the CPI to talk to this, right? So that's the next thing we built. It's also up on GitHub. We actually used Dimitri's package for the Golang integration. Pretty simple CPI. And with that working, we were able to have Bosch talk to Raspberry Pis. Actually, pretty cool. But then the next step is somehow getting a stem cell running on top of the raspberries. And this was supposed to be the easy part. Before we go into the intrinsics, maybe we should show what that is actually working. Yeah. Let's do that. Yeah. Let's go right here. So I'll show you with different stem cells. Let me zoom out a little bit to make it better. Yeah. So as you can see here, we have an actual Raspbian stem cell, which was the stem cell I used. I built it manually. There was no stem cell built or involved for the Raspbian thing. It took me weeks putting it together to make it actually run the Go agent. I learned a lot about what the Go agent, the Bosch agent, actually does. It's a lot more than just launching scripts. It also formats disks and whatnot. That's quite complicated. And then we managed to port the actual stem cell builder to produce an ARM stem cell. And that's what we're using currently. So you see this one. So the OS is Ubuntu trustee, but then for ARM 32v7. And you can see the nice asterisk next to there. So it's actually in use. And if I do this, show all the VMs. I need to zoom out again. All right. Here you can see we have a separate availability zone called RPI. And we'll see the serial numbers of the raspberries actually used as VM names in the IS. So that's what it looked like from the Bosch point of view. So let's see how we actually from this point on, I can maybe change the title of the talk to revealing all the dirty shortcuts and x86 dependencies in the stem cell builder and releases and stuff like that. Yeah. So the stem cell building, this was kind of a tedious job because the Raspberry Pi is a nice thing, but it's really, really, really slow. So typically building a stem cell takes four to six hours. And then you find out somewhere off along the way that something goes wrong and you go back and you check your error and you repeat the steps and stuff like that. Because, yeah, for instance, the stem cell builder starts with the Docker image. And the Docker image is based on Ubuntu, but it's not based on Ubuntu ARM. It's not as generic as that. So you have to build from a complete Ubuntu ARM-based image. There are some x86 binaries in there that you have to replace or build from source. There's a very tricky thing in there that I found out that if you run Knutar in Docker on Ofl AFS, you actually get corrupted TAR images. That's a known bug if you've searched well enough. So I had to replace that with BSD TAR. And apart from that, we had to make some customizations to work on the infrastructure we have. So, for instance, consume the, when we build, yeah, we mount a disk, LNFS share, stuff like that. So, the next step, well, we thought that was supposed, yeah, that was going to be easy, because this is all source. And there's a Golang dependency in there. So we just change the Golang dependency, build everything again. Sounds easy, right? Should build. Yeah. Well, not exactly. No, not exactly. Turns out there are also some binary packages in the different releases we used. So this is going back one slide. This is the list of releases that's needed to deploy one Diego cell. So they use different releases in one deployment. So we thought, well, just replace the package that we're done. Well, not so easy. We also had to replace a few binaries. It wasn't that bad, because honestly, like Envoy, it's not really in use currently by us. And the busybox image isn't really needed. I found out after going to a lot of trouble getting it into the release. So the only real dependency was JQ, which is somehow in binary form in the Diego release. I don't really know why. So, yeah, we also had to replace those. And then we were able to deploy it and run it, right? So let's throw it in the elevator and see what happens. Yep, errors. So this is where we got into a lot of, like, x86, ARM compatibility things. I'll take you through some of those. I want to bore you with all of them, I think. I wrote them on my blog, ultimate-it.today. So if you're interested, go to the blog. And there's a bit more detail on there. One of the first ones we encountered is actually a bit... It's a weird thing in Golang. In Go, an int is an int32, on a 32-bit platform, and an int64 and 64-bit platform, which kind of makes sense. But that was a problem here and there, because we had some overflows and stuff. Another weird thing is that in the OS packages, or the file systems package in Go, there's actually int32 in use for 32-bit ARM instead of int64, while they are perfectly fine supporting int64. So it was a bit weird. So for instance, this one, we got an error on the UTIME omit, and it was an int64 in the original source code. But the number fits fine in an int32, so we just changed it to an int32 and then it was compatible with some comparisons it made against the file system package. Then the ID mapper. So I was trying to use its own ID mapper to map IDs from the real operating system on the machine to user IDs and group IDs inside the containers it's running. To do that, it tries to determine the maximum ID it can use. It turns out that maximum ID overflows an int32. So we had to build a hack around that which looked like this. We just read the number into an int64. So this is what we added, the bold letters we added into the code. Then we tried to convert it to int32, and if that's below zero, then apparently it's overflowing and well, we just set it to a fixed number we randomly chose. Actually, there's a shortened version of the code on the blocks the whole thing. There's some more checks in there as well. That's basically what we had to do. Then onto Guardian. So we kind of had it running. All the software came up, compiled, and everything seemed fine. But then we tried to start a container. So we pushed an app. But before we even got to the point, the containers kept dying somehow. I think it took me two weeks to find out why. It turns out it was Seccom. So something to do with security, long story. There is a Seccom policy which is set by Guardian, I think. And I found some references to x86 inside the policy. So I did a brief attempt replacing those with ARM references. They didn't work. So here's the solution. Just omit the whole policy. Worked fine. It only took me two weeks of my life to figure out where to set the nil. And sure, if you want to run this in production, probably a bad idea, but works for now, right? Then everything was running. We could push containers, but Metron kept crashing. It was actually a really neat bug. It's just things I like, very low level. And luckily, I found the solution in the GoLang documentation. So what it says, the first word in a variable or in an allocated structure, a slice, can be relied upon to be 64-bit aligned. And the error I got was something about segmentation violation, meaning it's not 64-bit aligned. So here's the code that caused the problem. It did some atomic operation on these two in 64s. But because they are not on the top of the struct, they are not 64-bit aligned. So here's the solution. Actually, it works. This is in a few places in the code. It did search and replace, and all was fine. It seems to be working, actually. So yeah, I like that. So back to Ruth. Yeah, so everything is working. We go up one level. We end up in the stack. So if you want to run a container inside Cloud Foundry, it needs a root file system. And within the Cloud Foundry context, that's called the stack. It's built from a Docker file system, boss deployed, and copied to the root file system on first use. Actually, this was pretty easy. We just had to change a couple of x86 binaries, but all as well. Then there's a final part of getting a container to run. You don't only need the root file system. You also need an initial process for the regular build-back lifecycle. There's a couple of initial processes. You have the, I'm not sure about the name, but one is responsible for the staging, another is responsible for the running, et cetera. And that dependency is called the app lifecycle, or as documented, the bit that takes some bits, meshes them together with other bits, and produces a new bit. So it's completely clear with that, right? Actually, this is an interesting thing, because it's part of the Diego release, but it's not installed on the Diego cell immediately. It's actually deployed to the file server, and it's loaded by the container on first use. So the container is an ARM process, so this piece also has to be ARM, but it has to be deployed to the file server as part of the Diego deployment, and the rest of our Cloud Foundry was all x86, so we had to cross compile this to x86, and then all was fine. So let's push it. Will it blend? No. It's one final hurdle we have to take, and that was called NSTar. Actually, funny story, I saw the exact same name, I think, on the manhole covers in Cambridge this week. Yeah. So it's NSTar. I'm not really sure. And if you read the source code, I'm not sure the developers are really sure. In the code, there's something about jumping through some hoops, doing some magic with namespaces. So it stands for namespace star, I think, so it makes sure you can get data from to and out of a namespace container, basically. And that kept crashing on us with this error. Exact VE at invalid argument. Sounds complicated. So this was the point where I thought, hmm, sounds really low level. I'm not sure if we're going to fix this until we found this in the source code. That's the define, and that turns out to be a syscall number. I was in slack with Rued, and we did some quick googling for syscall number tables, and we found the syscall number for arm32v7. Changed in the source code, recompiled, works. I wasn't expecting that. So that was really nice. After spending a lot of time on very simple things, this seemed very complicated, and it was fixed within an hour. So it was really nice. What's less nice is that it's also something you can never do a pool request for, because it will change it for everyone, and it will never work again, unless you build some logic in the pre-compiler, I guess. Don't know how to do that, I'll see. So final step, let's push an app. And at this point, we had done so many things already. We didn't feel like also changing a build pack. So we just used the binary build pack. Because that was easy, yeah. Yeah. So it's time for a few more demos, I guess, if you have still time left. I think we have five minutes left. So already showed... Let's see if we push something. Yeah, let's see if we push something. We already showed already running up. So we also have another demo with a... It's called CFMEO, you'll see why. So let's see if we push this. And fingers crossed actually works. Hopefully we don't run out of resources, because that's something that happens pretty easily on the Raspberry. That's one thing that wasn't in the slides, but there's actually a setting in the deployment manifest in which you can specify the amount of RAM that the rep has to report back to the brain, so to speak. And I had to set that to one gig, because the Raspberry Pi reports like 980 max or something. And that's just not enough to push an app. It just refuses to run the build pack. Just keeps saying not enough resources. So yeah, I just forcefully put it to one gig, and it actually works. So yeah, that's nice. Same for disk space, by the way. Just report more disk space than it's actually there, and it works. All right, so that was a CF push to an ARM. Let's see if you can see the switch to a browser. CFMEO. Well, that's why it's called CFMEO. Now you know. Uses the CAT API? Yeah, thanks to the CAT API. I didn't build it myself. So it also reports the CPU model. So as you can see, ARM v7 works fine. And we can actually also scale this. So it reports the instance number and the instance ID. So let's see if we can run more than one instance. Let's run three. So three Raspberry's, right? So let's run three. And let's start CF top and see if we can... There we go. So you can spin up. It takes a little longer than on a regular cell, but it's only a hundred times slower. So that's to be expected. But they're started. There we go. It's just a very simple go-up. So that's not a lot to start there. So refresh the page. Well, there's more than one. And they all report ARM v7. Nothing new here. Just like when we're doing the job, just working. Now there's one more cool thing I can demo. Just because it's cool. We also have an x86 cell running. So I can push the same app to x86. I should have created a manifest with it. Did I? Yeah, I did. Let's see if push. And then use the manifest for the x86. And that actually does use the Go build pack. So for ARM, we use the binary build pack. And now I push the same code, but using the Go build pack to my vSphere cells. And then you'll see that we can load balance just between code running on ARM and running on x86, which is kind of cool, I think. Because it shows you that you can run it on multiple platforms and easily migrate. If your code's compatible, and I think most just regular application code will be compatible with ARM. If it's like Go code or Java code, it will run fine on ARM. It won't be a problem. Stop pushing. There we go. All right. So let's refresh a few times. There we go. So there's one instance running on Intel. Refresh again. And we're on ARM instance. Refresh a few more times. And back to Intel. So it shows we can run them mixed in one environment. Take some effort, but it actually works. Now, this is all fun in games, basically. But we really want to get some attention for Cloud Foundry being compatible on multiple platforms. Cloud Foundry itself, the few modifications I think will run fine on 64-bit ARM, because most problems we encountered were 32-bit, 64-bit incompatibilities. So if you run this on ARM 64, I think you'll just have a few minor problems. So I think the most work has to be done on the boss side, supporting multiple platforms, being aware of multiple platforms so you can have compile workers from multiple platforms, that kind of thing, without changing your cloud config for every run. So we kind of want to continue working on this, maybe porting more things to ARM. So if you happen to know somebody with a huge ARM machine, 96-core thingy, we're open for sponsors. Let us know. If you know a cloud provider which wants to offer maybe ARM-based virtual machines or actually physical hardware, let us know. We're interested in building CPI for that and that kind of things. So I think we're out of time. Right? Yeah, I think so. Are there any questions? Quick questions? If not, there's one. No, actually not. No, no. I did have brief contact with, I think with the boss team, when I was stuck on the stem cell thing, but finally figured it out on my own. From the other team, no. We just put in our own time and managed to figure it out. Yeah, which was a very good learning experience. Yeah. All right, that's all. Thank you very much.