 Test test. All right, I want to thank everyone for coming. We're starting off the Open Sousa Summit today. And we'll be in here the entire day with various different talks. And I'd like to start off the summit with Harry Schoen. And he'll be talking about, OK. Oh, I don't know. This button would be. And if you have any questions, please come and talk with me. And if you registered on events.opensousa.org, come please see me later, all right? OK, I can't control this other. This is not for my notebook. This is the site or whatever. So this is my presentation. I did want to post up the abstract again. So my name is Harry Schoen. I'm from Intel Corporation. I'm part of what used to be the UFI team, or the team that actually built and constructed UFI back in the since early days of itanium. And this session talks about, really, network boot, how we do network boot in UFI. So we'll talk about this in detail. Switch back to the presentation here. So this talk is sort of our push to talk to the open source community on using the network stack that we've put inside of the firmware that should show up in most of the PCs, both server and consumer systems that you have these days. We had the code out there for quite a while, so we would like to evangelize and get out into the public or the open source community to get off of the old PXC boot or network boot constructs that have been around since the early PC days and move over to our network stack that we've made available on TienaCore and has been available there for a very long time, but trying to get the open source community to come out and actually use this. So there's a variety of reasons why you may want to use this particular network stack from the pre-boot space, and we'll cover that in the talk here. So as part of our agenda here, we want to go over the current state of how we network boot from pre-boot space or from firmware space today. What are the limitations with that legacy model? Talk about the more modernized UFI network boot, how that works, the basic stack network stack that appears in the source code and then the spec, and then some use cases of why you may want to use this. There's some both security and functional aspects of why you might want to use this, and then a call to action. That's the reason why I'm here is to try and get the open source community to come out and actually utilize this network stack in the deployment of your Linux distros and server software, whatever particular OS or backup software you're trying to write. So my role in this typically is to evangelize with the open source community and try and get people to both educate and use the UFI, ABI interfaces, and actual products. So reaching out to the developer community to get them to pay attention to what we put in the UFI specification on UFI.org and to also get you guys to actually working on real open source projects and code to actually utilize what we put up on TN Accord.org in the EDK2 project. So we'll first go over a list of the current state of the art, what's in shipping out there today, what you find in the typical server, there's server systems and your typical client consumer systems that are out there on the market today and then move on to the stuff we put into UFI. So this is sort of a history lesson. The current state of what the industry uses today is something that was invented at Intel underneath the Wired for Management specification. So originally PXC boot was designed by IAL, part of our architecture group, to put another network connection into a server or enterprise class system that would give you a backend connection into the system so that if you booted to that connection in the system and a private network that you'd be able to deploy the OS, bring a bare metal system up from nothing, nothing on the storage, nothing in the system to having an OS or whatever firmware or software you wanted to deploy that system with. So in the early days, there's typically a NIC card Intel makes NICs, right? You would put that, plug that into the system or lay it down on the LAM and be able to boot to your private network connection in your enterprise, your data center or in a private network to deploy the system. So the Wired for Management spec was released down the early 90s to the manufacturers, OEMs in the PC industry and people have been making these 16-bit undie option ROM NIC cards for years. You still see that today on a lot of the NIC cards that are out there for X86 and hard PC and other architectures. So it's been around for a very long time and all the infrastructure that you see today uses this to boot. A lot of the cloud computing still has deployment software that uses this legacy PXC. What we call legacy PXC is 16-bit option ROM code that your PC runs. On your typical PC, you may have a special key in your BIOS F8 or F10 or something like that. So if the physical user presses that key while you're booting and through the BIOS then you come up and launch that option ROM that was dispatched during post and then start that network stack and boot to that. So your local BBS or your BIOS setup will say there'll be a network boot option which boots PXC in the system. Now that requires some infrastructure put in place and this is a typical setup you'd have on your little private network. There'd be a DHCP server that you have to configure specifically for PXC boot. It has to advertise where the PXC server is on the network and also advertise the architecture type of the network boot image that you're going to post whether it's i32, x64, itanium or arm64, whatever it is that needs to be in the DHCP packets to notify the target boot device when it asks for IP address what is available out there if you're going to PXC boot. There's also the PXC boot server which you need to deploy. That's typically what a lot of Linux systems do, you have a deployment server up on your network and then in a very simple topology there's just one switch and then a bunch of client target systems, servers or client notebook systems or whatever you have on your network that are booting to your PXC targets. So up in the right hand corner this is taken out of the wired for management spec. These are the exact steps between the DHCP server as a client when it says go run the option ROM and go do a PXC boot from your BIOS post. What does it you actually do when you go off and do a PXC boot? So step one is you go out to the DHCP server and say go get me an IP address. And then there's an act that comes back and says, okay, here's your IP address. And then when I say, oh, I want to boot send a bunch of packets, ARP packets saying, hey, I'd like to be able, do you have PXC services? If you do, please give me, acknowledge that and then tell me where that server is on the network, the IP address, and then also advertise what available architecture option ROMs are available on that system. If it's IA32 or X64, X86 or ARM or PowerPC or whatever, then you come back and then there's a chooser program up on your system that will determine which one of those images to go get off of the MVP or network boot program you're gonna execute off the PXC server and then the PXC server will then start doing a TFTP or a copy of the MVP down onto your system. So in a nutshell, that's the typical sequence that you would see on the network, private network, basically that you'd go off and communicate with the PXC server somewhere on the network and then hand control off of your BIOS Post to your particular system. So typically if you're installing an OS like OpenSUSE or some Linux distribution or Windows, WinP or something like that, this is the sequence you would go through to actually run the deployment software, the install software setup program to go off and install the OS on the system. So after the bottom of the page there, once you load the MVP or the installation software, the OS, then the kernel runs, the local OS drivers take over the system and then you're running Linux or the deployment software, whatever that package is, and the BIOS is really not doing any of the actual work of pulling anything off of the network. So once you hand control or off to YAST or Anaconda or whatever your favorite installation program is, we're not, the BIOS is really not in the loop. The OS native drivers and kernel have taken over and are providing all the network services to do the rest of the installation, pulling over the necessary files to install the OS. So there's something called also IPXC, if you talk about enterprise servers, a lot of people run IPXC where you take the option, the IPXC option, put that on the NIC card or include that in your system as a plugin card or flash it into the BIOS on the system and that has an additional scripting language which then allows you on a more complicated network to go find various deployment servers and go off and hunt down the deployment software that you wanna actually deploy onto your blade or particular system. So this allows in enterprise systems not to have any local storage on your blade, compute blade or whatever, you can just PXC boot or IPXC boot off of a network enterprise blade and then pull all of your necessary boot files for the OS off of the SAN storage or iSCSI storage, whatever it is, the storage in your data center is in your particular data center. So that's sort of the state of the art. This is how most systems work today. Remember in the 90s, this is designed long before virus scanners were invented and people worried about rogue things running on your particular network. So when we talk about a network boot environment, zero trust environment, we're talking about booting in an environment that is not necessarily friendly. We've separated the networks. Typically there's internal networks or VPNs. There's not the, you're not sitting on the open internet or sitting on a hostile network that may have access points or Wi-Fi or things like that where you may not have physical control over all of the elements that are on your network. So there's some basic limitations with the current PXC model today. We'll first talk about one, does it scale to large networks? If I have 1000 servers or 10,000 servers in a network and I initiate, they all turn on and they all do their PXC boot all simultaneously. If you look at the underlying, we use ARP and UDP packets due to all of this exchange information between the DHCP server and the PXC server, you'll find that on most networks that you have thousands of servers each individually asking for that exchange we just talked about, you'll bring the network to its knees very quickly. On a, you only typically get 60% of the bandwidth on a one gig or 100 meg connection, right? So you figure out the size of the network boot file that you're pulling in Linux, Grub 2 is only one meg or whatever, but when you start looking at the actual deployment file that you need to deploy, if it's 160 meg or in Windows case, it could be several gigabytes or whatever it is, pulling over that initial network file over your network, when you have thousands or tens of thousands of systems doing that, you'll find that you won't be able to sustain that bandwidth, basically. So you have an underlying scaling problem in a large network that the typical switches and routers are not designed for scaling out large amounts of UDP and ARP traffic. So we found in practice, we found most people have to bring the packet size down to the 802 out of three packet size to get it through all the switches and routers in a more complicated network. So that's typically what 15, 30, 384 bytes or whatever it is. So now you have to hack and knack individual packets that are about 1.5K across an entire network, transferring with TFTP, anywhere from a one meg to 160 meg file. So this is not very usable in a large environment. So I don't hear about large data centers using this to scale out and on an individual network, boot tens of thousands of servers, right? We hear about people switching the packet size to 32K to get rid of all the acts and acts. There's something called FAS learning tree that the Switch Cisco, ACADLIS, all the switch routers implement, which gives you the fastest path through the network and they typically do not route UDP and RRP packets efficiently using that FAS learning tree method on the first boot. Typically it takes several boots for the routers and switches to implement that and figure out where your client system is in the network. So there's a lot of limitations. The networks were not designed for UDP. They're more designed for web traffic, TCPIP and that's sort of part of the network stack. If you're in a corporate network or in a network where there are hostile elements on the open to the public or in your network where you don't know necessarily, IT doesn't have control over every system that plugs into the network or the Wi-Fi, you have to modify the DHCP server to do that initial transfer of advertising where the PXC servers are at and doing the load balancing of which clients end up responding to the PXC queries. So typically what happens is the PXC clients advertise to the DHCP server that they're there on the network and whichever one's closest and responds first to the client wins. So you don't deterministically get to choose as to which PXC server will respond to the request from that undie option ROM or that initial PXC boot from the BIOS, it's just whoever answers first. So on a network that is not balanced or tuned, that could be a severe problem. You also have a security issue in that anybody can create a PXC server and there's nothing that will stop you from booting to that rogue PXC server on a network. Even put it over Wi-Fi if you wanted to, right? So there's a lot of inherent security issues that were never anticipated. The original, as I mentioned before, the original wire for management sec was designed for a private network that you would add to a server or a client system and that network would be separate from the main network that you would normally create or use in your infrastructure enterprise. So there's a lot of issues with the legacy model and it was designed in a day that was not cognizant of security, it wasn't a concern at that point, at that time when it was designed and the PXC services and that sort of thing were done with very small images that didn't necessarily scale to large numbers of systems. So these are some really basic scaling and security limitations that we were looking at when we started inventing or creating the UFI network stack. So just keep these in mind when we start talking about the next thing. Well, here's the zero trust network statement. So some of the marketing people are evangelists who are talking about, you know, we have all these secure teams now revealing all of the firmware code as a standard process that we have within Intel and a lot of our customers. So they pulled this quote out from CSO online saying you can't make that assumption that when you boot PXC on a network that you plug into these days that this is not an environment you should necessarily trust implicitly. Just let any PXC server take over your system and be like walking outside in the street and say, hey, you wanna help me set up my network and OS installation on my systems? I mean, just walking out the door and asking the first person that comes along that responds to you is probably not the smartest thing security wise to go do. So now we wanna add some additional trust anchors and find a security model or a network stack that allows us to apply some sort of security to the system that we're doing to the modern systems that we're producing today. Okay, so starting from UFI we took a look at that network stack and early in the 2010 or eight days went through and decided that, you know, maybe it'd be a better idea. And when we look at the OS network stack to adopt a stack that's similar to what the internet uses today, the TCPIP stack that you see that's been around for a very long time. So in the specification, we created a UFI networking model. There's a whole section of the spec dedicated to that. The first thing we did was you duplicated PXC boot. So we do the same thing as we do in legacy on the wire except that we do it with UFI drivers and use the UFI driver model and API interfaces instead of the legacy BIOS. And then we also added from the very beginning of UFI we added secure boot out something we implemented. There's something called NECC phase that allows you to boot the firmware securely. There's choke points within UFI that you can go through and check the images and make sure they're signed before you hand off control blindly to something you're executing in the pre-boot space. And then we ported in addition to duplicating PXC, we also ported HTTP or HTTPS to our network stack. We've added that since UFI really 2.5 or a little bit before that. So you can do a secure connection out to the network over URL, adding DNS services. So taking a look at the new network stacks that were added starting from UFI 231. So in 231 we added the UFI PXC boot duplicating off on the wire what is done in legacy PXC except now through UFI. So on the left hand side, you can see the original IPv4, that's what all the legacy PXC does today. The basic stack that we use, I'm sorry, this is the new network stack that was released with 231 that duplicates the PXC boot services you see on the wire from a typical PXC server. So the only thing we had to change was in the DHCP, the advertisement, we had to add some new IANA architecture types for the SQL image to be UFI A32 and X64 and ITNEM versus the legacy architecture images. So you know the MVP coming over is a UFI image versus a legacy executable, an I32 or an X64 executable. So these network drivers were all included in the initial network package on TNL Core and implemented in most of the system biases today. So your typical Linux distribution or Windows distribution would use this UFI, consume this UFI PXC network stack to install the OS or do a PXC boot from a server, PXC server somewhere. So one of the things we did add in the initial implementation in 231 was IPv6. So we created a parallel stack. So in addition to the native Undy driver, the Nick Undy driver that is ported over to UFI that produces the MMP stack, we added an IPv6 stack. And then I had that taken to UNL, University of New Hampshire and then run through the full IPv6 certification tests at IOL. So then we would have both an IPv4 and an IPv6 network stack. And depending on what you chose in your BDS and your particular UFI BIOS, you could pick which stack or which option in the boot option list you wanted to, which network stack you wanted to boot through. The reason why we added IPv6 was the server vendors and a lot of the data centers and enterprise clients came back and said, you know, we're moving our internal networks to IPv6. We've run out of IPv4 addresses. When we talked to the storage guys, people making large sands and you know, fabric channel storage and that sort of thing. They said we have thousands of LUNs on the network. Each has its own World Wide ID. We've run out of IPv4 addresses in the data center if every single drive has its own IP address, right? So in a large data center with, you know, tens of thousands of drives, you can't have IPv4 simply just not large enough to handle all the possible boot targets out there. So they asked us to put IPv6 in there so that we could address all of the storage that would be out in the typical data center. So that was to address the concerns we had with the data center. So that was the first implementation of UFI migrating it over to the existing PXC model that exists in all of the deployment software back in the day. You know, this is the stack and typically you can get this today in most server systems. This is what ships in most, you know, HP, Dell, Lenovo, IBM systems today you can purchase and you know, that's the typical thing that we use. In addition to PXC, we also added iSCSI at the request of the enterprise server people because typically those sands that are out there typically have a World Wide ID LUN and the storage is sitting in racks separate from the compute blade or nodes. So they wanted away for us to name, you know, the actual LUN in storage device, World Wide ID of an iSCSI or you know, physical LAN storage device out in the data center somewhere and put that through legacy PXC. So this was the sort of answer at that time back in the early 2.3 days when UFI was first adopted by Windows and Microsoft and then the Linux distros as well. This is the first instantiation that we implemented in the firmware stack and got this into most of the servers out there. So typically for client systems, you don't have iSCSI, you don't have, you know, network boot is not typically a normal mode of booting on a system. So there's really more of the enterprise server devices, servers that went out there and implemented this particular piece of the stack. So from here, you can see in the red dotted line area, the chunks of UFI, DXC drivers, what we call them in the network package that you'd have to implement and port over to your particular BIOS. All of these pieces are standard, you know, they don't need to change from system to system. The only component you would need to change is getting for your particular hardware, the undidriver for your particular NIC card that you're applying on booting off of, basically. So this sort of eliminated, you know, major reports that you'd have to typically do on different platforms and all you had to do is go find a NIC card or the NIC manufacturer that would post their undidriver basically for UFI to be able to be X64 format and have that included in your system BIOS in your server. So this sort of standardized, a lot of the stack eliminated, a lot of the work importing just by having this going to find your local NIC vendor and getting your undidriver included in your network stack in your particular firmware for your system and you have your full stack that enables you to boot UFI, PXC, or iSCSI on your system. So moving to the models that we used, like I said before, a lot of the UFI PXC boot and iPIXC boot was using IPv4 or IPv6. That was the main migration in that period of time. We still have all those same limitations we just talked about before. A lot of the limitations with the scaling and the scaling and the security issues were not solved. This is just getting people to move over to UFI first. The only added security feature we had at this time was UFI secure boot where we would go off and when we got the MVP image off the PXC server we would check to see if it was signed and if you had secure boot turned on and you had the cert for your particular in your system in your DB we would check the image, the MVP image from the PXC server and make sure that it was signed and match the key that's in the DB for UFI secure boot. So if you had the, typically it was the Microsoft key or the UFI CA key for the shim that signed for your particular Linux distro we would check that image and we would not hand off control to the MVP unless we saw that signature on the MVP file that was handed off from the PXC server. So we had some really basic security there at least you can develop a chain of trusts where in the BIOS you're in the firmware you're checking the image you're handing off to you as opposed to legacy BIOS where there was no checking at all you just blindly executed whatever was handed off within your PXC server. So it didn't establish a chain of trusts with it whose PXC server you're booting from but at least the image that you're executing off of the PXC server you would check to see whether it was signed or not and from a trusted source that you wanted the system to execute. In UFI 2.5 we started moving forward with some more concerns about scaling and using a more modern network stack and getting away from using PXC servers. We'd like, there was a request saying well why can't you boot off of a web server? A standard URL so I don't have to alter the DHCP services at the front end of the network. Most IT departments won't let you change or alter the DHCP server they have to keep track of all the PXC servers in the network and deciding whether they wanna allow PXC servers to boot network images on your particular network. With the growth of all the networks in the system in the Wi-Fi I mean it's very difficult to control who's on your network so that's not really a good thing. So moving over from PXC to a standard web server booting off HDP, a URL or an IP address would be a little bit safer that I can then from the user boot options pick the specific URL and transfer and control of where the MVP was coming from off the network and allow you to boot off a specific file off of a known web server that you trust or that's hard encoded into the BIOS if it's for servicing. So the newer network stack we looked at the OS driver stack and started pulling over the pieces that we needed to implement basically TCP IP. So if we look on the right hand side over there we still have the standard DHCP and services that we typically use to get an IP address off a network, off a DHCP server but now all we care about is just getting a normal IP address. So you can do your standard static or dynamic DHCP acquisition of IP address and then from there implement DNS, the underlying TCP and HDP stack you needed to be able to talk to a web server on the network. So we spend a bit of time getting that working. Initially it was HDP without the S, without the field cert. We'll talk about that a little bit later. So at a minimum we wanted to be able to boot off of a specific URL or a web server somewhere on the network. You don't need to alter the DHCP server advertising where the HDP servers are on the network. You can do that if you want to. We have provisions to duplicate the PXE services of model but you could also just boot off a straight HDP or URL server if you did not want to ask the DHCP server where the HDP server is in the network. So that requires on the left hand side some of the basic HDP driver stack and we put that into the TNOCore.org on the DK2 project and in the UFI 2.5 timeframe, 2010, 2011 timeframe, we've been advertising that we can do this in the network. So we once again re-realized some of the basic drivers that were originally in the network stack. We used the same UFI on-D driver, particular to your NIC, that's hard coded to your NIC. Then we have the same SMP and MMP driver stack, the bottom that shared code that we reuse again in our network stack. So this allows us to re-utilize without having to go back and have everybody rewrite all of their firmware code for their NICs to reuse as much of the network stack as we could from PXE and continue just adding the upper layers of the stack needed for HDP boot in the system BIOS and not have to go back and to every card vendor on the planet and have them rewrite their network stack code for their NIC. So this re-utilizes the existing code, we don't need to go back and once they've moved over to UFI, they can just add the new code for the TCPIP and HDP and you would be good to go to boot off of HDP. So the different, you know, environments we're looking at obviously were the data center and the corporate enterprise server environment and then you're also your home network. So for consumers and people out there, we want it some way of reaching the end user out sitting on the open internet for them to be able to update the firmware on your system or be able to boot off of some cloud storage or some sort of network storage on your network to be able to deploy an OS, repurpose the system or whatever. So we wanted a more flexible pre-boot from our environment network stack to allow you to get at the machine and be able to boot off the network in a more modern network environment that doesn't have requirements in the DHCP server and PXE servers on the network, having web servers on your network is very common. Your switches and routers are scaled and designed for steering TCPIP traffic and hosting web servers everywhere on the network. That's not something that's particularly difficult to do and scale to a very large number of systems on your network. So your typical HDP or HPS boot flow is where you go through, you just have a DHCP server typically, you know, it's handing stuff out in the cloud and then up in your network you have a DNS server and an HTTP server. The DNS servers can be combined together in your network with your DHCP server but there's nothing special that you have that you wouldn't already have there for a typical network today. So the DHCP server is still going to hand out the IP address. You have a choice in your deployment servers whether you wanna advertise in UFI, HTPS or HTTP servers with a particular MVP architecture type of X64 or I32 whatever the host MVP images you wanna host you can put that up in the DHCP server or you can put that on the HTTP server if you already know the file name and the type architecture type you're trying to boot to you wouldn't need the DHCP server for that. The thing we also add was obviously a DNS so that instead of typing in a hard coded IP address whether it's V6 or V4 you can just type in a URL name and the DNS services that we've added to the network stack would allow you just to type in a www.OEM.com or whatever URL you wanna put for wherever you wanna host your MVP files on your system, on your network. And then from UFI you just name the NIC MAC address or the specific NIC you're trying to boot off of and then URL and then you can just download your MVP file off a standard web server and then transfer control to it. So this is essentially the model that we want people to move to. It uses a more modern network topology uses standard network services that are out there does not require large alterations of your networking topology in order to support booting the N system off the network. So booting off of HTTPS requires a cert basically a TLS cert that needs to be installed on the host UFI system. That's something that in this spec we talk about how the format of the cert and where to put it in the stack but we don't talk about the actual implementation and the specifications. So a lot of enterprise servers put the cert behind the BMC for example or someplace on an enterprise server that's safe that they can get out of the band and that's one of the security issues that the security people are very worried about. Cert management and deployment of the cert for TLS is something of great concern. So when I came to the Linux foundation and asked them did they wanna host some certs? They told me well we're working with something called the Let's Encrypt project. We'd like to do our own signing of certs for TLS. They have the exact same problem up in the browsers for Firefox and other browsers, right? Utilizing that infrastructure for managing the certs would be something good that something once again the community really needs to work on providing a way to get at certs from the pre-boot space and then vetting of course of the open web and that sort of thing as to how that gets updated securely basically. So that's not part of UFI obviously. It's part of managing the open web and that sort of thing. So going to the community and asking them how the form certs get distributed and how they get formatted and where to put them and where you get them from. That's something that is obviously not solved by the BIOS engineers or the firmware people. So that's something that we, from a security standpoint we'd like to see the community step up and own that portion of the infrastructure out on the open network, right? Some of the advantages of using HGPS boot versus the PXE boot is of course we use a DNS description of where the server is and finding the file, the UFI file that we're trying to boot off of. We can use this on any typical normal network without altering the topology since most networks support TCPIP and HTTP and HGPS. We can use both IPv4 and v6 and then you're not worrying about where the PXE server is which subnet it's on. You don't need to worry about load balancing or placing the PXE server on the same subnet as the target system that you're going on. You can cross multiple subnets and routers and switches. You don't have the problems of locating, co-locating a PXE server to where the target system is trying to boot. We have some basic minimal security for UFI secure boot. So for the MVP image, we're gonna check the image, the signed image before we transfer control to it. And then once you get HGPS, we can also vet that we're actually booting to the URL, to the actual website that you're trying to get to once we include the ESSERT in the system and in the network stack. So if you combine this with a RAM disk as well, UFI has a RAM disk as well as I believe Linux has a RAM disk as well, right? You can then not just copy over the MVP file but the entire install image, everything you find in the ISO image over onto the RAM disk and then run it from this RAM disk and do the full install process. So this sort of gives you what we think the ingredients you need to be able to deploy an OS on a bare mental system that doesn't have anything on it, just the ability to boot off of the network stack and from P-boot space. So some use cases of what we think you would wanna use HGPS for and people have come to us for was when the most obvious one is installing the OS image or just booting off of the OS if you have no local storage on your server in the data center. So like I just mentioned before, you can install from an ISO image you've copied up onto the web server and then move it to the RAM disk and install from there. Or as in the case of OpenSUSE, your installer can unrecognize and support HGPS directly and install directly from the OS MVP image. We also have had certain organizations within the government and other places, security people that have asked for former updates. So we've seen cases today where you see your typical IP camera and other devices on the network that need firmware updates, whether it's your system BIOS, the BMC, there's virtually everything and inside the system today has firmware on it. Your SSD storage devices, your NVMe cards, your NIT card itself has an undie driver may need to be updated to the correct EFI image that you need or architecture type. There's lots and lots of things inside the system. So EFI invented something called firmware capsule update where we've been working with the community on getting the capsule images standardized so it can be deployed from either Windows Update or something called LVFS. Richard Hughes' FWB.org site. So we would like to be able to deploy these firmware images out to the end target systems directly through ideally this network, safe network stack, and then update the firmware system without worrying about whether or not the VOS can boot or not. You don't need to be in a state that's completely functional in order to update your firmware. So firmware update from the security people seems to be a really big action item and something they would all like to see. And then in the case where maybe your system has crashed, the OS doesn't work anymore or I've been attacked by some malware or firmware. The system's been taken out. It's now compromised. I just wanna wipe the OS in the current state of the system and install the OS and restore from cloud storage and restore the backup to a state of the system before the malware attacked or infected the system. I'd like to be able to do that from a network boot basically. So if you've been backing up your software or your system as you should have, having a network boot function that's HTTPS compliant and capable will then give the ability to put the whole system back the way it was before it died or crashed. So there's an entire NIST document that talks about recommendations on typically for enterprise and government systems, the implementations of all this firmware security and network stack and things you should go do. So the NIST document is something that I highly recommend taking a look at. If you're looking at all the requirements necessary for system recovery and cloud recovery and storage and that sort of thing. The other thing for systems in the data center and blaze and things like that, they would just wanna boot off HTTPS and their network and not have any local storage on the blade systems, keep the blade or compute nodes separate from the storage nodes for thin clients, your typical Chromebook type environments. You don't wanna have necessarily local storage or whatever you want to be able to boot off of a network location to update or install the OS. It's another use for HTTPS that we see some people would like to use the stack for. So these are just the first couple of cases that were brought to us or that we've seen utilized today that people are working on. So we'd like to extend these traditional PXE network boot cases to these new cases that we have been asked of us and the reason why we put this network stack in UFI to begin with. So if you wanna play today, give some credit to where credit's due. So we have Gary Lin from the CZ community. We have Gary Lin and Joey Lee and some people in Taipei that have been implementing, watching all of this and have implemented in OpenSUSE A42.3 and SLS15 Leap. They've implemented the full HTTPS capability. So right off of the installation, I assume as you can install directly. And that's a little demo that I have here, utilizing in the latest releases that they're able to boot off of HTTPS and install OpenSUSE A or SUSE A SLS15 onto your system. So these guys have been watching us very closely and have been first to go implement this and stick this in an external distribution. We'd like to see the next level of the community above them, the deployment software people for like SUSE Assault Manager or whatever, also get off of PXE and legacy PXE and use HTTP or HTTPS to do their deployments. So this is something you can do today. If you can get a hold of a UFI system that supports the newer network stack, you'll be able to try this out. I have a system here sitting on the table and a video that I've taken that will allow you to, in this demo, actually boot off of a physical device that has HTTPS fully implemented from TNACOR. And then Gary and Joey have done the work of implementing it up in OVMF. So if you can't get a hold of some hardware that boots HTTPS, typical like, I think there's like an HP Pro Alliance server that boots HTTPS today. If you can't get a hold of something like that or this Miniboard Max board, there's an OVMF environment. You simply download onto your build system and try everything out inside the OVMF environment off of EDK2. So if you wanna do some real work and play, we highly encourage you to contact us or just try this out, download it and try it out inside the virtual environment. Play with that. If you're trying to get some tools and deployment tools working. When there's some basic getting started guide here and how to go implement the OVMF environment or download EDK2 and build the stack if you really are interested in pulling the stack down and trying it out for yourself with the source code. I'm packing up for a second here. So there's some other information here on where the source code is for, or excuse me, guides on how to boot HTTPS and HTTPS and Wikis that we've put up on the EDK2 TNR course site and there's some GitHub. We're transferring a lot of the tutorials and educational materials to the Wiki site, to the Wiki GitHub, GitBooks format. And then there's some URLs here on where to go get the source code for OVMF to do the virtual environment, build and test out as well as the Midamor Max project which is an open source hardware board that we use to implement EDK2 and test out the code on a real platform. Something that's open source hardware that you can get and try out yourself. And then I named also the HP ProLiant series Gen10 servers which have HTTPS built into them. They're shipping today. So you can, if you're in the enterprise and you have one of those servers, you can actually test out the stack and try it for yourself. There might be other Dell and Lenovo servers that have this as well. I'm not surprised and didn't do a full survey of all the firmware that's out and all the servers but you can ask them if they've implemented this network stack or not in the system. So I have a little demo here just to prove to you that this is not all vaporware. So I'm gonna go off and I have this up here but you can't see what's going on up here. Actually I can, let's see if this works. Plug them in, I'll plug the video into the Midamor Max system here which I've booted up already. So on my notebook over here, I have a HTTP server that I've set up with instructions from one of the URLs up there from Gary Lin and Joey Lee and I'm on the Midamor Max target system that's gonna boot over the network. There's a tiny Netgear switch here, a one gig switch and just the Midamor Max and then my notebook host server that's hosting Celest 15 as a host URL server using Apache 2 and a DHCP server as well. So I'm showing you here the Midamor Max system is booted up to what we call the BDS, so the boot manager and what I'm gonna do is select the boot options that are set up in this BDS. So I've installed already on the storage device, there's an SSD storage device, OpenSUSE, just to prove that to me myself that has the drivers to be able to boot OpenSUSE. In addition, there's a bunch of options here to boot off of the internal shell and the flash and then all the UFI PXC boot, there's a V4 and V6. There's two NIC connections on this particular board so you would have two NIC addresses and then you remember there's PXC IPv4, PXC IPv6 UFI and then HDP. So I have at least six combinations I can boot from and that's what all these combinations are here. I can pick which stack I'm booting off of that's in the firmware. So this is UFI PXC V4, V6 requires a PXC server and then from the first network connection there's an HDP V4. So I would have three boot options depending which stack I'm trying to use. V4 and V6, sorry, so there's four options for the first NIC and then there's another four options here for the second NIC connection. The same thing, PXC V4, V6, HDP, V6 and HDP V4. So on this particular server, this little notebook over here I've implemented a web server that is doing HDP or it's IPv6. So I'm gonna select that boot option and boot off of the URL. So in my BDS in the far right hand corner here you can see I've, I'm gonna let the DHCP server pick the URL that we're booting off of, whatever it decides to. There's a second boot option called UFI HDP and this names the specific IP address of the HDP web server. I could have typed in a URL, I should have typed in the URL instead of the direct static IP address of the HDP server but in this particular entry I had entered in the IP address. That's a V4 address on there as well. So we'll go back and boot off of the HDP V6 and let the DHCP server name the URL. So it's bootx64.efi, that's the name of the installation software from the Seles 15 DVD or ISO image. So right now it's just gotten the IP address, the V6 address off of the notebook of the DHCP server and then it's handed the bootx64.efi image file off of the web server, it's grub and that runs. I don't have secure boot on in the system, it's booting without checking the image but typically you would wanna do that, turn that on and put the cert for the shim that's in the system and then once grub two starts running then it's network stack, it's kernel loads and network stack loads and then we go off and start booting to the OS and so as usual it's never working on the actual time that you show it in front of a large number of people. So just for your case, grub is running so as far as the OS or the UFI is concerned we've already handed control off to the UFI image on the network server, what it does after that I'm, you know, that's really what Gary and the Suze people have done so we'll try this once again. Actually we'll try the, I don't think I've started the V4 server so it's only V6. I've forgotten if there's a timeout built into the grub two loader so I don't know if I'm supposed to be pressing with space or return automatically or if it does that itself. Yeah, so this time it worked fine. I think there's a timeout inside the grub config file and I was talking and I didn't press the space bar or something so. So now it's already transferred control to the RAM it's copied over all these files from the ISO image on from the web server onto the thing now you're watching the this is the standard install program from Suze Elite 15 that's running and UFI is not in the picture anymore it's loaded the network drivers and we're using the Sless 15 network stack to come up and we've transferred control UFI is not in the picture anymore so it's now going through the install process I'm not sure you wanna stare at an OS installation this is all yas and all those things running on the system. So it's typical the setup of how I got this working is through the boot maintenance manager when you're adding boot options you can go through and select using the BIOS setup to go through and boot off of the SD device here so I won't go through that right now it's implementation dependent upon my particular version of UFI on the system I'm just gonna go back and finish up on the slides here which time do I have left down to the last 10 minutes or 12 minutes here so the call to action here of why I'm up here talking to the community is you'd like you to stop using legacy PXC if possible for a variety of trust security reasons I just mentioned if you really wanna just boot off of something a PXC server on the network the first one that comes in answers to you that's probably not the smartest thing you should be doing even if you have IPXC there are trust models a little bit different they have for HTTPS they have a fixed location on the network that it drags the cert over and that's we kind of security people have a cow when they hear that we'd like you to take a look at our HTTPS boot implementations try out some of the stuff that's out in the marketplace today, the product servers or you'll get the middle board max or just pull down over your math and build it yourself and work on the deployment tools to actually go off and stop using legacy PXC and try and move over to HTTPS as your deployment method if possible and pay attention to some of the security aspects of where you're booting to you don't necessarily trust the network that you're booting from we don't have dedicated backend networks for deployment and for these things typically today so we want you to do that and then really the open SUSE documentation help out with documenting this for end users as well as how to guys and that sort of thing and utilizing your particular distribution and getting the open source community lies to your stack and educate them on how to use your deployment tools once you migrate over so I did want to open up to oh, there's one other slide here besides these references to these locations our community, the UFI community gets together once a year in the US to a plug fest we're holding one next month in April up in Bellevue Washington at the embassy suites so there'll be a bunch of lunch sessions educating the industry where the state of the art of UFI is the latest advances we're doing at lunchtime and then testing you can actually go and see all the vendors will bring their systems there and you can go off and try out UFI and various OEM systems the OS vendors might be there as well Microsoft is definitely there we're down the street from them and then try out in the test room suites we hold a plug fest we go in and try out your software or hardware and really UFI systems so until we'll typically bring our latest systems there in the closed rooms for you to try out and then the OEMs typically also have client and server systems for you to try out so we highly encourage the community to come out and visit us and meet the people who are actually building and constructing the former and the specifications with the industry we also cover ACPI we've absorbed that spec so there's some ACPI people there if you want to talk to them as well so I think I have maybe a few minutes here if there's any questions there's a microphone up here if anybody has any questions Hi, my use case is a contract manufacturer who's gonna do the initial install of our new hardware that is a very untrusted environment in some sense we're running ARM processors we don't have Grubb one question is is there a hard dependency on Grubb you showed the Mino board does that actually have one of those undeed drivers I'm just wondering for an ARM platform how practical it might be to implement some of this stuff and also just a quick question would you advise in a situation like that to host the images inside the contract manufacturer or host them in our own infrastructure and pull them down via HTVPS? Okay so first of all for ARM I would contact the Lenaro community so a lot of the people in the Lenaro community are in the middle of porting to UFI right now so you'd have to talk to them about getting an undie image as well as the ARM, well they ship ARM images with Suze and open Suze right so but as far as getting an MVP if it's not Grubb too whatever they're using for ARM you know having the installation program that you would use you'd have to talk to the Lenaro community about that along with the Suze community as to whether they support ARM I'm pretty sure they do but you know whether they include an MVP image for installation using HTTP boot versus I'm sure there's a PXE boot image but whether there's an HTTP I'm not sure that that's been ported so you'd have to contact them that's something at the PlugFest and the Lenaro Connect events that you can get the latest on what they're doing and then your other question about should you host the MVP image internally, locally in your facility, intranet, in your cloud? So with the stack I mean it's a question of security as to which environment you would trust a lot of people there's a certain fruit vendor that we know right in Cupertino that you hold the key down and it goes out to the URL and then it checks your Apple ID and says hey are you a legal customer for this unit and they do it over the open web so that's one model. Another model is if you have blank systems you don't have that security business model in place already that yeah you'd want to do that behind a closed intranet and not have that out there. Some other BIOS vendors and OEM vendors have a hard-coded URL. They're still using the TFTP and PXC stack to go to a location on the open internet to drag down an MVP image to reflash your firmware system so it depends what you're trying to do and you need to look at the security aspects as to what kind of environment you trust and how much security you have in your system to decide whether you should allow control to be transferred in that environment so you can talk to me afterwards I mean I can understand your use case better in that one. I'm going to hand the microphone. I had a question about, sorry, where can we find the slides for this? I actually work on a tool that does PXC TFTP boot installations called stacking. So I assume ScaleX is going to post the slides up to oh okay I mailed them to you but I'll post them on the event and then they should appear in the ScaleX website then. Okay, yeah, thank you. One more question after this and then we're going to have to come up. I'll be around here or just outside if you want to talk to me afterwards, so. So when you did the demo, it looked like you had a lot of stuff already set up on the board. How much manual configuration in the UIFI configuration interface do you have to do to get this up and running? So what I didn't have time for was to step through. We have in the reference BDS so the reference boot manager in EDK2, there are some menus you can go through so you do have to just add the boot entry. So once you have the stack already implemented in there, there'll be a selection of typing in the URL basically. So all I had to do to set up that boot option was to go into the boot maintenance menu and type in for the URL, basically the IP address for that particular Mac device. And that was all the setup I had to do inside of the, the user setup I had to do inside of the, that particular implementation I made at word max. It's not really quite plug and play. You can't buy 500 brand new servers and provisioning using this. So in the server world, there's something called Redfish and so what's not standardized in UFI today is a standard way to deploy configurations. So they're working on that with the DMTF. If you're interested in going I think at the OCP summit or in the DMTF organizations under Redfish you should take a look at that. So if you wanted to deploy on mass numbers of servers both in-band and out-of-band, these network configurations, among other configurations, not just the network configuration but the rest of the BIOS setup, you need to talk to the, you know, the Redfish organizations and that current work that's going on right now as we speak. Okay, last question. I have two questions. So first question, is there a published resource somewhere or is there interest in building a published resource on UEFI 2.5 compliance among manufacturers? So basically my concern is building a product around UEFI 2.5. So we have so many servers supported. One of the things we have in the UFI side is something called SCTs. So you can run the self-certification test to see are the API interfaces in stack is your system capable, right? Because yes, you're right. Walking up to a system and unless you boot to the shell and you're very familiar with UFI drivers and which driver stacks are available, you don't know what's it capable of. So running the SCTs is one way of checking which API interfaces are available in any given system. I also recommend there's a pointer to FWTS. So the existence of the API does not mean they actually are implemented working. FWTS is another, from Ubuntu is another test suite that everybody uses for UEFI and ACPI. I'd have to check with Alex Wayne whether they are up to 2.5 or 2.6 or not. I haven't asked them in the launch pad to implement the FWTSS for HTTPS, something we would want them to obviously go do. And then there's something called Love. Intel has, on 0.1.org has the Linux UFI validation tests. They combine FWTS, chipset, a whole bunch of things together to test, functionally test your system. So there are a number of places you can go to go test your UFI system to see what it's capable of. Okay, great, thanks. Last question, are there resources on leveraging TPM as the backing feature for validation? So what I didn't, as a call out, one of the things that we were working on was where to store the TLS certs and for the system, right? So that's something that's not well spelled out today. We just talked about the format and you should have this capability, but as far as implementation details as to where to safely secure the store of the certs in a system. We do have, on the website, we do implement TCG. This session was not talking about measured boot which is more than secure boot, right? I didn't talk about TXT or BIOS boot guard or any of those things. You can talk to me afterwards. That's a whole one or two hour session in itself. So I think they're gonna want us to move. Thanks. Thanks. All right, thank you. Thanks a lot. Either slide deck or website. Then you can just put the link in and everyone sees your talk and then I can also get that. Testing, test. Am I on? You can hear me? Do you want to hear me? That's the real question. Yeah, thanks. Is it time? Oh good, yeah, it's time. My name is Craig Gardner and I'm speaking for the first time here at scale, but I've been giving this particular presentation or variations of this presentation for 20 years. And oh my, I've gotta turn my clicker on. There we go. I subtitled it this time after one of my favorite poems, one of my favorite poets, Robert Frost. And the woods are lovely, dark and deep, but I have promises to keep and miles to go before I sleep. I'm a software engineer by background professionally, but I'm also an instructor at a university. And I spend a lot of time teaching computer science at the university. And I also spend some of my free time trying to get into the elementary and secondary schools to talk about this kind of a topic. I've adapted this particular presentation for you lot. Usually when I'm giving this kind of presentation, it's to students, it's to educators, it's to people who are less familiar with open source. We're at scale and you know all there is to know about open source. I'm not going to teach you very much about open source in particular, but I would like to take a moment to talk to you about some of the challenges that we have in the industry that revolve around this thing, this concept, this ideal, this movement that we all feel very strongly about. As I mentioned, I have a lot of engineering background, but as I teach at the university, I teach computer science classes, particularly in terms of software engineering, but I'm also currently teaching a class about technology and ethics, which is strangely related to this particular topic that we're talking about today. Likewise, I create training materials for a variety of different SUSE-oriented products, which I won't get into today. The real challenge that we have that I would like to talk about is, is that we have a very large demand for people contributing to the growing set of open source projects. Now you all know the power and importance of open source projects. It runs the internet, right? Open source is a movement that is providing new solutions in new ways, in new sectors, through a variety of different means, through a slowly growing number of people. The challenge is that we don't have as many people entering the realm of the writing of source code that is open source, to solve the solutions in an open source mechanism. As a matter of fact, our friend, Greg Crow Hartman, who works at the Linux Foundation, was speaking at Core OS Fest not too long ago, it's now three years ago, but he made a very insightful observation that I would echo in that it's hard to find people to hire in these fields, people to hire to work in these projects, because everybody's already been hired. And so we need to invest in the growth of people adopting these paradigms and in adopting these methodologies and do so in a way that we can grow the field of people who will continue to contribute and at a pace that the demand is demanding. I've titled this looking for rock stars and ninjas in all the wrong places. There have been lots of interesting news articles and even academic studies about what's going into this myth of rock stars and ninjas in our industry. Now make no mistake, my friends, really there are quite a few of our friends that we would consider to be rock stars and ninjas. And maybe there are some of you in this room that maybe you're contrite and maybe you're humbling, you wouldn't call yourself that, but you may be a ninja or rock star. But the recruiters in the industry are putting out these job descriptions for people to join the software world and giving those job descriptions titles like ninjas and rock stars. And they're failing to attract the people that they are coaxing that they're trying to entice. Now why is this? Two reasons. Number one, no one has any idea what a ninja or a rock star job is. And number two, the people who may be able to do the work that the recruiters are trying to entice are humble enough or maybe even dumb enough to think that they don't have those skills where they may really have the skills that a company wants. So we're failing to match these things up. We need to get away from this whole concept of ninjas and rock stars as cool as it is and as appealing as it sounds. We're not doing ourselves any favors when we are trying to hire ninjas and rock stars. We seem to think that this world of open source can only be filled by ninjas and rock stars. And I'm hoping that I can convince you that that's not the case. So as we talk today, I'll talk a little bit about open source that you already know. I will talk about where you and I can make the biggest impact. And I'm gonna start by saying that the schools is a good place but I'm not gonna end there. I wanna talk to you folks about how you can make a difference in or out of a classroom. We'll talk a little bit about how we use open source to bring people into open source. And lastly, I wanna make sure that you understand that your everyday run of the mill staff, software engineers are going to be and need to be the people that you are trying to describe as ninjas and rock stars. And you've probably seen this kind of chart before I shamelessly ripped this off from Wikipedia, right? Why do we use open source? We don't need to talk about that here today but when I'm talking to educators, when I'm talking to students, when I'm talking to people outside of our realm of influence who need those ninjas and rock stars, I have to explain to them the value of open source and there are many and some of them are obvious, some of them are less obvious but they're the kinds of things that you already identify with. The point I would wish to make as we move on from that is that in our schools and in our institutions of learning, we recognize that we have lots of jobs that are available and a growing number of jobs that will become available that we need to fill with those people that can program. And so in our institutions, we teach programming and that's a very common thing. Now I'm going to show you a little bit later that it's not as common as it needs to be but that is something that we've all taken part in. We've all gone to classes. We've all perhaps even gotten some related degree in a university to where we've learned how to program and we sit in front of an instructor and the instructor teaches us this is an algorithm, this is a procedure, this is recursion, this is a subroutine, these are all these different things that we come to appreciate in terms of programming. That's a very common thing and it's very worthwhile but teaching open source is a foreign concept in our institutions of learning. We'll talk a little bit about that here in a moment and I would assert that if we were to apply open source principles and if we would use open source in our education that we could make a bigger difference in preparing people to fill those roles in the industry. Now you're thinking to yourself and rightly so, what does Google care about whether the, where people learning about programming are doing so in an open source way or not? I just need programmers, we'll get to that. Not too long ago, the GNU revolution start that kind of was the opening of the door of the free software and the open source movement and you have these values that were established by Richard Stallman that you're all very familiar with. Let me provide some variations on this foundation for Richard Stallman and this is Craig's alterations of the free software movement. We need with open source an opportunity to have more freedom to move from project to project. Have you sensed that? Have you been in organizations where project A has been good but then it kind of dies and is less important and now there's project B and you need to figure out how you're going to staff project B? With open source methodologies, with an open source mentality, with open source values we have a greater ability to move from project to project. We have greater agility and greater flexibility to move. Number two, the freedom to grow and adapt our own talents according to our own needs. I would assert that open source facilitates that. It's even at the heart of what that is. Number three, we need open source to be able to redistribute ideas. The ideas that come out of project A to proliferate into project B and all the way down to project Z. And the freedom to improve ourselves not only for the good of the projects that we're working on or the good of the industry that we're trying to improve but the problems that we're trying to solve but the freedom to improve ourselves. And this leads towards an open source mentality. That gives us greater freedom. Now in our schools, I'm not trying to bad mouth the schools. I'm not trying to make any particular school or school system look bad. But I want you to think about what schools do, particularly in the context of technology, right? They're trying to tell their students that we're preparing you for the real world. And what's the real world all about? Well, it's a lot of things. But at the end of the day, we're trying to put food in your mouth and a roof over your head and support yourself and your family, whether that's just for hardcore day-to-day needs or for the toys that you wanna buy and participate in extracurricular activities, right? You're trying to tell these students that they're getting ready to make money later on. And they don't understand how free software leads to paying jobs. And schools, they're generally funded by taxes and to a certain amount to benevolent corporations that are donating their time, their hardware, and their software, but with their own agenda. Number three here is that most schools just don't understand what open source is and what open source does and that it has this broad outreach of business applicability. And it requires that schools need to be more adaptable and more open and to change their old paradigms, right? But schools don't have time, right? Schools have so many distractions. Schools have so many different things that they're trying to juggle. And it's hard for them to be flexible and adaptable. I'm simply saying I can understand why it's hard for them to understand the value that open source can bring. The standard approach to schools in terms of technology is broken. And again, I'm not trying to poke holes. I'm not trying to offend. I'm simply trying to say that if these schools are going to succeed in preparing this next generation of open source geniuses, ninjas and rock stars, they need help. They need help understanding what open source can do for them because they're tied to their tried and true academics. And they're a bunch of good people, but old people that understand things in a strictly academic way, okay? And I'm an old guy too, right? You can see I'm kind of a gray beard kind of person, right? They're closed systems, typically. They're not open to new ideas. They're not open to, they're very rigid, formed, tried and true kinds of things that are tied to paradigms that are related to the individual being tested, the individual being given an A to an F or an E, right? You either know the stuff or you kind of know the stuff or you don't know the stuff and there's no collaborative involvement in their learning of this stuff. Do you know how to write the algorithm or don't you? An open source isn't like that. An open source threatens that, but they can get along. I'll tell you in a moment about that. We use the tools, methods and hardware that come from the vendors. Now please do not misunderstand me. Microsoft does a great job of donating their wealth to schools and in the process, they give a lot of hardware and software with the intent that the students are going to become Windows users, aren't they? Right? That's good. I'm not saying that's wrong, but Microsoft is buying mind share in the schools and making it harder for the open community to work its way in. That's a difficult thing for schools to negotiate. I'm not trying to say Microsoft is bad in this. I'm grateful that Microsoft tries to, and I'm speaking on Microsoft as an example. I'm not saying that that's the only situation here, but they have a vested interest in making sure that they have future purchasers of Microsoft technology, right? We get it, we understand it. That's part of their business. This ends up being a broken approach because in an individual environment, you're not really approximating what the world is all about. I understand that a student needs to prove that he or she can read at a certain level, can perform algebra, and can create some subroutine in C or Python or C++, C-sharp-sharp, C-sharp-sharp, what do you call that? C-sharp, right? We understand the individual attributes of learning, but schools are tied too much to that, and they don't understand that collaboration needs to be part of the education because collaboration is life. Everything that we do in this life is a team effort, and especially in our industries, and especially in software. And they also get kind of tied down into just writing software for writing software's sake. Instead of realizing that and making good use of, there's a lot of code that's already been written, and that part of the real world is reusing code that already exists. Yeah, I understand, you gotta figure out how to write recursion. You gotta be able to show that you can write recursion, but they need to have a collaborative environment wherein they can learn how to work as a team. And those educators are thinking, I can't do that if they're borrowing somebody's code. I can't give Johnny an A or a C based on his borrowing Mary's material. That's a tough thing to have to overcome. Now, we understand all of these benefits of institutions using open source, of all companies using open source, schools are no less the case, right? Typically lower cost and using open source, all these different things, freedom from vendor locking. Yeah, that's all great and cool. But let's think then about where you're trying to get these students to go, where you're trying to get them hired, where you're trying to get them involved, and where they're going to be winding up to solve some of the world's most challenging problems. 77% of software companies, we'll talk about non-software companies here in a moment, but this is information from a variety of different sources, largely from this one down here at the bottom, but 77% of software companies foster open source. So why don't the schools teach about open source? If 70% of the software companies are needing to hire people that know something about open source. This is up from 50% just four years ago. 73% of software companies have a formal open source strategy. Well, then why should the students be learning about open source? Beyond that, 78% of tech companies, and of course that's kind of a squishy definition right there, have some reliance on open source. Moreover, there has been, and it depends on who you ask, and so I don't put in numbers on this particular datum here, but monumental increase in the use of open source even in non-software oriented companies. Mining, materials management, medicines, medical, mom and pop, all these industries are using more and more technology and what's behind that technology? Open source technologies, okay? We recognize this, right? But our schools are challenged with this and we need to be working with these schools to help them understand. That's why I've got miles to go before I sleep. I'm trying to spread this message that it's not just a school problem, it's a general problem that we need to help solve. 87%, this is from Linux Foundation, the 2018 Linux Jobs Report, 87% of hiring managers are having a hard time recruiting enough open source talent. That's a huge space to fill and Greg Crow Hartman says, we can't hire them because they're already hired, we need to fill the pipeline with people who understand this. Now Microsoft is doing a great job. Right back in 2015, they have launched a stance and approach that embraces open source and I applaud them and let me tell you, I know I have many friends who work at Microsoft and they're moving there and this is a great thing and they're going the right direction and Apple is too, right? A little bit, but if you open up your iPhone or your iPad and you look at the license information and you'll see the hundreds of open source licenses of projects that they're incorporating in their projects, they buy it, they understand it, they're a little less outspoken about it but they're moving there, they're using it, they take full advantage of open source and these are the big industries that every junior high school student knows a lot about. Our future needs in the outlook that takes place from the Bureau of Labor Statistics is that 71% of new STEM jobs are in relation to computing, yet only 8% of STEM graduates are in computing. They're usually in some other math or engineering organization. The imbalance is obvious, right? Only one in four secondary schools teaches programming. That's tough, that's a problem. And one of the reasons that they're not teaching programming is because they can't get teachers who know how to teach programming. And I don't blame them because every year it's different. It's hard to stay current and up to date as an educator about teaching programming. And just because you've got the one great lady who took a class to teach that particular programming class that year, she's likely not gonna teach it again a year later. It's a tough place. Now there are lots of schools that are working to remedy these kinds of things. You've got Harvard that has fully embraced this open mentality, including all of their MOOCs, right? You've got all of the material that they're putting out and they're massively online education efforts. This is great. RIT does this. Oregon State is huge on this. San Diego State is embracing this. Yeah, good. What about CIT? Does anybody know if CIT here next door, are they an open source? Nobody knows. Good, thank you very much. Yeah, but great. Thank you for sharing that. There are even some high schools that are trying to do this. There's this one of my favorites. I'm from Utah and there was a school that failed called Wasatch Institute. It failed for financial reasons, not because of its attempts to get open source mentality into its students, but they tried it and Penn Manor in Pennsylvania, let me tell you, they're killing this. They're doing the right stuff. They have understood what the future is, but we don't have enough of this. A few examples, we need more. All right, you probably know this as well, right? Open source is the new CV, the new resume. I have been a hiring manager. I've hired scores of employees in my day and I'll tell you how I hire people. I like seeing their resume, but the ones I end up hiring are the ones that I can see what they've done. I see what code they've contributed. GitHub, these guys sitting right outside here, gosh, they're smart. And all of those people that I go to find out what they really can do as a potential employee, it's because I can see what they've done on GitHub. We can, as this changes, we need to make sure that the students understand this and that they're putting what it is that they're learning out on GitHub. Now let me tell you a sad story that I'm embarrassed to tell you. I don't even know if I'm authorized to tell you about it. I'm gonna go out on a limb and tell you anyway. At the university where I teach, they recently rescinded the diplomas of students that had graduated from the university because the department found out that they had posted their code on GitHub. That's a travesty. Now I know why they're doing that, right? What they're saying is the students are giving away information about the courses and the answers to the tests that are gonna be taking place at the university and that that's a violation of their contractual obligations with the university. I understand that, but students need to be allowed to put what it is that they're writing out there for their own benefit, for the benefit of their peers and for the benefit of anybody who might stumble on it and say, holy cow, look what this person has done. This is going to be instrumental in solving the next great problem. We need to be more open about this stuff and this approach helps employers to know who the smart ones are, who's gonna really be able to solve the problems at my place of employment. All right, so turning people into programmers is one thing, but turning programmers, turning students or newbies into contributors is hard, right? Getting programmers to contribute is hard because they might not know where their interests lie. They have to figure out, well, where do I fit? And then not after they figure out what they get interested in, then they have to get inserted into the project and depending on the project, that might be a big hurdle, right? And we're all grateful that Lienus has taken a class on interpersonal relationships. But before that, trying to contribute to the Linux kernel was a Herculean effort. It was reserved only for ninjas and rock stars. They're getting better. I'll tell you, Greg Crow Hartman has done a lot to help Lienus figure out what it's like. All right, so I've got a little metaphor of hurdling, right? Hurdling is typically, you know, running, the running part of hurdling is a generally easy thing to do, right? You just kind of control, fall forward in a fast way. And you get to the part where you have to jump over something and that might not be very easy, right? All of a sudden you have to kind of leap and you have to leap at a certain height and you have to leap with a certain amount of timing and, you know, if you miss, it hurts. This is a lot like trying to contribute upstream, right? And I've got a few examples here. This guy didn't make it. This happens to be in my backyard. This is at Brigham Young University. This is the Steeple Chase event and BYU typically generates some really good Steeple Chase competitors but this guy didn't make it, right? That hurts and this guy obviously that hurts, right? That's the same thing as with contributing to an open source project, right? Programming is pretty natural, right? I learned how to do this in school or from that book that I read but contributing upstream doesn't come quite as naturally and it sometimes can be hard and trust me, when you fail, it hurts. Somebody flames you, somebody makes it sure that you're not interested in contributing anymore. We gotta change that. That's on us. So where can we make the biggest impact? Some of it's in the schools, right? Students in China and Thailand start programming in grade three. Do I think that's the answer to the United States? No, no, I mean, there's some good and some bad, right? There's pros and cons to this, I get it but I'm trying to tell you with this students in China at grade three outperform United States seniors in high school on standardized computing tests. We gotta do something about that and I'm not saying we gotta start US students at grade three but we gotta do better about this. 88% of global businesses can't fill positions. We gotta do something to fix that and there's a huge pipeline and it's fairly well-defined, right? If we can get people started earlier, they have a better chance of succeeding. It's harder to train people as I'm not saying it's impossible. You can teach an old dog do tricks but generally speaking, the earlier people start the more they latch on to those kinds of things and so let's get this kind of thing going at a young age, a younger age and get students interested in the open source projects that exist. Well, let's think then for a minute though the schools aren't necessarily interested in open source, they need the money to do the educating so they're gonna follow the money and the schools that can't get the money from government they get it from industry or they go without and we had a great talk. Thank you, Dr. Richards, Sam Coleman, thank you. How did I forget your name so grotesquely? I don't know, great talk yesterday about his secondary education approach to handling this kind of stuff, right? We gotta get the industry to understand this, right? If 88% of the jobs are going unfilled and 73% of the industries need open source talent then let's get these schools to understand it. Let's get the industries to push back and get them to be supporting open source but I recognize that we have a limited control over this. Maybe we've got more, maybe there's somebody here that is Tim Apples, he's called Tim Apple now, right? President Trump calls him Tim Apple, isn't that great? Yeah, Tim Cook, somebody, Tim Cook's right hand man is probably sitting in this room and he's gonna go back and say, yeah, that Craig Gardner, he says, we need to change our song, do it, right? I'm not very hopeful that we're gonna have a huge influence on industry to change their tune but I'm not giving up home. If we did control the classrooms, right? I teach this stuff at the university and I want other schools to do this. Oregon State does this and San Diego State's starting to do this, right? We're doing better at this but we've got to get more university environments and secondary environments to use open source in their curriculum, pick a project, have the students pick something that they can contribute to during the semester. Lots of variety from easy to hard that can match your student's skill set in a real life environment. Okay, we have to put the fundamentals first, right? We gotta teach them that, oh, this is syntax and this is structure and this is classes, right? We understand, we gotta start with basic but as you're moving along, get them to where they're actually contributing not to some arbitrary contrived coding example. Pick a project that matches your interests and that matches your skill set. Have your class assignments modeled after what's going on in the open source. Now that takes time and energy from the instructor but some of us are willing to do that. Recognize and reward collaboration. Yes, you need to be able to say, okay, you understand what an algorithm is, I'll give you an A on that assignment. But get to where you are having the students working together because that's what the real world does. Takes effort, takes time, takes patience but good instructors can and will do this. And the last thing is don't punish failure. I hate giving students C's and D's and E's. Failure needs to change. We've got to start teaching these people that failure is simply an element of success. It's part of the process of success. Now it's okay to tell students, well that didn't quite go the way you intended it to, right, what did you learn from it? And move from there instead of saying, sorry, you didn't cut it, you suck. What good comes out of that? All right, it might motivate this few number of people but it doesn't, it just doesn't, that's not human nature to have someone tell you you're a failure and try and pull success out of that. So start with programming but solve real problems with real software and give them a grade based on how well they progressed, right? Demonstrate that failure is part of success. As far as Nietzsche was concerned, I don't buy a whole lot of what else Nietzsche said but he was pretty good when he said that which does not kill me makes me stronger. Yeah, that's decent, right? So let's move on then beyond the classroom, outside of the classroom, how do we get into here? Yeah, we're going to have some involvement in classrooms, we're gonna have our kids, we're gonna have our neighbors, we're gonna have our grandkids, we may even be teachers, right? We may have some involvement in the classroom but how are we in our jobs, in our spare time, going to make the biggest difference? Outside of the classroom, we have to be the recruiters. Let me tell you something, I know I'm gonna offend somebody when I say this but recruiters in the tech world are not very good at their job, right? They just don't know what we know, what it takes to hire someone in the industry. Now there are exceptions to this obviously but generally speaking, recruiters know how to gather resumes and search for keywords and try to line them up but we are the recruiters. Let's make the industry a place where people want to come and contribute and be part of, right? We need to be inclusive. Instead of the flamores, instead of telling people, yeah, where did you learn to program? That is the worst example, there's no way I'm accepting your contribution. We're driving people away. We need to be more inclusive. Hey, that was pretty good. Now let me tell you, there are a few things that are missing from this contribution that you're trying to make here and let me try and help you understand what it is that needs to happen for us to be able to accept this contribution. We need to be inclusive in our communities. We need to encourage and not discourage. We need to make sure that we follow our own rules and make sure that our code is maintainable and good, right? It doesn't have to be perfect, we get it. We're coding against deadlines and we're not all rock star programmers but we need to make sure that our code is maintainable. We need to write our code like the one who's going to come after us is a homicidal maniac who knows where I live. Be kind in how we code stuff and not say, well, whoever comes behind that's their job to figure out what crap I put together. We need to hold ourselves to high standards and we need to live by them. Do we have standards? Sometimes we do, sometimes we don't. Okay. Parting shots here. I didn't do this, this came from a Google search and I put the references, the links to all of the images that I've used in here either on the pages themselves or in an epilogue here at the end of the slides. But we would love to have ninjas. We love to have rock stars in our organizations. And we may run across a few. And you may even be one. But most day to day accomplishments are of that middle of the road, good enough variety by your staff programmers who are ordinary mortal humans. And so we need to invest in mortals and make sure that they feel like they have a place to go and that they're welcome and feel that they have the means whereby they can get better. And feel like that they are contributing to solutions and not part of some vast set of problems for which we ostracize them. And as we invest in those mortals that are doing the day in and day out grind of solving some of the world's most challenging problems in technology, if we tell them they're doing a good job and we encourage them and we make them feel included and we certainly correct them and make them feel like, yeah, you can do a little better. Let me help you to do that. As we invest in that way, we will probably have a better payoff in the long term and even in the midterm than going out and trying to hire the ninjas and rock stars. And the simple truth is in the right environment with the right encouragement, we may even grow a few of those ninjas and rock stars in our own organization if we do it right. My call to you is to find someone, whether it's a young person or some sort of newbie to the industry and become a mentor. Now I'm being simplistic here in talking about code.org. That is certainly not the only place, right? And we all have our problems with scratch. But whoever it is and whatever their skill level, whatever that he or she may be willing to do, find that person and teach that person not just how to program and code, but inspire that person to contribute to be part of something cool, to become part of a community of software engineers, to find some open source project that interests him or her and to start figuring out what it takes to contribute. Now this might not even necessarily be code, right? There are lots of ways to contribute to these communities. Graphics, project management, processes, a variety of different things that are part of the overall engineering strategy. And then lastly, I would say to the extent that you're able, you're already spending a lot of your free time coding, aren't you, right? Your coders for your jobs, some of you, not everybody, right? But you're involved in this in technology to some way, shape, or form in your day to day job. But then you go home and you do something that interests you. As part of the free time that you spend, find an opportunity to vest yourself in a school. Now it might not be a big thing, right? It might not be donating 72 Dell latitudes to the school so that they can have a better lab, right? You might not have the means to do that. But go find that high school teacher. Go find that junior high school administrator and say, I'm in the business. And let me tell you about open source. And not only am I just gonna try and indoctrinate you in open source, but let me come once a month and donate some of my time with your students to help them understand what open source is and where it's going and why it matters and how it will matter to the open source environment but how it will help those students to become better at what they do. Whether they end up being full time programmers at IBM or at Microsoft or at Apple, you don't know. But they'll start to understand the ideals and the values of open source. And they'll start to understand that life is a team sport. And they'll start to understand how to collaborate. And they'll start to recognize, hey, somebody cares about me and someone is vested in me and someone's encouraging me. That's what open source is about. This means something. I think the world ahead of me is worth working toward. I'm gonna do better. I'm gonna do better. I'm gonna get better grades. I'm gonna get involved better. You'd be surprised what kind of influence you can have in the schools. Now, if you've got the 72 servers that you wanna donate, yeah, do that too, please, that's great. The schools need that. But the real love that we have to share comes from our sharing our time. And you may not care about those snot-faced students today, but if you spend a little time with them, and start helping them to understand what this is and where this is going, you'll start to see that, yeah, this is that future rockstar. This is that future ninja that's gonna make a difference. So I leave that with you, hoping that what it is that I've said today is meaningful and helpful, and I would love to hear what questions or criticism or difficulties that you have. Anyone? Doug's got the mic. Go get him, Doug. Questions? I help a lot in the schools. Thank you! But I'm a little concerned about one thing. So what I help with a lot is the robotics programs that are huge in the schools these days, and it's got a lot of kids interested in STEM, but what I see is very little to no open source component of that. It's all proprietary software, it's all run on Windows, and little microcontrollers, but there's no open source component. Have you experienced that, or know any way around that? I'm sympathetic. Now the Lego folks, what's the name of their C compiler that they use? Next C? Well... And NXC, right? Yeah, NXC, that's pretty good, but it's not open source, right? But it's general enough that what they're learning in NXC can be meaningful, but I don't know, they're not using NXC to program it, what are they using, it's that other language? So the organization that runs the biggest one that I know of is called VEX. Yeah, yeah, yeah. And I mean, it ticks a lot of the boxes for the kids, the collaboration part, the STEM part, but the open source, there's not even close to that. It's a very close source and proprietary, and I'm kind of disappointed with that, and I don't really know how to get around that. Because they have this huge, it's almost like how everybody uses Facebook, and it's like you got to hold network type thing going on where there's a ton of people involved and to try to make something separate from that that is open source is a big uphill climb, right? Because you're working against this network effect, right? Where you got a ton of people involved already. And my only answer to that, I don't have an answer to that, right? If those companies that are coming along starts to see the Microsofts and the Googles, Google's a double-edged sword there as well, right? They love open source to a certain extent, and then they hate it, right? And they're writing all of their own proprietary stuff. It's a balance, but Apple and Microsoft and IBM, right? IBM buys Red Hat, they're committed to open source. Some of the other companies will start to take notice, but that's not enough, right? We need to speak up, and contacting the manufacturer, contacting the producer would be very instrumental. It might be a drip in the bucket, but if we get enough drips, we can do it. We've got a response to that, is that correct? Yeah, thank you, thank you. And I knew that unofficially, so have someone bolster my knowledge of that is very helpful, thank you. This is a little bit more of a comment than anything else, but along with the terms of turning programmers into contributors and the Your Challenge slide, Google has this thing called Google Summer Code for university students and Google Code in for high school students for open source organizations to contribute and have students contribute to their projects in a meaningful way. And they use GitHub, they do pull requests, they do things from design to project management to whatever. And it helps out the organization, also helps get students interested in open source. So I would definitely recommend that if you are a member of an open source organization, please try to get your organization involved in this. And if you know young students, get them involved in it as well. Google Summer of Code is awesome, and I'm so happy for that. And I do, I encourage my university students to get involved in Google Summer of Code. There's also Outreachee, which I also give props to, right? There are a few approaches to this that pay money for students to work on open source projects during the summer. Great, thank you for that comment. Next. Hi, I'm from Long Beach City College and I'm part of the Cyber Security Club and the Woman Information and Technology Club. We are trying to do more open source projects for high schoolers and middle schools, but I wanted to know how to get more started in other projects. So thank you for the dexters. So I just wanted to know if there was any other places to go get information or Google. We'll talk with you about that here outside the doors. We're glad to help you with that. All right, thank you. Swing the mic back over to him. Dr. Coleman's got. Raspberry Pi. And so one of the things that I'm involved with is the Raspberry Pi Foundation. And they provide teacher training at no cost, but you have to kinda apply. I guess the nearest place would be an Irvine, so you might wanna just go online and just, if you're a teacher, check out the Raspberry Pi Foundation. Also, we have Raspberry Pi Jams and I'm also a member of the Raspberry Pi Club in the Riverside. And we work in collaboration with this place called Volcademy, which is a maker space. So we have Pi Jams. And so me and some of the other grayberries, we actually bring Raspberry Pi projects there and we get middle schoolers, elementary school kids, high school students interested in these different projects. So I've done that and collaborated with Harvey Mudd University to do some of the same things at schools like in Upland and the Alto. So if you're interested in that type of thing, get in contact with either your local Linux users group and your community, because there might be somebody who's a Raspberry Pi guy or gal. Or if you guys have questions along those lines, give me a call, because sometimes I'll bring projects to different schools and I'll show kids, this is how you make this, this is how you make that. I mean, there's lots of options out there. You just have to reach out to people in your community, but I would always start with the Linux users groups in your neighborhood, so to speak. And those pies are cheap, right? I mean, if you want to make a big difference with a small amount of money, not every middle school or high school has a teacher who would welcome a gift of five pies, but there are plenty that would just die to have that kind of a donation to help spur the interest and curiosity in the students beyond a Facebook consumer perspective into the creative contributor realm, which we really need to have. Next question. So in this education process of trying to expose people to open source, do you think that this is the right time in the process to kind of let the people know about kind of misleading open source pitfalls? You know, like, oh, great, everyone's using Linux, but we're all using Google Docs. Like, or is that like down the line too much? Or what do you think? What do I think? I think that you have to be a realist, right? Google Docs has changed the substrate wherein we write and that we collaborate. We certainly hold some resentment for the fact that they're not willing to keep that as an open source environment, but it is a collaborative world that they're trying to promote and I think that they should be applauded. And then every now and then we just kind of dig a little bit and say, come on, open source or not. Yeah, good question. And certainly worthy of criticism. Hi, I'm volunteering once a week every Friday. Thank you. Each programming to second grade kids. And I've been doing this for the last three years. But what I notice is after they move on to third grade, they don't receive any more training or teaching. So I wonder if somebody have an idea what's the next step, what should I do? This is in LUSD, it's a second grade kids in a somebody. So I can't give you specifics about the Los Angeles School District, right? And there are thousands of them, at least throughout California there are thousands of them in my experience with school districts in a variety of different settings. The pressure from outside helps, but what ends up helping more is to find someone inside the school district to champion it, right? So if this lower grade educator is buying into this but it's not getting upstream enough, give that lower grade educator some tools and some help and some insights so that he or she can start pressing saying, hey, the middle school needs something like this. I'd be glad to help organize it and implement it there at the middle school or at the high school level, right? You've got to get some of that rallying within. The pressure from outside is always a double-edged sword. Going to the school board, going to the school board meetings and clamoring, oh, we got to have this. Usually makes the school board dig their heels in and say, you don't understand my problems, buddy. But if you get somebody inside that can own it, you can go a lot farther with that. Dr. Coleman's got to know the thought about that. Out of the pressure, get the parents fired up because parents want the best for their kids. Thank you. All right, and I think we've got to cut it off. Okay, thanks for coming. I really appreciate your attention, your attendance and enjoy the conference, thanks. Hello, hello, hello, hello, hello, perfect, perfect. And Doug, do you have any slide, the thing to pass the slides or? No, we'll just go from here. Looks good. Should we get going, Michelle? She would start. It's 12, right? Good morning, everybody. Thank you for coming. How many of you guys are from the healthcare industry? Good, good. So today we're going to talk about Libre software and free software and the GNU Health specifically for the public health sector. I mean, of course we can deploy these in private institutions, but the idea is building large health network in a country or a region. About GNU Health, it's pretty much a social project. That's the main idea of the project, is to deliver health and education with free software. And it has very cool technology behind it, but we cannot forget about the social project part because there are many cool technology projects around that don't take in consideration the social part. So who's behind GNU Health? It's GNU Solidario, this is an NGO. This is a group picked from a new health cone in Gran Canaria last year. And again, the main idea here is to deliver health and education with free software. And GNU Health is one of the projects that we do. It is an official GNU package. And one of the main focuses we have is it must be, all the components must be Libre software. Because at the moment that we find one that is no free software, the whole chain goes down the drain, right? And we just had to actually, a month ago, have to change MongoDB for Postgres on the health information system because they changed the license and there were a lot of issues with MongoDB and the license change. So, but lack enough, we have another DBs that work very well in that scenario too. So we were able to move to Postgres too on the health information system. So GNU Health is an ecosystem. It deals with different areas of work. We always start with the person in the community. Domiciliary units, health institutions and so on. Then we go to the traditional patient-doctor relationship. We also take care of the hospital management, stock management, finances, pharmacies, hospitalizations and so on. And at the end we want to make sense of all the data that we've been collecting through all these transactional systems and bring that data to the health authorities so they can improve the health promotion and disease prevention campaigns, among other things. This slide shows a bit the main components or the main projects within the GNU Health umbrella. So the first one is the hospital information management system. We also have a project that deals with GNU Health embedded. The same functionality that we have on the, here's we have it also on the Raspberry PI and also for LIMS, Laboratory Information Systems. We have the Federation, which will be part of, we'll be talking today. Working progress is the mobile application and we also deals with bioinformatics, all these molecular basis of disease. Very quickly, these are some snapshots of different functionality that you will find on this ecosystem, okay? So from the very transactional system that you have running hospital and health centers around the globe to reporting, to imaging, lab cameras and so on, agendas, calendaring. And now let me talk specifically about dealing with large implementation. One of the issues that we have today and we'll talk about it now, it's that most systems in the health arena are silos, meaning they work in a way that they don't share data between each other. So we try to do something that deals with distributed networks of autonomous nodes. That means every single node is independent and you decide what type of information you want to share with the Federation itself. So in case of there is a network outage, nothing really happens because you still are independent. You will be able to run your health institution without the network. And this is done through these three components. So on the left side, you have the nodes which can be a hospital, it can be a person with their application, it can be a research institution or a laboratory or whatever. Then you have a message and authentication server and finally you have a health information system that also deals with patient and person master data. The way that we deal with information on the health information system is a bit different than the way that we deal in a transactional system. It's semi-structured, okay? It's data that is good to make sense out of it. It's not a transactional system per se. We have already that on the limbs and we already have that on the hospital management system. Now we want something that it's easy to be read by humans and by machine and it's also easy to share among the different nodes within the Federation. So I came with this concept of the book of life and the book of life, it's made of many pages within the person and here you have different categories and pages. So you have pages that represent a medical context or an event, demographic, biographical and social and within those you have categories. So on medical you have health condition, genetics, encounters and so on and the social, again this is a social project, we talk about all the socioeconomic determinants of health and disease. Whenever you do something on the transactional system, whether it's your lab information system or whether it's your hospital information system, immediately it will create or update one of these pages of life. So you don't have to be doing double job. The main information collected on the transactional system, it will create this page of life which then will go to the Federation and be shared. Of course, provided that you give authorization for that specific model to be shared on the Federation. It's a very simple way of dealing with complexity because you need something that when it's in this health information system, it can be easily grabbed from different heterogeneous systems. So we deal with this transdisciplinary approach to health. Now we have many people, doctors, nurses, psychologists, social workers, persons, patients, community members, feeding the Federation and feeding this health information system. And if we don't do that, we are dealing with a reductionist system. We would be more on a system of disease instead of a system of health. And we need to have a holistic picture of who we have in front, not just the medical records. Medical records, it's the consequence of something that we have been doing wrong to have somebody sick most of the time. So let me put you an example dealing with genetics and the molecular basis of disease, and in this case cancer. What you're seeing here are different natural variants of the BRCA1 gene. BRCA1 gene, it's involved in breast cancer. And here we have about 70 different natural variants or in different positions, you have amino acids that change and that will make it a conformational change within that protein, making it not functional and making that person more susceptible to have cancer. Now, what we can do with the Health Federation, it's contextualize all that information because you see, it's easy when we know what are the clinical implication of specific natural variant. Then, you know, if we have that list, we know, well, this is white v9 or this is pathologic and then we can do something about it. But what happens is most of the times we have natural variants of unknown clinical significance and that's where the problem comes. You know, you send somebody, it's genotype and it comes a mutation on that specific gene, but hey, we don't know whether that variants, it's pathological or not. And one way to tackle this is having a federated network and multiple information coming from diverse nodes across the country or even across the world will be able to say or make a correlation between genotype and phenotype and knowing, hey, now we can find a direct link on this specific mutation that we didn't know before whether it was pathologic or not. And the New Health Federation is a great tool to deal with this type of things. Big data, a lot of information coming from different, and not only the molecular basis. You know, there is a lot of stuff into the epigenetics and you know, the lifestyle and nutrition and other things that we should take into account in order to have this global idea on a specific mutation. This mutation will be contextualized to this person. So now if you see this slide, not only we have the gene, we have the locus, we have the protein code, we can go and see the OMEAM information, we have the family history, but we also have the lifestyle of the person, okay? So putting all this together will provide good information and be able to tackle that specific mutation within the context of that person. So now, as you see here, we are going to be able to deal with each of these nodes, each of these participant people or health institutions or research institutions or social workers put the information contextualized to this person. We also need a way for not only new health, but other applications to be able to retrieve and feed the new health federation. That's why we provide Thalamus, which is a flask RESTful API that allows you to send information and retrieve information and update information, no matter which system you have on the other side of the federation. And this is what I was talking before. You know, most health systems today are silos. You log information in. And that information has no value whatsoever. In many places, I don't know, we are here in the States, I think it doesn't differ much from Europe. If you go today to Health Center A, because you had a knee issue, and tomorrow you go to Health Center B, you will have to provide again all the information because Health Center A doesn't talk to Health Center B. They are silos. And I know this not only as a person because I'm also a health professional, I'm a physician, and if somebody from Barcelona comes to the Canary Islands, I have to start from zero her or his health history. You know, private institutions don't talk to public institutions and vice versa. And the New Health Federation will actually break these silos. So right down here, that's the Canary Islands. When I will ask to do this, I say, hey guys, make sure that we are still here because we're far away from Spain in every single sense of the world. So why don't we do this? Why don't we have a public health system? First of all, with free software. You know, free software foundation Europe came with this beautiful saying that public money, public code, you know, not only for education, but also for health. Now, not only benefits on being able to read the code and empower the health institutions to make that program their own, but also there is a lot of savings in licenses that can actually go to benefit the public health system. Also, as a physician in the public health systems, you have pretty much seven minutes per patient. If I already have the history of the person, I'm saving a lot of time and I can put that time to actually improve the encounter and the care of the person that I have in front. You know, we are also becoming oversophisticated and we are getting rid of the human factor and that's something that we really have to face and improve in the healthcare arena. So now with this proposal, now with this model, no matter where you go within Spain, the information will be updated automatically and now if you go from health center A to health center B, your information will be fresh and not obsolete as it happens in the other place, in the other cases. Most cases, by the way. So this is what we propose to our government officials to of course move away from private software on healthcare and public administration in general and we already have a solution. It's just a matter of having the will of moving away from proprietary software. And if we do it at national level, why not we can do it worldwide? If I'm here now in the States and I break my knee, God forbid it, I will have a game to give all my clinical records to the US health professional that sees me. Now if we have this in place and we have a unique ID per person in the whole world which is perfectly technical achievable, that wouldn't happen. The doctor will just ask for my ID or get my card and all my clinical information would be there, provided that I give permission of course for her or him to read it. You know it might look a bit utopian but this is something that it can be perfectly done. I think that the unique person identifier it's something we really need to do worldwide. Now why choosing open SUSE or why do we work with open SUSE on this federated model? Well first of all, the guys from GNU Health and GNU Solidario are also part of GNU's open SUSE community. They are already the packages, both on Leap and Tumbleweed and it's well documented and well supported. So as I said, you can stall GNU Health on Tumbleweed which is the rolling release version that's the one that I usually have and then for production you move to Leap which will give you more stability and for large implementations you can actually switch that Leap implementation to SUSE Linux Enterprise and that will provide professional support. Again, I get a lot of calls from people that installed GNU Health in hospitals and this is not a war processor. This involves a lot of things from performance to security to privacy and we're dealing with people here. This is serious, this is something that you should have a technical department that knows about databases, operating system, security, encryption to make sure that that implementation runs smoothly because we do the best we can to provide something that is kind of easy to install but if you don't have concepts on databases, you shouldn't be installing GNU Health as somebody that knows how to run databases, how to make backups, how to know that that information, if something happens, you have standby databases, you have replications and so on. If you have those people, then perfect, otherwise ask for the people that knows to install you GNU Health and running. GNU Solidario does not provide implementations because we are an NGO, we are focused on dealing with the community and of course the development of the software. We work hard with local people. We have Dr. Roof here from Jamaica. They have their own team, the Ministry of Health. That what makes a project sustainable. You cannot go there, make the implementations and leave because that will not work unless you have capacity building locally. And also OpenSUSE has been showing commitment to the GNU Health project, not only by providing support and packages but also being sponsors and donating, the raspies are being donated by OpenSUSE. I'm going to Haiti now and they already have these raspy computers that will be placed in a local hospital on the GNU Health implementation. So we are very grateful to the OpenSUSE community for all the commitment shown continued for the last years. And that's why we are actually here. They invited us to give a talk on GNU Health and OpenSUSE. We also have, talking about OpenSUSE, we have this federation community hub that allows the community to actually login into this public server and play around with the hospital information system, play around with the laboratory information system, play around with the federation and so on. Not only developers but also health practitioners and research institutions. This server starts itself every night so no matter what you put there, don't worry if you break it in a couple of hours, we'll be up again with some demo data. It's just more to get a feeling of what is GNU Health ecosystem about. And finally, let me just glimpse about real implementation projects. We go from small clinics in the rainforest in Africa and in this case, this is Cameroon, where they have small hospitals that cover specific area in the rainforest to very large implementations. AIMS is the largest hospital in Asia. It takes over one million evaluations per year and they adopted it a year ago and it's been done by them. They have a huge IT staff department and they are autonomous and that's what we need. We cannot be behind, I mean, in the case of Africa, sorry for saying Africa, in the case of Cameroon, yes, because these guys don't have an IT department so that's part of our role to provide them support. But these guys have huge IT, they have over 50 people just working on the IT department so they install it, I just went there for a week to give an overview and a bit of training and from there they took the task. We work with the World Health Organization in different projects. This is a Bafia district hospital but we also adopt many of WHO coding, ICD-10 for example, immunizations, pediatric growth charts and so on. And that was one of the training sessions that we gave to the guys of Western Africa from WHO. Jamaica, that's one of the pioneers projects. I think we started back in 2013, Dr. Roof, yeah. It's now six years that they've been using GNU Health and again, local team, they decided to build their own team to deal with the project itself or otherwise it won't work. I mean, these are huge public health projects that first of all, you have to be from Jamaica to understand the context and deal with your people and what they need and so on. And second, we don't have the resources GNU Solidario to be able to provide to every single implementation. Red Cross, this is Mexico with over 300,000 evaluations a year, Laos Public Health also. This is two implementations, one at Mahosot which is the largest hospital and all are in the CMR for rehab. And finally, we have this alliance that deals with academic institutions that want to collaborate with our project. And within this context, we have people from the University of Montreal for example, they go now, remember this project that we were talking in Camero? They are taking over it and people from Argentina will actually go to pick up and collaborate with them. And that's wonderful because that's what we need. We need to have a large community that tackles the challenges in different areas of the world. And I always thought that academia is probably the best way of finding people that is enthusiastic and that they have the time and the resources to actually do these type of jobs. So just naming Mr. Stolman, this was the very, very first project that we did as GNU Solidario and this is over 10 years now, 13 years actually, started in Argentina, Santiago State and the school. These kids are now full grown ladies and gentlemen and it's great, it's great to see that that seed that we put someday there, it actually is making sense today. And that makes me very proud of this little but beautiful project that we have. And I'm also very proud and happy to be able to share it with you guys today. And hopefully some of you will join us on the quest of making sure that health remains a non-negotiable human right. Thank you very much. Thank you. Questions? Anybody? So the interconnect probably uses HL7 with messages? There is a project, we didn't go through every single sub-project but there is an HL7 messaging server that connects with GNU Health in specific areas. For example, the lab and the encounters, it's not all of them, but HL7 Fire is the one that we are actually implementing. Yeah, HL7 Fire. Yeah. Oh, sorry. Is there a small guide to setting up a new health for a small clinic in the United States? We don't have specific localization projects. We have the documentation on Wikibooks, but we don't have yet a guide for localizing new health on specific countries. We have members. Actually, the guy that is doing the HL7 Fire is from the United States and he knows well all the specifics. States is hard. There are a lot of things that you must comply in order to have it, but he knows well all the requirements and we can actually put you guys in contact with Chris. I'm sorry, can you repeat his name? Chris, Chris Simonman. He's also working on Daikon, not only on HL7, but also on Daikon and he's also becoming a doctor very soon, so he pretty much has everything. Oops, sorry. I came in a little bit late, sorry, but just real quick on some of the obvious questions I think are for use in the US, for HIPAA compliance, quote unquote, for the EU, for GDPR compliance. Is there encryption built in and in? How far along is that? And my wife works for a large health system here in the US and they're using proprietary software, it drives me crazy, costs, and he did mention some of the orgs using it, and I'm sorry if I missed it, are there healthier orgs in the US? You mentioned the Red Cross, but it gets elsewhere in the US and using it or trying it out and you mentioned how widespread is it? You're proposing it in Spain, if you're saying, right? It's not, and how far along is that proposal being considered, thank you. Well, thanks for the question. Health is, it's pretty much a framework, okay? So we provide the framework to be able to deploy into a health institution or into a network of health institutions. We provide encryption, we provide digital signatures, we have the packages to connect with the GNU PG libraries and all that is already there and it has a very fine-grained way of dealing with security and access control, but again, I'm not the right person. I just don't know well what are all the requirements. I know there are a lot, okay, for every single country. It's, I know some cooperative and this is coming for a couple of years ago. We're using GNU Health within the United States. Now, when you move to the public health arena or whatever, I just don't know, we have some specific modules for example, for Argentina and other countries. Whenever you go to a country, say Jamaica, Laos, Argentina, Spain or whatever, you do need to have your own localization in terms of reporting, in terms of security, in terms of privacy, but that goes more into that specific context and having somebody to actually adapt GNU Health for that country. Yeah. Any other questions? I'm a physician as well and I've dealt with several EHRs here and yes, unfortunately, they're all proprietary for the most part except for the Veterans Administration, Vista. That's an open-source project that's been around since 95, I believe. Wondering if you've ever reached out to them to try to interface with them. That might be a good place to start. Yeah. Yeah, we know Vista and it's been doing very well in the States. From the technical point of view, they are completely different projects. The type of databases that Vista uses, it made it a bit hard to actually interconnect. Now with this HL7 and also the Federation RESTful APIs, it's easy to send messages across heterogeneous systems. To be honest, yeah, I kind of checked it a while ago but it was a completely different infrastructure from the technological point of view that we use Python and Postgres SQL and they use maps and other type of databases that it kind of make it incompatible. Now with again, with the RESTful APIs, it makes things easier. Yeah, Michele. Yeah, thanks, Luis. Just a couple of things. So as you mentioned, we have done some implementation in Jamaica. The security aspect, somebody had a concern about that. What we find will work best is first defining the security and privacy requirements and use that as part of what's required to build out the system. We're still in a transition mode, as you know, working with the government and various other agencies. The other thing though is the Federation, I find it very exciting. The use of HL7 fire and so on. But if we look at it even on a national scale, say a country like Jamaica, 23 public hospitals, about 11 or 15 or so private hospitals, everybody will be using their own software in the private sector. Public sector is on the one authority, the Ministry of Health, so we all have to use the same thing. So just interesting how to handle the data standards, et cetera, to arrive at that Federation, coming from different nodes and how that might work out. Right, so in terms of privacy and security, that's a great comment and that's what I was saying before. We shouldn't rush to implement the software because this type of software requires a lot of design in terms of human resources. Who can do what? What can do each health professional? What can do the patient? And you do that before actually implementing the software. You do that on paper. And once you have that on paper, it's easy to map it into software. Otherwise, you are compromising the security and the privacy of the data that you have in your health implementation. But people usually tend to rush and just install it and then the problem is they have to go all the way back and do a rollback and waste a lot of time. So yes, designing is key, not only for security but also in terms of scalability, what type of software, what type of infrastructure am I going to put? So we can actually scale for the next five years or whatever. The message is there is, we have designed on each of the resource and endpoints the type of message that you will send on each method. So if I send a put, if I make a put or if I make a patch request, you need to know what are the specifications on that message, how it will go. And that's how the federation works. Now, if you use the HL7 fire server, then that's the coding specification is already there. There are pros and cons always, you know, if the way we do it is if you, it provides you maximum flexibility. So the people that is making the mobile application for Gino Health knows, okay, if I want this node to be part of the federation, how am I going to send the message of a new encounter or a new domiciliary unit or whatever I want to do with it or a lab test, you know? But it's clearly documented on Talamus. So it's just a matter of going there and say, okay, this is the way I need to send the message. It's JSON based, by the way, all the messages that we send and receive. And we love that standard. It's very nice and quite universal too. But it's documented on Talamus. Any other question, please? And notice the previous slide mentioned the lab information system. Do you guys have a lab information system or are you interfacing to an external lab? We have, I don't know what I have in the slide here. No, I don't think so. We do have a LIMS. It's actually called Occhiolino, okay? And it's a lab information system. Actually in Ghana they are using it today, specifically as a lab information system, okay? So it interfaces with the hardware, with the analyzers instead. But all the processes within the lab are already there, okay? So sample and everything is already there and you can print out the results. So it's a LIMS by itself, yes. Independent, yeah. Any other questions? Thank you. Yeah, thank you. Thank you guys. Test, test, test, okay. Is the microphone working? Yes? Yeah. Okay, thank you. Okay, maybe better now. Okay, no, I know where to place it, thanks. I'll be showing some code snippets so if you don't have good eyes like me then I would recommend sitting in the front. No, don't need to, thanks. Okay, so it's 1 p.m. Let's start. Welcome to my talk about buffer overflows and mitigations. First let's start with the obligatory who am I slide. My name is Johannes Sigitz. I'm a security engineer at SUSE and I'm based in Nuremberg, Germany. It's my first time in LA and finally I see some sun because I made my wife jealous by telling her, hey, I fly to LA and then it's raining all the time which was kind of annoying. Okay, my main daily task code review and product pen testing and that's the reason why I'm interested in those mitigations because they make my life easier or harder depending on which site I'm currently at. I joined SUSE in April, 2014. That was a great time to join SUSE because that was the exact day Heartbleed broke. So joining a security team was really the perfect timing there, no panic at all. Okay, this is the outline what we are covering today. We will talk about buffer overflows and how to protect against them. We will talk about stack canaries, fortify source, ASLR and no execute memory. These are all used by SUSE products. There are other protecting mechanisms that we are using and of course these are also used by other distributions. This talk will require some C and assembly background. I will explain most on the fly. If you space out at a certain point of the talk, don't worry, we will cover those four areas and you will be able to resume, hopefully, at least my hope. And please stop me if I'm going too fast with the examples. We will probably need almost all of this hour. So if it's going too fast, please tell me so. We will have to keep going a little bit faster than usual because usually I do this talk in one and a half hours. This is also of course just a short overview so you're not going to be elite hackers after listening to that, but that is usually the case. If you go to a talk, you are never experts after hearing about a topic for one hour. And I will also try to keep it at least a little bit interactive. That depends on how much time we have and it also depends on you. So if I ask questions and no one answers, I will stop with that. So if you don't want me asking questions, then just stay quiet. Okay, so we are talking here about stack-based buffer overflows. There are other kinds of buffer overflows. The stack is a certain memory region in a process that is used to handle data. This is usually a problem in languages where you manage your own memory, so especially C. If you have a language like Java or Python where memory is managed by the runtime, then this is usually not a problem. And if it's a problem, then it's a bug in the runtime or virtual machine can be fixed in this one component and then everyone is safe. On the other hand with C, we will probably never run out of buffer overflows because managing memory is hard. So this is a really simple example here. We have a small program. It has a statically sized buffer of 20 characters, so usually 20 bytes. And then we copy the first argument given to this program into this buffer without any size check. So this will work until we give it more than 20 characters on the command line and then it will start crashing. So this is really the basic principle of what we will see here. So the general mechanism is that you have a given buffer size and you put too much data into this given buffer. Usually the problem is that you are just missing a size check. So if you think about the example that I just showed, there was no check at all for the size. So whatever size the argument is, it will just copy it in there. Sometimes you have a check, but it's either faulty or it can be circumvented. And circumvention usually it involves some form of integer overflow. So you might have a multiplication in the check. The multiplication might overflow. You don't check what you think you're checking and the attacker can still use the buffer overflow. So why is this a problem? I mean, you could think, well, we will flow some data, why do I care? This is a problem because in our architectures we mix data of our applications and control information about execution on the stack. So here's a really simple example. We have two functions. We have a function A, we have a function B and function A calls into function B. And so this data structure called the stack currently holds the local variables of function A. So we are here before we invoke function B. And now if we call into function B, we first place our parameters we want to have for function B on the stack. And then we also place the return address where we want to continue execution after function B also on the stack. So if a buffer overflow happens, you can override this control information. And if you override the control information, you can execute arbitrary code. But only if you do it carefully, if you just do it blindly, you will just crash because you override this control information and the program doesn't know what to do with that. So this is a little graphical representation. So if you do that carefully by first adding a knob sled, so knobs are instructions that don't do anything, not a no operations, and then have your exploit code in the return address and take care that you override the return address that is stored on the stack, then you might jump back to your knob sled, fall through to the exploit code and then run your exploit. That is not the only thing you can override. So RIP and EIP are registers and CPUs that store the currently executed command or the address of the currently executed command. And you're not only able to override this, but you can also override, for example, function pointers. So if you use a function pointer in your code and I'm able to override it, the next time you invoke this function pointer, I'm in control. You can override exception handlers. So an exception handler also has an address stored somewhere on the stack. If I'm able to override the exception handler, an exception is triggered. Once again, I'm in control. Sometimes you don't even have to override these control information. Sometimes you can just override some application-specific data. So maybe there's a flag that indicates if you're a privileged user or something like that. If you're able to write whatever represents true into that field, then you might already have one. So what can we do against these problems? The simple solution would be just use Java for everything and we're done. But based on the reactions, I think you also know that that's not really helpful for happening. First of all, we have a lot of legacy code that we will continue to run. So we can't just port everything to Java. Also, I don't like Java. And the next thing is that for some applications, you just need low level access and that usually requires you to have some ability to work with memory directly. So let's dive into a simple 32-bit exploitation example. So we basically have a main function that calls a function that is helpfully called vulnerable. It has a statically sized buffer and it reads too much data into this buffer. And we will now exploit that. So that is why I mentioned maybe come to the front if you have bad eyes. I'm not sure how readable this for you. It's readable for me, but I'm standing here. So if you can't read it, just come a little bit closer. I mentioned that we have some exploit code or some shell code that allows us to run on commands. So in this case, we have a shell code that adds a new root user to the system. You can either craft that by hand or you can use a tool like MSF Venom. This is a tool shipped with Metasploit that helps you to generate shell code. So this nifty thing here is the shell code. That is what we actually want to run. Then we generate a knob sled. So those are the instructions that don't do anything. It's usually helpful if you have more space than you need for your shell code to prepend a knob sled before the shell code because then you don't have to aim very carefully. You just need to land somewhere in there. And the knob sled will not do anything so you fall through to your exploit and then it runs. The final thing here is that with this line, we set a static address where we think our code will be and where we want to jump. And this only works without any mitigations. And then in the last line, we just put everything together and run it. So we jump to the next window. Currently we don't have a new root account on the system. And now we will run this in GDB. GDB is a debugger on Linux. I use a little extension called PETA that's helpful for exploit development. You will see a lot of colors right now. So let me cover that briefly because we will see that a few times. Up here. Oh, okay. That is not great because my screen is actually at a different resolution apparently. Okay, I'm sorry for that. The problem is that I have my screen here for a set for the exact same resolution as the projector but for some reason you'd see something differently. So you should see a little bit more just two characters but we'll have to deal with that for now. So here we have the registers that are currently used in the processor. Registers are small variables in the CPU. They can be accessed very quickly and they are used for various calculations. Down here we see the current assembly instructions. So where we are currently executing and this is the command that will be executed next. Here we have the stack. So the structure we want to overflow and if we now continue with that. So every time I press a key we will step one step further down here and potentially something changes here which will be indicated in red. We step through and at this point you now see that we are at the read instruction. So if you remember the code that I showed you before at the read instruction we will cause our buffer overflow because we read in too much data. So if you look at the stack you don't have to remember what's exactly here but if we step over the read instruction in the next step you see that a lot changed and you also see those 90, 90, 90 hex values so this is our knob sled where we actually want to land. Stepping a little bit further we are finally at the read instruction which returns control to the previous function and the read instruction will take the top most stack value and start execution at this place. And Peter has this helpful feature so it shows me directly where it's pointing to so it's pointing directly into our knob sled. So if I step one step further we are right here in our knob sled and if I continue execution then we will see that we now have a second root user on the system and we're also able to log in here and have root privileges. Okay, so that was the good old days. Exploitation was really easy or the bad old days depending on which side you're on. The first mitigation we will talk about are stack canaries so you see a little canary here. The name comes from canaries that were used in mines to alert miners of the buildup of deadly gases and stack canaries basically do the same for software so it alerts you for stack overflows. The general idea is that the compiler generates some extra code that puts a so-called canary value at a certain location in the stack frame and before we return from the function so before this red instruction that we just saw we can check if this canary value is still untouched and if it's not untouched we can bail out. We have three different canary types. There are so-called terminator canaries. They consist out of a null byte, a carriage return, a line feed and minus one and the reason behind that is that you're usually not able to write those bytes because they are not really friendly to string handling functions. Then there are random canaries so you just generate a random number, write it there and the rationale is that the attacker doesn't notice a random value so he can't write it and then there are so-called random XOR canaries so you use a random value and XOR it with certain data on the stack that gives you a little bit more entropy. We have not three as it stands here but four different variants in GCC so that's the joy if you change your slides right before the talk. We have fStackProtector that will add this code only for functions that add buffers of eight or more bytes on the stack. Then there's fStackProtectorStrong so after a while people notice that fStackProtector and this heuristic was not good enough and there were several other criteria that causes this checking code to be added. I will not go through all of these to save a little bit time but in general if you have for example an array you will get this checking code. Then there's fStackProtectorAll it will just add the code everywhere that is not very great for performance and then there's fStackProtectorExplicit there you can annotate functions where you think you want to have it and then GCC will add the code. So just a short reminder of the example code we have our statically sized buffer and we put too much data into the buffer or we are able to put too much data into the buffer. This is the assembly listing of this original function. If you don't read assembly it doesn't really matter because this is the protected code and now I will show you what changed between those two versions. The first thing that changed here is this snippet. Here we take the canary value the known to be good canary value and place it into a register and then we take this register value and write it onto the stack. So here we place our canary at the point where we want to check it later and then here after we executed whatever we wanted to do in the function we get our canary value back from the stack and then we XOR it with the known to be good value and XOR has the interesting property if you XOR two values or the same two values together then you will get zero. So then in the next line we can check if we have zero if we have zero we jump to our normal instructions to leave the function but if we don't have zero then the stack canary was touched and then we will call this function called stack check fail which is a pretty descriptive name that then bails out. So another demo time. Hopefully you will see what I wanted to show you. So this is a binary compiled with the stack canary it's exactly the code that I just showed you before. If we run it and give it 12 ACE as an argument then everything is fine because we have 20 bytes of buffer. We do the same thing with 32 bytes then it fails and we get this helpful message that stack smashing was detected. Can again look at this in the assembly. So if we run it here we are right now at this place where we take the stack canary from this predefined memory location move it into the register and it will be placed into RIX this is unfortunately not visible for you but you will see it soon. So this is the stack canary that we have now for this execution and if we now continue with our execution we are now at the place where we check the stack canary so in the next step we will take it from the stack back into RCX. So right now we have our stack canary that was on the stack back in RCX and if you noticed the stack canary we had before did not end in 41 and 41 is the hex code for in capital A so apparently we overwrote part of the stack canary. In the next step we will now XOR the current stack canary with the known to be good stack canary and we should receive zero in ECX which we don't because they are not the same and now in the next step we will not take the jump because we didn't get zero as a result and we will now call the stack check fail function and if you continue then we would actually see it if the screen is big enough up there. So this is the interaction I was talking about. There are several limitations. Do you have any idea what the limitations of this technique might be? Yeah, performance definitely is a limitation. Okay, so I will keep asking you because you answered. You can thank those two. Yeah, yep. Yes, you might be able to predict it but in practice that is rather hard because if you have not bad random source then you should not be able to do that. It depends a little bit on your architecture but I mean especially on a 64 bit architecture is going to be really rough to predict 64 bits. But of course it depends if you have a weak source of randomness that might happen and that actually happened with some implementations. So my list is it does not protect data before the canary so you write the canary at a certain point in a stack and if you have local variables for example a function pointer on there you might still be able to overwrite this. To counter this some implementations try to reorder the variables to ensure that sensitive stuff like function pointers are also protected by the stack canary. It does not protect against the generic write primitive so with some exploits or with some vulnerabilities you are able to specify write me that value at this location and then of course you just jump over the stack canary and write whatever you want to the place where you want it. You can circumvent it with exception handlers so if you're able to overwrite the exception handler address and if you're able to cause an exception before we return so before we get to this checking code then you're still good. You can change a chain a buffer overflow with an information leak so if you're able to get the value of the canary then of course you can write it and there's also no protection for inline function so if you have higher optimization settings then the compiler might start inlining stuff and since this is all related to the function prologue and epilogue of your functions if you don't have that because the code is inline then you won't have these checks and of course you can still use it to cause denial of service. Denial of service is definitely better than remote code execution but still if you run a public DNS server wherever you probably don't like denial of service either so yeah. Okay so first mitigation three to go. Fortify source is a pretty nice one because it transparently fixes so-called insecure functions to prevent buffer overflows so for example mem copy, mem set, string copy get additional checks. I have insecure here in Attelex because there's this little tweet here. We tell people not to use string copy and then they do stuff like that so a string copy in itself is not insecure if you know what you're doing but usually you tell people not to use functions like that because there they can just forget those checks in the other case they can just forget how to use the functions correctly but yeah well. This only works for statically sized buffers or at least maybe for even more situations but the compiler needs to understand what you're doing here. So if the compiler sees you have a statically sized buffer you're using string copy it then can add an additional check but you need the compiler to understand what you're actually doing there. If you do some fancy buffer magic then the compiler will probably not understand that and will not add the checks. You can enable it with d45 source equals two and the interesting thing here is you need to enable optimization because this 45 source magic works somewhere in the optimization stage and if you don't enable optimization you will not get the benefit. We have a different example here it's based on Matthias Gastner's work he's a colleague of mine that uses that for our trainee trainings and basically we have the same structure so we call into a function it has a statically sized buffer and we copy too much data into that buffer with string copy and we have that little snippet here to prevent GCC from removing our buffer because since we don't use our buffer the compiler would say well nice buffer you have here you don't seem to use it so I will just kill everything here. So with that we keep the buffer. This is the corresponding assembly and if we now go to the assembly with 45 source the first thing you might notice that it's become shorter. The reason is that we enable optimization and therefore the compiler worked a little bit harder to generate better code and because of that we have fewer lines and we have two interesting lines here first is the compiler notice the size of our buffer and places that into edx and it doesn't call string copy but it calls string copy check so it replaced the call to string copy to a call to a different function and with that it can then check if we move too much data into that. So another demo. We are again in GDB I will first run that code with an input that is too small to cause a buffer overflow. So once again we are at the beginning of the program we have the hex 100 where the compiler notice the size of our buffer and we have our string copy function with the checking and if we now step through that we reach the part where we call string copy check and if we step over that nothing happens because the copy just worked and if we continue the program will exit normally. If we do that again and provide an argument that consists of 700 As we will now overflow the buffer. We are again at the beginning we start to step through and if we try to step over the string copy check function we will not come to the next line because the function realize that too much data is placed into the buffer and it will terminate. So empty slide again. What are the limitations of that? In general it's just limited to some functions or situations. So if the compiler doesn't understand what you're doing there it can't help out. It can still lead to denial of service same problem as before. You don't get remote code execution but your service still dies. One other problem is that developers might just keep using these functions. So they are thinking well writing all those pesky checks that's rather annoying I just don't do that. The compiler does it for me. So if you're a security guy in a company then you should probably still ensure that people don't use those functions. But it comes with almost no cost so you should enable it. We have it enabled in our build system so that every package that is built is automatically protected with a 45 source. And as far as I know we did not really have problems with that. And also the performance cost is rather negligible because you need those checks anyway. If you don't have them it's not a performance benefit you are just insecure. So the question was why do we need to have optimization turned on for this to work? The reason is that 45 source works in the stages where optimization also works. So if you disable optimization then this whole engine just isn't triggered. I can't give you the detail why GCC does it that way. Just this. Yeah, that is definitely a limitation. Should add that to my slides. Thanks. Okay, okay. So then let's go to ASLR. ASLR stands for address space layout randomization. And this is a technique so that certain memory segments like the stack, the heap or the code segment are loaded at random locations. And if you think about it computers are deterministic at least the ones that we use. And unless you have some sort of randomness in there you will have the same memory locations every time you start a program. So that's the way it used to be. That is pretty nice if you do debugging and stuff like that. But it's not great if you want to prevent security problems. Because if you have randomized locations then attackers don't know the return addresses for their exploit code or for the C library for example to jump to. If you remember the first example that I showed there I hard coded the address where my code will be and then jumped there. But if my stack is moved randomly in memory then I can't do that anymore because I don't know where my buffer will reside later on. So if you cat proc maps for a given process then you see at which places certain memory segments are. So for example for this process the stack is a dislocation, the heap is a dislocation. And if you now do that five times and just grab for the stack segment then you will notice that these numbers here change. So this is run on a system that has ASLR enabled. The stack is moved to a different location every time this code is run. You can check what settings your system has. If you check this file can have three settings. You can either have zero for no randomization or one or two it then randomizes various parts. If you have a recent distribution you definitely should see two in there. If not you should talk to your distribution. I think all major distributions have that. But if you're maybe working in a embedded space or something like that it is definitely something you could check out. To get a full benefit of that you need to compile your programs with fpy. That then ensures that the code is generated in a way that the binary works no matter where it's located. And this is rather important because some binaries assume that they will be loaded at a fixed address and then the loader actually has to load them at that address because otherwise it will not work. But if you compile it like that then the binary is generated in a way that it can be relocated wherever and then the loader can relocate it to that address that is randomly chosen. So what are the limitations of ASLR? Spectrum, yeah. Definitely. Come again please. It works but not really that great. I mean yeah. It's the, I have those points. Okay, so first point for all the systems that had a pretty hefty performance impact. Yeah. Sorry, come again. I can't hear you without a microphone. Sorry. Stack Clash, no that's not directly related. I mean you could have exploitation of Stack Clash without ASLR but you could have a situation where it's becoming easier because those segments are nearer than they would otherwise be. You have limited entropy on 32 bit systems. So at least that is what I know. I'm not sure why you think it doesn't work at all on 32 bit systems. Yeah. Yeah. And of course you can't really use all those 32 bits because you have various memory segments so you can't place it everywhere. That is definitely a limitation on 32 bit systems but fortunately not too many around of those anymore. At least where I work, if you work in an embedded space or whatever then it's probably different for you. So brute forcing can still be an issue if restart is not handled properly. So even on systems where you have more entropy if you restart your process or if you don't restart a process because you get this randomization on the start of the process but if you only restart part of that and still keep the same memory mappings you might still run into problems there. And it can be circumvented by chaining an information leak into your exploit. So that is with regard to the spectra comment. Of course if you're able to leak information from the process then you might use that to gain some knowledge about where those memory segments actually are currently. You also might have some exotic software that relies on fixed addresses. So for example if you have some inline assembly then this might be problematic. I don't see a lot of those problems but that also depends heavily on which area you're working on. And sometimes you might have some usable memory locations in registers. So you might just have a usable pointer to whatever you need in a certain register so then you're lucky because you don't have to find out any addresses you just have it around to use. So for ASLR I don't have a demo because it was pretty clear from the five executions of this cat program what it does. The next mitigation we will talk about is about non-executable memory. So modern processors support the ability to set certain writes for memory mappings, sorry. And they can say that certain memory should not be executed. Another term for that is NX or this thing I don't know how to pronounce. And the most interesting memory regions for that are of course the stack order heap because in well-written software you usually should not execute code that lives on the stack order heap. Unless you're doing some whatever writing a virtual machine or if you write normal code you probably should not do that. So the stack overflow can still take place but it's not possible to directly return into your shell code because your shell code will live on the stack. The stack is non-executable so you can't jump there and execute it. If you try it then the program will die. And when you see my italic of directly there's probably a way around that. If we remember what we saw before so if we again cat this maps file in the proc directory and grab for the stack then we will not only see the address where the current stack is but we also see some modes here and in this case we are allowed to read the stack to write the stack but we are not allowed to execute the stack because otherwise we would see an X here. So what limitations of NX can you think about? Come again? JIT? So JIT stands for just in time compilation. If you do something like that then it's going to be tricky to do that. I mean you can still work around it by first compiling the stuff and then afterwards changing at least the right permission from the memory map but you can use existing code in the exploited program. So maybe there is a privileged operation you want to run. Maybe you don't need shell code you just want to call the delete all database functions or whatever and so you just jump there and since it's part of the code segment it's executable and you can jump there. You can use something like return to libc so pretty much every code is linked against libc there are a lot of useful functions you can use so you just jump to libc and use the code that's already there. You can use return-oriented programming so you structure the data on the stack in a way so that you always jump to small sequences called gadgets that end in a return instruction and try to get your functionality just by jumping to those little small gadgets and since we'll be using that shortly I will show you that that's a small graphical representation of that so you add data to your stack so that you override the instruction pointer with this data which then causes you to jump here and there you only have one useful instruction you just pop EAX that means that you take the topmost stack value and place it into EAX so the topmost stack value will be this after the address here is removed so you will then set EAX to one and then you end in a return so the return causes the next address to be taken and control be transferred here which you then can repeat and get basically whatever behavior you need so you use the stack only for the control information you don't put code directly on the stack and execute it and this is used a lot in modern exploits so we talked about quite a few mitigations we talked about stack canaries we talked about ASLR we talked about no-execute memory and we talked about 45 source so you should probably be safe no more exploits, right? Yeah, unfortunately not I will now show you a counter example it's taken from a certain blog post I changed it quite a bit to make it work on my machine and also to make it a little bit more understandable for this talk I will leave out 45 source for this example just to be able to create an easy exploitable program but I probably don't have to tell you there are enough vulnerable code pieces out there so this is just me cheating to make my life easier it's not because you can't circumvent that so that is the code of the program that we will now be exploiting we have a main function that then calls into a function called memleak which is a memory leak so we have now here our information leak where we get the information we need for further exploitation and then after we call to memleak we call into one function that is a classic buffer overflow so we have a statically sized buffer and read too much data into that so not very complicated since we want to be able to execute our own shellcode we need to make the stack executable again so the default stack is non-executable but you can use functions like mprotect to make it executable again and we want to run our own shellcode so we need a way to call this function it accepts an address, a certain size and then some protections so the permissions we want to set on this memory mapping on the architecture I'm using here the first few arguments go into registers so they are not placed directly onto the stack and so we need to be able to set the registers and to be able to set registers we need to be able to execute code but since we have non-executable memory we can't directly execute code so we need to use Rob return-oriented programming what I just showed before for that we need to find some gadgets so gadgets are those small code sequences that end in return directly there are a lot of programs out there that help you do that I used RobGadget this just prints out a list of all gadgets it finds in this case I just checked the C library and I wanted to have an instruction that allows me to load data into RDI there are programs out there that you can give a binary and it gives you out complete Rob sequences so if you don't want to build it yourself you can just use those programs so demo time again and let's hope that it's actually worked because the last time I gave this talk it did not work which is kind of disappointing so this is the exploit that we will be using we have our shellcode running in a virtual machine and we want to be able to connect back to the hosting machine so that we get a reverse shell the shellcode is generated with MSFvenom that will connect back to my host machine in this line we will open the vulnerable function that is only the compiled version of what I just showed you before the system has ASLR it has non-executable memory and it has stack cookies enabled and first we read what the program writes and then we write something into the program to trigger this memory leak and this memory leak is triggered via a format string vulnerability format string vulnerabilities are not that common anymore but they used to be really nice because you can do a lot of stuff you have your memory leak if you want to but in this case we just use it to leak some memory then we read the memory that we just leaked and start calculating the data that we need so first we calculate the base of the libc that we need then we calculate the address of the buffer and then we extract the stack cookie because we leak so much memory that we have the stack cookie in there next we calculate the address of mprotect in the C library because we want to jump here to make our stack executable again and since this mprotect needs the arguments in the registers we first need three different gadgets to set those arguments in the corresponding registers so with the first gadget we set the value in rdi the second one sets it in rsi and the third one in rdx so then we do a little bit more calculations, not that interesting in the end we add a knob sled so that is the same as I showed you before that just makes our targeting a little bit easier and we put it all together that is the final exploit we have our knob sled we have our shellcode that we generated we have our stack cookie at the exact location it needs to be for the check then we add our gadgets and the data they will add to the registers the address of mprotect where we want to jump first and our buffer address I set up a listener on my local machine so this is my laptop and what we will see after that is just a virtual machine in my laptop and if we now call this exploit we see that this is the result of the memory leak so this is what we are getting out of this other process then we use that to calculate all the data that we need so we get our stack cookie we get the addresses that we need and then finally we send our payload to our program and if we go back now we see we got a connection we are root which is because this program runs as root and we are in the virtual machine so what we didn't cover now is quite a lot there are some other mitigations that we use in SUSE products so for example this Stack Clash Protector really saved us a few weeks ago when the system D exploit broke fortunately we enabled it for all our current products and because of that we were not vulnerable this is also one of those mitigations that are not that costly so it was really easy to employ them then there is something called Rail Row so there are relocation tables and binaries if you want to move them around in memory you need some place where you relocate your symbols those are pretty nifty targets for attackers so you want to make them read only after you do your locations it's also pretty basic there is definitely more that should be done because as you've seen exploitation becomes much harder with those mitigations but it doesn't become impossible and if you look at exploit prices then it just has the effect that exploits become a lot harder to construct and therefore they become more expensive that has definitely a positive impact on us because if an exploit is more expensive to generate then people really think about if they want to burn them just to break into your database server but it depends on what kind of enemies you have the ROP technique that I just showed you before it's used in a lot of exploits nowadays and because of that there are some mitigations that are coming up you have something called Shadow Stacks so we were just talking about the normal stack which is control information and data mixed into each other Shadow Stacks just have the control information and then you can have checks that says we have the return address here and we have the return address here we compare them if they're not the same we have a problem then there is control flow integrity that should ensure that you only jump where the programmer intends you to jump and luckily we will see that in hardware pretty soon or maybe you already have that because if you do it in software it's rather expensive and then there's something called data flow integrity that even goes a step further that really ensures that data is not changed in a way that was not intended by the programmer but that is not something that you see in standard systems that you run nowadays the main problem here is that those mitigations are rather costly especially control flow integrity integrity is something that is rather costly in terms of performance so we have a rather hard time convincing our customers that taking a 10% performance hit is worth it and of course they can also be circumvented so I mean it's the same thing every time you invent some mitigation and the attackers just get better but you make it harder every time and those mitigations are the reason why you currently don't see or seldomly see exploits just posted somewhere because if someone can nowadays write an exploit for iOS or for Chrome they can earn really big on that and because of that they don't post it publicly anymore and the reason the prices are so high is because it's really hard to do that nowadays and that is partly thanks to the mitigations that are now in those products so thank you for your attention do you have some questions in that case duck will come to you for the ASLR you said there is no performance degradation on the x6 64 bit platform is that true or is that just because you didn't mention it to go back to the limitation ASLR has not really a performance issue because you do these relocations once at the beginning of your code or before you start up your program and that costs a little bit of performance at this time but then if you run your code it's not really a factor on modern hardware I'm sorry I have a hard time understanding you I'm not sure I can follow because the ASLR only moves your program in memory and then later on you just execute your code as usual so there is really no performance impact questions so I'm not really sure how to phrase the question but I don't think the bottom line is what the heck do we do now so concrete talk you talked about some hardware features that are coming online there are some improvements in the program languages go with maybe better than Java REST is probably better than C what do you see the system software doing in the next few years say 5 or 10 years so in general it's definitely if you're able to choose a programming language nowadays and if you don't have to use C or a really low level language you don't want to do that because you just solve a whole class of problems or you don't have them there are still enough problems there for you to care about with regard to security I think going forward we will play this game probably forever we will have mitigation then the attackers will become better and in the end it really depends on what you're defending with the mitigations that you have nowadays I think we're in pretty good shape unless you're the target of a state actor or something like that and if you look at what really owns how large organizations are really owned then this is seldom the newest zero day exploit it's usually a user just clicking on a link or executing something because they got told well I'm the IT guy do that so I'm not too optimistic about the future of security because we as humans are just too trusting that is definitely something that we can think about but in my experience unless something is really horribly broken and a lot of people lose money because of that we usually don't change it so I mean programming languages like Rust they are really cool and great but it will take decades until we have replaced all the C code that we still carry around with any others alright thank you thank you testing one two the working kind of sort of maybe a little bit sort of alright no judging on non-professionals I don't do this part for a living alright stop that I have two o'clock on my watch can you get started alright so here we are what are we doing what are we talking about this talk who about me my name is Patrick Schwartz I am a in my first year at SUSE been a systems admin systems architect for the last 20 some odd years for multiple companies that's how you can find me whether it's on twitter or linkedin so what are we here to talk about today what are we trying to solve as systems admins we're responsible for everything right whether it's patches it's configs whether we have to install software account management system security tuning I mean the system admin the system architect we've got to make sure that things are right things are stable things are where they're supposed to be users the developers whoever is using that system the consumer of the system depends on the system admin to make sure it's right can we all agree on that whether it's making sure of software drift making sure the containers are safe I was reading an article recently talking about some massive percentage of the containers that are deployed today are vulnerable they're finding that containers that are finding that are not safe they've got vulnerabilities already in them and we're still deploying them even though they've got vulnerabilities so as system admins we're responsible for a lot of things to manage and today's world and I give this sister presentation of this about our commercial product called system manager our world is getting more complex not less as we have to deal with OOSs we have to work with but also the hardware platforms we're dealing with as well whether it be x86 ARM Z you name it we deal with it in some way shape or form so on the bottom line is what do we tend to do stay in control that's what we're after so the solution I want to present today is called Uooni and you're going what in the world is well I might answer that question for you it's not this Uooni although we'll give this Uooni the credit this one is the largest salt flat in the world why did we name this project after the largest salt flat in the world in Bolivia well it's because it's based a lot on salt stacks salt which I'll talk about in a minute what does Uooni do for us well like this slide reads it deploy, manage all kinds of workloads no matter where they're deployed from from a single UI the gives us the ability to automate audits and reporting capabilities and when I get into the demos we'll do an audit we'll look for a CVE and we'll see what we can do to fix that we can do hardware systems a lot of times we get out there in the world whether you've been in the job for years or you're brand new to a particular position you need to know what's out there and a lot of times we get out there we get into systems we have like oh my gosh the system's been running this for how long we didn't know to be able to maintain standard configurations and that's one of the key issues that we have to fight constantly is configuration drift right I'm going to propose it gives us that ability to maintain that and last but not least and not even this is just a short list but building the deploy physical VMs and container images also from a single interface and I think about that the ability to deploy physical systems, virtual machines or containers from your interface at the same time when you know they're deployed they're going to be deployed the way you meant them to be in a managed controlled infrastructure all right to let you know I'm not a big slides guy so we're not going to have a ton of slides we're going to get into more demo stuff is that all right so where did when you come from other than that big salt flat way down south from here originally my slide kind of cut off here but we started off with a project redhead sponsored the project back in 2005 time frame I believe called spacewalk have you ever heard of spacewalk couple so we get our origins way back in the spacewalk project well as redhead has decided that was the foundation for redhead satellite 5 great product and we took that same SUSE took the same code stream and developed our SUSE manager two and three product line from that but in the meantime redhead has moved on from redhead satellite 5 redhead satellite 6 and completely rewrote the interface the whole project has been completely redone and they abandoned or they decided to no longer maintain spacewalk and it's got an end of life somewhere around 2021 whenever satellite 5 completely goes end of life so SUSE manager or SUSE said well we need a project that we can continue on and continue putting upstream code into so we've decided to fork the spacewalk project into Uuni that will be so Uuni now will be the foundation for our upcoming enterprise product called SUSE manager 4.0 which is due out this summer with lots of advantages to that so this spacewalk origins way back here is continuing its life through Uuni and through SUSE manager 4. All updates will go into from Uuni into SUSE manager 4. So what is really we're talking about here well Uuni it talks about being an opinionated branch of spacewalk to provide a simple installation using salt how many of you are familiar with salt stack? okay good good we use salt extensively behind the scenes underneath Uuni we talk about because the world is becoming containerized and Kubernetes is becoming a huge opportunity within our world so Uuni ties into containers as well we've redesigned the website to use React UI and I'm not a web developer so that's a little bit foreign to me but anyway I'm liking what we've done to it Python 3 and JDK 11 will be the foundation for everything going forward as well as and this is right now the current supported clients open SUSE SLEZ, Sintos, REL and by the summer even Ubuntu so from this single project I can manage just about every Linux flavor out there these down here are the platforms that we support and the ones in bold are the ones that you can actually run the Uuni server on the other ones that can be strictly clients you're not going to want to run the server side on RTA to say you know raspberry pi type a little bit beefy for that but so that's what we are right now alright so just like open SUSE tumbleweed Uuni is a rolling release it's constantly being updated there is going to be there's no community versus enterprise additions it's just Uuni with a rolling release but it is the upstream like I mentioned before going forward for SUSE manager from this point forward the code bases are going up into Uuni and then coming down into SUSE manager 4.0 and on this is what makes up Uuni today like I say it uses salt stack to do all the configuration management and the provisioning with physical virtual or cloud boxes it has the ability to tie into Kubernetes, has the ability to tie into VMware and the next version will even the next version or next dot release will have the ability to tie into KVM as well we use PostgreSQL behind the scenes and to get to it typically you are going to use the web UI most commonly is the web UI however there is a very extensive API back into this you can then use any of the typical API tools that you want to use to get to the back end of Uuni and also from the command line you can do just about everything you can do from the web interface either through the APIs or the command line okay and I'm talking really fast that's okay so like I said those of you who know about salt this isn't from you so much but the others like I said we're backed by salt stack and if you want to know more about salt stack there's the github link to that salt stack isn't a configuration management tool then they'll tell you that they are a remote execution framework meaning that you can do lots of stuff to your systems including configuration management all from salt stack or salt in orchestration automation is the foundation by which salt was originally designed for Tom Hatch and I have had long talks on the foundation behind salt and he's a fascinating guy to talk to and okay so salt though it pretty much runs everywhere it's written in python it runs on Linux you know traditional flavors of Unix even Windows so it runs just about everywhere it can be ran in agent or agentless and the behind the scenes is using the zero MQ to put things out on the bus so you've got a little bit more secure method because you don't have to open up ports directly to every client nor does there have to be behind the scenes issues with clients in firewalls alright so let's jump into some demos and see if I you know did my appropriate this to the demo gods today so what are we going to talk about I'm going to try to show you some salty goodness alright we're going to start back at the command line and work our way up we're going to talk about this thing called formula with forms and to do that there's this whole functionality that I want to show you that uses salt and salt fillers behind the scenes to deal with a lot of complex installs we're going to look at some audits and see them be able to see the audit and fix it alright we're going to talk about these things called action chains and what we can do with action chains and then lastly if I have time which because I'm really fast I'm going to talk about some software channels that we can do with our in world because software the repositories are known to be one of the most complex things to deal with having to manage multiple layers of systems whether it be test, dev, QA pride, pre-pride in an SAP world and having to manage all of these repositories in a unified method without spending terabytes upon terabytes of disk space to do these rc backups of which became really challenging and Uuni gives us that ability to do that so let's see if I can do this alright is that do I need to switch that to white on black um um would white on black look better okay I forgot how to do that I forgot where that's at oh which one oh breeze okay let's try black on white is that any better alright you see that now if I move this that over a little bit and I totally left my sheet of paper back in my room alright so this is all on memory here so step one so what do we want to do here so I want to show you for example um the world I happen to do this without my cheat sheet anyway alright so salt is very easy if you get to see my bad typing I can target my systems and the first step is targeting my box and I'm going to salt first thing I'm going to do is targets and right now I'm targeting everything and I just want to do a simple test without paying to see what I've got out there and boom um that yellow is ugly though um it comes back and tells me hey all these systems are true I got one error though so this one's in blackout mode and we'll talk about that when I get to it that's their own purpose um but just like that I was able to get back uh individual things so if I want to target um and that's one of the things with salt there's a back end called grains and by default salt pulls in a ton of information about the system um so let's look at that for real quick here one new client um um grains dot items all right and I can look at this and I can see if I scroll up back up here at the top um I can see all kinds of data about my system that salt pulled in all by itself without me saying anything um told me it was running you know it's got some it's got an SSD the BIOS release version uh and every one of these things in blue I can now target on if I want to target every box that's x86 if I want to target on every box that has a domain of susan zoom dot com um or target on any you know GPU host name or ethernet address I can target on any of these things going forward and we're going to talk about we're going to show real quick here it's just strictly OS let's target on our OS so salt I'm going to do a dash G um OS colon uh SUSE test dot version let's try something different okay so I'm going to target all my SUSE boxes they came back with a grain of the OS grain equaling SUSE and sure enough I only got back this list of clients now I can start doing all kinds of stuff from that now breaking things down and doing more with these individual systems if I wanted to get even more granular I could go you know is it SUSE OS version you know 15 or SUSE OS version 42.3 I can now start targeting my systems in a much granular fashion and they all respond and what's nice about the zero MQ uh message bus is it puts the salt master which UNI is uh puts the messages out on the bus all the clients are listening to the same bus at the same time can respond immediately there's no more of this loop shell stuff of a list of systems that you've had to compile I mean how many of us have done that right we create a list of systems hoping that that's complete put it in a systems dot list file and then run our you know bash shell loop across our systems list file and it works it's going to hit each one individually in serial format and hopefully you got every system you needed to in that list right well with salt though it puts that on the message bus all the salt minions is what they're called for the clients are listening to the bus and they all respond at the same time so I don't have to wait for this individual one after another uh method to get my data back and I can compile all this data from the command line and actually port it out to a list if I want to port it out to a collection program like Splunk for example or any other log collaboration correlation engine all right so I can do that and targeting my back in so if I need to target something so that's all my targeting I can target all kinds of stuff I even I can put my own grains into the system if I want you know like you know it's in data center 2 rack 5 U32 whatever okay those can be individual grains as well you can put in salt so salt is the foundation by which a uni is built on so everything starts here all right so if I want to do something with my system and I've got a couple of simple um very very simple demo worthy um salt states as they're called then I can show you and let's look at one of them for example of root profile because let me have your own root profiles right I certainly have um root um profile and hit .sls so in this few lines of code I'm able to push down the bash profile that I want from my systems I tell it file this is where it's going to go root dot bash dot underscore profile that's the location that's the destination of where this file is going to end up it's a file that managed these are called modules with insult and there's hundreds of them and each one is uh it can do a lot of things differently where's the source though what file am I pushing down what's in my root of my salt infrastructure of call and it's called bash underscore profile and then of course I'll make sure that my permission is correctly on that box alright so if I wanted to push this if I wanted to actually make sure I got that um I can show you where that's um just so you know I'm not lying there's the this is the srv salt is the root uh the root path and then of course root bash profile root bash profile okay and of course I'm going to push that bash profile out so I can just wrote um salt thanks fel I'm going to target uh two you client one and I'm going to say state dot apply on my root profile okay cross our fingers oh except in this one's in the I think I picked on all the ones I picked on I picked on the one it's in blackout but it goes it helps me go to prove a point though because I've set a pillar on this box it says don't allow any changes to the box okay so I tried to push a config change to this box and it said no um the pillar says you're not allowed okay the one thing I'm allowed to do is actually run a salt refresh pillars on this box it's actually updated to pillar data so let's try a different box let's try to client two and sure enough just like that I push that box out I get a lot of data about that I can see that I get a diff if it happened to be a root profile already there I would have gotten the differences between the two and good get format um and I would have seen that I can see that it was successful it changed one file nothing failed it took 66 milliseconds to run I could have done that I could have targeted all of them and they would have all happened that fast because again it's happening in a parallel to all of them it's putting that out on the message bus and they all just happen okay so that's the foundation behind salt um it's a very very quick overview of salt yes sir well I can roll back to I can push down a different profile there's not a rollback feature per se unless you that goes back into the the question is is there a rollback feature um and it kind of goes to what is your underlying OS support if you're using OpenSusa with butterfs there's a lot of rollback options then if you're using ext3 yeah there's other ways you have to go about it okay um but I could have made this profile a lot more um I could have made the state file more uh rollback proof for for example I could have made it do a copy and make an original file um so I can do a lot more in that state file I made a very very simplistic um on purpose um oh yeah so I could have put more data I could have made this a little bit more complex here to give me some rollback options but in typical I tried to keep things simple alright so good questions though and I don't mind questions along the way alright let's move on and I'm getting that make sure I don't run out of time here so the next thing I want to show you though was the um how this looked then in the UNI itself alright let's give a quick overview of the system oops I don't care about that but it's local I didn't test that my apologies um if it wants connection um let's try this get a network here alright much better fine um alright we're back on track okay so let's look up here at home overview real quick this is the interview this is the interface now I'm having to squeeze it all in to fit this screen up here um and you can see that it's a little bit cut off but that's okay um I can see an overview of my systems I can see I have no interactive systems I've got one pending minion waiting to be accepted I can see my most critical system so it's got updates to them I can see what's happened recently um we're going to come back to this on relevant security patches we're going to come back to these and actually deploy these so this is just kind of an interesting overview I can see I've I'm fairly evenly distributed between my sentos and my open SUSE clients um but what I wanted to show you though let's go back up here let's go to close this down systems system list all and I can see that I've got um a lot of clients where they're at for patches packages where they're at what base channel I'm going to talk about that in more than a minute but I want to scroll down here and look at my configuration channels and you can see I can do the same thing I had I had already set up one called bash profile and I can look at the init file for that and I can even edit this init file from here you can see the same format so I could have pushed this out from the UI as well across lots of systems I wanted to show you a couple more real quick since that's kind of a simplistic one the permit no uh this one I had set up called no root permit now where we can turn off the ability to SSHD so in this 10 lines of code I've basically saying here this service SSHD needs to be running this service needs to stay running but watch this file if it changes restart SSHD and then I'm going to push a new config file down to the box with my permissions and it's going to land in this director in this path right here which matches what I'm watching and if that changes it resarts as SSH so in my 10 lines of code I was able to easily push down and restart SSH without much trouble so for sake of time I just wanted to show you a couple examples of what we can do another one real quick here this one because I'm a lab and I do a lot of stuff in my lab box here I don't want to deal with all the spectra stuff so it won't slow me down so I turned it off and now I'll replace on my grub my etsy default grub and of course I could have put more targeting here but I can actually target saying if it's a SUSE box use this path if it's a red hat box use this path for example but here I'm saying in this file look for this pattern so think of said I'm looking for this pattern that ends with the line show ops quote I'm going to replace it with this line show ops no specs back off example and then if this changes down here on changes of this up here run this command I need to run grub 2 make and fig because I've made changes to my default grub because I told it on changes here run this command and this changed so run that command so again very simplistic only 12 lines of code actually if I take the one out the one space line out alright so that's a very simplistic look there's some that get very very detailed but I want to give you a quick overview of what we can do with solves from the UI alright so let's go back up and one of the things I want to show you let's pick on a system here oops I don't have this one and I can go down here and there's this idea of we talked about formula with the forms and you can see that I've already rolled out this one formula called Splunk what are these formulas well this formula is a very simple one and it's asking me whether the install is either a universal forwarder or a full version of Splunk very simple but behind the scenes the formula behind the scenes says if it's a UF install from this path with these parameters if it's a full install with this package from this location and do all these other kernel changes that I need to do for a full Splunk install but from this simple UI I just all I had to do was pick whether it's one or the other so let's look at a different formula though it's I'll say it's oh that's right because I didn't I need to put it back to where it was alright so let's look at a different one we could have added I can get my mouse to work here there's the one I talked about on that U1 we have blackout but let's look at one a little bit closer it's like a DHCP save that go up here now I've got another menu item up here you can see this one's a little bit more detailed and I can add my domain name domain name servers my interfaces so all the things that you would configure within a DHCP server I can now do it via this formula okay via this form and push that out and if I add it to a particular group DHCP would be a bad example but all the machines I add to that group now would get this formula and get these parameters okay so let's clear the values there alright any questions so far I want to turn that off that's a little bit longer talk yeah actually there is a presentation at SUSECON coming up first week of April that we get into that a whole two hour session on formulas I will show there's a topic on not topic but a web page form salt on formulas and then there's another the documentation on how to create the form portion of that as well I'm doing okay yeah half an hour alright so let's go back up here and look at our systems real quick and from here actually I want to go yeah let's do this go back to him so I talked about my states and I'm sorry do I want to do this yeah configuration channels so here I can do my configuration channels and see I've already pushed this one called zoomer host well because of the way my little lab is built on this one single laptop and the way I'm doing DHCP and here I had to actually push that file and edit my local host on each box so it pointed back to my laptop for him so I had to do a config file there but if I wanted to apply I could search for other configuration changes SSH and can search I can see that I can add for example I'm going to click on that I can even see what it's going to do I can see I'm doing SSH running I'm changing the message of the day and I'm pushing out a banner okay so I can add that I can even edit it from here if I wanted to I can apply that I'm going to assign that save changes I can then rank these depending on which one needs to make sure they get done first if there's something that has to happen before the other one and confirm it and now the config channels change has been scheduled if I go up here to high state and when it builds the high state I can then see where that's validates and if this builds cleanly I can see that sure enough there's my changes that I'm going to be pushing down to the system as soon as I apply this high state these banners this stuff would change and we just go ahead and do that and I can see that my high state has been scheduled and from here I can see what it's been doing what the details are and when it completes it will show up here I can see it's in progress or you know when it fails it will show up here and it will give me a little bit of a clue of why okay so that's our schedule one thing I wanted to do oh golly yeah so I wanted to show you on our audit for example two ways to go about this one I can see from my overview page that I've got some relevant security patches on different systems they're updated with the most current ones at the top as recent as two days ago this one opens to 2019 there was a docker change I've got one system that's affected by that okay so if I click on this one I can from right here click that apply patches and he's done that security vulnerability has now been remediated for this system okay and boom he's now been scheduled oops he's been now scheduled for that box I can see this patch down if I wanted to see what that patch did the details about that patch I can see what it's fixing I can see the CVE that's relevant to this that's also relevant and I can see what it's fixing in that I can see then what packages it's pushing down as well and where I what repository I'm pulling those packages from and then of course the affected systems and ooh it failed okay let's find out why okay again I picked on the blackout box typical so let's go to that box let's get rid of the blackout and then we will be able to do things to him and now if I go back to that I can then quickly come down here and see my actions this one failed this high state failed because I had it in blackout mode if I click here now and go back and do it and quickly apply patches confirm schedule and that's old okay so anyway we're moving on another way I can do it though if I need to see particular audit on across all my environment I'm looking for particularly let's say 2018 I don't know 365 I think it's one I have in my back of my head and from here you can see that I'm allowed to not only audit my servers that I already have but I can also audit my container images to make sure that before they're deployed they're clean for this particular CVE audit okay I don't have any containers set up right now so we're just going to audit our servers and you can see I've got some that talk about I've got a couple of options here some that are clean and one that has the audit against it is available alright and I can do the same thing I did then click on this guy I can see the clients are affected details, packages highlight the system I want or all of them and apply patches okay alright now let's talk about repos real quick so in our world of managing you've got all the repositories whether it be SLEZ11, SLEZ12 OpenSource to 42, OpenSource to 15 Sentos 6, Sentos 7 and soon to be Ubuntu all of these repositories and they're changing every day right you know there's new patches being deployed to the main repository upstream every night practically seems every night well you don't want to apply changes to a system today based on today's repository the upstream repository and then apply another update tomorrow to a new system and it have newer patches in the one night before you need a baseline you need to be able to say and I can guarantee that from this point and I update my pride it was dated March 1st and every box I apply then to March 1st is going to be exactly alike agree well that UNI gives us that ability to create those baseline we can do it either from the command line or from the UI and you can see here for example I've got several out here I created from February 5th baseline and February 18th baseline February 22nd, January 28th March 4th I've got all these different baselines and I can then roll I can push my system to those individual repositories and so that when I push it to that repository I know that they're going to get the same patches as the one system prior to that and then I have just barely scratched the surface of the interface of what we can do here is our images ability we can deploy systems from within UNI whether they be virtual or containers and I don't have this this is one area I didn't plan on demoing but I wanted to show you the interface for it I can deploy either a kickstart file use a kickstart file or auto-yass file to do it and I have a presentation we're getting ready for for SUSECON on how to do system upgrades from SLES 11 SB4 to SLES 12 all automated all through SUSE Manager and SUSE Manager today is synonymous with UNI okay? so UNI is a great place to you've got that development space you've got that lab and you need to be able to have system control over lots of machines you can do that effectively from UNI so let me jump back to my slides real quick so encourage you guys let's get involved and how do we do this in lots of ways the project itself page is here in which all of these are then you can find them from here our GitHub page the open build service page the mailing list the user follow and of course IRC all the developers are on IRC so I really encourage you guys it's a brand new project there's a lot of functionality here that we've already brought in but there's a lot more we want to get to it's a very active community I want to build it more you guys get to be the first US based UNI talk so let's make it happen let's get some US people involved in this project just like the couple of talks we've had today already about getting the community involved this is a great project to do that whether it be in documentation because there's a lot of documentation that's been modified from system manager to UNI that needs to be updated and we need people to jump in I'm not a developer so I can't commit code but I can commit to doing documentation updates and changes and that's highly needed in this project as well the project also needs testers people to go out and try this find out where the bugs are where are things missing what features are needed that are critical getting involved to find those things from a user's perspective instead of just a bunch of developers writing code that don't do this on a daily basis that's where we come in as the admins to make things happen so get involved that's what we really need so that's my presentation I'll open up to questions again the typical disclaimer have any questions? I was just wondering is this dependent on an OpenSUSE environment or will it work on any? the server today requires OpenSUSE 3 the server side actually I think it will run on LEAP15 now as well on the server side the client side can be Red Hat all the way back to 6 as well if you want to use Salt if you want to use the traditional client which I didn't talk about you can even go back to CentOS Red Hat 5 as well as CentOS Red Hat 5 Support 2 but it's still available today but the server itself needs to OpenSUSE 15 or OpenSUSE 42.3 so I walked in late so I missed the first part but is this just a front end to Salt or any configuration management solution? this is our front end we use Salt underneath it's not a per se front end to Salt other than the fact that we use Salt and do do a lot of fun and work for it it's not a replacement for SaltStack Enterprise we're not trying to compete there that's a completely different market and different mindset they do things we do, we do things they don't do right no that's agreed it's meant for it runs Salt underneath and only Salt however there is several SUSE developers we have an internal Salt team that writes code for Salt and several of them have pushed up stream code that will allow us to run Ansible Playbooks within SUSE Manager so we can run Ansible Code via Salt inside SUSE Manager so you don't have to take all the Ansible code you've already written and throw it away and start over, you can continue using those Playbooks other questions? thoughts? comments? rotten tomatoes? eggs? alright well, thank you very much thanks guys we're on the cusp of getting, oh my voice is much louder now that I've turned on the microphone probably don't even need the microphone because I do this a lot but for the recording purposes they told me to keep the microphone on they also warned me not to dance they told me to stand in place try not to wander around which is impossible for me I'm completely incapable of standing still repeat questions from the audience and don't forget to tip your waitresses I think that's an important one I was just telling everybody this is one of my favorite places on the planet this is Henry's Lake just outside the Tetons West Yellowstone it's in Idaho it's called Island Park Idaho, Henry's Lake I used to spend a lot more time there than I do now but that's not my presentation this is and does it work? good it does thanks for coming today I know that there is a high competition for time everybody's got lots of interesting presentations they want to see or booths they want to visit or Simon Cowell out there with the America's Got Talent show but I'm going to spend the next 40, 45, 50 minutes talking about software to find storage and if you don't like my presentation Alan Ott will be giving a similar presentation on Sunday I don't remember when and he's a lot smarter than I am but he doesn't have a beard I up until recently was one of the engineering managers at SUSE for the software to find storage product that SUSE has called unimaginatively SUSE Enterprise Storage and now that I'm no longer one of the engineering managers for that team I feel a lot less qualified to give this presentation but I'll do my best I'm currently writing technical training materials for the SUSE Enterprise Storage product so that still qualifies me to give this presentation in addition to that I'm an adjunct instructor of computer science at Utah Valley University which gives the only credential that gives me is that I can stand up here and talk to you fairly comprehensively does not allow me to do interpretive dance though but for a long time people have kept records and wanted to keep track of their stuff and if you can imagine, you know, Thag you know, we're still alive the day and if he didn't ever delete his data you can imagine the huge storage of communications data that he would have stored up how many caves of data do you think Thag would have, right? and they used rocks to communicate things and you know this is some sort of Meso Asian writing script, some script oriented base of language stored on stone really incredible stuff that we have about storing data and this is Greek this is some sort of official edict from back around 300 BC and this is my office right here and this was 5 get this, 5 megabytes of IBM storage around about 1956 so you know, we've been dealing with this data thing for a while and this is nice data stored digitally and we're not much farther along than that but it's a big problem obviously we need data and we're keeping lots of data and whether we want it or not we never delete the stuff and from 2009 we're some down here and here we are in 2019 and the data growth is about 40% per year ouch that just blows my, if you think that blows your mind wait until you see the next slide and now although it's interesting though that storage costs are declining, that's good we're grateful, however, comma you can see the disparity that's the word I was trying to fish for right there although the costs are going down at roughly 25% annually, the rate of growth is 40% annually so it's still very expensive to store our stuff alright so now that if that doesn't blow your mind it's alive it's just nobody ever delete stuff, you're not allowed to delete anything anymore and IDC predicts that there'll be 10 times the amount of stored data in less than 6 years about 6 years and by 2025 there'll be 175 zettabytes of data being stored on this planet now if that doesn't blow your mind this is a zettabyte a zettabyte is 1 trillion gigabytes or 1 to the 21st that's one with 27 zeros of bytes and I forget how many DVDs worth of data that is but you can't watch that many DVDs, trust me and just to make it all the more complicated last year the same IDC folks said by 2025 we were going to be at 135 zettabytes and so they're already inflating the number where are we storing this stuff that's a complicated answer there's a shifting paradigm in how we store data and we have to do that if you're a mom and pop shop or if you're at home you probably don't need to listen to this talk we've got the standard today model of storing data on legacy storage systems and that's fine and that's probably going to do us for our lifetimes but if you're in any sort of business any sort of organization that is not allowed to delete data you've got to figure out larger and faster and more comprehensive ways of not only storing it but making sense of it keeping track of it, knowing where it is just because you've saved it doesn't mean that you necessarily know where it is and can get it back you've got to be able to manage and organize all of that ridiculous data in our current day to day of storage our legacy approach to storage storage silos with a direct attached storage and if you're lucky enough to have something that's at least a little bit greater scale than that at least you have the means of expanding that direct attached storage and having some sort of array within what do you call those things a closet, a cabinet that's the word I'm looking for a cabinet to expand and then when you get to the point where you have to have now two cabinets it gets a lot more complicated it's hard and it's a lot of work and it's a big headache then you've got to back up that stuff which is really nasty and the process that's behind it is a very slow reaction to change and as a matter of fact the way most proprietary vendors handle their software their hardware for storage these days is what we call in the industry a forklift you don't come in and add storage although you can you come in and add a shelf and then you add another shelf and you can at least get some scalability there but then when you get to a certain max size for that architecture then what your vendor does is they do a forklift change out and replace they basically come in and take your cabinet out go out and throw it in the dumpster and then they bring in another cabinet that's got higher density disks, higher capacity disks and it's hard to scale that kind of stuff and it's difficult to manage that kind of growth and not that I haven't put this particular detail on the slide but it ends up getting really expensive, really fast and the hardware vendors know this right once they've got you they've got you for life what people want is not that what people need going forward in long term is something that's far more agile and we end up calling this a software defined infrastructure software defined everything infrastructure and as a part of that you've got software defined networking and you've got software defined this and software defined that and in that whole hodge podge of software defined everything you've got software defined storage where your ability to deliver storage to the people who are using the storage ends up being more agile, more flexible, more along a dev ops delivery mechanism and more ability to move the data around and change the size of the data and it typically ends up being more business driven your ability to react to the business needs is far more facilitated and you can be faster in terms of deploying what it is that those users and customers need for what it is that you're keeping track of and this is where most organizations want to go and even if organizations don't know that that's where they want to go when they try to tell you what it is that there's their use case they're pretty much describing this which is a software defined storage approach now one thing that a lot of people start looking at as they're going this direction they say oh yeah well they've got cloud storage and great cloud storage is awesome and there's a great application for it it's very flexible you don't have to do any capacity planning because all you have to do is just tell your cloud storage provider hey give me some more disk and they go hey we'll be happy to give that to you yeah here it is you know what they're doing in the back end software defined storage but that's another story right and and from a hardware perspective you're really happy that managing all that hardware is somebody else's headache yeah good stuff right you don't want to be in the business of managing all that hardware but there's always a but that comes at a cost and I mean literally a cost right remember growth is about 40 percent per year today for two petabytes that costs you $50,000 a month and of course prices are coming down but your growth is going up and they don't intersect and so it just gets more and more and more expensive for you plus in your cloud environment although there are a few options typically speaking you're talking about object storage and maybe you need something different than object storage what if you need block storage what if you need file storage so once you get hooked here too if you ever want to change has anybody ever tried to get out of the cloud if it's your your mp3's that's pretty easy to move out of the cloud right but if you're talking about all of your medical data for your hospital getting that out of the cloud that's not into impossible so there are some challenges with cloud storage solutions there's a lot that's appealing about it that's fine that's good and there's a good reason to be there just be aware that this isn't always what everybody wants so how do I store all my stuff well along came an open source project that was free and open source and was a clustered distributed model of software defined storage it predates the cloud movement right the first stable release of sef was in 2012 by a guy it was his phd project somewhere he was here in california I think he was at yeah there you go and his name is sage sage vile sage vile good guy really smart guy starts a company called ink tank within a few years red hat they come along and they say look how valuable this is they buy up ink tank so now all those sef guys all of those ink tank guys have moved to red hat but they're really good friends of ours and I work for susa and we're also involved with the development of sef and I'll show you a little bit more about that when you get in the next slide sef has a lot of really good goodies really incredibly powerful and useful features for a distributed storage cluster right now of course it stores all of its data all back as an object store but then makes it available in three different formats that we'll talk to in a moment another important feature about sef is that it excels at durability so the redundancy and the fault tolerance of the storage data is second to none very very rock solid making sure that you can get to your data when you need to get to your data it allows for a very efficient and fairly effortless scale out to the tune of exabytes if you need more storage you just add another node and it's built on commodity hardware a lot of those proprietary software vendors today they're selling you their hardware that will only work with certain disks that only have their controllers and you have to be using their software and it's all proprietary stuff whereas with sef keep in mind there's limits to this you don't want to go out and buy the cheapest controller card that you can find you don't want to go out and buy the cheapest hard disk that you can find but generally speaking it's all on commodity hardware that you choose you don't have to buy the vendors hardware you can go out and buy the stuff that you like that you have the confidence in HP, Dell, EMC whatever it is you want and it's all open source it's entirely open source it's almost entirely C++ although there are some other bits to it that are other bits of code and to be honest it is the most it still is year after year the most popular choice for the back-end storage mechanisms for OpenStack now I have a reference down here the 2017 survey report from from the OpenStack summit they didn't do a survey in 2018 for some reason probably because it's expensive to do surveys but it is still even today unofficially there's no report to back it up but OpenStack pushes the fact that SEF is the way to go to store your stuff in an OpenStack environment that's a big testament to how robust and how relevant that SEF is just in 2018 SEF was removed from direct control from Red Hat and the organization now is governed by the Linux Foundation that's a really cool thing right nothing against Red Hat right but even the the old ink tank guy Sage they're all saying as much as we like Red Hat and Red Hat's paying us very well it provides greater flexibility and greater openness to be managed by the Linux Foundation now that doesn't mean that Sage is employed by the Linux Foundation now Sage is still a Red Hat employee but SEF as a community is governed by the Linux Foundation so we have a new SEF Foundation and its founding members are listed here I of course because I work for SUSE I'm biased and I put SUSE here at the top it's interesting that you'll see over here on the number of lines of code that are contributed to SEF with the most recent release of SEF which is the MMIC release in 2018 we're currently working on the Nautilus release which is actually on the list of being released it'll come out probably the beginning of April but as far as we have for lines of code contributed for MMIC you have 537,000 lines of code contributed by Red Hat and close second to the tune of one third the amount but the second greatest contributor to SEF is my company SUSE and a lot of other great people right this is a bunch of people that are unaffiliated just their own everyday run of the mill garage contributions right a company known as Morantis still they've kind of left the game Morantis is a great company but they've de-emphasized their involvement with SEF so although this number in the early part of 2018 was significant it has dropped off a lot since they made that announcement and then a company called ZTE that's one of the other surrounding members and a bunch of others I just picked the top five that were here SUSE produces we have about 30 engineers that are involved in contributing to SEF Red Hat has I don't know what was the last count they've got more than 100 that are working on and not exclusively SEF right they're still committed to Gluster but their storage solutions they have significantly more people that are on board the development of SEF so they have a lot more contributions here and of course they've got Sage there who's running the project and a lot of the other original ink tank guys that are senior officials in the governance of the project but it's a great cooperative effort those folks at Red Hat holding tank guys and we at SUSE we work very hard we work very closely together and I'll mention that here in a moment actually I'll mention it here one of the great details that are coming along first is sort of a preview in Mimic and now it's going to be seriously supported for Nautilus is the SEF dashboard which is now a graphical interface to do the management and the interaction the graphical interactions with SEF now that SEF didn't have before Mimic and we at SUSE had a product adjacent to our SUSE Enterprise storage product called Openatic and we got together with the Red Hat guys and said you know what we really need to do is bring these two things together and so our Openatic developers ported all of the functionality from Openatic to make it native within SEF called now the SEF dashboard so that's a really great cooperative effort between both Red Hat and SUSE at SUSE we make two variants of SEF or that is to say we're involved in making sure that we have an Enterprise product which is called SUSE Enterprise Server but we also in the efforts that we go through with Enterprise Server we publish all of the bits that we are working on as well as all the rest of the upstream SEF efforts compile it release it on Open SUSE for both LEAP and for Tumbleweed on LEAP we're currently shipping the stable SEF MIMIC but actually available in Tumbleweed today we have the latest and greatest pre-release bits for Nautilus which is going to be released here in the next few weeks all the work that we do and says gets rolled down into the open source the open distributions of Open SUSE LEAP and of Tumbleweed of course the Open SUSE bits community supported whereas for those who want to buy into a support model where we'll help you fix all the bugs that you may have then there's the SUSE Enterprise Server that's available and I have the links here that you can get to the LEAP and the Tumbleweed bits of SEF SEF is composed of a storage cluster which makes perfect sense N number of nodes within this cluster it's basically self-healing self-managed that's kind of misleading isn't it what it's trying to make sure is that it's keeping track of where it's storing all of the stuff and all of the technical details it's keeping track of and just says hey when you ask for the file you don't care where it is I'll get it for you generally there are no bottlenecks there are obviously caveats to that because you can implement or design a poor design of any cluster and have the whole cluster itself be bottlenecked but when properly designed according to best practices according to what it is that we recommend for it there is no bottleneck it'll just find the data wherever it needs to the equivalent speeds no matter what it is that you're trying to find and as I mentioned earlier it excels in data durability let me make sure I'm clear about that that's how it was originally designed that has always been its particular focus to make sure you don't lose your stuff there are proprietary solutions that are going to be much more performant than something like Cep let me mislead you Cep is fast it's clippy those little cephalopods if you've ever seen them swim man they can get around where they need to go but there are faster alternatives if your value is speed right if you want the IOPS you can buy a product that's faster but for all the combinations of things that exist with the durability the scalability and the open sourceness you can't get any better than this this is absolutely the best there are basically three different ways to access the data it's got its native object store which you don't access natively but it does provide an interface for object access for Amazon S3 or OpenStack Swift there's block access if you've got some native linux application you can just use LiO to attach to the data that's there or a few different gateways that are available right you can get to these the storage store through KVM or iSCSI a variety of different ways and then we have in a recent release of Cep now the introduction of distributed file system storage which is called CephFS that's really a unique name right everybody loves it it's great CephFS you couldn't have come up with any better name than that and associated with that CephFS you can also access that file system data with the likes of NFS through NFS Ganesha there's lots of flexibility in how you get to your stuff a basic Ceph architecture looks like this you have the baseline object store that's been called RADOS or RADOS and it is much more imaginative than CephFS right you've got the reliable autonomic distributed object store or RADOS you can see here someone like you know like the Arby's commercial guy right saying that with a really deep voice we have the meats right RADOS and on top of RADOS you have these basic interfaces for those three access methods you've got the RADOS gateway which provides for object storage interface for S3 and Swift and it's a nice restful interface you can get to the block storage through the native LIO access obviously but with a gateway to allow provisioning and snapshots for ice guzzy and the like and then you have the CephFS that we were just talking about that provides a POSIX compliant file system mechanism that's very tantalizing for the likes of Hadoop or just regular file system storage that you might want to have if you don't care so much about your own EXT or butterFS kind of environment. Three great ways all rolled up together so a quick look at how this is right and this is I pilfered this from an old set of slides that I have this is actually slightly wrong and I didn't have the disposition to actually change it yet but I need to on a node on a storage node you have three basic sets of functionality you have the physical disk itself and then sitting on that disk you have some file system and this is the part that's a little bit misleading because in the recent past we just took advantage of XFS because we thought hey we don't want to be in the file system business we'll let somebody else take care of it and that's good right we let it be somebody else's hey but then we thought for performance reasons maybe we don't want some of the overhead that's associated with XFS so we're going to build our own and we're pros and cons with that but we created now a new file system mechanism that's exclusive to SEF that's called BlueStore and of course it takes advantage of a few other not ours technologies you know this whole database is based on everybody's favorite Facebook database what's the name of their database ROXDB right so we use ROXDB to help manage all the metadata that's on the file system but it's our own proprietary proprietary it's open source what am I saying proprietary for right but it's our own invention of a file system that provides for effective SEF usage please ask a question correct yeah it's part of the SEF development here I'll try not to say stuff that's just SUSA specific which is a SEF discussion here and then sitting on top of that is a demon called the object storage demon and for each disc in a node they all have their own OSD that runs that just keeps track of hey what am I storing in my database and what am I storing on this disc so that's the OSD process that runs so if you have eight discs attached to a node you will have eight OSD's that are running on there and these OSD's all know how to talk to each other amongst all of the different nodes they're talking to each other they all know what they're storing and then they are able to serve their stored data out to whatever client is it says hey I need my file oh I've got it here here's my here's the file you're looking for and if anything would happen to disc N inside of this node then all the peers know that oh some disc has gone bad where's the other replica for that let's now make up for the fact that that doesn't exist anymore that disc has gone bad so they all are able to talk to each other to keep the data relevant and replicated all throughout the entirety of the cluster those discs usually are made up of three different varieties the most common is still our everyday run of the mill cheap HDD's that we affectionately call spinners there's also the solid state devices SSD's which nowadays are really getting less and less expensive and more and more affordable and and I'm told that we're on the verge of another SSD innovation a revolution there that's going to bring the capacity up and the costs way down yet again so that's good now don't be fooled right if you're going to buy the SSD that you can get at Walmart that you're going to put into your home system no that's not going to do it for you you've got to get enterprise class SSD devices and a matter of fact Kifu no is it Kifu it's not Kifu somebody in the community keeps track keeps a web page up the date on a weekly basis about what SSD's are available and which ones are going to provide you the meantime between failure results that you're expecting to get and the performance that you're expecting to get so make sure that you're getting enterprise class SSD's and the cheaper these become the less you'll be inclined to use spinners but there's still a good case for spinners that we'll get to here in a minute and then the most current technology that we have which is kind of similar to SSD's the nonvolatile memory express flash capabilities that exist still pretty dang expensive but if you got the money this will give you the performance that you want the best thing that you can do though is that you can use all of these things in concert you can tear these or you can say keep my raw data on the spinners and just keep all of my metadata and my journals on the faster devices and you end up having a great combination of storage capabilities there so you have a bunch of disks and their associated file systems and further the variety of OSD's that are running and they're all represented as a particular node that you're going to see on the next few slides this is a node with the disks and the file systems and the OSD all combined together each physical disk has an OSD and of course you don't want to use all of the disks on your system because you have to have an operating system on that server so save one disk or maybe two if you want to mirror the OSD if you want to mirror the OS that's a good idea, that's a good best practice there to have some redundancy for the OS on that node but then you just give all the rest of the disks to Ceph and it takes care of it, self-managed it just takes the disks, sucks them up hoovers them up if you're English and makes them all part of the storage cluster there's also one other kind of node in these clusters called the monitor it's the brain of the cluster and it's not in the data path at all none of the OSD's really care about what the monitor does but the monitor just keeps sanity going on in the cluster it knows what the OSD's are it knows which ones are healthy it knows which ones are sick it knows which ones are overworked and it just keeps this map of what's going on with all of the different nodes within the cluster and it has to be a a it has to form a quorum it uses the Paxos protocol to make sure that the data that it's managing is authoritative so you have to have at least three monitor nodes in your cluster and you need to have an odd number of them now strangely enough you don't have to have more than three monitors in your cluster there are some customers that come around and say oh yeah well I'm going to have this really ginormous cluster there are five monitors and between Sousa and Red Hat we've been monitoring this monitoring for the sorry that was a pun we've been monitoring this for a really long time and we still even with CERN one of the largest clusters that we're associated with they only need three monitors if you want to have five go for it yeah you got the hardware hey hey knock yourself out but you basically only need three monitors they'll handle the load so you put those nodes the storage nodes together with three monitors and you got yourself a nice little handy dandy self self cluster I can say that and maybe we want to get some data right of course you don't want to store data if you can't get there right so you've got some sort of client that's running maybe this is a a CFFS client and it's trying to then write some file system data and it's going to go over to the cluster and it's going to say I want to write this file and the cluster says hey baby we got it leave it to us and so somebody in that cluster grabs the data and it makes by default three different copies replicas of three that's configurable and of course you don't even have to use the replica mechanism you can use erasure coding that's one of the different types of technologies that we implement also in SAF but for the purposes of this demonstration we're showing replicated distribution here and so he takes the copy and he makes two friends and now we've got the stored data and then as soon as the cluster has verified that it has the data and it has given that instruction to three nodes to write the data it sends it back says that's ours we got it your data is safe in my cluster and so without clustered storage solution would be complete without the ability to get the data back out and so you got a client that comes along and says hey you know remember that file I gave you a little while ago I need that back so you talk to the cluster the cluster says hey I know where that is let me give you the one that is fastest and so this guy answers here it is clients got it now do you need a little more storage just add a few more nodes need a lot more storage I'm simplifying it obviously right all of a sudden you're in two cabinets three cabinets you've got to have some robust networking in the top of the rack there to work effectively but you think it up we can do it you get the right networking you add as many nodes as you want you have three monitors and you're good to go now can you do data center replication on this yeah you can it's not great performance but it's hugely robust you want to put a backup set of your data in a data center across the hall piece of cake now if you want to put that backup data center across the country or across the planet things get a little more complicated but it certainly can be done and it's because of this robust mechanism of these clusters all talking to each other if you want to use CFFS instead of block storage you do have to have one separate dedicated node called the metadata server that provides all the POSIX capability now it's entirely technically possible to have a CFFS system without a metadata server but anytime you want to do a find anything you want to do anything POSIX you've got to have something that's keeping track of all that POSIX metadata for each of the different files that are stored and that's what the metadata server does and you could conceivably for high availability purposes have N number of metadata servers and have them fail over one to the other it's not necessary to have multiple metadata servers for performance at least not that we found yet you don't need to have multiple metadata servers but we do allow for the duplication of metadata servers in case you want to have that kind of redundancy now let's see, what do I have on this slide for RADOS it's native object store this is its base design this is exactly what objects were designed for is for large scalability but even so with CF that was exactly what SageVile was after so with the object store we just simply keep everything together and then our native object store and then can certainly make it available for what other object application that it is that you have whether that's S3 or Swift and it's better that we're dealing with large-ish objects but it's okay you can have small objects that's certainly not a problem and of course we're avoiding single points of failure everything is very robust, redundant at least three replicas or with erasure coding you've got that spread out in a kind of way out throughout the entirety of the cluster and it's entirely distributed so you can configure it to have replica one in cabinet one and replica two in cabinet two and replica three in cabinet three and then you have those three replicas so that you have a lot of that robustness that you expect to have to be able to do all of this stuff though then you do have to have some robust servers with some RAM and with some cache and with some CPU and of course with some disk and so you don't want to go out and just buy the cheapest hardware that you can get you want to have some enterprise and if you're toying with this in your garage you're just making a SEF cluster for your storing your DVDs in your basement then you can buy whatever you want but if you want to have something that's going to do what SEF distributed software is intended to do make sure that you're getting hardware that's equal to the task as well so there's three essential components to storing and managing all this stuff and we'll talk a little bit about pools, placement groups and crush maps here I'm checking the time pools are just a way of logically grouping your objects you could create a pool for department A in your business and a separate pool for department B and a separate pool for department C that then those different departments have control over their own sets of storage for their particular department and typically these pools are all organized in a very comprehensive way that makes sense to you you have to give the pool a name you need to tell it what kind of redundancy that's associated with it whether the number of replicas you want to manage your K plus AMI ratio coding mechanism that you've decided is the best for your organization and you have to identify a number of placement groups I'll talk about placement groups in a minute you also are going to have a bunch of crush rules you could take the regular crush rules as they are by default or you could customize I mean we'll talk about crush groups crush rules in just a moment and then as part of that pool you also need to say an owner by department A you're going to make sure that department A is the people that have access to it and the department B aren't going to have write and read access to what's in the pool the pool for pool A and of course as any good storage mechanism is that set of core functionality is going to allow you to create objects, read them, write them and also you can create snapshots for those objects now for placement groups it's a weird thing that kind of trips some people up and that's okay if you don't understand it it's actually not going to cause you any problems you will have to understand it a little better but I'm going to give you a nice promise that with Ceph Nautilus you have to understand it less we're going to do some automated placement group manipulation if you want to if you want to give that kind of control over to Ceph and just let Ceph handle the headache for you you can do that if you want to be cool if you want to have more control over what's going on with your Ceph cluster and control those placement groups more manually you still can do that you have more options here this is just the way that Ceph balances the data across N number of OSDs and there's a kind of a ridiculous math algorithm that you use there's actually a tool on the Ceph.com website that helps you to calculate the best way of doing that and also a lot of other groups that you should create for a particular pool one placement group will typically span several OSDs like shown in this diagram right here and one OSD typically serves many placement groups and it's entirely tunable based on what's going to happen in this particular pool of storage if the data in your pool is going to change really fast you should have a larger number of placement groups and of course these formulas that we provide for you or your consultant provides for you or the website provides for you helps you to make that calculation appropriately if you're going to have really really static data you're just going to write it once and then people are going to just read the brains out of it then you don't need a large number of placement groups for that particular pool and then there's this other acronym called CRUSH controlled replication under scalable hashing it's cool right somebody got some money for that I'm sure that is the way that you as the storage administrator now makes designations of how best to organize the data in your data center so that you can decide that for this particular group of users or this particular application you need to have that in a set of nodes that have the fastest disk and the most CPU and RAM so that you can have the fastest IOPS for that particular application or you can customize your CRUSH map so that you are spreading it out over a specific set of racks in your data center or that you can organize your data so that it has the most optimum disaster recovery resilience right the CRUSH map is the system administrators of how all of the data is supposed to be organized across all of the different nodes and the users don't really care how this works they just need to be able to get their data so when the client system comes to fight out where the data is the user doesn't have any idea what any of this weighting is going on and all of this measurement is going on and all of this access is going on it just comes and asks it and the CRUSH map and the monitors all know okay go find the data over there and boom the client goes over and gets the data and brings it back so with all of the components working together there's some kind of object some file that you're trying to store you're going to go through the client it's going to get lined up in some pool spread out over a bunch of placement groups the CRUSH algorithm says well the best place to put that is in this placement group that placement group says yeah well then because I'm talking to these bunch of OSDs and that OSD is not doing anything right now put that data over here on the OSD and then the OSD says yes now I have that data let me give that to my two brothers over here for three way replication and then it sends all the information back says hey we got your data and the object says great I'm safe safe and safe so a few bits with the time that I have remaining here right let's talk for a minute about legacy storage arrays now if you go to your NetApp organization they're going to have a different list and I accept that I'm biased and I accept that they're biased right so we've got these kinds of limits associated with these traditional storage mechanisms it's a tightly controlled environment it's not open at all there's some scalability but it's limited compared to what it is that we can provide in a clustered system like SAF it's relatively expensive don't be too fooled by this right once you get comp what's the word I'm looking for competent once you get competent hardware in your software defined system it's also going to be expensive but not as expensive as your proprietary guys are and you don't have as much flexibility and as many options that are there now they're going to tell you something different and to be honest with you you know the folks at EMC at NetApp they have a lot of functionality they've been in this business for a long time and they're good at it I'm not faulting them at all boy they're good at it they have got really well-designed hardware and software I love them they're really great but you can only use their specific drives you've only got what it is that they can tell you in their shelves for those number of slots you don't have as many variants and what it is that you can do with RAM you just stuck with what it is they're going to provide you you can't use networking hardware you can't just go get commodity networking you got to use NetApp's networking technology and you're limited to what CPU and RAM that it is that they can provide for you the benefits are though they do have a long track record they're really good at what they do you know that you're going to get value out what it is you know they're going to give you the support as long as you're paying your support contract they're going to take care of you they're really going to take care of you and it's a very predictable pricing structure to find storage and of course NetApp's list is going to be very different here than mine is but the biggest complaint about software to find storage is it's slow and I'm going to tell you that's not true SEF is really fast it's just that it's not it's forte that's not what it was designed to do it's emphasis is durability of the software of the of the system so don't think that just because we say that it's slower than NetApp or slower than other proprietary systems that it's slow it's not if you design it right you can get good performance out of it that's one of the challenges though about this I should have put this as a bullet point right who's responsible for making sure that it gets tuned right you are if there's something about your NetApp system that's not tuned quite right you pick up the phone and they're going to tune it for you aren't pros and cons the best thing about software to find solutions like SEF is that you've got basically infinite scalability infinite adaptability just a nearly infinite number of choices and just all the flexibility in the world what it is that you want to do with the beast when you're thinking about a software to find storage system like SEF though make sure that you have a good idea of what it is that you're trying to find because maybe your legacy every day run of the mill current day shelf technology is the right thing for you depending on what your use case is you need to think about the pros and the cons the balance that you need to strike between availability of your stuff and the density of the stuff that you're trying to store SEF does not do well at 90% you got to keep it 80 82% full or you're going to start seeing problems your data is going to be rock solid SAFE it's going to be there it's never going to get lost it's never going away but it's going to slow down you need to think about your performance in terms of IOPS now whenever anybody talks about storage they say oh IOPS I need IOPS I've got to have the fastest performance that's not true you need to really think about that's your data usage what's your use case because you might not need the very fastest access that you think you need if you do the calculations you'll have a much better idea of how you need to balance these kinds of things and then of course you always need to consider what the cost is if you just want the the latest net app stuff that's going to give you the best iSCSI performance and you've got the coin for it do it but if you want to have a lot more flexibility and if you want to have a lot better long term growth and if you want to be open then net app is not the way to go I'm not bad mouthing net app I love net app I've been a net app customer off and on for a long time boy they know this they know the business they're good at what they do but it isn't everything right and I'm going to mention that here on this last bullet point right here you have a lot more hardware options you can just buy anything that you want but you need to also understand that now you get into something like software defined storage and now you're in control of that software environment and you are the one who has to do the management of all of that so there's tradeoffs with that one size does not fit all you may not really be the best use case for software defined storage software defined storage does not just say we're going to solve everybody's problems on the planet it's just not the case seph is not the best answer for everything one thing that you need to keep track of is the network the network is just as important as the nodes in your cluster you've got to have a competent network in terms of hardware you have to have a competent configuration of that network in terms of the software and you've got to have some competent people that aren't going to screw up your network configuration so you as a storage administrator have got to cooperate with and have some coordination with your data center network people and if you're able to do it make sure that you get in your agreement with your data center people that they don't touch your hardware especially your network I'm in control of my network because I don't want you guys coming in and messing up my storage network then nobody can get their data and who are they mad at me not the infrastructure guys not the data center guys they're mad at me because I'm the storage guy you let them mess up my data as with everything choose the fastest network that you can afford please don't buy a network that you can't afford it's not worth it you'll need to have two separate networks this may be two separate physical networks but it could be two separate logical networks it's perfectly capable of managing that sort of thing but typically you need to make sure that the back end communication between all of the OSDs the private network the storage network is roughly two times faster has two times the capability of the public network the public network is the network that's where the clients are coming in and of course the fatter your infrastructure the better of course I want to use gig E can you use a hundred yes hundred is better than forty forty isn't necessarily better than ten because forty is just four tens and it's just kind of weird but anyway we can talk about that more outside if you want to because I'm running out of time you need to have competent hardware you need to make sure that you've got good CPUs and good RAM but to just get what you can afford you'll want to have good storage controllers and you don't want to just get some storage controller that can put twenty disks on it because then where's the bottleneck and where is the liability the storage controller if the storage controller goes bad did you lose data if you've got it all crammed into a shelf of twenty disks you've got kind of a problem don't you so spreading it out is better and to have good storage controllers is a good idea use your SSDs and NVMEs for your journaling and for your ROCS DB your metadata taking care of all the metadata on the disks and use your spinners for just your raw data and it's easy to configure this with Bluestore it's just simply a matter of saying yep put my data on the HDDs and now I've got one SSD with these four or six spinners and now that one SSD is the journal and the metadata server for all the rest of the spinners that are there it's a great performance combination that's what I meant to say more about journals spinners good this is something for a different talk on a different circumstance if we had more time we'd talk about these kinds of things the great takeaway here is that SSDs as good as it is and as much as I'm trying to sell it to you it's not a golden hammer it doesn't solve all problems in small simple environments if you only have terabytes of data and if your data is fairly static and it's not going to change very much then just go get yourself a nice shelf from the bell or from HP and call it good and particularly for single threaded ultra high performance requirements SEF is probably not your best choice today we keep working on higher performance, higher throughput kinds of changes and tweaks to it but the emphasis is on reliability and durability of data so software defined storage in SEF best accommodates long term growth to give you the best opportunity to increase capacity to manage more complex data over a long period of time is the best but it's not the best if there are other things that that are in play might have you think not to use SEF so sizing considerations and then we're here at the end so thank you for letting me spend some time talking about software defined storage like I said if you don't like my presentation go see Alan Ott on Sunday I forget what time it is but I think he's over room 107 I can't remember look him up in the schedule Alan's a really smart guy he'll tell you all the stuff that I didn't tell you today I'll open it up for a few questions, Doug's got the mic I have two questions the first question is you mentioned if it's designed poorly that leaves a lot of problems you hinted at it with how you set up the components and how many you have but what are the big three things that people do wrong when they set it up that causes problems and then my second question is isn't SEF good for certain types of workloads as is Gluster and then there's other workloads that it's just really not suited for and you shouldn't go down that road great questions let me ask this answer the second one first he's talking about specific workloads definitely true and if we had had more time I would have drilled down into that in a little more detail Gluster, so EDC the consulting company the analysts company guys they call storage mechanisms by two different types in fact they called type one and type two and type one is the traditional variety that is more what Gluster is after and that's the more legacy common kind of storage traditional storage that we see today and then EDC calls type two the software defined approach that's intended to be more flexible more agile and greater scalability capabilities so if you just have a type one application for your business for your work need for your use case Gluster is great or just your everyday run in the mill EXT4 on a Dell shelf and you're happy right but if you have long term scalability needs with high demand for durability and reliability and manageability then Ceph and SDS is the way to go likewise if you're talking about a bunch of small files then Ceph is not the best Gluster is a better choice traditional storage is a better choice if you're talking about large variety of files particularly large files and rapidly changing files then Ceph and software defined storage is a better choice for you there now back to your first question which now I've already forgotten what was it just shout it to me real quick oh design yeah what are the three bad things that people do yeah too wrong is that they forget to enable big big big what do you call that jumbo frames they forget to enable jumbo frames and you have to have jumbo frames node to node has to go all through the data path that's one mistake that people make another one is they're just too small right they try to put their cluster on 100 megabit network cards and switch and that ends up being a problem and you'll be surprised right one mistake that a lot of customers make particularly in terms of hardware is that they don't realize that cables matter you know they go into the closet and they pull out those old dusty cat 5 cables that aren't being used and they plug them in and you're not going to get what you need out of those cat 5 cables you need to have and you need to test your infrastructure hardware and software going into it those are the kinds of mistakes that people make next question just a question on SSDs you mentioned that the enterprise SSDs and I remember researching this when I did my cluster setup I haven't used CEP yet but it seemed to me like the main advantage of enterprise SSDs was the over provisioning and it was actually cheaper to buy just more prosumer SSDs than to get the over provisioned enterprise ones the only other difference I saw was in the firmware of the FSD and that was related to rate controller but it seems to me like better off using a JBOT setup than letting step manager fair enough so I'm still going to tell you to use enterprise class and if you go look at Kifu I can't remember if it's Kifu or some other engineer I can't believe I'm drawing a blank here but one of the engineers keeps this website of all the performance data of all of the SSDs and you can see that those consumer grade SSDs perform great for three weeks and then the performance just crashes and we're not talking about tailing off, we're talking about spike straight downward and it's just because they're not engineered to the same specifications I highly recommend for a few pennies more that you get the enterprise ones next question so actually we're going to blanks up next and if you have questions for Craig you could test test ok let's get started because I have too many slides for not enough time I have to be really fast my name is Frank Kaliczek I'm an open source guy for a long long time for 20 years it's a bit scary but yeah I started to contribute in all kind of different projects KDE at the beginning part of the open source community obviously like everybody in Germany I guess and then founded own cloud successor next cloud this is also the topic of the talk so I want to talk a little bit about the next cloud obviously what it is how it can be used how to get it, how to contribute and then time for questions hopefully so who knows next cloud ok so that's like half so I want to give you a little bit of an introduction to the elevator pitch at the very beginning so next cloud is an alternative to Dropbox to G Suite and to Ovid 365 with fundamental differences obviously that it self hosted you can host it where you want you can run it where you want you can host it on a tiny tiny Raspberry Pi you can run it in a big data center you can actually run it on AWS or somewhere else you can choose where it's running 100% free software so all the components are serviced everything is 100% free software and it's all built around distributed and federated ideas I will talk about it later so this is the web interface it has a web interface if you log into next cloud you see something like that it looks basically a file manager if you have files in your folders you can search them and you can filter them and you can in the sidebar you can do all kinds of things you can manage your files with it but a lot of the users of next cloud actually never see this they never see the web interface because they use our clients so because you have really rich clients on the left hand side is an Android app for smartphones and tablets where you can do all the same thing natively on your phone and your tablets in the middle you might think this is the iOS client we actually have an iOS app and this is the normal file manager from iOS itself where we have a plugin you can see that there is iCloud from Apple and just below it is next cloud so next cloud is a first class citizen on iOS you can actually open and save and store and do all kinds of stuff from all iOS apps directly inside next cloud very similar to the same as with iCloud from Apple and the right hand side you can see a tablet in the face also iOS in this case where you can see that there is more you can do if you have bigger screen you can do more stuff manage your files, look at PDFs and other stuff so we have very good clients and also we support also all kinds of other rich use cases on modern devices and here for example push notifications we have rich push notifications where you can get a notification if someone wants to talk with me in a comment or if you are running out of quota and all kinds of other things and even works on other devices like smart watches or televisions and all kinds of things that support this kind of stuff and we also have desktop clients of course for Mac, Windows and Linux this is built with Qt and with C++ it's basically for synchronizing folders from your laptop or your desktop to your next cloud server very similar to OneDrive or Dropbox but it's more flexible because you can actually synchronize to different servers, you can say I want to synchronize my photo folder to my private next cloud and then I want to synchronize my work documents folder with my work, next cloud and so on and also there we have very rich integrations, you can just click on the file and directly from the file manager and again this works on Mac, Windows, Linux and Linux we support GNOME and KDE Plasma you can directly share with the person directly on the desktop you get this nice dialogue where you can configure expiration dates or set passwords and all kinds of things and you also got push notifications on the desktop so this is all basically all possible on all platforms and you never have to use the web interface you can do everything the web interface you want but it also works natively in all common desktops integrated you can have all my files in next cloud that's great what you can do with them first of all you can look at them of course, here for example it's our built-in PDF viewer you can look at PDFs if you want if you upload pictures then there automatically picture gallery is generated from all my pictures so again all on my own infrastructure self-hosted might look similar to Facebook and Flickr and other stuff but this is all on my infrastructure and you can also share dialogue on the side always share with other people set expiration dates, set a password and so on support two-factor authentication so usually you integrate into some kind of LDAP or Active Directory in the backend and this can do some kind of advanced security but we also support two-factor authentication with hardware tokens or second-factor wire SMS or OTP protocol so it's very secure we have this Federation feature which is very flexible so let's say we have many many of those next-cloud servers, let's say one of the next-cloud server is my that one is my next-cloud server at home and this is my work next-cloud server and here's not next-cloud server at a service provider and we can all share folders with each other so it basically behaves from a user perspective as one big service it has the same usability and can do the same thing we have a single drive or Dropbox but there is no central server there is no central components or completely federated and distributed we use the OpenCloud Mesh API something that we redesigned but other vendors support this too this was built together with CERN and other organizations so it's a fully distributed system another concept is data access engine where you can mount other existing data sources into your next-cloud DriveShare or SharePoint as three FTP storage something you can all mount this into your Dropbox into your next-cloud and consolidate data like Dropbox into one thing a lot of organizations have legacy Dropbox accounts or legacy Google Drive accounts and you can say look employees you're no longer allowed to use it but you can mount it into your next-cloud and then you have a migration period of let's say a year or something over one after another and then we kill the connection completely so this is useful for that but if you have your data then all aggregated in your next-cloud you can do useful stuff like full-text search where we can use elastic search to index all your files doesn't matter where they are you can be in SharePoint or in some other Windows server but it's all visible in next-cloud and can all be found you can also use other features like this file access control or file firewall where an administrator can say my people, my employees they can share word files bigger than 10 megabytes that are tagged with secret to iOS devices on a Saturday to an IP address from China only if in there in the marketing group or something like that it's a very useful example but I think you get the point you can really model this kind of permissions in a very nice, fine, great way we support end-to-end encryption so you can actually encrypt your files on your clients automatically they're on the server they're only encrypted files and we also support like recovery keys and enterprise-grade key management so it's actually useful in bigger organizations so next-cloud is used you can actually run it at home on a Raspberry Pi for 3, 4, 5 users that's totally fine but it also scales to many, many users so we have the bit of a problem that we also have very big customers so our biggest installation actually has 20 million users it's a big service provider 20 million users and with this kind of architecture you want to have like more than one data center obviously this is a cluster setup sure but even a cluster setup comes to its limit so this kind of architecture you want to run on different hosting centers on different continents distributed and that's possible with the global scale architecture so we can support that we have a plugin for Outlook so a lot of companies they have Outlook used by all employees we actually have a plugin where you can attach a file to an email you press send the attachment is automatically uploaded to next cloud a sharing is generated and put into the mail and then the mail is sent this has the advantage that first of all you can send unlimited attachments, unlimited size and also you don't waste space on the exchange server which is very expensive and you can also still know who opens the file because you have the full auditing if you want there are groupware functionalities built in calendar and context management and email management if you want not everybody is using every feature by the way but you can enable and disable them as you want but that's actually one of our favorite features from people who want to get away from Google so there's a very nice calendar context and email functionality we even have project management functionalities as a plugin in next cloud this is a Kanban board very similar to Trello 100% open source and 100% on your infrastructure a new feature is in the social area so we actually implemented the mustard on API I don't know how many people know mustard on here a few, yes very nice distributed, federated open source social network very similar to our philosophy so last year we implemented the API so every next cloud server can also be a node mustard on distributed social network so you can easily share files and photos into the social network and also in the activity stream see messages from other servers from other users there's a feature called next cloud talk that's a communication feature so this can do different things you can audio video calls and chat and you can use this for like one-on-one communication but also for group communication so the basic functionality is something like chat rooms so it looks very similar to Slack or Microsoft Teams where you have different rooms and you have communication messages and we also have native iOS and Android applications here's the iOS application I think where you can also chat and get push notifications as you mentioned in the room for example but again everything is open source it's always open source and this can all run on your infrastructure not like Microsoft Teams not like Hangouts not like Slack or others like on your infrastructure very secure of course you can also share documents in your chat channel if you have a document picture word file or something you share it into the chat channel you see a preview you can click on it you can even like view the document edit the document if you want in this room press the start call button it transforms into a video call so this is again something that's using WebRTC it's an open standard and all the communication is again peer-to-peer and fully end-to-end encrypted with many many users I mean out of the box it supports like 3, 4, 5 users in a room and then it just reaches the capacity of this end-to-end communication but we have additional component called an MCU and if you install also this component then you can really have like 100 people in a room then of course document editing is another very important feature we can actually edit word files and Excel documents directly in the browser and we have two solutions one is Collabra Online this is using Libre Office on the server for that and another component which is called OnlyOffice which does something similar it's just a data solution so you can edit your documents directly in the browser similar to Office 365 and of course you can also do this with other people in a team so in the sidebar you can extend the sidebar invite other people to the same document and then you have several cursors at the same time in the same document and this can be a Libre Office ODF, ODT file or Microsoft file it's all great if you like five people to get editing the same document but maybe it's a bit confusing you need to coordinate because of that you can also in the sidebar have a chat channel and a video call with the people editing the same document at the same time I'm actually very proud about this feature because here we are actually able to innovate above Google and Microsoft because they can't do that Skype for Business for example is a completely different window it's not the same than Word Online so this is something which is very innovative and then we can also do spreadsheets of course where you can edit like spreadsheets and I picked an example with charts and formulas where you can see that even complex Excel sheets can be edited like that here's a screen of the same interface on an iPad and the same was an Android device too we have native integration the native applications can do the same you can browse your documents here you can click on a Word file and you get an editor on the side and the same with the chat and video calling feature so it's actually very powerful and it's just one of the many many examples of the functionality that's available in the next cloud we have something called an App Store it's not really in store because everything is free software and for free of course but as a repository from our community we have over 200 plugins for all kinds of extended functionality okay so this is all great but of course you're asking how do I get this I mean and what's the real difference to Google and Microsoft and the others how do I get it well first of all the very easiest solution is you just download one of the clients you go to the iOS App Store or the Google App Store download the client and then there's a button and you create an account on our server no that's not what happens because we don't host any data of you we never will we are not a service provider we only do the software but if you click the button you redirect it to one of the many many service providers who already offer next cloud and they can create an account if you want that but of course like in this audience you really want to host it yourself and for that there are also many many options you can download it from our website there are also packages for SUSE for example and for other distributions you can just very simple install next cloud and everything works out of the box if you prefer something a bit more preconfigured there are also virtual machines available for download docker containers, slab packages basically all kinds of ways to get the software if you want there are even companies and organizations who offer devices so there are some companies where you can buy a device I think here is someone who sells one of those, yes so there are also companies, again we don't do this because we are only an open-source software organization but many many other organizations actually sell hardware with next cloud pre-installed so that's very nice and of course if you're a big organization that for a few 10,000-100,000 users then we are also happy to provide enterprise support and enterprise subscription to you we have the same business model than what SUSE and Red Hat others have if you get an enterprise subscription for love as if you want if you want to contribute it's also very easy because next cloud is actually an open-source project so we counted over 1,800 volunteers lately over 1,800 volunteers contribute to next cloud there is only the code of course there is more like translations and testing and marketing and so on and we are really completely open everything is on github, all the feature requests all the bugs, the roadmap you can really be part of it like all the software components are there you can have a look and we also care about diversity so we have a new initiative called next cloud include where you can also fill in something into the contact form you want to have support in the area of mentorship or internship travel support all kinds of other things where you can become part of it okay the summary I'm running out of time I know I want to repeat a slide that I showed at the very beginning so next cloud is a real alternative to Dropbox, to Google Suite to over 365 with the difference that it is self-hosted open-source and distributed and federated so in a way we are a bit like the anti-cloud solution right we are basically giving you the same functionality that you can get from this nice big organizations but you can run it on-premise open-source on top of open-source or or less or something and you have it under control and supported and you keep your data where you want to keep it so you don't have to give your data to Google or someone else you can keep it in your organization in your stack, in your hosting center together with your proven other Linux tools as before but your users will have nice new features that I know and love from these cloud companies out there thanks a lot question does your 2FA, your hardware key support also work on apps and desktop sync clients and so how? I think it should, yes I think we one and a half years ago we introduced a new login flow which is a bit inspired by SAML which means in the clients and this is the same for iOS and Android and Mac Windows Linux clients if you actually want to authenticate against the next cloud server you go through a web interface where you can with a browser plugin authenticate and access token back in the client and then this is used to authenticate so this should work I'm not an expert in this area but I heard that it should work anyone else have a question hi that's a very interesting platform I'm wondering about, you mentioned clients so the browser client is that something that is developing next cloud sort of a client that bonded to the server and for example I was to set up a hosting myself and offer it as a platform as a service how would the clients access it is it a client software where is it what I'm wondering is there an alternative to the browser, does one develop like their own browser or sorry if I wasn't clear about it just running out of time so there's the web interface of course but a lot of people never use the web interface so we have native clients for Mac, Windows and Linux and you can run them on your desktop and they run as a small application in the SysTray and just synchronizing the different folders between your desktop or your laptop and your next cloud server so it's a integration to where can you speak up a bit for a Windows client because there's still so many people who are running Windows machines you have virtualization the people have to package that to run within a virtual window or do they have to run it through a browser in order to access all of this you don't have to use the browser at all there's a fully working standalone Windows client you can yes there is a full working Windows client you can install it and it runs like on your SysTray and the background is very similar or really the same than Google Drive and Dropbox and OneDrive you don't need the browser at all if you don't want to use the browser you don't have to one other question so I assume all of the actual file transfer happens through WebDab exactly and how do you deal with the performance hits you so the performance are two things the performance in our experience is not really a big problem I mean sure it all goes through HTTP and through TLS and there's some checksumming and stuff and there might be a little bit of more overhead than using I don't know, SIFs or something but on the other hand I wouldn't use SIFs over the internet you need some kind of advanced encryption that HTTPS gives you so I don't think there's a performance hit there there is of course a bit of a problem if you talk about big files the second part of your question and that's true, I mean this kind of transfer that not really reliable if they run for an hour or something if you transfer like five petabyte file or something and for that we implemented some kind of chunking to implement the chunking files are actually split into pieces and they're transferred separately and they're collected combined together on the server side and if the network connection goes down in the middle it doesn't matter it remembers which chunks are already transferred and which chunk is not and then already transmits the network chunks and if the internet connection works again so we don't really have a file size limit it should be quite fast, I mean if you just use some more native protocol like scp or something I'm sure you get like a few percent faster performance but all this happens in the background, asynchronously so users probably most of the time don't really see the difference so it's impressive work I haven't heard about this until today can you talk a little bit about the finances there's lots of people involved obviously in producing a product like this are you getting most funding from large enterprises or governments so our business model it's actually quite nice to talk in this audience here because you probably understand this, it's very similar to what SUSE is doing, I have to say so it's free software, you can download it you can use it same as you can use open SUSE that's all fine, we think that if you run something like that in a mission critical way in an organization with a few thousand employees then you probably want to have an enterprise subscription you probably want to have 24-7 support you probably want to have like a hands-on workshop to deploy everything you probably want to have security patches guaranteed in a certain timeframe and so on the same stuff you get with a less subscription that's the same you get from us so we make basically 100 percent of our money with the big companies governments too, the German government is a next-large user with 350,000 users so what is the difference between own cloud and the next cloud and why would I prefer one over the other I have a talk about this tomorrow and I'm probably a bit a sorry so you really should come to my talk tomorrow, so the quick answer is I have founded own cloud and there were some things that didn't work out so well and more about this tomorrow because of that the core team including me founded next cloud as a successor and I think that a few things are better now including for example the not free software iOS app which now is free software and also for free so I'm sorry that you lost some money on pying for the own cloud app but the app is free now and many other things are also better now but more about this in the talk tomorrow so my questions are about encryption so are the voice video calls all encrypted do you have a secure encrypted file transfer solution and are you using some kind of private key infrastructure for that it's really hard to understand so the question is if the it's about encryption so are the voice and video calls fully encrypted yes and for the file transfers do you have a secure encrypted file transfer mechanisms and do you have a public private key infrastructure you have to implement on top then what does that look like so this is the I could talk about this for a long time for hours so it's encryption there encryption on a lot of different levels first of all the transfer is of course all encrypted with HTTPS or TLS obviously and this is for everything this is a file this is for video calls audio calls push notifications chat messages everything so obviously transfer is always encrypted so then for files we also have encryption of files on the server so server side encryption on storage and this is useful for example if you have storage not running together with the next cloud server so earlier we had a talk about Cepf let's say you use like some Cepf object store and this is provided by a different organization or somehow you don't trust the administrator of the storage then we can also only store encrypted data on the object store if you want so we can encrypt the files on the next cloud server encrypts the file and stores and encrypt it on the storage that's possible and the third way that's possible is full end-to-end encryption in this scenario we encrypt the files on the clients on the macOS linux windows ios android clients they're all our clients support this kind of encryption and they're stored encrypted through the next cloud server on the storage and this is also possible it's the newest most advanced feature but it comes with some drawbacks for example the web interface doesn't work anymore so because of that we support different kinds of encryption for different use cases and you have to decide what you want and what exactly your attack vector is okay um we're gonna uh shut this off if you have a question for frank you could uh probably talk to him outside or go to the openSUSA booth he'll be around and if you want to learn how to install next cloud on one of the distributions including openSUSA matt's up next so thanks Frank okay are we live can you hear me awesome okay so my name is Matthew and um the purpose of this workshop is to walk through installing next cloud from a fresh install of openSUSA LEAP 15 up to a running next cloud instance i don't see very many laptops so i'm guessing nobody wants to walk through it with me um we have an hour to do this so if you have a VM or if you have a LEAP laptop and you want to walk through it with me um we're just going to talk about the installation process and it's going to be um just like i said from a from a fresh install to a working server so before or as we're getting started this is my first time giving this presentation um in fact this is my first time giving any presentation i am whether you can tell or not a bundle of nerves up here so okay good good um yeah so we're going to install the lamp stack and um we're going to configure the database and we're going to install next cloud so that's what we're going to do um in order to do that you're going to have to place a little bit of trust in me and if you're going to trust somebody then you really ought to know one thing well two things and that's who are you and what gives you the right so let's talk about who am I and what gives me the right um i said my name is Matthew McGraw um i'm a full-time dad um those are my three kids um little girl little boy on top and my big boy in the middle um this picture is unfortunately a couple years old but i didn't have any with them all together that was newer um and of course my uh my other half is there as well um i have done some podcasting um i have contributed to hacker public radio if you're familiar with hpr um and i own uh show called the geek dad show which i do post uh occasionally um the website is here at the bottom norcalgeekdad.com and um i'm on twitter and i'm on instagram and i'm on telegram and i'm geek dad just about everywhere sometimes with the threes and sometimes with the ease it depends on what's available um so uh yeah so so feel free to to follow me um if we get to the end and you have questions uh i'm happy to communicate with you guys and and do whatever i can to help you get a next cloud instance running um okay all right um so what gives me the right to come up here and stand in front of you and give you this information and the the most important thing is that it's open source and one of the best things about open source is that anybody can dive in and learn something and then share it with other people and um i have been a volunteer uh at the next cloud booth here at scale um three different years and um i really enjoy talking to people about next cloud and what it can do to make your life better solve your problems if you've uh got something that we can solve i want to help you try and do that so um i am not a developer i'm not uh a contributor of code and um that's not really my wheelhouse so um if things mess up i uh i still use google and try to figure out um where i got where i went wrong i do run my own instance of um next cloud uh i run it on a digital ocean droplet uh this is not an advertisement for digital ocean but it's a great way to get started um if you want to do that they don't have a native uh open suza image yet um but they do allow custom images now and so you can go on to the build service and get a um qcow2 image and upload it and then boot your bootstrap your system that way and i have done that 15-20 times over the last couple weeks as i've been walking through this talk and it it does work remarkably well so um yeah so um and also in doing this i am i'm making some assumptions uh about you guys and um like it says you're probably an open suza user if you're at open suza summit right on i uh i'm actually a new convert to open suza um i've been running it on my laptop here for about six weeks um tumbleweed and i really enjoy it um we're going to do some command line work um in order to get this installed so i'm assuming that you're somewhat comfortable with that um thanks to frank's wonderful talk that he gave before i'm hoping you know what nextcloud is um and uh and that you're interested in leveraging it for your own use um um and uh i think those last two are the most important sometimes things don't go right the first time so i've been through this talk i've given it to myself and walked through the procedure and it works so i'm hoping it works today and um and if it's not fun then we're doing it wrong so um let's uh we're gonna skip what is nextcloud anyway because unless anybody has a question about what it is yes sir in the back so you're asking about the vm image okay um the the image i i don't have those numbers um i do know that when everything's installed it will run off of a 16 gig raspberry pi sd card um that doesn't give you a lot of storage but um i have run it that way um in terms of just having the lamp stack installed and the and the next cloud app installed um i think that the image that i used on digital ocean was approximately 200 to 300 megs i think so it's not i i didn't start with a full install of leap off the cd i i use the basic image i'm sorry i barely hear you oh i definitely not yeah oh sure sure so the question thank you the question was um how much uh space do you need to allocate on your vm uh to run an open susah leap uh bare bones plus the next cloud to install the lamp stack so um and uh the the final bit of his question if i heard correctly was will it be under three gigs and that shouldn't be any problem with that obviously your your storage container or whatever you're using for your storage backing is going to have to be bigger than that or that's all the amount of files you'll be able to store on the server um but as far as just running the application that should that should be good um okay so this is where we started this is where i started when i was um when i was starting this uh to research for this talk um it was a fresh install of leap and i used the j e os the just enough operating system image of open susah so um the first thing i did was a good old zipper up to update the system and um and i also added a command line web browser and w3m um w get which will need for downloading um the next cloud app and um the j e os image doesn't include um sudo so if you're wanting to run a unprivileged user with escalation you'll need to install um sudo as well um if you need to create a user for your um for your as an unprivileged user that you want to use escalation instead of running as root um it's a really simple um really simple user add-m which will give you the skeleton files and it will give you the home directory of the user and whatever the standard um it's usually a bash shell um and you're all set to go um when it comes to to sudo um the default for leap um in that file is for the user who is using sudo types roots password not their own if you're just running a single instance where you're the only person that needs to have escalated privileges that's probably okay but you probably don't want to hand out roots password to everybody if you're running it um in a in a multi-user environment so there's a couple of lines in the in the vi sudo file um that you delete um and if you want to search in the file you're looking for the term target p w t a r g e t p w and um by default open sudo sets that target password to root and if you delete those lines then um you have more control over who gets access to escalating the privileges um at that point you're probably also going to want to add your user to the wheel group and uncomment the line that allows the wheel group to have privileged escalation so um that's how I would recommend doing it if you were doing it on a multi user server and yeah so um to get started um you log into your system and you're staring at that blinking black and white screen or green depending on how you set your terminal up and um and so this is where we start a lot of this is command line and you're going to need escalated privileges so if you have set up an unprivileged user and you don't want to keep typing sudo obviously uh if you do sudo space su then you can switch into a super user mode until you exit out um so that's one way to get around it if you're like me and you're scared which I get scared when I'm entering commands because I don't want to erase my system or something then you can type sudo every time and and that works just as well so um again I'm assuming that you are installing this on a headless server or a VM um where you're just at a console and not at a desktop with a terminal emulator so um so Yast is your friend and um you can install it with um zipper but one of the benefits I find to using Yast instead of zipper at the command line is that you don't have to know the exact name of the package so if you um if you run Yast then you get a search box and you can type lamp and it will show you on the screen lamp stack and you click it and hit apply and um and it will get that full stack obviously Linux is already installed Apache is the web server MariaDB or Maria I've heard it both ways anybody know what's official ok I'm going to say Maria because that's what I always said um and PHP and um in Leap the versions of these tools are fairly current and I haven't had any problem um installing and running the next cloud application on the tools from the repo so you should be set to do that and then obviously you're going to continue to do your updates as you maintain your server um so that's uh that that step will take as long as your internet connection takes for you to download the packages and for Yast to um run through all the installation um so the next thing that we do um after that is we have to tell SystemD that we've installed Apache um so that's uh we use the system control systemctl command and we want to enable Apache 2 so that it starts on boot but we don't want to have to reboot the system so we're going to go ahead and start um Apache yeah um that's the default when you install a lamp stack from OpenSuselip and if you if you want to install Apache and you want to install Postgres or if you want to install Nginx and you want to install Postgres and you want to install PHP um as far as I know Frank correct me if I'm wrong those tools work just fine it's it's not an incompatibility um but for a new user that's just you know maybe this is the first time you're setting up a web server um this is the default when you when you install lamp from Leap um and so then the best thing to do after you have started a web server is to make sure that the web server is working um if you are running a VM on your laptop and you've got a uh graphical web browser and you can go to local host um in your browser then that's great but we installed W3M at the beginning so if you're just at the command line you can type W3M space local host and it will pull up the Apache it works page and that's all it says it's a blank page it says it works at the top and if you see that then you know you've got um apache running now because um of the way Leap handles um the firewall if you are not at the system you are installing on and you're only um SSH to in for example then you're going to want to go ahead and run these firewall commands to open up port 80 and then restart um the firewall rules reload the firewall rules um I went round and round because I was doing this over SSH from my laptop to my home server and I could not figure out for the life of me why Apache wasn't running even though when I did a test it said it was running and um so I went round and round about that until I realized that um SUSE basically denies everything through the firewall out of the gate so um as soon as you open up port 80 then you're usually good to go then we have to tell systemd about MariaDB and so those two um commands at the top you'll notice are the same we enable it so that systemd knows to run it at boot and we start it so we can mess with it um and then this next step of the MySQL secure installation script um I would recommend doing it even if you're just running this on your laptop um in a VM whatever um mostly to get in the habit because if you ever install it on a public facing system this is a great script to harden your your database installation um out of the box Leap15 does not set a password for the root user of the database so it is root and an empty password line so the very first thing that happens when you run MySQL secure installation is it asks you to set the root password as you can see in bold type don't forget this password it's it's possible to recover but it's it's not pretty it's not fun um and then basically you're going to just hit enter or y through the rest of the script what it's doing is it's removing the um demo databases that are um it installed as an example it deletes a um unprivileged um database user and it flushes all your tables out so you're ready to start and um and so if you run run through that step then you will have um you'll have some initial hardening to your setup um for security purposes um PHP modules are tricky sometimes when um when Leap installs PHP um it's installing PHP 7 right now and it installs a bunch of modules but these are the six or so six or seven that when I got to the end of the install next cloud complained that it didn't have access to these things so um so these are the the PHP modules that you may have to um install separately I don't know if tumbleweed installs any of these I didn't run through it on a tumbleweed install um just so you know I was gonna I should have mentioned this already um all the slides the markdown um art assets and everything um it's gonna be on my github um or available through my website so um if you're wanting to follow through this and you don't have a machine to do it today um you will have access to this all of this information um to do it at another time so um don't try to feel like you have to memorize all these or write them down excuse me um PHP my admin is um a great way if you're not a database god to uh to add users and create databases it just makes it easier particularly for newbies um if you are mysqlgod and you can do this from the command line go for it what you need is a user you need a database and then the user has to be connected to the database and I believe that if you know what you're doing with SQL commands you can do that all in one line but I can never remember how to do it and PHP my admin doesn't take up that much disk space um and it makes it just that much easier to create your users so um again we're gonna we're gonna create a user with a password do not forget this password yeah on my instance that I use everyday my personal instance which is on digital ocean I don't have PHP my admin installed um I don't know what all the steps are to hardening that particular attack vector um so maybe you know depending on your situation your security level you may not want to use um PHP my admin in this case um the first time I set this up was on shared hosting and they had a cPanel plugin and I could set up the database that way um so yeah that's it that's a good question though thank you so now we have our lamp stack installed and we have set up a patching to listen and we have our database ready for data excuse me and we have uh PHP ready to uh process and the next thing to do now is actually get down to installing next cloud so the version of next cloud in the repos for leap 15 is next cloud 13 um it is not current our current um version is 15 point you hear me now aha there we go much better yeah go ahead so you you can um I believe I believe it's available for the open build service but if you are on tumbleweed then you are on if you do a zipper in of next cloud it installs version 15 so that's uh that's a little bit of a cheat not cheat but if you're if you're willing to run tumbleweed on your server then you can just zipper install um next cloud and get at least the base version you may have to upgrade to the point release but you can get the base version from um from zipper just from the repos um but uh the other thing is if you are um setting this up on another operating system um the package archives are not always kept up to date in the best way to get the freshest is to go straight to the source um obviously this url you're not going to be able to copy paste that because the version number changes um but you should be able to see it on the website um to get the correct url so you can either download it through your browser or you can use wget um in the root of your of your um web browser or web server um document root and then it's just uh an unzip and in open suza the Apache user is www run and the group is www so we want to do a change uh ownership to the whole directory recursively for next cloud so that when we launch up our web browser then the web server has permissions to read and write to the files that we need to configure and um and to run the application and um and after you do that then we need to restart Apache so that it is made aware of its new the files in the in the in the root there um there's one more step that is not um excuse me not always recommend I'm not I have seen people do demos and install tutorials and they don't do this this is something that I always do when I'm installing next cloud and that is to move the data directory from outside your web server um if your data when next cloud unzips it's going to install a file uh folder in um under the next cloud root there's a folder called data and that's where all your user data all the stuff that you're syncing all of your content that you're putting on your next cloud server is going to live in that data directory if that data directory is stays under your document root there's a chance that somebody could exploit your web server and get access to that data so I usually make a uh a data directory in the root of the file system on the server um you can do it on your NAS you can do it on your server you can do it wherever you need to have that data stored external drive whatever as long as it's mounted into your server you can link you can put the path in when we configure it in the next uh I guess the next step and I'll show you where you put that information in um so yeah so we just create the we just create the directory and then again change the ownership so that the web server has access to the files or the web user the Apache user I should say um so after we do that and we've unpacked the file and we have restarted Apache um then we should be able to open up our browser and go to localhost or 192.168. whatever um whatever your server is by default it goes into the slash next cloud directory under the root so the web address will be mydomain.com slash next cloud if you don't want to run it that way you can copy all the files from the next cloud folder up one directory and then you'll have it in your root of your um web server there are hidden files so you want to make sure that however you have those files whether it's through the command line or through um the file manager that you are taking all the files including um any hidden files that might be there um you can see I've circled in red here when you pull it up um that is where you're going to put the root or the path to that data directory that you created um you can see on here this is where it um is going to want to put it it's going to want to put it in your document root and then in the next cloud folder and then that's where your data is going to be so when you change it like we did in the previous step you want to be sure and change that um at this stage as well um you're going to create an admin user this is another thing that um kind of depends on your own personal philosophy and workflow you can just create a user called admin or administrator probably not admin because that's what everybody uses is going to try to bang on your box then they're probably going to try to use the the user admin so you can make it webmaster or you can make it administrator or you can make it next cloud maniac I don't care but um you your very first user that you create is going to be the admin user for next cloud and you're going to create a password remember this password and then at the bottom uh there's an option if you are running a really light system um when I run this on a raspberry pi I don't usually use maria db or my sql clone um sql light works um if you're just doing small amounts of um files like you might do on a on a home network um but for this installation since we installed maria db we're going to select that here we're going to enter in our database user and password that we created earlier that we didn't forget and um and the database name I usually use next cloud and as the database user and the database name I've not had any trouble doing that as long as I um have keypass generate me a nice long password um so then uh comes the moment of truth right we uh right down there at the bottom and we hit finish uh finish setup and hopefully this is what we see and this is your first look at your next cloud instance um got a little uh splash screen and um and then the files app is right behind there and I don't remember which apps are in core but I believe it's files and calendar and contacts and I don't remember is gallery and corner so it's going to install a few um apps because as you may remember um next cloud isn't just an app it's also a platform so we have open source contributors um and next cloud developers that are creating additional functionality that you can plug right into your existing instance um I use the calendar and the contacts functions every day I decided I didn't want Google to have my contacts and my calendar anymore so I pulled everything out I synced my android smartphone right up to my next cloud instance and it saves all of my um contacts and my calendaring um there is functionality within the calendar app to subscribe to Google calendars um I believe that there are gate there are some some way to connect to outlook is that right okay maybe not that but um but it is running a full card dev and caldav server so any um applications that you have on the desktop or on your phone that um operate through those protocols you can um you can sync your contacts and your calendaring through that so like Thunderbird has a calendaring um that um there are several calendars for android um that also will connect up that way so yeah so hopefully that is uh that is a full installation ready to go so are you talking about when you log in from a new yes right so it's going to ask you to approve the domain or the IP address that you are serving it from I've only ever run it on one server at a time I haven't done any load balancing or anything like that sure so yeah that's a good point but his comment was about um when you first um log in the first time to next cloud it's going to ask you to put in your user your admin username and password to verify that the domain name that you've typed in is the one you meant to type in you're going to um tell the system yes this is the name of this server and I am approving that so that's that is something that you have to do on a on a new install um and then after that um you can um create users uh if you want to have multiple users um users can be put into groups so you can give access to different parts of the system based on what user group the um the users are placed in and any of the users can be granted administrator privileges so if you don't ever want to have to log in as admin again you can give your daily user admin privileges which is what I do since I'm the only user on the system excuse me um but if you were in a situation where you were going to have many users and you wanted to control access then you could do all your admin work through the admin account um and then parse out privileges to the other users depending on your use case um next cloud is all about syncing so um Frank mentioned a few of the different um clients that are available there are windows mac and linux native desktop apps for syncing and also android and ios um um the apps that I have I've used the um android the linux and the mac um apps and they work flawlessly I haven't had any trouble I've heard people say that there's a lag time between when they hit save on their document and how soon it shows up on their server and I have never had more than a couple seconds lag between the time I hit save on a document and the time that my sync icon in my system trade changes from a check mark to busy and then back to the check mark so again it's going to depend on your um upload speed and how big of the file is but in terms of initiating the transfer it's very fast in my experience um and then as I mentioned it is an app platform so um you can go to apps.nextcloud.com and you can browse through um personal information managers and media players um there is a way to run one of my favorite actually one of my favorite apps to run is the notes app um if you're a um evernote or a simple note user and you like the functionality of being able to sync your notes this is a great um a great app to use I have a desktop um app on opensusa called qonotes and it um speaks to the next cloud notes api and so I can edit my notes on a um desktop app and a native app on my on my desktop and it will sync right up to my um next cloud instance and then I also have a notes app on my next cloud notes app on my phone so if I need to reference my grocery list or the meal menu that the kids and I made or um the list of uh movies that I want to rent or borrow from the library I can reference that all those notes from my phone on the go which is really nice um um anybody familiar with ampachi for streaming music anyone know um it is possible depending on your band depending on your bandwidth to create your own self hosted spotify by installing the ampachi app and any number of ampachi clients from fdroid or from the google play store I would assume IOS as well I'm not an IOS user so I don't know um but you can connect to your server and play the music that is stored on it streaming over your network connection um also to your desktop it doesn't have to be it doesn't have to be your mobile device but um that's another it's just another thing and there are pages and pages of apps and new ones being developed all the time and it's very possible that you may find something that you didn't know you always needed um but you'll soon find maybe that you can't live without it and then the last thing I would say is if you are running a public facing or an internet facing server you have to do your um your let's encrypt be sure to apply your TLS certificates so that you have HTTPS so that you have the encrypted file transfers that Frank was talking about um in his talk so I think that's it um thank you for coming um there's the link to my website it's of course in blue and you can't read it norcalgeekdad.com slash scale 17x all put together um you will find links there to um slide deck and the mark down and the icons the open seas of summit that I borrowed off the open seas of summit website um all of the assets that I used including a link to my github repository and a link to download a build of just an fos leap 15 that is in a qcow 2 format that you can upload directly to your hosting provider or um or boot directly through kvm um or zen on your um on your local network um so um I have those things stored um on internet archive so you should be able to get to those without any trouble um and those links are available at that url um if you want to check them out that's all I have um I am happy to answer questions if I can um please come to the expo floor and come to the open seas of slash next club booth um feel free to chat more if you go back to the hotel tonight or your house and you try this and it doesn't work then come let's talk about it um does are there any questions that haven't been asked so um I am one user on my server and um I don't I don't use a lot of storage I run it on their lowest level vps which is five bucks a month um it's uh one processor and a gig of ram and a terabyte of transfer so if you're not uploading the family video library or you know your uh your 2000 cd um music collection then you're probably not going to um go over that what I do do okay we can all laugh at that um what I do is I also pay for um 20 gigabytes additional block storage that's attached to my server and that I just mount as a mount point through the um through the next cloud interface and um it handles that no problem um I I like to store my podcast files up there so that they're not taking um not taking up a room on my hard drive um and I don't sync those to all the machines that's I don't know if Frank mentioned during his talk um another really cool thing about the desktop sync client um at least I've seen on linux and mac is that you can do some granularity in what you're syncing um so for example on my think pad um I sync my whole my whole directory my whole I sync the whole server except for that podcast um folder on my netbook I sync my documents folder and um some photographs and um some ssh keys so that I can set up new systems but I don't have to sync the whole thing I can I can have some granularity over which folders subfolders of my instance I want to sync to which device the other nice thing about um the Android client is that it will browse your file system but it won't download anything until you tell it to so if you want to have access to your files on your mobile device and you don't want to take up all the storage space on your mobile device you can browse through your subfolders find the song or the movie or the document or whatever that you want to have on your device and download it at that time I believe that same functionality is coming to the desktop client so that you don't have to sync your entire instance to still be able to browse the whole thing and download it as needed is that right okay so that's coming to the desktop client as well anything else yeah I'm sorry oh the yes yes right so the comment was for the benefit of the video that you don't have to just use the local storage on your server you can mount in you can mount in Dropbox I believe you can mount in Google Drive AWS SharePoint yeah so you can mount in all these different storage backings not to mention FTP SFTP Samba you can you can mount all these different storage solutions onto your instance and NextCloud just treats it like storage that's another thing that Frank was mentioning when he was talking about crypto at the end is if you already pay Dropbox for the use of gigs of storage and your concern is I don't want to be responsible for managing all the backups and I like the convenience of being able to use Dropbox's servers and not mine you can utilize the encryption client side or even through the NextCloud server and what you end up storing on Dropbox's servers is fully encrypted blobs that they don't read even if they wanted to and that's true for any of the storage options I believe that you can mount into NextCloud it will treat that storage like dumb storage and just store encrypted data on it and so if you already have existing infrastructure you can mount that right in and utilize it that way and then yeah you have webdav access to your files so if you don't want to run the sync client but you've got a windows machine and you want to mount a web folder through the file manager you can do that directly onto your instance so it's super convenient, it's super robust and it's a lot of fun to play with and it's a lot of fun to know that you're hosting your data and not Google or Apple or Amazon unless that's where you choose to store your stuff Anything else? Alright, well thank you guys for coming and muddling through it with me, I appreciate it