 Okay, please welcome Jeremy and another presentation about Samba. Thanks very much. So I'm going to try and speak up because there's no amplification. Can you hear me okay at the back? Yeah? Okay, cool. So my name is Jeremy Allison. I work for Google in the open source program office. But I really want to make it very clear. There's my Samba.org email address on the presentation. And this is not a Google badge presentation. Just in case there's any reporters hiding in the corners or something. This is not an official Google presentation. This is not what Google thinks. This is not what Google says, et cetera. Any Google lawyers in the room should run screaming for the hills because nobody's reviewed it or seen it. So, having said that, let's dig into SMB3 Unix Extensions. This is going to be a reasonably technical talk. It's not going to dive down as deeply as I could. If you want more details, please feel free to ask questions or ask some more detail at any point. Stick your hands up and I'm happy to be interrupted and take questions in the middle of the talk. So, first of all, let's go back one step and say, does everyone here in the room know A what Samba is? Anyone doesn't know what Samba is? Okay, cool. And here, SMB3. Is there anyone here who doesn't know what SMB3 is? Okay, so we have one person. That's always good to say. So SMB3 is the third version of a protocol originally developed by IBM, taken on by Microsoft. And it is the protocol that Windows uses to talk to Windows to access files and directories. So if you mount, you know, if you bring up your Windows Explorer and you attach to another server, you can see the files and directories. SMB is the mechanism that's allowing that to happen over a network. And SMB3 is the third version of that protocol that Microsoft have developed. And it's actually quite nice, I think. So why extend SMB3? Why add Unix extensions to it? Well, SMB3 is a Windows to Windows protocol. So it doesn't have much accommodation for specific things that Unix machines need. So originally, back in the SMB1 days, various vendors wanted to add features to SMB. Now, SMB doesn't have a standards organization. It only has Microsoft, and what Microsoft does pretty much becomes the standard. So it's evolved only to fit the needs of Windows to Windows. So other vendors who needed to interoperate with Windows, sort of like HP and SCO before they went insane and started suing IBM of Linux, et cetera, they thought, well, we already have an SMB redirector in our kernel, which is what allows us to mount the Windows systems. It wouldn't be so hard to extend this to talk to the SMB servers that we also have on our Unix systems. And SAMBA was the most ubiquitous server out there and was the easiest to extend because you could just type the code in. So SCO, HP, and a few of the vendors started proposing additions to SMB1 to make it work better between Unix to Unix because if you're aware, Windows to Windows has case insensitive path names, deleting files works differently, renaming files works differently, file locking is completely different. The semantics don't match. So by extending SMB1, they could make the semantics match Unix to Unix much better so that Unix applications would get the semantics that they expected when they mounted a remote file system. And excuse me, I'm going to have a... Is the Ganesha maintainer in the room? He was earlier. Okay, if he's not here, I can get away with more with my NFS jokes. So SMB1 Unix was actually better at Unix to Unix, POSIX to POSIX semantics than NFS. NFS has a boatload of hideous things in it like silly renames that basically hide the fact that it's a stateless protocol, at least NFS v3, and doesn't work very well POSIX to POSIX. Well, SMB1 with Unix extensions had better fidelity for Unix to Unix than NFS does. So plus, you know, NFS sucks and it should just die. So I like this slide. I made it because you can actually choose who you are depending on which protocol you like. You know, you could say that NFS is Kirk's screaming, Kaan! And SMB is saying, from hell's heart, I spit at thee. Or it could be the other way around. And the other thing I really love is, meanwhile in the cloud, we will add your biological and technological distinctiveness to our own, your culture will adopt to service us, resistance is futile. And no better summation of the cloud vendors, storage vendors I could possibly imagine. So yes, so Kaan and Kirk are having fun there while the Borg are out there waiting for us all. What happened with the SMB1, Unix extensions? Well, they became a hideous train wreck. Why was this? Partly my fault. Partly the fault of a now Microsoft employee, because I wouldn't say no to him. The issue was there was no real oversight on what got added. And it was a, well, if it works, it's good. It doesn't have to protocol design, it just has to work, which is, you know, you can see the point in that. It was an enabler for some security problems. And I'm going to go into those in a lot more detail later on. But allowing server-side SIM links, in other words, SIM links that are created on the server-side file system by the client that the server then follows is a recipe for security disaster. And I'll explain why in a little while. So we just added anything we thought of. Oh, we want ACLs right, we'll stick POSIX ACLs in there. We want encryption. Okay, let's put transport level encryption in there. We want caseless path name semantics. Okay, that's a given, we'll put that in there. We want to have new info levels for POSIX semantics, even if there was a Windows info level that was kind of similar. Yeah, let's just throw a boatload more info levels in. So it, like I say, it was just ad hoc thrown together. Having said that, it had some very useful effects. I've been chatting to the Microsoft engineers who implemented SMB2 transport encryption. They told me that they originally had terrible pushback from their internal security teams who said, no, no, don't do this at your file system layer. We've got IPsec to handle all security in the network layer. And as everyone knows, IPsec is widely adopted and easy to set up. So that was a joke for people who don't have to support IPsec. It's terrible and doesn't work very well. So they were able to add transport level encryption directly into SMB1 just by saying, well, look at those open source guys. They've got transport level encryption. We need it for feature parity. So that was actually quite useful. And the SMB3 transport level encryption is actually very well implemented and secure. So I'm happy about that. So monorail, monorail, who needs a monorail? Sorry, Simpsons. A clean slate. We can design something glowing, sliding into the future like the Springfield monorail. But one thing that we came away with, and one of the reasons this has taken a great deal of time, also because I'm kind of lazy and don't do things unless people are shouting at me, I finally came down with a philosophy of utter minimalism, which was that rather than the train wreck that was SMB1 Unix Extensions, the SMB3 Unix Extensions would only add things that we could not do any other way. So don't reinvent anything that SMB3 already has. Just tweak and add in very small ways. Now, because the SMB3 protocol is actually quite nicely designed, that made this not too hard, actually. So what does this mean? Cut down on the extra stuff you need to add. Don't duplicate stuff. If you have an info level in Windows that returns create timestamps or whatever, don't add another POSIX level that does the same thing. You've already got that data. Get it the Windows way. Allow semantics that aren't 100% POSIX, but are kind of close enough. And I'll explain a little more what that means in a later slide. Reuse the existing SMB3 features that Windows uses. So what does that mean? It means that we don't have to invent our own mechanism for encryption. Windows already has ACLs, and they love them or hate them, and I certainly do hate them. They are the most widely deployed ACL mechanism on the planet. Everybody understands. Okay, nobody understands Windows ACLs, but everybody uses Windows ACLs. That's probably the best way to put it. So they're very common, and we should adopt the same. Microsoft already added some features to SMB3, not widely used, info levels and such, that would make their NFS server work against the Windows backend file system. So when we have those mechanisms that Microsoft already invented, just reuse them. Don't try and invent our own. Don't try and say ours are better. Just say, hey, this is the Windows way of setting things. Let's use that. So what did Apple do? Let's contrast the Apple approach. Apple standardized on SMB2 and above for all of their file sharing needs. They replaced their proprietary AFP file sharing protocol. But they kind of did it in a SMB1 Unix extension similar way. In other words, I think they made a real mess of it. So what they did was they said, oh, we're just going to keep a regular SMB1 connection, and then at the point that you actually attach to the network share, the first operation for an Apple server is a special magic open. And when you send that special magic open with a create context, I'll get into what create context in a minute of AAPL, then the server knows I'm a Mac client and therefore I will behave differently and do things, do different things. Now that's basically the same kind of magic action horror. Send an info level that has no meaning other than to change the behavior of the server long term. And I'm trying to avoid that in SMB3 Unix extensions. So, you know, this magic action means the server must now respond differently to various different operations. And the effect of this is that poor Ralph from Cernet, who was the original AFP maintainer for the AFP server on Linux, who now maintains the Apple compatibility with SAMBA, SAMBA used to claim we were bug for bug compatible with Windows. Now we have to claim we're bug for bug compatible with Windows and Apple. So he has to duplicate the same bugs that the Mac server has in order to make the Mac's clients run. So this is a pain. So what do we end up with? How should it look? So the nice thing about SMB3 Negotiate is that you have this concept of negotiate concepts, context, sorry, which are basically arbitrary little pieces of data you can attach to the first packet. So we added a new one. I originally thought we could get away without adding this context. So you send an empty context saying, hey, I'm a client that is capable of talking POSIX SMB3 Unix extensions. And if the server then replies saying, yes, I also can speak SMB3 Unix extensions, then the client knows it's talking to a Unix capable server. So the nice thing about this is why is this there? Well, firstly, if the server doesn't respond with that, the client knows not to try any other POSIX, or Unix extensions. The protocol spec says that they are non-mandatory. You can send as many of these contexts as you like, and the server only responds to the ones it understands, which is nice. And here's the kicker. Here's the reason why it's there. Theoretically, you don't need it because every single operation in SMB3 has to have a handle and a handle has to have contexts and those contexts could specify SMB3 Unix. So theoretically, I originally thought we can get away with simply doing contexts on the creates rather than on the initial negotiate. And then one of the Microsoft engineers in discussion on this pointed out that we really need a create context if you start sending these Unix commands to the Windows server or to a server that you think might support them and it doesn't respond or rather responds to the rest of the create but not the Unix part, you don't know whether it doesn't understand what you're sending it and is ignoring it or whether it understands what you're sending it but is choosing to reject it on this particular operation. And so that's what the negotiate context is for. If the server responds to the initial negotiate context, you know that the server is at least capable of doing the Unix extensions and then when you send them in further operations, if it refuses to do them, you know that the server is making a policy decision. It's not just you can tell the difference between I'm ignoring you or I don't understand you which can be kind of important in protocols. And the other thing it allows us to do is if we mess this up and we need to rev the extension version number, which is possible, probably in fact, we have a way to say for the client to tell the server, oh, I want to speak, excuse me, Unix extensions version two, not version one or version three, et cetera. So the secret to making this work is essentially to leave alone all of the protocol that Windows to Windows speaks. So the first thing an SMB3 operation does is it says, here's a path name. I want to turn this into a handle. I will then do operations on this handle. This turning a path name into a handle is done by a create. If you're familiar with the Win32 API, that's the create file API. It's turned into an operation on the wire create and to modify the create, where you can say things like, I want to open an older snapshot, I want to do this, I want to do that, you add these contexts called create contexts. And so the way the Unix extensions are structured is such that adding the Unix, I want to do a Unix create, turns on Unix behavior, turns on Unix behavior with renames, deletes, locking, read, write, the rest of it. For that handle only. So you can open a boatload of Windows handles, then you can open a Unix handle. Now, at least to SAMA, this made it easy for us because we already have to handle the case where we're serving SMB1 Unix extensions and Windows clients simultaneously. So we already have the code that can differentiate between the two. So I was kind of a cheat in that it was already easy for us. So what does a create context look like? It actually, I mean, I was thinking about showing the picture of it and whatever, but it really, it's pointless. It's just a grid. It's just a 128-bit random number. I generated a write on this laptop and then stuck in the code. If the server responds with a reply saying, yes, I understood this, then you know that you've got Unix semantics. So the first thing you have to run into is case sensitivity. Windows to Windows, it's case insensitive. Unix to Unix, it needs to be case sensitive. So this is the first thing. Adding in that Unix create context tells the server, I want you to pass this path name case sensitively as a proper Unix to Unix connection would. Now, it is only a request for the server to do so. It is not mandated in the spec. Why is that? You might say, well, Unix to Unix has to be case insensitive, but that's not true. If you look at Linux, imagine that you are mounting a NTFS file system on a USB stick or a USB drive or whatever. Unix already has to cope with case insensitive path name lookups. So asking for a Unix create context on SMB3 and getting something back doesn't 100% mean that you've got case insensitive semantics. It just means you asked the server for that and it did its best effort. If you really want to know, did I get case insensitive? Can this share, can this file system give me case sensitive semantics? There's a separate info level with a bit in it that says, hey, my share can do case sensitivity or not. So it's a best effort request. So the rest of the changes that you need are remarkably small. You do need one info level. I tried to get away without it, went back and forth, but there is really one extra piece of info, a few extra pieces of information that you need to return from the Unix stat system call that you can't get by doing just, you would have to do multiple round trips to Windows info levels to get the same thing. And you still wouldn't get, you'd have to fetch ACLs for every single file, for instance, to get the mode bits. So it's a lot easier to have one very finely modified info level that will return all the information a POSIX stat needs. For setting the mode for doing Chimad, the easiest way to do that is you have to support ACLs anyway is you do a set access control list call with a special set of ACL masks that say user group owner. Microsoft already designed this. They already did it for their NFS server. So just reuse it, we already did. POSIX ACLs, anyone here uses POSIX ACLs on their Unix server? Tough. Windows ACLs, they're gone now. Windows ACLs only. Your POSIX ACLs will have to be mapped into the Windows ACLs best effort basis. As ZFS is becoming very, very common on FreeBSD and Linux systems. Yes, I know about the licensing. People still use it. That supports NFS v4 ACLs. I think XFS does. The world is moving to Windows like ACLs. So you can set for SMB3 Unix to Unix. You can take the incoming Windows ACLs, turn them into NFS v4 ACLs on the wire. That code is already in the Linux kernel to support NFS v4 servers. So let's just reuse that. I'm tired of fighting Kristoff Helwig over the Windows ACLs piece. Let's just put the NFS ACLs in. Other thing you don't get, no UIDs and GRIDs. Everything comes back as Windows SIDS. Why is this? This actually was a big argument. We ran around for many months over this. And the reason for it ultimately is any SMB client has to cope with mapping Windows SIDS, security IDs, UIDs and GRIDs basically into Unix numbers, UIDs and GRIDs. The client already has to do this work, so I'm not going to make the server do it. What you will end up with, and we did end up with this in the SMB1 Unix extensions, is you ended up with being able to make POSIX calls to get UIDs and GRIDs back, and then you would make Windows calls to get SIDS back, and you would be exposing the mapping that the server was using between SIDS and UIDs. That mapping may have been different from the one that the client was using. And that lies the source of a million bugs and possibly security issues too. If you have one Windows SIDS that's mapped into a UID, one UID on the client and a different UID on the server, that way lies madness and security errors. So, eliminate the problem, Windows SIDS on the wire, nothing else. Changing ownership is simple. You just do a set-ackle, nothing different there, exactly as Windows would do. So, the file name handling is the largest change. How many people are familiar with Windows Stream's names? Yeah, okay, a few of you. Windows has this hideous concept, which I will explain in one slide why it is the worst idea in the world, but that will be the last slide of the presentation, because it's nothing to do with the rest of it really. I just want to show you why it's a terrible idea. So, Windows has this concept of a special file name. You say, colon, colon, name. And what that does is anything before that is the actual file. So, you know, foo, bar, bars is the file, and then colon, colon, stream one, stream two. They become different independently-seekable data streams within the file. This is the world's worst idea. So, for the Unix to Unix in SMB3, don't expose that, leave that in Windows hell. Just, you know, if you want a Windows stream, open a Windows handle without a Unix create context on it. It would have been, Windows uses UCS2, encoding of Unicode, not UTF-8. It would, we were very tempted to say, oh, okay, if you get a Unix create context with, the path name must be in UTF-8 just to make our lives easier. We decided to drop that. It just complicates the path name processing. It means you have to handle path names coming in in two separate encodings. And it's just easier to say, no, stick with Windows encoding. Windows encoding is good enough to encode any Unicode in UCS2. So, yeah, just leave that alone. Hard links are already supported by Windows. We don't need to add that. You get goodies like snapshots, quotas, stuff from time warp tokens for free. And you get all the other wonderful things that SMB3 gives you that David talked about, actually, clustering, SMB direct leases, encryption, everything you want. So, I've got to speed up a little, I think. So, what's kind of ugly? What's close enough? Extended attributes. Windows extended attributes are, they predate any of the SMB3 works, so they are ANSI character set, case incentive encoded. I didn't add a separate Unix extended attribute call because EA's are, well, to be honest, they're usually ANSI encoded case incentive. It's very rare that you would have a UTFA encoded case sensitive extended attribute. Maybe we'll run into some apps that need that. Maybe not. For now, I'm going to leave it alone. The other thing is, EA's on Windows and also in SAMB are not valid on symbolic links. And the only issue is SE Linux uses this. But being as the most thing that, first thing that people do is usually turn off SE Linux. I'm not terribly worried about that. Maybe we'll have to revisit that. For weird things like FIFO's, block and character device files, Windows already has a way of encoding those. They needed it for the NFS server. Just reuse it. So I'm doing that. And so the hardest thing I had to do was basically to say no to the feature creep that people kept wanting to put in this thing. So some people really wanted to add some crap into this protocol. So why is following symbolics on the server a bad idea? So there you've got an isolated root slash slave slash export. And then the client sends a relative path name. And while you're in the middle of processing that, if another client can come in and replace that after you've processed past there and replace it with a sim link outside, the safe exporter's zone, you have a race condition and it's a security hole. This was actually the security hole that I talked about in the presentation yesterday that Jan Horne, the Google developer, found. But the reason that that's a problem is that the server transparently follows sim links. If that was not a sim link and the client could never create a sim link, that problem just goes away. So I hate sim links. There've been a security disaster in Samba. Sim links are kind of useful holdover when you had to add a new disk and you wanted to move some stuff over and create a sim link from where it used to be, et cetera. Great flexibility of scripts. Security disaster. As soon as SMB1 could write those server-side sim links and it was a terrible decision and it was my fault to store them as real sim links on the server file system, it's a horrible mess. It's convenient for NFS and SMB1 servers and I absolutely want to forbid this in Unix Exensions 3. So how are we going to do sim links? Well, Windows already has a concept of repass points. Unfortunately, as in many things with Windows, there are an over-designed mess. You can have repass points on directories, on files. You can turn a file into a repass point. You can turn it back. There's all sorts of weird crap that you can do with it that I just don't want to deal with. So for SMB3 sim links, SMB3 Unix sim links, what I've decided to say is you can create a zero-length file and turn it into a repass point and store the sim link there in a special extended attribute, but nothing else. There was a mining modification needed to the protocol. I've still got to run past Microsoft to make sure that we can return the sim link error on traversal and storing sim links on this way breaks the local file system. It means that local apps will not follow sim links created by SMB3 clients. I'm okay with that. I'm okay with that because it's secure. So one of the easy parts we've already got working, all the code for locking, reads, writes, rename, unlink, that already exists. We did that for the SMB1 Unix extensions. So with some tweaking, the code is remarkably small, a few thousand lines of code changes. You can map the SMB3 Unix create context onto an existing request Unix semantic in the SMB1 stack. And then when they get down below into the NTFS layer inside SAMBA, they kind of act the same way. I need to write down this behavior to make it explicit and write tests to make sure that everyone understands what the protocol should do and be. So what works now? There's a wiki link, a wiki-samba.org link that explains what the details are on the wire protocol changes. There is a wire shock extension, I think a reliant did, I think he's just left though, that actually will decode these extensions inside wire shock. Here is my test branch, which I keep rebasing as master changes. This will be outside of master for a while yet. I'm hoping in a few months I can get more of it merged in. And then the Linux kernel, Steve French is rapidly implementing this on the kernel side of things inside the Linux client. So the server-side code is mostly complete in terms of functionality and features. So the real work gets, you know, the 90% of the 10% left is to be done now is basically it needs to be splitting, split up into micro commits so that it's an easy to follow upstream merge. Then I need lots of tests. I need to finish the client piece, which I haven't done yet, and I absolutely need to add lots and lots and lots and lots of tests. So this is going to take a while. The Linux client code already works. It's still evolving to match the server. This is the big missing piece. The libSMB client code, it needs it enhancing to negotiate and implement the SMB3 Unix extensions. So the APIs are mostly there from SMB1. And to be honest, I could do with some help getting this finished. You know, I will do it. It will just take longer if it's just me. So how do we know we've won? Well, SMB3 Unix to Unix replaces NFS. Yay! God! So Steve French, who's the Microsoft Samba team member, he has some spectacular numbers using the Linux client code to the Samba server. He's really happy about that. We were very careful and collaborated with Microsoft. They are making no promises whatsoever, but we would really like other server vendors to implement these extensions and not make it too hard for them to do so. And if you think about why Microsoft might be interested, right now they have greater than 50% Linux hosts in the Azure cloud. They would really like those Linux hosts to be able to mount Azure file storage over SMB3. And the closer deposits they can make that and Unix they can make that, the more traction they think they're going to get in the Azure cloud, which is fair enough. There's another vendor, third-party SMB vendor, SMB3 vendor who's also interested and started to implement. So I'm hoping that this will eventually take off. And eventually, once we've got it all written up, we'll publish it as an addendum to the SMB3 spec. Because of legal issues, Microsoft can probably never publish it as an official Microsoft doc, but they can link to it and say, you know, we've carved out the info level space for these guys and basically make it as official as they can. That's what I'm hoping. Oh, okay, so here's my last slide and then we can go to questions. So here's going on about why alternate data streams are a terrible idea. You probably can't see that terribly well if there's any Linux kernel file systems for people in the audience. But what this is, this was actually a slide from Ted Cho, who is a Linux kernel file system person. And it's his personal reason why alternate data streams, this ability to have multiple different streams inside a file, is a terrible idea. What it shows is the Windows Task Manager because Windows has alternate data streams. Running a list of processors, and if anyone's familiar with Windows processors, they're the standard ones, you know, SpoolSS, server host, you know, Explorer, you name it. Then at the bottom, there's a process running called myfile.txt. How's that happen? Well, what's actually running is a virus that was embedded inside myfile.txt in an alternate data stream with the name of, you know, myhappyvirus, bitcoinminor.exe or something. And guess what? Windows will execute those directly. So, alternate data streams only exist to gladden the hearts of CIA exfiltrators who genuinely use these to steal data from people, which we've seen from the tools that they got hiked and dumped. And virus writers, they're the only two people that I can see who have any use for alternate data streams. I've had this to the Linux kernel, and I don't want to expose the ability to access them over SMB3 Unix extensions. Windows is bad enough. So, does anyone have any questions and comments? You were very polite and quiet throughout the entire thing. Yes. Oh. Uh-huh. So, the question is, with NFS, you can have a single-mount multiple users. Can you do that with SMB? Can you do that since SMB won? Ha-ha-ha-ha. That's... Someone's got to go around and cuff him. Yes, it's... Every single connection, you basically do a new session set up, you create a new context ID and there you go. Multiple user, same connection. Yes. Any others? Yes. So, the question is, Microsoft really is pushing multi-channel on SMB3. Do you need to make any Unix extension changes? No. We get it for free. Ha-ha-ha. It's just a speed-up feature. We already have multi-channel in SAMBA. That's one of the reasons that I think Steve may have implemented multi-channel in the Linux client, I'm not sure. But that's one of the reasons why we can get wonderful performance. Yes. Question. So, is there any chance of getting something like Linux iNotify over SMB? Well, good news for you. It turns out that Linux iNotify was based on the existing Windows notify. Which has been in since you guessed it, SMB1. So, yes, that's a cuff your friend for me, will you? So, yes, that's another feature that has been in the protocol since time immemorial and we will continue to enjoy the benefits of just by existing in the same space. Yes. Any other questions? Oh, cool. Well, I guess everyone wants to go and have some beer. Well, thank you very much. I'm sure to provide some feedback to Jeremy Stalker on cost and not art. Yes, please. Thank you. Very well done. I'll just turn it on because we don't have a PA anyway. I'll just put it in the pocket. Okay. Yeah, I'll give you time hints. Okay.