 Yeah, okay. Welcome everyone. So, as you can see, we've just started one back up. So this is all the thoughts we use, and it's a user-specific implementation. And this is something, you know, that's there to talk about. So yeah, I'm actually happy about this. You know, there's some play. Some new form of logic you can read out that these houses are different. And they're way too complete to talk together. So I'm going to tell you a little bit about the basic thing of what a digital access system is and what you need and I'm going to let you know how it's there, which is, well, whatever response you're going to show, I apologize for that already. But anyway, yeah, this is the only hope. So it's probably twice as good as anything else. So that's the algorithm. That's the outline of my part. Actually, that's not the outline of my part. My part is mainly concepts and pops and a little bit of views. And others, they will tell me about reviews and any color you like. I was supposed to single my secretary here to write his own agenda there, but he did something fancy with rainbow colors. Okay, so let's start from user-land file systems. I won't assume that you're all familiar with user-land file systems because when I started this work, well, first of all, in 2003, I was chatting around with an infamous and I thought, maybe a chance to do that and a beautiful, beautiful way to do something. And a few years passed and thanks to a lot of support, I finally got a chance to start working on it. That was in 2005. And then my application actually, I think people knew me in advance in the Mint BSD project, but my application was open four months I think during the summer, I accidentally heard someone say that there's this thing for Linux called Fuse, but I still have no idea about it. Then I got my initial version done, and then people started telling me that this is nice and handy and fancy and cool, and there's 10 billion files system written for this Linux interface called Fuse. Then I was there, came along, and they started working on that, and people started telling me a bit more shortly. But now I just want to define some important concepts in the context of this talk. So when we talk about Fuse, we mean the interface, and actually there's this as well. So here we have an architectural diagram, and what we mean by Fuse is this interface at this later, between the file system and the library which implements the Fuse file system. That's what the effect file looks like. And then we have Refuse, which is an implementation of that interface. And then finally we have the framework which enables a dog, which is a bunch of interfaces and a bunch of implementations. So I'll say a few words about Fuse. So first of all, what does the name stand for? It's obviously custom user space framework files system. And why did it call that? Well, there was this funny thing when I was doing it in the summer of 2005. I was sitting in a chair and I was reading a cookbook. And there on one page, I think it was page 304, although that's not very important. But anyway, there was this text which said Puff. And that kind of got me going. And then if you had an S to Puff, you had an S as the end, which is obviously a Fuse. So then I just had to come up with some suitable attributes to be able to work 5 or 6 months in Europe. But I finally came up with one. But the interesting thing about this name is that I never talked at AsiaBSB about this. And I was collecting what other user space file systems are that are out there. That's the way we first do the work and then we kind of fit prior work into that somehow. And then I came up on something which someone had written in 2003. Something called Puffs, which was the practical user and fake Puffs system. Okay, now that's certainly an interesting name for this new Puffs. Well, I didn't think it would worth a chance to name it. So how's it different from Puffs? Well, I just wanted to get to that in a bit. But the quick part, which you'll see on the next slide of the architectural overview, is that how does this work? Well, we have a VFS module within the kernel, which interfaces with the kernel's VFS layer. And applications don't realize that they're using a user space file system. They're just using the Puffs module and the name Puffs. So we interface with the kernel, with the kernel VFS framework, and then we, well, whatever the kernel has, we write, open, create, rename, all the fun stuff, whatever. We somehow branch into the format which we can transport to user space. And it's moved to a device. And this library called the Puffs in user space, which handles most everything of what you need to do. And then specifically, we'll avoid this talk. On top of that, we have this refuse library, which is the fuse implementation at ESP. And that interpaces whatever the Puffs gives us. And transforms it to something that any fuse file system you can find out, you can find out the engine, the download, and understand the world, and so forth. So, yeah, I guess that's about it. So this is just to tell something about Puffs. I don't think, well, some of these may be supported if used, but I think they're kind of unique to Puffs. So, first of all, we have real file handle support, which means that the kernel VFS module actually goes and asks the user-land implementation that's, okay, here we have this node, what's this node's file handle. So we can actually do a proper NFS. Of course, it depends on the file system back and forth, if it actually has a proper file handle. It doesn't have them. It can fade it or decide to non-support it or anything. But if you have that ability, you can have proper NFS for full file systems. Then once I have this weird idea that I don't like threads, well, that's actually a weird idea, but I just don't happen to like threads. I don't like how they schedule behind my packets like they're plotting against me or something like that. So I figured out, well, why not do something where we can schedule something explicitly. And file systems, that's important, especially in the user-land case. For example, network file systems. You file a query somewhere. You don't want to be waiting there until your response arrives. You want to be doing something useful that time, like getting more events from the kernel or that kind of stuff. But if every file system has to implement this kind of thing or explicitly save their state every time they want to send something, it's kind of, you know, I don't need to program like you have, you know, you need to effectively save stack state, save registers and so forth. So what I did was I did something, well, we talked earlier, if you were listening to the pipeline talk, we talked about code routines, so those are effectively called code routines. So every function gets passed what I call continuation cookie and you can yield all that cookie and then you can continue from later so you can explicitly schedule yourself wherever you want. It's kind of, implementation is something you don't want to read if you want to remain sane, but I think the user interface is kind of easy enough. Continuing and so you can, if you want your file systems to implement CMS files, then built on that concept, I have a generic buffer and event framework for network file systems. We'll see soon a couple of example file systems also will really use that. So that basically abstracts all the memory management buffer handling or, you know, you send something when you received how you handled it in network file systems and others also. I actually mentioned at this point, but besides that, we're twice as good as anyone else, we actually wrote a paper. So all the details that I would rather go into now are available in the paper. You can get it in the conference proceedings or you can get it from the URL which is in the paper. All right. And also that contains the references to prior work that I've done and for example, paper dish planning. Then you have suspension support. So the criminal file system interface supports the spending file system so you can take a snapshot of it. But, well, that might be interesting, but this opens up plenty of other opportunities. For example, let's say you want to migrate your user space file system to another machine or something like that. You flush everything from the kernel, you get suspended, then you just, well, whatever you need to do for your file system then restart again. With the integration of the kernel cache, so, well, mainly in the cache, actually. So you can flush the pages, you can evaluate what you can do to the page, well, yeah, that's possible. So everyone probably understands that. I'm seeing from that, for users' basis, some kind of race conditions which you need to have or what anyway that supports, so you don't need to wait for the kernel to eventually do that. And also what you can get is notifications when the kernel reads or writes a page, which is in asynchronous notifications when the kernel reads or writes a page, which is in the page cache. So if you want to do caching in your user lab file system you can get notifications that the kernel has modified as a page and it's with how can you handle the cache code for the same user space. And then this is what I've recently done, that's actually, well, not useful if you're a user, but I think it's very cool if you're a developer. So you can say kernel file system code compilates and mounts it so that the kernel file system is running in user space. So, for example, if you want to do some, well, it supports most file systems now, so if it's developing, for example, if you hack all the surface of the landing, you just run the program in user space and if you happen to miss hack it a bit then, then all you get is a crash instead of a kernel panic. And you can do that for all kinds of other new tricks like you want to test some error paths in your file in your kernel file system code, you do some fault injection there, it's completely isolated from the rest of them. And the TMPFS is just mentioned especially because it's up to some special work to do. A few example file systems we have in base, we don't have many file systems available in this interface, most of the fun systems are available for a few interface because there's 10 billion Linux drives who haven't tried to work in base so there's more possibilities available there. But some of you didn't have it as HFS, if you haven't used just HFS, I encourage you to try it. When I wrote the package, the initial version of the SSH that I wrote, was like wow, this is really useful stuff. So it's modeled after half of the fuse file system, but for example it uses continuations and generic buffering, whatever I mentioned. Then we have support for 9p, well that's just a multibit to do the generic buffering, because I saw that things between these two file systems they shared a lot. And then there's also an implementation of portal file system, which is the 4.0.4.3 file system where you can, which works so that you have portals in user space which open a file structure and then they pass it to the calling process. So you can have for example TCP sockets in your file systems and things like that. And this just adapts to a lot of the pre-existing model portal codes in the top signature phase. But what's surprising about this guy is that it's very similar to these guys. So it doesn't do file structure passing if it is a real file system, but it really shares a lot of the characteristics of the distributed file system. Then some system control stuff. And the last thing that's interesting there's a file system layer so it makes everything case-insensitive. And this is something that Alistair initially wrote a version for the Fuse file system which I think has motivation for something like package source testing on case-insensitive files systems. Something like that. But then I thought, well I thought this need to have made stuff in the top signature phase which like it may be used to make this a lot simpler and so forth. So I wrote it and there's a comparison of these two in the paper. It's probably an interesting way. And then just a few words about the interface before I hand over to Alistair. So it's very, very DFS-like. And the point is that, well, it's really kind of hard to decide if you want to have an abstract user-language interface which means you need to translate everything which arrives or on the other hand if you want to have a really kernel-like interface which where everything is multiplied in the kernel. And what happens pops is very kernel-like. So I was initially worried about that but then Alistair came along with Fuse and I kind of started thinking that maybe because I'm more interested in the research aspect so maybe I'll just investigate those of it's more being a few years more of experience and then see where that leads. And I don't need to worry about the interface stability. If people want to use user-language systems or let me speak with a stable interface they can always use our future. What's your biggest problem? We've been implementing Parla on top of it. Sorry? Yeah, the question is how much Parla fits into this and the answer is we've been implementing Parla on top of it. On top of it? Does it provide infrastructure that pops into it? Yeah, it kind of requires another different user-language process. But for example it doesn't provide a library interface. Which is, as I understand, provides just the raw device which is quite more difficult to integrate. Let's move on. It's quite weird for a double-act to have the straight guys to warm up but we don't want this. There's another structure in my part of the talk. We'll be talking about what Refuse is and another type of known as Refuse. Some compatibility with Fuse. Some of the issues there. The development strategy that we use to make this work. Some of the implementations that we have in package source and some of the things that we have to jump through to make it work both on FSD and on all the other operating systems that package source supports. We've got some performance figures. Very, very rough but it gives an interesting feel and themes analysis options in the future work as well. My motivation for this actually wasn't what Antti said. He talked about his work on Puffs and I was very interested and I asked him about the relationship with Fuse at the time he gave me an answer and I thought, great, yeah fine, go away. And then at the land last year Kirk asked me exactly the same question but a few months later, I completely forgot the answer and so I thought I'd better get something implemented straight away or I'm going to sound absolutely stupid. So that's the motivation for a refuse. Moving on. Refuse to live with pain. Refuse, I don't know how the hell you killed it but we'll move on. This is what Fuse looks like and Antti, you know, an interesting slide around there about what Puffs does. Very much the same kind of idea here. We've got the look into the kernel kind of going down to the BFS after the fuse, the voice and up into the callbacks at the user level. And this is what refuse looks like. About that. Enough geek, you remember? The compatibility here and some of the sources look compact. I apologize if you can't read these but I had absolutely had to show them these slides, these are great. These are the people that think you're getting married and things like that when not to hydrate the names. And there it says, best laid out. Then we have a lot of pollinators on top of the source code for compatibility issues. We have a user line called fuse.h which is included with everything. The structures are the same as we have in fuse. So we're laying refuse on top of Puffs and creating an interface that looks exactly like fuse. We have a shared object or a library called librefuse.sl or whatever. And all the fuse functionality is built into that. We just link with that and PackingStores says that it's necessary smarts to do that kind of thing. You can have that on other platforms as well. So in PackingStores it has the smarts for the other platforms too. Fuse itself has different interface versions and they're labeled very, very confusing. They start off at 2.2 I think you can go through now to 2.7 although the month that's mainly used is 2.6. And there's absolutely no relation to minus kernel numbers or anything like that. But you'll often see things fuse use version 2.6. You put a definition in your code beforehand before version you want to use and you get that interface. Let's move on to the next one. Bank Groups compatibility is also an issue with this. We can go back to 2.2 although there are some across systems out there that still use this old one. There's some old Fuse callbacks. What one is it? I think it's called. If you can't read that one I'm sure you'll be okay with that one. There's more to it in the paper on one of the locality things. Can you see that one alright? That's poor sap. There's more to it in the paper if we have on compatibility issues. And there's also some language bindings as well. I'm going to go through this. Any better? We need word. Alright, language bindings that we have. There's obviously the C and the C++ language bindings that we have for everything that comes straight with reviews. There's also point and language bindings which are in package source at the moment if you haven't seen that. Have you got that on it? Okay, let's move on. There are parallel bindings too and they're not in package source at the moment because it won't pass the regression test and the reason for that is it requires look back file systems and regression tests and I haven't had time yet to get the Linux systems out of that. There's also mono or C-sharp bindings and also some, any Haskell programmers out there doesn't have one. Things work too. Alright, moving swiftly on. And we'll get to this and my favorite is right at the end at that moment. That was hardly hard if you didn't get it. Licensing. There's actually a chance to put up a picture there of Paris Lilton and the licensing problems were that she didn't have a license to go out driving which she should have done because she was already disqualified by that so everything for her timing fell. The licensing that we have are free BSD. It has got their own version of a fuse that's written under BSD license. The opens a lot or so on. It is written under the CVDL which is surprising. Some people have problems with the CVDL. It's certainly more restricted than an ordinary BSD license but not that much more. Macviews is written under the free closed BSD license meaning this is the right from the free BSD ones and Refuse of course under a free closed BSD license. And if you can recognize some of the players there. There we go. That's Richard Stolman. This is the launch of the CRB3 license. I thought I'd put this up a bit further. I was sitting behind him and said a moment and that's the master exit paint glue there. But unfortunately the licenses have a problem of hooping up behind him and getting him in the back side. So there we are. Also within Refuse we keep our... I noticed the fast systems that we build up they have five path things attached to them. This is fine for ordinary fast systems that have the five paths hard collated within them except for the very other versions of it. But first obviously moving on safely. For perpetual fast systems you don't have that kind of thing. You need to have a always squirreling way of pathing later so you can get back to the 80s with directory operations on them, things like that. So I've got a set of routines in the base source called retro directory. Basically what we do here ones at the top manipulate the directory files and systems we have so you can add files to it from it, find them find them based on various criteria and there's some routines here which traverse the directories you can actually see they're intentionally made to do by the the library that we have for traversing directories as well. Right, we now have one of some things there are two levels of infuse. One is to the bottom of the low level and one is at the top level. I apologize for this analysis here for you. I went back trying to show the karate and the multiple levels of pipe belts down to the sense they can have and it looks like it's focused in the end of the can-canners I believe in Google there are image sorts of them but there's another one to a level training platform. I think as one of the older ones here I can remember the day when there were system programmers and application programs and they tended to work at different levels. System programmers worked at the bottom they were very interested in announcing goals and working time slices more than before to microstate and something like that. Application programmers much more interested in the overview of the nice branded interfaces that they were used to using from libraries and things like that. That's some high level training training board and that's high level system that's some low level dumping grounds on some continent over to the left of history. This is some full level work in Iraq and that's a low level system this time all along in the theme of refuse and some low level vising there. Okay I'll go on there and explain a bit about package source that you have in here. Hopefully you don't see that and I could read that and read them all out and give you an exhaustive an exhaustive in and out of what they all do. I would encourage you to look at them and see for yourselves see if there's anything you need there anything that you find interesting or maybe you can add on or do better or something like that. There's a couple of times that we've heard the NDFS 3G mentioned there that's what we've raised it for for NDFS volumes very, very useful and that's also one that we did some performance benchmarking on later on in the paper and I'll speak briefly about it later on. Another interesting one up there a huge part I'll show you later on that allows you to access and manage the tunes in Iraq and the file system this is actually a typical from the laptop or something like that for our attention. Gmail FS interesting one in that you can use Gmail to provide support for you files of the file system space for you, so storage for you and as they give you the way 2.7 gig in the moment for 3G, you could only go into that connection and remember of course you can always invite yourself through a different email address and you have up to 100 of these presentations so you can get a fair amount of stories on the internet if you go around the team life. Kerala FTB FS is a nice one as well it allows the basically the contents of FTB and FTB space to appear as a local process and it's basically the content of Kerala I didn't it seems like games basically have there so I've written something else and I'll talk about that later on ok so that's the some of the instances of the iPod FS Anthony was talking earlier on about system control FS that's what it looks like in the operation actually you have to see the system control notes down there and if you go into the directories you can manipulate those it's based on paths, it's not based on on refuse and that's what Anthony was talking about for the SSH FS another interesting application and as he says very useful and he did quite a lot but some talk earlier on about the ZFS running on fuses the lines are quite they actually have the stage where they can read and write although they have they have problems at the moment one of the interesting pieces of this email here if you count the number of threads that ZFS fuses running it's 154 the previous version 113 so performance isn't going to be good for you but this is one of the reasons as well for my limitation of refuse refuse so ZFS fuses uses the low level interface which we haven't finished off within maybe three yet but it will be coming at the moment as an aside on ZFS or CFS if you can come up with something else we had a look at the work and actually we did some comparison benchmarks between Sun6900 and the 25K and on both of those the another one actually moving to the 25K using Solaris 10 and ZFS 6900 Veritas and there was a hundred times slower than moving to ZFS so the data integrity of REST does pay some prices the other problem we had with it was that we found it unstable so we were trying to for DR purposes we were trying to restore 25 terabyte database for a database warehouse on this 25K we got through 23 terabytes of it and then ZFS decided to die we actually went and asked our Sun support people and I apologize if there are any Sun people here you can come and talk to them our Sun sales representative who had a body of production at the moment and they couldn't tell us of any customer so we were at ZFS it's got some excellent texture, it really does and I'm not trying to not compare anything like that but I'll just say that that might not be the time to I'll tell you best here okay this is some of the stuff I'm hacking on as well and I apologize about this but this is just a local a local meaner which is a class system that we have here and they should be making it into package stores or into the network we've got some of them there already in RCSF and my motivation for that is actually for web servers I get fed up typing in URLs into Firefox and then telling me that the file is not fine because it can take after a lot of there is an uppercase and a lowercase I wanted something that was just typing and not worrying about where the caps off was on so that was the reason for RCSF and yeah I would encourage you to have a look at the paper in Paris and we'll be doing that one that time yeah there are a few that don't know what I'm doing someone would throw away like TNTPFS I asked him, do you learn about that he said you know it's not I'm just going to make an idea of TNTP which would get to be fine and we all know and love right so that's the working progress as we want to performance now enough of the amateurs let's go into the professionals here it's close to me driving this around the race track last year great fun, if ever you get a chance to do it um so goodbye goodbye goodbye this isn't very good this isn't very good anyway so what we do the fastest in New Zealand is we're going to sound it up performance and the answer is we're not we're really looking at some of the interesting features we can get out of it and as I mentioned earlier there's a fast vehicle for debugging for trying things out and also if you want to migrate stuff there's a whole lot of things you can do and using a fast system in New Zealand does bring home to just the possibilities that we have there are some things that people do well and some things that people just well they do pretty well right I'll also talk about some of the differences that we have between that cost and fuse fuse uses a stat buffer for the node fuse is very much user level orientated so you have somebody coming along and they want to make a fast system out of some of the user language that they have fuse is excellent for that it's written exactly the right level for that kind of thing for the more serious ones if you want to look at the low level fuse returns error so there's negative integers and ESD returns the most positive integers so we have a shim layer in there to change all the negative errors or codes into positive ones so fuse can then do its work in the right way pass can do its work in the right way pass gives us access to the cold context and fuse you actually have to call something at a high level of storage we have a high level of storage in ESD so we get all the information from the path score context and that's what we do, somebody's going to do a different thing on top of that directory meets in fact by fuse and then path set and then the way the buffering is done is a different thing I did talk about some of the cost systems that are up there and some of that the phenomenon technique we didn't go all eyes at it we thought we'll try one nice, easy system call and Nancy kindly pointed out we could be a very, very good one to do and then we built this up a little more upon its time so we get the whole of refuse working and we have that right now I think we need to do the second extended attributes but I'm sure that's coming given that I've written a few response systems to the later and extended attributes and so forth and this is going to be itself so, couldn't mind go off the cart again this kind of stuff we thought the development technique would be used the paper points out that the refuse way of doing it is that we have typically one thread so a less parallel doesn't need to be a wrong result with the fast systems okay close up with that one and that's something you get to get wrong the idea again is once you've got used my fast systems and tried over and over and over again it doesn't matter if you have a crack or anything like that you're not picking up bits of the fast system typically you've got snapshots of it and so we're not worried about parameters of the feeling performance figures I must note these are very, very rough however, anti-interface with QE and QE we're running on these snapshots there the top figures are coming from an NPA system and the bottom ones are coming from Linux we're running on a live CD this probably looks a bit better what we see character we have views again these are roughs I don't think it was possible but what I would like to stress is that the graphs are roughly the same especially in that one there's a lot of USB outperforming views I'll take it at the end if that's what we know and then this one as well as the 6 so they're roughly in the same ballpark which I was quite surprised at I didn't expect such fast performance action we're right you'd be glad to know that this is the top finally over and we're about to go out I think this is a picture of a service agent who has taken a nasty question any questions the character of the foot did you actually verify that the comments didn't do different things so that the actual file system gets caught the same number of times you know what I mean that was just running like students do this stuff that was well the purpose is it gets buffered in the kernel so I didn't turn on all the caches because I didn't really want to measure that I wanted to measure the user experience so it's actually when it's doing its work when the writes are going to the file system they come in the kernel which is buffered into large chunks yeah well the question is it's a different actual account doing different things or is it a character or is it just some account over here somewhere that's interfacing it up yeah it might be for example from a system call or like I said just really rough because and one of the points is just to show that even though we're not emulating kind of a fused interface in the kernel where it would be most efficient it's still very efficient to emulate the user space because you're really in a way with a context which is an all of data mapping you need to do in user space it's not really that do you want to ask a question? I'll have one of the phones without them can you just give us a very quick comparison between user space and real well it's a bit slower but it's about about 5 seconds very good sorry how about me talking to the previous I mean that's the previous thing guys I don't see any problems with it the interaction maybe but it certainly should be portable I'd love to see anything what's this Google search that has? Google published an HTTP interface and so you can actually program it up so what I have there is a file system that if you want to search on the terms classical rangers or something like that you would put in a file name, a path name send that up to Google if you can just enter one of the things you have to do is you have to pretend to be links otherwise it starts to use everything it's a feature on a funny browser but really it's a very yeah that's right but you send the query over to the directory system sends the query over to Google and the terrible results that come back as files and stuff as long as you've got the HTTP in there it's difficult to know how best to present this information whether you want to present it as separate files because then you have to cast each of the files when you look there's also Google search on the browser you get the context in the paragraph you get cache instances of it you get the URL there as well there's all of that that's one of the problems I have actually is displaying the information you get back how best to do that and you could display it as a huge symbolic link preferably not right, because there are other things that involve that but the useless ones don't really care they just have to take the stuff that comes back from the car and all that the reason I say perfectly not for that because you're probably running into 256 carat lengths on a symbolic link you get truncated search information oh shit