 This is just going to just jump through a bunch of things, like things in our lives, especially because this is very GABN centric. If you're like, what's the GABN? This might be the last one telling you. But it's my background, primarily as a scholar in the GABN. It's making my mind more rapidly. I think we should be adaptable. So one thing we do is we sit around cursing at SBT as scholar programmers, or at whatever's a built tool is that it's likely to be SBT. So we need to reimagine everything, right? Like some of these ideas are like they seem extreme in some sense. But they're actually what's extreme is that we haven't done it yet, that we live with these ridiculous arbitrary distinctions. Like we have these piles of bytes on disk, and we funnel particular little bits of bytes into different files. And then we make that semantically meaningful. We say, like, if it's in this file, then it's got to do this. But if it's in this file, it's got to do that. And if you screw it up, then we'll just fail. Like Java says, that needs to be in a public, you know, that's a public class called foo. You got to put it in foo.java. Geez, computer, can't you handle that for me? And in fact, it can. Because the file system that we show for Java's benefit could be exactly what it wants. Meanwhile, we can stick every bit of Java source code into one big file. We put it on a database and synthesize it as needed so that we can essentially do, at compile time, we can do our just selection of exactly what we need and just simplify our lives a whole bunch later. We can reuse code without any dependencies, right? Because it's all in the database. We can analyze all of our code statically, much more simply than when it's spread out in all these files. We can basically just break through all these boundaries that are so pervasive that we largely don't see them. That's the thing, it's like, people are like, what? Like, you can do that? Well, yeah, that's the thing. Keep trying to, you need to keep scraping the assumption stuff off of the brain because it takes several passes before it's all gone. And at some point, this stops looking like crazy idea and just looks inevitable in about 20 years behind where it should have happened. So among the great, like intuitively obvious, wonderful prospects here are infinite file systems. Anything that we have the metadata for, we can have locally. For all intents and purposes, we have the entirety of sonotype, maven, central, whatever, every repository of interest on our local machine. It's all sitting there. The metadata is there. We know everything about that stuff. And then when the time comes, we actually go and get a jar. Either it's there for real on a physical disk or it's downloaded transparently. But there's no parade of broken XML and like specific repository management, whatever. There's no like, I don't have this thing. I gotta go get this thing. That's the step we can cut out of any number of things that infuse our lives. Now in principle, like if everything's working like it's supposed to, then if you have maven or SPT or IV or whatever, then you don't have to personally go get this thing. You just need to watch it say resolving, resolving or watch it bottleneck on access to IV II for 20 minutes, because two things are trying to resolve and it only let one. So that's the working version, but we can do way better than this. And then the last thing there is we get on the flip side, like we don't need to make the like active decision to burn a bunch of disks to have all the jars that are relevant to our class path sitting somewhere locally where they're convenient. We can always have them there. They're virtual. They don't take any space at all. It's just a nice thing. And they are for all kinds of purposes, real files. And they're right. So our lit manage is always maintained. And that's just the tip of the iceberg. Anything inside the build tools model is visible to us. With SPT, it's a constant source of frustration. Why on earth can't I make foo be equal to five, right? Like I've tried to set foo in global in build in all projects equals five. But it still says it's seven, completely maddening because you don't know the crazy internal process it goes through to combine all these sources of settings and come up with something. But you could know if it would reveal that to you in a file system. One thing it would show you the entire hierarchy so you could see every value. And then in nice little like commented source code, all of which is virtual, it's just a mapping of its internal model. It would explain why it has that value. Like the build.spt in this sub directory says that it's this and that's the most important thing for determining this and so that's where it came from. Thus, and there are many examples of this, many programs that are configurable via a number of similar files and it's left on you to figure out why the hell something has some particular value. Nobody should ever have to write or see XML. And I think it's almost, it passes comedy into tragedy that we actually use XML like a configuration format in any way, shape, or form. It's completely sick. So those days can be gone forever. We just come up with some like less insane thing that maps directly to XML. And then that less insane thing, it could be YAML or I don't know. I don't wanna like endorse any particular format but it won't be XML. But then we can have what I call implicit files. And the implicit files are for any given type of file then right next door in the virtual file system is other types that we know how to create from that type. So anytime we have equivalent formats like arbitrary unknown format for that we actually like to write in and XML then if you come looking for food.xml next to food.p's unknown format, there it is and it's exactly the XML that you would get by converting this at this moment. It's always, it's hot. It's not something you have to create for its benefit. It is created by virtue of being requested. Oh yes, a version's database, right? Like so much of like our pain arises from like being ignorant of the fact that databases can figure a bunch of stuff out if we would just like store our data in a useful way, right? So, but we need in files. So we have this problem. But if we generate files, and I don't like to say generate because what I really mean is like materialize because generate implies this sort of imperative process. I've got a thing, I'm going to make another thing. But it's a pull process. We can just have these like holograms surrounding the database. They're file holograms and if you come and you try to grab one, it just turns into Michael Jackson, right? From the hologram Michael Jackson. But before that it doesn't and it costs you nothing. There's no effort done for a virtual file that nobody ever asked for. It's truly the tree that falls in the forest. Build outputs. All right. So let's take some like highly redundant set of Java files or for that matter, Scala. Scala's got like function zero through 22, which are deterministically derived from a shell script that walks through zero to 22 and creates these files and doesn't, you know, keeps expanding the number type parameters and a bunch of stuff. What if all those zero through 22 existed virtually and were just a consequence of some other file, which in turn leads to class files, which are just a consequence of source files, which in turn leads to jars, which are just a consequence of class files. So now you have three levels of virtualization here. You have, but you have built products immediately, which actually reflect everything before them and the dependency chain is self-apparent. In fact, it's implicitly encoded in the vial system, the fact that the jar bundles up all the classes means that it's dependent upon them. The class is dependent on the source, the source is dependent on the generator. You change the generator, nothing happens yet. Then somebody comes along, tries to read the jar and everything happens again because it's dirty. So maybe, I don't know, like on a scale of one to 10, how much better is this than the way this sort of thing is generally done now? Because generation of sources or whatever, like boilerplate management in real life builds, I'm familiar with, these things are monsters. They're imperative management monsters and they can all go away. Like, it's truly, we're talking about a guy, there is a silver bullet for some things. This is a silly, and they say no silver bullet, all right, no silver bullet, but there's some things we could shoot. And then there's, and then of course, like the publication jars is just one part of it. All the generated documentation that's derived from the source files, like how long have I spent staring and waiting for the scholar doc publication to finish even though I could care less? Well, a lot of time and I'm never gonna actually access that stuff and if it wasn't generated until I went and accessed it, it would never happen. I'd get all that time back, but that's not how it works, but it could. All right, a class path is this really clunky and coding of a union file system. It says like, here's a bunch of paths that I've separated by colons or semicolons, each of which should be treated as a bunch of stuff for the class path. Well, we can have a thing and it's like a line of code and it's a fuse that splits the class path into those paths and then it takes the union of all the classes visible there. In addition, it can expand all the jars, virtually, it doesn't actually do anything except expose their contents because all jars, in fact, all containers of any kind are two-way transparent. So you never again have to think, oh, I've got a jar and I need a TGZ or a 7Z or a BZ or whatever else there is. There's like a hundred container format, right? If we know how to unzip it, then we know how to see what's in it, then we can just have it there. It's always exploded, virtually. If it's read, write, which is up to you, then you can actually edit the compressed thing by editing the virtual exploded thing, which will then feed back into it the next time that you quest it, but not before. So these are like crazy awesome things. What can I say? The class path, and we do this with the path as well, you get your class path that you use for the KVMs consumption is now two characters or whatever, right? It's slash foo, where you have now brought everything that was on your big ugly class path into one place, saying with your regular path, it's twiddle slash bin for me, you know, that's it, right? In addition, it can report on all kinds of interesting things like here's a path that actually occurs in several different things on your union file system. Only this one is being used. Maybe that's not what you want. Just being able to trivially see what is shadowed on any given kind of union path is a very useful thing. So here's like a ridiculous idea. I wouldn't actually suggest doing a lot of this in development, but here I am taking every single jar, virtual jar, mind you, right? None of them are here. Taking every single virtual jar, building a class path out of all of them, their contents, and then just using the Scala Rebel or something, right? Just make something up. I just feel like using Bippy, right? Anything you've ever heard of, right? It's slow over the first time because in the background, the attempt to load a particular class went to the metadata index, found out what the dependency was, downloaded it so that the virtual jar became concrete, and then finally the class loader, really slow class loader comes back and says, instantiated here's your thing, and you have it. And so this happens, there's not even a point at which you would have to do anything other than wait a little bit. But of course, in these scenarios, that only happens once and then you have it locally. And B, you can prefetch as much as you want. So it's a little weird in that scenario that you think the new foo shouldn't take 20 minutes when you know, Maven Central's all throttle. But in general, the magic aspect is really promising. Class files are blizzards of redundancy. There are numerous situations where the compression you would get out of just like LZW on a bunch of class files is gonna be like 95% because you just have the constant pool just massively replicated with tiny differences. Every single one of those class files, saying the same things over and over, if you disassemble like a big directory of class files and then run stats on the most common strings, you'll see some just ungodly repetition. And all of this is unnecessary because we can be, our class files can be virtual, right? So there's a number of ways we can do this. Like we could be just generating code on the fly, generating byte code on the fly. But more simply, we can just have a meta class file format that implies a bunch of class files like scholar specialization. So here's a really unfortunate non solution. Let's boost the standard jar by several megabytes so that we can have a subset of possible combinations of a small subset of function types specialized. I can specialize every single function type on every single possible type at no cost. No joke, right? Every single one, zero cost. This is not an exaggeration. We can have, because this is a deterministic process, we've got a template for a class file that we can specialize for any set of types. When you come along looking for the one that goes double string, double byte foo, then we'll just make that for you and give it to you. It's there and it loads and it's highly specialized or further optimized, right? Like who knows all the things we could do by getting in there at the moment a class is attempted to be loaded. So when I say implicit files, this will be a term I'll use consistently, I hope. So when implicit file is in file, we know how to make from the existing file. When you do like an LS in the directory, you don't see the implicit files, right? Like, because thanks to the tools like convert and ffmpeg, I don't wanna see like 150 different audio formats next to my wave, right? I mean, yeah, there's, it can turn into 150 audio formats, wait for me to ask. So they're there, if I ask for foo.obscureaudioformat, I'm gonna get it. But when I look in the directory, I only see the wave. That's the implicit file. So this is distinct from like whatever, a lazy file, which is something that you can see, but that hasn't been resolved yet and where work remains to be done. Binary compatibility, huge problem, totally solvable now. Because what's missing is a link stage, which we don't have, but this is the link stage. Because it doesn't have to be just rigid, predefined, I'm bytecode that thinks I'm gonna get a freaking particular type in this method signature and if you give me anything else, I'm just gonna link it to RU and go home. Doesn't have to be like that. It could be like, I need to resolve the situation. And well, I know how to resolve this situation. I'm gonna give you the bytecode you need of what you're looking for. I'm going to please you, JVM, so that you may continue. I'm gonna be the computer doing the computer's job. And so, one of the things about this talk to me is like several of these slides are like totally like world shaking from my point of view. Being, having a mechanism for solving the binary compatibility problem of the JVM is huge. And yet I'm like flying through them because that's the kind of time that I have. That's, I only say this to emphasize like, you know, again, your choices are delusional or like this is like the lowest hanging fruit in the computing world for everyone in our lives. So I already kind of got at this one. Sorry. Oh, yes. I'll just solve the binary computing. Well, so there's a whole bunch of hand waving here, right? What, so I'm not gonna, let me restate that. It doesn't solve the binary compatibility problem. It allows a solution without changing anything else to the binary compatibility problem because you have, your opportunity now is you have a mechanism to get into the process when a class is loaded. So because here's what's gonna happen in real life. You ship a version of a library, you ship a slightly different version of a library, something was compiled against the first version of the library. Now, but in real, somebody has the second version of the library and then it says like, let me load a class and it's got not quite the right signature and linkages. But now we are able, we're like the omniscient, you know, problem solver watching this happen and we can say, oh, wait a minute, that's not gonna work, right? Like we know way before it does that this doesn't actually work, right? And so we can say, go to a database of like compatibility resolution things. It could be specific to these things or it could just be general logic. A lot of the time you just need to like weaken or strengthen a type, right? It's just like it became a super type or something, right? It's the same method. It hasn't changed. The method signature changed by accident but the JVM has no room for error. It's got to be exact match. Well, we can, you know, smooth that over. It's like happy jelly for the sharp edges becoming jelly edges, which is not clearly an improvement, I guess I'll work on that metaphor. So I guess that your hypothesis is that you can reproduce it, I feel like. That you can reproduce, I'm sorry? You have like this recipe and you can reproduce it so you can lazily like output different, different. Yes, we're going to give it, right? We're gonna give them different byte code depending on the situation. That's the thing, right? We are lazily like, and it doesn't even have to be lazily determined. We can just have a validation check. It's the opportunity to give them something different than a pile of bytes. Because the real live version is there's a pile of bytes, the operating system's gonna come and get it and it's gonna fail. And there's no point in there, no generic way, to say like, oh wait a minute, let me just look at this situation and maybe re-divert you a tiny bit or give you something a little more to your liking. But this is that way. So, couple of quick questions. First off, how does this relate to kind of the approach that say, Ultrazano's Unison thing is taking where every function, every data constructor is uniquely identified. And with that specific example that you just gave, I'm wondering about something like, we don't have in our programming languages enough metadata to say, well I'm referring to some name here from Literary A and I'm referring to the same name from Literary B, but they're depending upon different versions and consequently there may be some fundamental compatibility and then we have the end-to-end problem that we have to solve of translations. Absolutely, there's a chicken and egg thing here though. There's not been that much motivation to have like highly generic sort of, because we only really need a theory of name by name that doesn't go any further than that, right? I mean like this is the main thing you need. You need to know that like within a given context that this name and this name refer to the same thing or they don't, right? And then you can get a whole bunch of reusable logic. But the main thing is that like programming languages that exist today have all been designed around the nutty world that has no ability to do any of this. So of course, I mean every hammer ends up looking like a nail, right? We have the opportunities for doing things better. Just open right up. I find Unison to be completely complimentary in the sense that it's like much further down sort of the stack, right? Like I see this as an enabling technology that Unison should be able to put to use 100 ways, but this is not an end in itself. This is only about the things that makes possible. So yeah, like there are interesting questions in that regard, but that's like, that's a layer up from sort of where I'm targeting at this moment. You said a couple, did you have a, was that two questions today? Did I answer two questions? I've learned the lesson early, don't answer the question you were asked, answer the question you wanted to be asked. That's, I got that from having to stand in front of like, you know, actual people. So it probably sticks with me some. Anyway, if you feel dissatisfied with your answers, you know where I'm at. Anything else before I could, anything else before I could, yeah. I was gonna ask, you mentioned implicit files and stuff as we've been doing along. You mentioned recognizing when we need a different binary version of something. But something that's interesting about that is that it takes time to make those adjustments. Yes, absolutely, right? It's the Haskell problem, right? If you make everything lazy, then you get these moments when a whole bunch of work suddenly has to be done, which might not be when you want it. Yeah, and that's a big deal for a file system though, because we consider reading and writing almost a known constant. I agree. So this, let me just put this into the hand way category for now. There are a number of like interesting problems, or let's just say like consequences that arise. I think I have like good answers for all of them, but I would like to sort of get through like the first breadth, first look at the potential possibilities and then come back for the... Totally understand. Great, blah, blah, blah, blah, specialization. Yeah, MCDDD. So this is what we're talking about here, right? Each of those Ds, that means you can, if you're folding over a bunch of doubles, then it's gonna be really fast because you can pass a function to fold over a bunch of doubles and it's gonna go, hey, I'm folding doubles. I can call this thing just pass doubles and not box doubles. And there'll be 10 times, maybe more than 10 times faster because there's two of them going in and one coming out, who knows, a lot faster. But if it happens to be, I don't know, pick a thing that doesn't turn up, double in, double your screw, right? So now you just drop off of two orders of magnitude and speed because of what seems like a transparent from any sensible standpoint change. But you're actually hung up on these implementation details. But every single one, it can be DDD, every letter of the alphabet, you know, Unicode, right? We have anything you can think of, can be there, right? Like the total size of the byte code, if you totaled it, could be a billion petabytes. It's all right there for the taking because you're only gonna come for specific classes and we'll make it like you, if they come for it, you will build it. I guess it's the reverse of field of dreams. Ah. That's good. I don't know what's good for someone like that. Not for something off the cuff. Like the prepared stuff gets that generally. Yeah, so, great. So there at an MCF, right? Metaclass file. That's just a made up thing and this is like supposed to be conceptual. But that could actually be a real thing as well. But at least sort of everything that you do that's got a bunch of duplication where there's no obvious way to get rid of it is likely to lend itself to this approach. Here we have, and again, I've already sort of gone this one too, the metaprogramming from the higher layer. So we all have this sort of like, most of us I assume, get like your hackles rise when you're metaprogramming. Because your intuition about metaprogramming is correct. That it's like a great way to make something completely impossible to understand. But it's a bit different here because we are not metaprogramming in like the way the Scala macro way. We are generating code and in a much better way than code is generally generated now. And it is generated now when we deal with it. But the thing is we can make it transparent. When we have like three meta levels of expansion, we can see each of them as a bunch of files. We can get errors against those files. We have a whole like opportunity, we can generate beautiful code, not like junk, I'm a generated code thing. It can all be part of the process. And what it amounts to then is just better tools for abstraction, more flexible. That can do things that were not thought of in advance because we can essentially make just arbitrary decisions. Now we wouldn't want this to be enormously slow. You probably don't want to be like compiling Scala code for the purpose of generating Scala code and have every time it comes by, it's 20 more minutes before the thing goes, right? But that's okay. We can use like a really fast language, like go or something God knows, something fast for that part if we want to. Because it's easy, right? It's like a shell script generating code. And then it's gonna be no slower than it was before, but way easier to manage. Other ideas. So all of your class file, among the implied files, implicit files from a class file is Java P. So here I have, you know, Java slash lang slash string dot class. And so I say Vi Java slash lang slash string dot Java P. And here I am looking at the disassembly, but why stop there? I'm gonna run down in there into the constant pool and change something. Which is going to update the class file, but it isn't just going to update the class file naively because of the way that I've written this particular implied file with a two-way thing. The constraints that are required for a class file on the Java virtual machine will now be reoriented. So the fact that I change some, move something around or whatever, that require other things to happen to not be verified error land, those will happen. So this is like magic editor, right? It's like it comes to you and is all this cool like expanded binary thing. And then you change text and the binary updates and gets it right. This is totally doable. Maybe that sounds like science fiction, I don't know. This is totally doable. And then there's a bunch of variations on that kind of thing. Things that are really hard to edit. We don't need to see them as the hard thing to edit. We can see them as the easy thing to edit. We can encapsulate the knowledge necessary to make it easy to see and to make it easy to change. And then that is implicit in the file system and it's just there, right? So the opportunities for like crowdsourcing this kind of thing are really high, right? How much work have all of us put into like parsing something ridiculous and it's one off work that's never gonna get reused by anybody else. But this is a mechanism for which once because I haven't gotten to typed files yet. But once we have like a type system that's in place then we can just have things like type A, type B. Somebody write like the transformer between those and put it in the like crowdsource database of type conversions. And then just like by magic, right? I think Eric had the brilliant idea of using mechanical Turk so that you could take the earlier example where like has to pause waiting for the download but actually pause to issue mechanical Turk to like write you a REST API wrapper. So this takes several hours but then the initial attempt succeeds, right? That's pretty cool. So I guess this is the example I was just running with there. Oh yes, so RM index up. I just removed a method from Java Lang string. I look again, the class file is smaller. And other things one might, like there are many, many things one might wish to edit in this fashion. I would love for a scholar's simple table to be exposed to the file system so I could do RN-RF collection and this sort of thing. They just trim it way down. Compiler is that, you know. Obviously the compiler's not really designed to have like half its simple table disappear from under so probably won't deal with that. It's not that we can necessarily spook a program in this way but these opportunities do exist. All right, so it's not just about like boilerplate style code. There's just a bunch of situations where there's a clean way to express logic that is just buried in a blizzard of irrelevant complexity by the languages themselves. Parser combinators are a great example of this. Like the only really fast parser combinator situation is parboiled too which uses macros and I guess I don't have to tell you any more than the fact that like it uses macros to pass around a mutable thing. It's like, you know, this never ends well. It sure doesn't, right? But we don't need to use freaking macros. Do macros is like the world's most complicated metaprogramming. We can do the world's simplest metaprogramming instead. We'll write super, super simple rules without like letting it be infected by all the accidental complexity and then we'll write something extremely simple that translates that into some programming language. Probably not scholar, this point I'll generate Java because that's assembly and I can make it really fast. Java is definitely our assembly language. And so now I get to work in my dream language. It's exactly the language I want and it implies code for which there's already a compiler and so this basically enables the escape that I've always wanted from all the languages that we have to deal with. This might be my last JDM slide. I have tons and tons more slides but that's a good place for us to see what we have about Java. Yes, Chris. So in previous life, I dealt a lot with metadata for programming languages. One of the things I remember is that at the time, I remember that Wiser 4 actually had some of these same ambitions with respect to the translation between metadata and data. And that was sort of my other question. It goes back like 30 years. I don't claim any original, a single original idea. And the second thing was... Sorry, I hope I didn't displace it. But yes, for emphasis, even though this is like the best idea I've ever had, quote unquote, there's no actual idea here other than that we're nuts for not having done it already. Oh. That's the closest thing to an idea that I have. From your previous talk, you said that the metadata semantics are different in terms of laziness. Well, yes, that's right. The data is lazier than metadata. Like that it's critical that you can add. But metadata is data. Yes, absolutely. So when I say data, so in a file system, there is like an, here's an implied structure that it doesn't stop with the metadata. That it isn't just encoding a tree. It's encoding a tree with data at the leaves. Which is to say somewhere there's a bunch of bytes, like there's something, we're storing files, the normal definition of files, somewhere in here in general, right? And so metadata, yes metadata is data, but we don't need, it's data that is likely to be big and metadata in our case is not like, metadata is things like, what is it? What type is it numberable? Like the number one thing actually is the separation of the type from the thing that embodies the type, right? Because the metadata is like our sort of compiler level knowledge of things. Right, what I'm saying is we might want an infinite tower where we can use, we can treat metadata as data where we're generating metadata, you know. I understand, you very likely will want that and you can have that. It's not that metadata can't be lazy, it's that data has to be even lazier. And the point being that you can analyze the metadata separately from having to actually acquire the data. This is already the way file systems are, right? I mean, I'm only pointing out that that aspect of them that is essentially that there is a implied two level lookup that I don't give it a path and get back all the bytes. See, I give it a path and I get back a mechanism for understanding what that thing is. That's the key to this. So I'm not excluding any kind of lazy metadata. I'm only excluding the data being a one level map. It can't just be like paths to bytes, right? It's got to be paths to byte getters. Yes. Seems like next would be a great solution for the function in there. That's a perfect next slide actually. Perfect next slide, thank you. It's not for a definite interest, but I don't actually have a next first. Yes, Nix is exactly in line with, there are several things that are, basically I've got like four linchpins or something and there are a bunch of things that have like two of them and they're perfectly aligned for everybody to go, isn't this cool? So git is totally critical, it gets my immutable file system, but git interfaces nightmare, I don't ever want to actually use git. What I want is git to be my mechanism for ensuring an immutable file system and that's it. A different backend, in fact, I don't want no SQL, right? So I have a SQLite back end in the git, right? This sort of existed and then it's like not maintained and that's annoying. There's a number of sort of fronts on which I need the state of the art to be advanced to some extent, but I have the motivation and means for that to happen. So that's definitely like an intention, yes. But your question was the distributed access or do you feel like part of it? I think Intel has a project like, maybe something smart, it's what Akamai has, like curfew, like some kind of like distributed resources. Have I mentioned anything really distributed yet? Maybe like, have I mentioned anything curfew? Like I have stuff like that, but I don't remember actually saying anything about this past, been had a certain maybe. Yeah, so it's because of the item thing, I mean, you said it could be split, I thought. Ah, yes. So there's like a crazy universe of possibility in this zone, like what it means, like what a file is, where it comes from, like how it's replicated, how it's synchronized, like the way that these things could be, for instance, you know, synchronization just between my desktop and my laptop is still in 2015, a giant pain in the ass. But you can see a world here where synchronization A is driven like very much by like the actual sort of reads and writes and not by the fact that I need to do this in advance. And also where you can expose like the synchronization status of a thing in its metadata. So one of the things I've gotten into here is like the implementation detail, but extended file attributes basically give us a channel to say a lot more about things than is said in the standard UNIX file system. So it's nuts that we sit around typing octal numbers all day to manipulate permissions. I mean, it's nuts. And we can actually have like this whole possible namespace, right? Now unfortunately, extended file attributes are broken slow or both native ones in general. But again, like some of these things they're there and they suck because nobody uses them and nobody uses them because there's not a good reason. There's a good reason. So we can make them suck less. There's no reason they have to be slow like they are. There's a few things that do use extended file attributes and it's really neat. Like whenever Chrome downloads a file it attaches the URL it came from and it's extended file attribute. So after the world's craziest blizzard of unknown command line things you can actually get that data out. Like it's not stored in any way that you would ever be able to look at it. It's comical. It's like they wanna be sure nobody can figure out what it says. But once you can see what it says that means all your files lying around you can see where they came from. That's really neat. Lots of programs could be doing stuff like that and we could be utilizing it and we could be virtually exposing it all the time. That's the thing, right? There's no cost to having a bunch of extended file attributes in a virtual file that nobody looks at. But when they do then that's essentially a database lookup or whatever somehow a data source is being cultivated and fed to that particular thing because that's what they need to know. So that can be slow. We're essentially getting a huge range of new behavior and it's only slow for the rare opportunities that we need it. But when we need it it's really handy. Get SQLite, Datomic. If you read the data, so closure is not really my bag but you read the Datomic stuff and it reads like they were designing it for this purpose. Imagine this. Your symbolic links right now are like you go LN-S, Bob and then you gotta give it a fixed string to a path and that's all you get. So it's like the world's worst program. It's a programming language that only knows how to echo constant strings. But what if instead of path to Bob that was just a data log query, right? So you go LN-S, Bob, specific data log query that does something sophisticated and gives me exactly the file system I want right here. So now you have ad hoc, trivially-created file systems which you can then turn around and treat like a bunch of real files and use all the same tools. One thing you discover if you do all the research I've done on this matter is the world is just overflowing with reinvented file systems. I don't know how many programs there are out there called things like LSBipi, RMBipi, MBipi because they all have to reinvent everything that we use in the regular file system to manipulate stuff inside of their file system. And then all of that is madness because we can do it with the same tools. We just need to expose it and we need a mechanism to do it, which we're missing, not entirely. We have fused what the problem is. It's not pervasive and it's not good. I'm getting a bit confused here because of that. So it seems that most of the time you speak about more like an interface but then you also speak about implementing. So I see the point that everybody exposes the file system at school. I don't see how it really just takes, you just don't need to implement your RM or your LS for your specific file system. I mean it's just a nicer interface. I don't see how it actually makes my life easier. Well, so A, mostly you won't need to write your own file. Most of these file systems are completely redundant. If you have SIFUs and you have a few core file systems that do the major things people are doing with their file systems, which are almost trivial once you take away all of the ridiculous boilerplate and the satisfied fuses desire for 40 low level methods. That's not the level of which we wanna write file systems. We wanna say things like I've got, and in fact I must have a thing in here somewhere, but basically if you can build a map in memory of paths to metadata, which then leads to data, you're done. That's it. So no, you don't have to write a bunch of stuff. You can write more stuff depending on like how much sort of fine-grained control you need. But you almost never need it. That's not what people here is like custom file systems for. It becomes a matter of just mapping like, well so file systems is collections, is like the whole thing here. So file systems are collections in numerous ways and the ways that we manipulate them all amount to collections operations. You join them, you divide them, you manipulate aspects of them, you traverse them, you break them up this way and that, but it definitely makes your life easier if you write a file system. In fact, mostly you don't write a file system. At best you might have a config file because the whole big virtual universe can be config file driven for all common tasks. So there's no reason like, I can't just save .SFS config and put a few little commands in there and exploit the already written file system, which does what I need because I can configure it there to do what I want. So does that make that any more convincing or do you feel you're like, because I actually, this is not very code oriented but I've written a lot of code and writing a file system is literally too large. So I see the point where you're saying that I don't see, for example, something like Eevee Mabel becoming less complicated from the fact that you are doing this. I see it's become more mannequin. I can see, oh, it's so cool stuff, but I don't see it becoming less complicated. Yeah, so I don't like to say magic if I can help it. What we can take out of the equation is all that, so Ivy and Mabel are largely opaque creatures that make a bunch of decisions and badly manage a single threaded cache which gets corrupted if too many people look in there at the same time. Now, consider an alternate universe in which all the files are actually managed in a way that's concurrency safe so we no longer have to have this whole business of, oh no, I ruined my Ivy cache. I need to delete it because I'm getting all these mysterious errors. That it's all database driven rather than having important configuration data strewn over XML files all over the, I mean, this is nuts. This is nuts. We can cultivate data so much better than this. Just like we could just sit down and write a database thing that trivially was far superior in every aspect in terms of just associating requirements with versions with artifacts. And then we can present as an Ivy repository or whatever. But this is much less magic. The only thing that seems magic is the fact that we've allowed ourselves to let like the actual act of trying to get a file cause something to happen. Something we're not accustomed to but that's not any more magic than anything else we do on a machine. That's an arbitrary limitation that has bound our hands for 30 years for no good reason. So I choose to lift it. We can remove all Ivy support from all of the build tools as well as we can use it. How much is it? It's polluted in SPT. Pretend it's all a local directory containing Josh, right? Right, the whole process of like managing like the whole network thing, right? They're really bad at it. And it's a bunch of complexity. But regardless of like the exact avenue chosen the opportunities for basically getting free of these things. I mean, because adding magic is the opposite of my ambition. For example, probably I see there is our inclusion of a file system don't include errors, breaking too much. So what if the network is down? So how do we expose the file system? So among the most critical aspects of this for actually getting anywhere is recognition of the fact that you cannot be a flaky as a file system, right? There's no room for that, right? You must be the most robust component in the system. And whatever your mechanism is for dealing with that so my intention is like nothing happens without just absolutely bulletproof cores for like dealing with all manner of file system failure. But what this really does is this is an opportunity not a weakness. It's an opportunity to manage the various brands of failure in a coherent way, right? Like, and it's also we can do things like expose the reality of whether a file is already here locally or not, right? Like virtual files have different metadata than a file that's virtual but physical machine backed. And we can see that. So we know what's here, what's not here. We can pre-fetch, right? We can be opportunistic, right? Like there can be intelligence, crowdsourced intelligence like other people get this and that, whatever. Let's just go pre-fetching stuff. What's trending in IV repositories, you know? These are not like this isn't science fiction and this would be usually beneficial because that's the thing all of us are laboring, laboring, laboring and applying human intelligence to a problem. And then all of that is wasted. Whereas, you know, coalescing it and then utilizing it is not that hard. And here we have like a reason to do it. Wait, wait, wait, wait, let me get to my welcome back to you Chris, what you got? I've got kind of a simple question. So I missed your previous talk but I think it was very exciting. I just did a report on Plan 9 and they heavily used virtual offerings. I did mention Plan 9 in the other one. Well, like only as a throwaway joke, but yes. So, um. So, I'm curious and again, it's kind of simple, but at the core is there a way to get the current file systems to expose how a virtual thing or would you need to have it? That's among the biggest problems. Would you kind of need like a window that you look through at the file systems? That's how it is, basically we need operating system support to get root. And so without being able to get root then people can keep escaping from our happy universe. And we can redirect symbolic links for instance, back into our window but when somebody's hard-coded user bin and for whatever, that's where it's gonna go. This is in the problems to solve category. So there's no fundamental technical problem but it is not something that can be fixed. And yeah, and so it really removes a lot of the best ideas like security. You could be hardening file names dramatically, their contents and forcing all kinds of cool constraints but security is all gonna break down if anybody can just go through slash. So yes, very much a problem and but also not one that is especially concerning because it's a little way off. So my question is actually related to this one. All right. So you've done a ton of research on this. Why hasn't this happened earlier? I mean, this was the goal of the unit system in some way where you have devices that give you bytes. Is it just that when you get bytes from a unit's device that it's streaming rather than random apps? Well, I'm gonna say the biggest reason it hasn't happened yet is the different, well, basically like conditions were very different in previous years and you know, disk used to be what like 10 to the ninth slower than memory, right? And so like getting too interesting with your virtualization of disk was probably the road to like debacle, right? I mean, like their performance was like a probably a very serious consideration. There's every indication of that when you read papers from the 80s, you're gonna read about like stackable layer of layered file systems they call them and stuff has been shipped before. Like there's a number of examples of things that have elements of this in them. It's never been pervasive and you have the classic like, again, the chicken and egg thing to really enjoy the benefits of this. You've got to be willing to, you've got to be able to anticipate that the virtualization exists on the target. And like just cause, you know, Sunos shipped some interesting virtualization files system in the 1980s. Didn't mean anybody else did. Therefore, nobody can really take advantage of it. We all know as common denominator ourselves into oblivion. Now my ambition for suffuse is to be able to ship something that is actually like dual nature in that it is capable of exploiting virtualization where it exists and then falls back into just being like a regular archive thing that works like a normal thing when it isn't. Like everything, every idea I have is based on the idea that nobody will try anything that's the least fit and convenient. So it must be like absolutely zero pain whatsoever. It's just nothing but pure good. But I think that's essentially doable. The one thing you'll have to do is install suffuse, right? So if you can type brew install suffuse then that's all you gotta do. You do have to do that right now. You won't have to do that ever again though because one of the things is suffuse files system with every binary in brew sitting there on your path. So you just try, you need Bippy and you try to run Bippy without there. Oh wait, but in the background brew install Bippy package and then it finishes and it works, right? Like this is another thing I've actually implemented. It works exactly as described. Like this is not science fiction. Yes. Just a quick question on like that. Whether or not power characteristics like that how such a system would utilize power of say battery operated place? I don't see any reason to expect like this is some kind of like massive battery drain. Sure, we're talking about adding logic to file system accesses but one of the critical things that we have the opportunity to hear. Hey, every file in the system doesn't need to be like in this, these layers of blankets, right? We're only interested in a subset of files a small subset really. We're interested in like the files that make the system be the system, right? And then we're interested in like the files that we as humans mess with, right? And then for the most part the other ones are all derived from those and they don't need to be like and so we can have like identity. So it's critical the way you architect stuff that you can get out of the virtual world if that's what's called for. So if it's the identity transform, right? Like when all is said and done then just pass the thing out back to the regular file system and let the guy do the thing the way he did it before. So in principle, then you really don't have like a major issue. What if you're bunching together work to do? So say you have to go out somewhere remotely to fetch some bucket of bites. If you're bunching all that stuff together whereas some of the work could be done beforehand at a time when more power is present. Well sure, nothing is stopping us from like pre-doing work, right? Like because it's designed for laziness but just like in real life we attach eager in this annotations because we want it done now. Well there's no reason we can't do exactly the same thing. It's just as a default for like an arbitrarily large universe of data we would really like some things to generally happen lazily. But yes, pre-doing stuff is very important for both like performance and also for robustness, right? Like as you mentioned that a network's gonna go away. Well let's make sure I got everything I need to keep working when the network goes away. Something I definitely don't have now. GitHub goes away and it's like my whole life shuts down. Oh sorry, am I, oh I see there's an actual, sorry I'm not used to having any constraints on my behavior. Okay, just telling you I'm done. I don't have really anything else to do around here so talk about this. So if you have more stuff I'm here. Anyway, thank you very much. Thank you. Thank you.