 I want to welcome Jakob, we've debated his name already. I'm just going to hold this. This is impossible to pin. Oh, I think you can just take it away. Yeah, that also works. Hello. Is it OK? Can you hear me well? Yeah? Is it OK? OK. Great. So welcome. We're going to start here. Yeah, nice to see and learn more about Wazi. So hey, guys. I'm going to talk to you today briefly about Wazi. But before we get on, let me, I guess, introduce myself. This is actually my first time at FOSDEM and first time presenting. So far it's great. So my name is Jakob Konka. I'm an R&D researcher at Golem Factory. For all the links and stuff you want to see more, you can find this on the FOSDEM events web page for the talk. I try to put everything there. If you have any more questions about stuff, just feel free to ask me probably after so that we can get through this. And then I want to show you some live coding as well. Hopefully we have time for that. So I'm a regular contributor to Wazentime and Wazi. And one of the authors of Wazi Common Library that's used in both Wazentime and Lucid Wazi runtimes. And I'm going to get to that later on as well. I'm also a member of the WebAssembly community group. And if you have any questions, just feel free to reach me on any of the handles. Private emails are fine as well. I don't mind. So what is Wazi? Very briefly, Wazi stands for WebAssembly System Interface. It's a pretty new thing. I think it officially was announced last year around March, April time. Currently it's being standardized by a community group called BiteCode Alliance. Now, BiteCode Alliance was formed in November, 2019 by four founding members of the Mozilla, Fastly, Red Hat, and Intel. All of those companies have actually something to do with WebAssembly and computation outside of the browser using Wazentime. So why Wazi? Because we already have a script in Target. We have the unknown, unknown target for Wazentime. So Wazi is trying to take Wazm outside of the browser, unlike in scripting, and actually do it with security in mind first. So the security is based on capability-based security model, kind of like CloudAbby, Capsicum, or whatnot. So basically, I just will give you the save and portable access to the host resources. If you want to learn more, have a look at the official website, which is was.dev. So how does it work? It's actually pretty simple. You take your C, C++, binary library, mainly. You cross-compile that to Wazm 32 Wazi target. You get your Wazm module with some exports defined, and imports that the embedder or the runtime should provide. You put that in the box, which is the sandbox. Could be Wazm time or Lucet. Now, the tricky bit here is that if you don't give it any capabilities, you can't really do anything except, for example, read from student or write to student or student. If you want to do something fancy with your Wazm app, you have to give it capabilities. So for instance, you could allow it access to the workspace there under slash, right? You could give it access to the host entropy, but all of this requires a capability. Because by default, you don't get access to anything. So for instance, in this case, we're going to give it access to workspace on entropy, so we're not going to give it access to slash def or clocks or in general time. This way, it's actually easier to give capabilities than take them away, because you can always forget about something, and things just happen. So to put it into more context, for instance, oh, by the way, I'm using REST, I hope. I'm not really a good C developer. I mainly use REST, so sorry for this, if you guys are not into REST at all. So we can actually create a file in workspace directory. We can access the entropy with run thread RNG, but we cannot, for example, open def null, because we don't have access to that value capability. So that's a no-go, and we also cannot call system time now, because we're also lacking this. So this is cool. You can actually run apps that you don't really trust, because you can enforce some degree of security using capabilities. So now what is the setting? The setting is fairly specific. So I work for Golem Factory. Golem Factory is trying to create, it's a blockchain startup. It's trying to create a decentralized market for computing, where you can buy and actually sell computation power. You basically, if you have a problem, you call yourself a requester. You want to basically do a map reduce kind of an app. For the moment, you write your task. You then specify how it's meant to be divided into chunks. And then, as a provider, so a person who's actually renting the computation power is going to compute that for you, each part of the SAP task. The trick here is that obviously it's untrusted. It's not like Amazon AWS or Google Cloud. You don't really trust them, because you have no idea who they are. So things can go wrong. They might have an incentive to lie to you or whatever. So that's why Wasm and Wasi come in handy, because they give us some degree of security. In the sense that, even if you're a malicious requester and you want to do something crazy, you can actually with Wasi, you can restrict what they're going to do. So in Golem, you can actually, one of the use cases is you can write your own Wasm apps and then send them to the Golem network. Now, how it works currently, you're allowed to compile your app to a scripting, or you have to if you want to run it on Golem. That is because when we started looking into this, Wasi wasn't even announced yet. That was end of 2018, I think. So I think Wasm started doing some stuff in the sense that they were trying to take a script and then do something useful with it. And we basically, we were caught in this wee bubble. So we use a scripting to do our bidding. Our sandbox is SpiderMonkey based. So we took SpiderMonkey, JS Engine, and embedded that. And essentially speaking, you basically have to preload all the resources you want into memory and everything is in the host memory. So that's also not ideal, plus you have to deal with JavaScript. So that's why when Wasi was announced, it was perfect because you don't have to deal with JS and you have a better way of tracking what capabilities you give. So the other thing that's actually quite tricky, and this is really tricky is if you send it, if you, as a requester, want to get something computed and the node is not trusted, how the heck do you tell whether or what they returned is valid or not? Right? There is no trust, so to speak. So one of the ways of handling this automatically so that the user doesn't have to worry about this stuff is duplicating the work. So we call this verification by redundancy and you can duplicate one task or SAP tasks, SAP tasks into two, send it to two different nodes and then compare the results. But in order to compare the results, you need determinism because otherwise, if you want to do it by, by, by, which is the simplest way. If you don't have the determinism, then good luck. It's actually for every single use case or app you would have to come up with your own way of verifying, which is just annoying. So this is where the determinism requirement comes from. And I know it's also useful for blockchain guys who actually do the lower level stuff, like smart contracts, obviously there, you need determinism on all levels. Right, so is WASI deterministic? And unfortunately it's not. There is actually many avenues which give you non-determinism and so WASM itself has certain things that might make it non-deterministic and with WASI, we add even more things. So one of the obvious ones is obviously access to random device. And this is done with the Cisco Random Get which under the hood should call something like Get Random to actually access the host entropy. So the good news is that with the upcoming snapshot, which is in the works now in WASI, this will get its own module, like WASM module, and will require a capability to be enabled. So by default, it's gonna be disabled. So this is good. So one less thing to worry about. The other one is system clocks. Same story as with Random Get. Currently it's enabled by default, which is essentially a spec bug, but we were just specking that out. But in the future it's gonna get its own module and will require capability. So that's another thing. Now, more subtle ones, and those are the interesting ones, are for example, the contents of file stats. So this has almost direct mapping to what's happening on the host. And file style returns things like the inode, the file type, and the worst thing is it's got access, modification, and control times. This is tricky. It's not that easy. You wouldn't want, if you want to keep it in WASI so that you can use it in your deterministic program, you have to do some crazy stuff here, like I don't know, filtering out parts of the struct. I don't think sticking anything emulated inside or random numbers, random numbers, like zeros, something is gonna work. So this will require some more time. The easiest thing would be to just disable it somehow and we're gonna get to that in a minute. This one is one of my favorites, so something as useful as listing contents of a directory can give you a notion of non-determinism because guess what, order of entries is dependent on the host files of the file system used. So nobody makes any guarantees. How are you gonna get the results back? So this is also tricky, but it's also useful, right? Because it's better if I give you just, hey, this is the directory, just list the entries yourself rather than me explicitly passing everything to you. That's no ideal. And actually, the list is quite long. I was surprised because I started with a couple and then people joined in and it turns out that there is a lot of things that can go wrong. So and I encourage and invite you to actually join the discussion, have a look. It's pretty good. So the link is there. It's under WebAssemblyWazi issues 190. Right. So can we actually make Wazi the terministic as it is now? So now is the question. The answer is, it's pretty much, it's missing a couple of things, but we can still deal with that in a hacky way, but still. So the model that I wanna present today is very straightforward. It's kind of like you can think of it as delegated functions, I guess. It's basically you export the function, let's call it compute, that takes into Wazi file descriptors. Now I'm gonna get into what the file descriptors and writes are in a minute. And basically what we require here is that the input file descriptor only allows you to read from it. So you can only read bytes and out, you can only write bytes to it. Okay, and that's it. And because of the capability-based model, we can actually force that with ease nowadays in Wazi. So now what is the Wazi file descriptor? It's an index into a table where we store capabilities or actually entries to the host resources. So the simplest one is zero, which doesn't have to by normally points to standard input, one to standard output, to standard error, and then the rest is actually where you pre-open. So in Wazi you call this pre-opening. So for instance, under Wazi file descriptor 11, you're gonna store an entry, which has quite a few fields, but the most important ones are the actual always handle, which is the host specific handle to the resource. On Linux, that would be a file descriptor. On Windows, that would be a Windows file handle, for instance, so it can be anything. It can even be a socket and whatnot. And you also have base rights and inheriting rights. Now base rights describe what you can actually do with the set file descriptor, whether you can read from it, you can write to it, if you can ask for the file start, et cetera, et cetera. And inheriting rights are a bit more tricky. They are used, they will be the base rights of any file descriptor that's being derived from this one. So this is mainly useful if you do things like path open. So if you, given a file descriptor to a directory, you want to open a file under that directory, right? So this is gonna get the inheriting rights. So in our case, actually, it's not really important, but it's good to remember. Right, and then the Wazi file descriptor writes. So as I said, this is a powerful concept. So if the file descriptor has got FD read, you can only do two operations. I think actually three, you can also set rights. Anyway, so you can do mainly FD read, right? But you cannot do, for example, FDP read, which is with the offset because it don't have seek or tell, which is pretty useful. And you can do FD FD start get, but this is harmless because all it does, it returns the information about what rights it's got, what the file type is. So that's perfect, right? And you can actually do it nowadays. It's now no magic, you can actually enforce that. And the same with write, you can invoke FD write to write bytes to a file descriptor. And I wanna emphasize the fact that file descriptor can be anything. It's an abstract concept. So you can point to any resource on the host. And again, FD start get, right? But nothing else, and this is cool because this already weeds out all of the syscalls that use file descriptors and pass. However, as I said, so have we just achieved determinism almost because there's still this. And nowadays, unfortunately, this is implicit. So you get access to all of those. Call it ambient security or whatever, right? So you can actually call that. And there's no way of, in a nice elegant way of enforcing the, at runtime, not to do this. There are hacks together to get around this. And I'm gonna show it in a minute. But the good news is that this will get sorted with the upcoming snapshot when it stabilizes. Do you guys have any questions until now? Yeah? Hi, Martin, by the way. Say again? If, okay, I'm not sure. Yeah? Sure, sure, but okay. But there is no automatic mechanism actually, wait, I see. I think the question is for the recording industry. Oh, right. So I think what Martin is asking is why can't we not just check when importing the module if it's importing the, if we're importing the function random get, right? Yeah, yeah, exactly. And this is my hack, but it's a hack, right? So no, no, that's a good question. But the nice way in elegant way is actually if you can do that, you know, when you have the runtime spinning, you just say like, then you will say import this or include this module, but nowadays you can't do this. And that's the elegant way, but that's the next thing question, yes. So any more questions on this? Or can we move on to the examples? Are we good on time, by the way? Yes, Martin. Cool. So when I was trying to figure this out, I prepared like a short set of examples, some of them very simple, one of them actually a bit more complicated to see like a proof concept to see whether this would actually work. And this is exactly what Martin was suggesting. I added that as well. So you can find them on my GitHub under kubiconwazi compute. And there is the description how to compile it using Rust, current Rust, et cetera, et cetera. There are three examples. The first one is hello compute, it's that easy. You read from an input file descriptor, you uppercase and then write out, just to test that it actually works. The second one is to verify that indeed the two FDs have the right writes. On the third one, a little bit more tricky is actually taking some existing library, in this case, flight text to speed library or engine. So what I did, I pre-compiled that from C to as the object. And I included them for convenience in the repo and then you can actually statically link that into your, into this example. And it's gonna create like a compute function that does flight not using a binary as such, but actually is gonna create this deterministic function compute that allows you to read in the text input and then write it as a way file. And it works. I'm not sure I can, maybe I'll be able to demonstrate this. But anyway, so feel free to have fun with this, break it, play with it, right? So, since we've got some time and it's perfect, we can actually, I can actually show you some of this stuff. Now I got rid of most of the things so that we can actually do it live here. Can you actually see this? Or is it too small? Is it okay? Or should I make it bigger? Okay. Bigger? Is it better? Yeah, cool. Right, so, yes, this is a bit tricky and there was some discussion about this on the Wazzy discord lately. Wazzy still implicitly assumes that you're gonna define your main so that the main entry point to Wazzy's underscore start. That is true and because Wazzy's mainly based about on the concept of the pre-open and file system and you know that you actually pre-open the directory and then you insert some stuff into it. If you do it my way, this stuff is not gonna work because essentially using libc here doesn't make any sense. You basically call the syscalls directly so it's kind of like lower level and all we do here, we actually export compute and that's it. So that's why in Rust you need the no-mongle because we kind of for mongling, right? We actually need to be able to read it without mongling the name, pubexternc and then we have in as the input was the file descriptor and output was the file descriptor. And right, and what Martin was saying, so I modified wasn't I am just a little bit so that I'm actually blacklisting random gates and all the syscalls that can cause non-determinism. So they're not there and if you do it at home or after this and you want, just check it. Try to invoke it and it should panic. Basically it should trap because the import is not there. And the next thing is, there's also a manual for this. Well, you're gonna see how we're gonna run it. Basically you specify, I did like two flags to wasn't time that allow you to open, basically assign the resource to a descriptor directly. So you don't, because currently the way it works is that you can pre-open directories but what I did, I tweaked it a little bit that you can pre-open files and actually reduce the writes to either read or write and that's it, okay? So that this actually works. Right, so that as I said, the hello was meant to basically open in, read from it, uppercase and then write to out. So very quickly, we're gonna create some temporary buffer here. Let's say a thousand, obviously you would do it in a different, more like safer way but we don't care about this here. Now, the thing you need to actually read from FDread is an array of IOVect structs in WASI. So this is IOVect and then it requires a pointer to the buffer, in which case that's gonna be buff and then we're gonna take pointer out of, sorry, out of this and then also the length. Okay, so now basically we basically pack the buffer into a WASI struct that we can then pass into FDread and read in the contents into it. So then what we do, we call, and this isn't safe for obvious reasons, we call the FDread syscall, which basically takes the input file descriptor in this case, RN, actually if you call it without, it's gonna trip immediately because we don't have the FD writes on the input file descriptor which is exactly what we want. We just want like a one-way stream, right? There we create the array of IOVects, in this case it's only one and that's it. So this is gonna read the contents of whatever we point at with input WASI file descriptor into the buffer. Then what we do, let us actually do the upper case. So this is from UTF8, oh, and I should probably have that, yeah. You should probably do like the proper handling here but I'm not gonna even bother with this. So from UTF8, and that is gonna be the slice we just read into, oops, the buff, and then however much we read, and then unwrap, and then we can do to upper case. Okay, so that should be to upper case. Then unfortunately we need to create a CIOVectstract which is used for actually writing out in WASI. It's pretty much exactly the same as the IOVect except for the fact that the pointer is const, it's not mutable. Oh, and it's not, there's actually two string. And we just call this is called, which is CIOVectstract and that's it. So that's the first example with the key reason upper case is it and writes out. Any questions on this, yeah? No, but again if we were doing like a library out of this then yeah, that actually would even make sense to create something like Lipsy. That would be the above the raw Cisco's like I don't know stream in, stream out thing but you know I didn't have time to actually do this so yeah, that'd be a great idea to actually have a from or try from or whatever, right? So having this, let's try and compile this. Oh, I highly recommend a cargo WASI tool that was written by Alex Trayton, it's great. It actually saves a lot of hustle with passing in targets and stuff. You can even do testing. So I'm gonna actually use nightly because I know that nightly is up today with the current snapshot for WASI Rust. I'm not sure whether stable co-op yet or not but you can have a look. So this is gonna build it. Oh, it's gonna be all, yeah, okay. Let's go into Hello compute. All right, so it built it in target. We have the Hello compute WASM, right? So now about invocation. So I'm gonna do this with tracing on so I can actually have a look what the call sequence is for the Cisco's as well. So this should be, this is useful for tracking what's going on. Sometimes not enough, sometimes you have to add some debugging because there's too much that's going on but for starters, this is pretty good. So the flanks I added are pre-open read which is gonna take some input file in which I haven't created yet so this should actually shout at me and this is gonna use that as the stream in and then pre-open write which is basically gonna save it to a file out and that's it. And I'm gonna use the experimental argument invoke for WASM time which basically allows you to, instead of running start of the WASM module, it allows you to basically run any export, okay? I'm not saying there's absolutely nothing there. Oh, and because our function requires input file descriptors, we're gonna pass them as the last arguments. So, okay, that's pretty good. So we indeed haven't passed in anything, okay? So we have this and when we run it, there is a lot of stuff happening here. The first bits are not really that important for us now. It's basically telling you that it's inserting certain pre-opens into the WASI context which is basically the stable that we're talking about. One of them is FD11 and FD12 should be somewhere here but I can't see it, maybe it's FD10. Anyway, so there we have FDREAD and it read 20 bytes should be about right and then it should write out the same amount, okay? Even though the buffer was a thousand. So we can test that and yeah, there you go. So this works. If we tried, so we can, this is kinda cool because you can play with it now and if you tried to call something crazy right, like actually random get, it's mutable, right? Yeah, cool. So this should not work. Let's have a look. Right, yeah. So I guess this is what Martin was suggesting here. Basically the input random get is not found in the module. So you can actually do that now but it's not clean, right? You'd have to maintain your own runtime and I'd like to avoid this. I'd like everything to be like according to the spec. That'd be amazing, right? So the next thing I wanted to show very briefly is actually, I'm not gonna write that out myself. So that's the test that you can find, the test example that you can find actually in the repo. Essentially, it doesn't do anything complicated. It basically calls FDFDs.get on the file descriptor and then we can actually compare the write space. So we're expecting the header thing to be zero. For the input file descriptor, we want the wasd writes to be read only. So this is wasd writes FD read and for outs to be writes FD write. So if we run this, yeah, it's fine, right? Everything was success, you can see it here and here and then it output is zero. So that means no errors. So the writes are there and they're properly constrained. So for the most complicated example, there is a slightly bit more code here. So this is actually taking the C library and adds rust wrappers on top of it and actually you can do the flight. I wonder if I can show you that she works. So I'm not gonna bother actually going through this code because there is quite a lot of, it's very flight specific. Just trying to hook in the C library and rust and basically just invoke it. That's it, there's absence no magic there. It still follows the same architecture that you read in. You do some processing and then you write out. So I wonder if I can show you guys, right? So now how it should be a wave file and it is. I'm actually gonna move that to be easier. Right, I have, right, forgive me for this. We'll see if it works. So it actually works, right? So you can actually do, oops, sorry. You can actually do some more complicated stuff with this as well. Obviously this is just an experimental thing so it's meant to break. But I guess it's a good start and we can build from there. So going back to this, right. So any questions? Cause I think we've like five minutes left or something. Thank you.