 Cool, so, hi, so I'm Francisco Ambrutores and I'm the CEO of the Bonavarition Working and Sourced, and I'm going to be talking about a project that's called Public Fish. I'm going to try to do it in, I'm going to try to do it just by why do we need public fish, why is it a thing at all, and then talk a little bit about how we solve the problem we have, solve the concept that we're working with, and then how we actually implement the actual structure, the language that we use, the technology, and stuff like that. And there's going to be demos in between, and maybe the Wi-Fi works, so maybe it actually works. So, if you want to follow up during the slides, the slides are online already, so you can just use a thing, or if it's made in QR code, then somebody uses it in more QR codes, if you want to use one. So, why is public fish? And this is the part that there's no way it's going to work, but wait. Okay, so this public fish, it's an effort that gets, like, it fits from a neural page, and then puts things. Then it allows you to understand that you can't always enjoy a little bit too much, so you can go and watch your head. This is from a movie, The Hedge Hiker Guide to the Universe, which is a great movie that happened a bunch of times, but after watching this, now I don't know why. But basically, the public fish is the whole concept of this, a universal translator. And what we're trying to do is pretty much the same thing, a universal translator, not really a translator, but a universal way of understanding the problem in any language. At source, we do machine learning on source code. And machine learning on source code has the same challenges as any kind of machine learning, which is, maybe, what is the data, and how do you get it? How do you gather it? So we have a whole team on that. There's a whole team at source that basically has a bunch of projects, and all of them are open source by open source. And those projects are the ones that are going to go and fetch source code from repositories and put them together in a format that allows us to have something more compressed and useful to basically learn later from. And so there's a problem of creating those faces and then analyzing them. We have another project we call it the engine that is specifically for that. Now the problem is that if you have source code, you can understand many different ones. One of them is the sequence of bytes. That's what it is, that's what source code is. But if you understand that, you're actually limiting yourself to what you can actually learn. Has anyone done any personal neural networks or something like that? Okay. There's a very good example, which is this example of how to recognize a hundred images. If you're learning, you try to recognize a hundred images, and rather than using something that allows you to have the structure of, well, there's an image that's square, and these are the pixels, and you know it's up, right there, or right there. Up, right, down, left. Yeah, those four. So if you don't have that, and everything you have in just a sequence of bytes, being, uh, protecting correctly is actually way harder, right? So structure is important. What is the structure in source code? The structure in source code is the structure of the language, right? If you write a photo, well, you have a photo, you have instructions, you have photo definitions, all of those things. And those things are defined by the language grammar. In a language grammar, as the name says, you know, it's the language, and every single part of the language has its own language. So what we're trying to do is actually using that in the grammar, well, the result of extracting the structure from a program by applying the language program as the input for a machine learning program. And that is the part where it gets interesting, because we don't want to have one single machine learning program. So many folks say, uh, let me... Extracting the data structure is actually kind of hard, and once you've done it, you have a different kind of data structure per frame. ASD is made in all languages. It's hard to learn from all of this. Why do we even try to do this? Well, there's a bunch of reasons, right? One of them is we're trying to do machine learning, and machine learning, you're able to do both things. It's very, very interesting. We're invited by one of the engineers at Source, where he's learning from all of the identifiers and all of the sources that we have, and creating an embedding. An embedding is basically what it does, is it gets from a huge... from a space with a huge number of dimensions, it takes it down to less dimensions so we can learn and understand things. So it allows it to do cool things, like saying that the distance from boy to king is the same one as from girl to queen. That's kind of interesting. And, you know, ASD is also from Source, to understand that query is the database, as students setting or sending is the same as pushing to pop. And this starts to be quite interesting, because we're actually extracting information, so you can imagine that, for instance, saying, when everything's doing to send, you should be doing pop, right? You could be doing this same analogy over many different concepts that are somehow not analog. Yeah, this, if you want to read this article, it's pretty interesting that it's right there, a lot of sorts of that. And I definitely recommend it. It's quite long, but it has pockets, so it's good. Okay, but in order to extract an embedding, to generate an embedding, in fires we need to extract different fires, right? And these were things starting to pop, because why is that an identifier? And how do you extract it? You cannot just move it, right? Or we're just going to just hope for the best. You can have main-edited into fire, functions could be added into fire in Go, but not in JavaScript. Func could be added into fire in JavaScript, but not in Go, because keywords are different. So there's a lot of different concepts, right? So we created Boutofish, and Boutofish, what it is, is a set hosted server for universal system processing, threading cooked files into universal abstracts in the same districts. And now what I'm going to do is explain what that means. So, universal... What is the universal abstracts in the extreme? Before we actually... Well, what is an abstract in the extreme? How many of you know what an abstract in the same district is? I think you're in source code in that. But really quick, you have source code, you scan that source code, you obtain tokens, and those tokens are just like a sequence of tokens. And thanks to the language grammar, you're able to use the parser and extract the structure. And you get an abstract in the extreme. So three plus five, you get three plus five, and then you get actually the abstract. What you want to do is you want to learn from that structure. The problem is everything you want to learn from that structure. So what we want to do is somehow the magic will be removed from an abstract in the extreme and abstract in the extreme to a universal abstract in the extreme. The difference is its process. But what is the difference? What is this? It is simply an abstract in the extreme. It's just a form. Specifically, form and don't be decided. So what we have is... This is green and gold, but I hope you understand it. Every node has a bunch of different fields. The first one is an internal type. An internal type is the kind of node in the language, and every single language has a theme, a different way of using it. For instance, in Go, you may have GoRoutine, but in Java, you may have, I don't know, a class or something like that. Every single one of these are language to the better. Then you have properties. You can have as many as you want that's going to be built, but the whole idea is that every parser should try to give as much information as possible. Then you may not extract all the information, but it's good that it's there for all of you. Then you have children, because it's true. The token is the actual text that we have. So, for instance, if you have a folder, it will be form right there. Then the positions, and then the roles. And the roles are an interesting part here, because the role is the point where we start annotating an abstract syntax tree with basically saying what is fixed according to a common vocabulary across all of them. So, there you go. So, it's a bunch of them, but you have function, and we have documentation. So, function is a single of these structures packaged as a unit. We're not saying what it looks like. We're not talking about keywords. We're not talking about structure. We're talking about the concept. And this concept applies to many common languages. If you're writing Haskell, or if you're writing JavaScript, you know what function is. There's all things like numbers and numbers. We don't talk about code 64, or int, or whatever. No, it's just numbers. And you know what a number is, so that means you can apply it to whatever it is. Maybe you even know that a variable is a number, and you can apply this to that. And then, also, we have an entity factor. An entity factor is any form of an entity file. It's something that identifies states. That's it. So, you can use it for many different concepts. So, now we have this. We're able to, and it works. Amazing. We're able to, it's not amazing that it works. It's an idea that the Wi-Fi works. Just in case. So, now you're... I know the Wi-Fi doesn't work. Let me tell you what it will look like if it works. Let me try the other question. Oh, it's a drama. No, that's not a screen shot. That's not a screen shot. You can trace the other way. There's a second one. Oh, yeah, that has never worked for me. Yeah, sometimes it does. That's not what that works. Oh, okay. Whoa, whoa. Wait, wait, wait, wait. So, as you can see, what happens is that, at the end, we end up having an abstract syntax tree on this right side. The abstract syntax tree has information. Okay, cool. You see it. This is what it would look like if it worked for me. So, you have a bunch of nodes. And out of every single one of those nodes, you have the general types of computation, meaning it, or line comment, and for the variation. And this is specific to the language. But also, you have roles. So, you can see that these are the variation and any other when you're importing a package. Now, once you have these, the interesting thing is we have a common format for all languages and we can query on those. So, we query trees. Well, we didn't invent trees, XML tree, and we have a regular language for it, which is XML. So, what we're doing is we actually implemented an x path filter on top of UST. And we call it B, UST. So, the way we map it is we have, for every demo type that is the element name XML, the rest are basically attributes in different kinds of ways. But basically, once you have this, you're able to query things. So, you can say, well, this is the internal type. So, it is not that happens to be the internal type for a package. You could also filter it by, say, also just the package. But we know the num is only for a package. You can say, and all the languages are in a language and here what you're passing is actually checking the roles. So, you're saying role, memory, and role, and also you can do all the arguments and function roles. So, now what you're able to do is, with a single line, you're actually able to extract information from, we have tenabytes of data from source code, you're able to add that query. And I would demo it, but it doesn't work. But basically, here, you can take the UST query, you can actually query something, and you can get only the nodes have massive query. I'm almost out of time, so the architecture can be, this is the architecture. So, we have a client, we have a server, and then the server has writers. We have a writer for programming language. The interesting thing is, I used to work in the Google team at Google, so I write them. So, there's offers. The server has a nominator and we can go and the parser is reading the language that you're passing. And this part is very important because we don't want to write parsers, right? Every problem language has a compiler and very often that compiler is a compiler. Which means that very often we have a Python, a Python parser with a Python, a Java parser with a Java. We want to use them. So, what we have is we have a process reading the native language that is going to create somehow a UASD that is now annotated. And then we have an SDK for the nominator tool, annotating every single thing for the language. So, that works. Because that allows us for the server to download drivers in a very easy way. Every driver is a different one. And then we use GRC. So, you can contribute everything into the source. We have both of them. From In this bunch of references, DoubleFish BDL SH is where you're going to find a little implementation for the website. We have source of techniques. So, you can have all the other projects that you surround. And all of them are also a source. And then we're also on GitHub and on Twitter. And again, if you want to take a picture of that, then just cover me into these slides and see that video. And we're off time. Thank you.