 Today we have Harsh Mishra with us. He will be presenting a talk on building polyglot applications using Metacall Core. Over to you, Harsh. Thanks, sir. Hello, everyone. My name is Harsh and I will be presenting a talk at PyCon India on building polyglot applications using the Metacall Core library. So before I jump into my talk, I would like to extend my heartfelt gratitude to the organizing and the volunteer team at PyCon for providing this great opportunity and helping me put forward a word about Metacall before a general audience. So a little bit about me. I was a Google Summer of Code 2021 student developer at Metacall, the core library of which we will be talking about today. And over the past few months, I have been tinkering and playing around with the core library to make an IPython kernel for cross-language function calls. Currently, I'm working at Quantite Labs as a software developer intern working on the DevOps and the infrastructure side of the PyData projects. So the agenda for our today's talk would be very simple and flexible. First, we will get to know about polyglot development and its particular use case on why is it so important. Next, we will talk a bit about the Metacall core, a library that brings forward one of the best polyglot programming experience. We will get to know a few things underneath the core. And finally, we will do a walkthrough around the implementation of the machine learning based new scraper using Python and Node.js. So for any questions or any queries, you can simply use the chat to put them out. And I will take all of the questions at the end of the session. So let's get started. So before we get into the polyglot development, let us talk a bit about the history behind why we arrived at this Polyglot development library. So a few years back, a couple of engineers were working on the development of a game engine, as simple as that. One engineer was working completely on developing it using the C programming language, while the other one was working on developing the interfaces and the graphics. So after some initial version releases, there was an initial blocker in the development process. First, they wanted more people to join their team. But it faced a blocker that not a lot of people have the primary expertise on working with them. The developers also wanted to use different programming languages like C sharp or maybe Python in the same code base, which was not technically possible before. And migrating the code base from its original C language to some other language was hell extensive. It was at this time that they thought that we can possibly use scripting to make this approach work. But with this use case, we discovered that there are a lot of problems that can be possibly solved. First, engineers faced a lot of problems in migrating their code base. If someone is using PHP and they wish to migrate to something like let's say NodeJS or maybe Golang, that would carry its own set of problems, especially when you are developing on a completely different environment. Another problem that we saw was that we cannot use libraries available in one language into the other one. This lack of interoperability restricted our choices to only a few available and supported libraries in one language. And we simply didn't have the freedom to choose the right tools for our purpose. Adding on top of it, we wanted to embed low level scripts in high level environments. Like if I take an example, we have got Lipsk comp 2 in C, which is used for sandboxing the Linux interpretation Python. But it would be great if we can have something like this in NodeJS as well. Some of these problems at hand were particularly concerning and the engineers wanted to develop a solution for the same in an easy and flexible manner. This brings us to the most important part of our talk, what is polyglot programming? So polyglot programming or simply polyglot development is the practice of using multiple different programming languages in the same code base. This simply ensures that you can use multiple different languages and you can have multiple different developers with multiple different expertise who can collaborate with each other without much ado. They don't have to restrict themselves with the available choices only, but they can simply pick up the right tools, the right library or the framework of their purpose in a completely language agnostic interface. So with the help of polyglot development, you can use the same to combine two or more programming languages. We will see an example of the same happening today. And we can address specific type of challenges as well, like maybe you want to use strong types or maybe you want to use a fast interpreter with an interactive environment. Everything is possible using the polyglot development. Now, this might sound good in theory, but it is not exactly that good in practice. To make this happen, we need to create a connector between these languages and we need to create a protocol in order to send the information between multiple different programming languages. People have already been doing this by using existing development kits like Graal VM, but that involves a lot of boilerplate code. Through polyglot development, we wanted development teams to build and scale without much friction. They should be able to embed the code between high-level environment and low-level environment without any sort of performance issues. And finally, all of this brings us to the next agenda talk, which is about metacall core. So, yes, as we were talking about polyglot development before, metacall basically allows us to make transparent cross-language function call without any boilerplate. If I can show an example of this, it can be something as simple as the same. You have a function in Python, and you can simply import this function inside your JavaScript code or maybe your Rust code, and you can execute the same using the metacall runtime in itself. So, as I say the same, metacall is an open-source polyglot runtime. Using metacall, you can simply call functions, modules, methods, and more between multiple different programming languages. Right now, metacall is supported by a lot of languages like C, C++, JavaScript, and JavaScript. Ruby, just to name a few. Recently, we have added support for Java and WebAssembly as well, which is now integrated into our runtime. One thing that we need to understand is that metacall is not just a library. It is a runtime in itself. So, metacall maintain its own versions of package managers like NPM or PIP so that it can take care of the packages for the interoperability. It is also available through a Kamala interface. You can easily install metacall and you can use it through the standard CLI that it provides, which I will be talking about a bit later. And it is also end-to-end interoperable. It simply means that we can use Python with Ruby or vice versa, or maybe Node.js with C sharp or vice versa. With the help of the CLI, we can just simply inspect our code using metacall. We can load the functions, we can load the methods in a pretty easy manner. So, with the help of this multi-language interpreter, metacall has also developed a function as a service on top of it, but we will discuss about this a bit later or maybe after the session. So, let's check out a very simple demonstration on how metacall exactly works. So, just as I showed the demonstration over the Ruby here, this is pretty much the same. We have a function in Python. We have a simple load script right here which imports this particular file right here and we can just simply use this with the help of metacall. Let's check this out through a live example. So, we have got a simple Python file here, app.py, and we have got these four functions defined here, which is the sum, sum, multiply, and division. So, if you're using the standard Python interpreter, you can easily execute this code, but how can we make it work using the metacall CLI? So, as I said before, metacall offers its own command line interface and if you will just push in the metacall word and it will simply start the metacall CLI for us. Metacall CLI offers us a lot of high-level commands that we can particularly use for our purpose. Let's have a look at them one by one. The first one is that we can load a script from a file into a metacall. So, if your script contains some functions, you can easily load them inside the MetaObject protocol, which is the standard protocol defined by metacall to connect multiple different programming languages. The second one is that we can check all the runtimes, modules, and the functions that are loaded into the metacall with the help of the inspect word. Then we have got the code snippet. We can evaluate a code snippet in a pretty easy manner. We can also call a function that has been previously loaded in metacall. We can also do async await using metacall and finally we can also delete a script that has been loaded into metacall. So, let us have a look. Our file has been loaded into this particular Python file called app.py. So, let us try to load this right here. So, as you can see, the script has been now loaded correctly. So, we can just push in the inspect command and now we can see that the metacall host is a runtime in itself. So, this is pretty much the default one. But additionally, we have got the Python runtime that has been loaded to the metacall protocol and along with the same, we have got all these functions right here that dev, sum, sub, and multiply. So, you can pretty much use all of these functions through the standard CLI itself. So, I can say call sum 2, 3 and this will simply give me the value 5. I can say call, let us say sub and we can push in 5, 3 and this will give us the value 2. So, with the help of this metacall protocol, we can easily export all of these functions that are loaded into the metacall protocol into some other code. Let us check that how we can do that. Additionally, we have got this main.js a simple load script right here which imports all of these functions from the app.py file and we can create a sum, a subtraction, a multiplication and division between two integers here. So, if I say metacall, main.js it will automatically load the script for me and it will automatically execute everything for me right here. So, no need to export the whole logic of this code to this main.js file. Again, anyone can just write all of these functions in JavaScript and make it work. This is why I have a slightly more complex example that I will demonstrate later on but it exactly conveys the purpose on how this inter-language function call is happening. Let us jump back to the slides. So, now that we have reached this point we know that what metacall is and what it exactly does. So, let us check on how exactly we can install it. So, metacall pretty much provides us an install infrastructure so that we can just completely do away with all sort of complex installation using CMake and all sort of that. You can simply use this shell script to download the metacall on your local machine and this install script will basically download a pre-compiled tarball and you can just get started with using the CLI and the runtime in itself. As a note, this install script just works for Linux-based distributions right now. So, if you wish to use it on something like let's say macOS or Windows, you can either build it yourself using CMake. There is already an install documentation right here which pretty much documents all of the various ways in which we can install metacall or else you can just simply pull it using the Docker Hub which will provide you a handy Kamala interface and runtime on a Debian base to destroy itself. So, once you download metacall, you will be provided access to a runtime which you can use to run these scripts. You will get the standard CLI that I just showed you a few minutes back which is a very handy way of debugging these metacall scripts and also inspect the loaded modules. So, let us talk a bit about the architecture of metacall, how exactly it is able to make the languages work with each other. So, before metacall, we had a lot of examples of interoperability being implemented. For example, there was this component object model that allowed us to implement a library in C-sharp and use the same in Visual Basic. We also had LLVM which used to provide the compiler backend for languages like Rust, Julia, and all of this is made possible using an integrated assembly format. We have here the structure of metacall which basically makes use of a standard C API to make all of this happen. It simply means that we are using the metacall code library to embed different runtimes into C. So, we have got ports, we have got loaders which I'm going to explain a bit later on. So, right now with the help of this metacall C API you can simply do away with the standard Python C APIs or the Node.js NAPI if you want to create a wrapper around some sort of a low-level code. So, if you're trying to call these low-level libraries from high-level languages, you don't have to create a C or a C++ wrapper on top of them again. You can simply use the metacall to make this happen. So, the first step to understanding metacall architecture is by simply understanding the ports. So, as I mentioned before, metacall offers a standard C API that integrates with different languages and runtimes. So, using the ports, we offer the metacall API in different programming languages. Like the standard APIs are the metacall load from file and the load from memory which basically allows us to load the different code or different files into the standard metacall scripts. So, if you remember the example I showed you before, we were using the standard Python and the Node.js port to make the inter-language function calls happening. With the help of these ports, we are also able to extend the metacall functionality and make it possible for developers to use metacall without having to write any C or C++ code. One of the most interesting parts of the metacall code API is that how we use monkey patching in relation with the dynamic languages like Python during the runtime. It can be thought of as a hack, but it simply makes it possible for us to use Python scripts in the metacall runtime. The metacall object protocol is the actual code basically the heart of this project. This metacall protocol allows us to tightly integrate different languages with each other. So, for understanding and for our convenience, we have noted down on how we represent these native and the complex values through the metacall protocol. We can see that we have a standard 64-bit number in the JavaScript protocol which we are basically representing as a 64-bit floating point as we do this in the C programming language. And this is further translated into a floating value in the Python object protocol. We do the same with complex values like here, where we are representing an object in the JavaScript object protocol. We are translating that into a map in the metacall object protocol and finally as a dictionary in the Python object protocol. So, through our metacall code rivalry, we are basically abstracting all of these through the metacall object protocol so that the representations are easily carried over and we don't have to worry much about the same. And last but not the least, we have got the loaders. So, these loaders can be simply thought of as the back-end for the ports which act as the front-end for the code library. So, just like Python and Node.js where the front-end, these loaders can be thought of as the back-end for the same. And these loaders are basically responsible for wrapping up the languages and embedding their runtimes in the metacall itself. So, metacall offers a plugin-based architecture so you can use the loaders to basically carry out the responsibility of embedding the languages into the metacall. You can also implement your own loader using the metacall specification if you want and it would be just lazily loaded during the execution. This simply makes metacall inherently lightweight compared to any other alternative and quite easy to use as well. If we talk about loaders, we have to follow some basic steps like we have got init, load, discovery, clear, and destroy. In the init phase, we basically initialize the runtime and the load step basically loads all of the handles that are available to us. In the discover phase, we basically inspect the functions and populate them inside the metacall protocol. And finally, we have the clear phase where we remove these handles and the destroy phases and it basically closes the event loop and frees up the resources for us. Since metacall code is still in beta phase, as of now, we will be soon releasing the 0.5 version. We did some benchmarking and some testing to find out on how viable metacall is for production use cases. For this purpose, we used an Intel Core i5 with a virtualbox VM and we used a Google test and Google benchmarking purpose. All of the results that you see right here has been tested on a DB and base distribution. Pitting it against the hard-coded Python C APIs, we discovered that the metacalls spot and function interface allows us to perform more than 1 million calls per second, which is still slower than the standard Python C API which performs nearly 1.7 million calls per second. This leads us to a scope for improvement on optimization which is further on the way before a general release. With this, we jump into the last agenda of the today's talk where we will basically navigate across a polyglot machine learning application. This was a small example that I built when I was introduced to the metacall community and this is what we are going to use today to inspect on how we can use the polyglot development using the metacall. The use case for this is pretty simple. We basically have a Python script which is loading some machine learning models and we are trying to call some of the functions inside the Node.js application without having to implement any of the modeling within JavaScript itself. Let us see on how metacall basically helps us here. We have the following script right here and as you can see this pretty much picks up and use URL link and it tries to find out all of the similar news articles from all over the web. For this purpose, it is using cosine similarity and a few of the machine learning models that has been pre-trained for our purpose. Starting from the top, we have got the extractor function that basically gets the article body from the URL. We have got the text data extractor and extract the text. We have got this Google search function which performs the Google search with the specified title in the URL. We are using the newspaper package to make all of this happen. Then we finally use a similarity which basically checks the similarity of the news article through the cosine similarity and finally we have got the handle link where we basically do all sort of classification using the pre-trained machine learning models that we have right here. The last function is our driver code which basically takes up a news URL link and it tries to find out all of the similar news articles from all over the web along with the similarity score that has been tested using the cosine similarity. And we have got this very simple node script right here even if those who don't know anything about JavaScript can't read this code. We are using the read line package to take up the console prompt and we are basically importing this particular file right here which is the machine learning news scraper to our standard code base and make use of the same. So if we try to launch the Metacol CLI here we can pretty much load the file itself which is the machine learning news scraper and once we do an inspect we can pretty much see that all of the functions that are present in our Python script are actually loaded into our runtime here and we can pretty much use the same. What we need to do right now is basically make this script work. So just like I said before let us try to do an interesting experiment. Let us try to say node app.js and make it run. As soon as we do that we will automatically see that the node is giving us an error. It is saying that we are trying to import a statement outside the module. It simply means that a standard the standard node runtime is not able to import this Python function right here. So this is where we understand that this point Metacol is not just a library. We are not using Metacol anywhere right here. What we are using instead is a Metacol runtime to make this work. So if I do a Metacol app.js again I can see that the script is working now. So let us try it out. Let us see that we have got a few links here which I saved beforehand and let us pass it to our script. It will automatically start the scraping process through the Python script that we have right here and load the same inside our node script and pretty much everything would be console logged right here. So that will take a couple of seconds and yes, we have all of these similar news articles that has been found out using the standard machine learning models. The models are not yet perfect but it exactly does the trick. So this is exactly how you can use Metacol to start developing your own Polyglot application. Metacol can be used for a lot of use cases. Some of which can be viewed right here. So even the Metacol core is still in a beta phase as of now it is widely used across a number of projects. For example, we have got Asset Cam and this is basically used to distort the videos using OpenCV so that we can basically create some architecture and it is using Metacol to embed the Python script into its plugin-based architecture. You can also check out the sub-projects of Asset Cam on GitHub some of which are like the OpenGL-based version then there is a command line tool and then there is an Android application. The second one is Pragma which is basically a language that has been built using the GraphQL APIs and under the hood it basically makes use of Scala and the Metacol core to import and execute functions in multiple languages and lastly we have got a Polyglot kernel which uses the IPython library to basically wrap the Metacol core library and it exposes us to a Jupyter notebook interface where we can load and execute multiple different programming languages. So you can check out all of these projects on your standard GitHub repositories and you can play around with this theme to understand the further use cases of Metacol. We also have got a few more community examples present on our example repo and you can check out some of the other stuff that you can possibly make use of through these examples itself. So this brings us to the end of this session and I would be more than happy to take any questions now. Thank you, Harsh. It was a wonderful talk. I remember teaching C-Sharp. So as per my knowledge it was the first Polyglot programming language which allows us to integrate several programming languages. Yeah. So we have some questions. So let me read out the questions for you. Sure. The first question is what is the performance implication or what kind of overhead does Metacol runtime has? So just like I said before in one of the previous slides the standard Metacol code still falls back in performance compared to the standard Python C API but there is an interesting catch here. So Metacol also provides its own functionality service kind of an enterprise layer built on top of the code library to basically develop and deploy serverless functions and you can interestingly check out all of these benchmarks on the standard FAQ that we have right here and some of them is exactly one of the most wonderful benchmarks that you can see. So with the help of Metacol plus Node.js it is almost 1.3 to 1.4 times faster than Express on a standard HTTP server and with the help of Metacol plus Python it is almost 17 to 30 times faster than Flask compared to a standard Python HTTP server. So this fast offering is completely on an enterprise model. It is not completely open source though there is a smaller version being developed for open source contributors but comparing the standard function for an interface with the runtime call that is performed by the Python C API Metacol call simply falls behind a bit which is exactly being improved right now. I hope that answers the question. Thank you. So let's go to the next question. Just curious how the stack trace look like when an error happens. In your example, divide by 0. Okay. Let's jump back to the calculation demo. Even I'm slightly curious about it. It looks something like this and I had a hell of a time on debugging this when I was working on my desktop project. So basically the notice loader is sending out an error that it cannot convert the value to the NAPI. So basically what we are doing is like we are trying to use the Python API and we are trying to connect with the NAPI which is simply not happening. So this load from file part is simply fading out. So next question is how does things which are specific to language works for example passing pointers? I'm afraid that I don't have any idea about this. I never worked on the on trying to call Python function inside cc++ code even though we have a few examples of the same which you can possibly check out. Okay. So we have another question. How can we deploy and use it in production? So for deployment and using it in production we are right now developing a tool which is called deploy. And it is still in basically development process right now and using this we can simply connect our application to the standard metacall pass. So metacall pass just like I explained before it is a function as a service that we are using to develop and deploy metacall application. So you don't have to exactly use any other script to make it work. You can simply drag and drop your code as a zip file and you can use it to deploy your code automatically. Otherwise metacall also works very good with CICT tool links and all of that. So it would simply work. Okay. Thank you. Let me check if we have got any other question. Okay. I am not seeing any other question.