 they don't heard of Varnish, just a quick introduction. Varnish is doing HTTP reverse proxy. It's kind of web caching demon. If you compare some of the solution in the markets about NGINX or HA proxy, very similar products, but different objectives. Varnish is only focused on caching. NGINX is more like application. You can put a lot of modules, plug-in. And HTTP proxy just do routing, share, and load balancing. We can do both, but we are really focused on the caching side. So here, we have the website, the event website. Also, we will exhibit in broadcast as well. If you want, you can join us in broadcast as well. So Varnish, that's why we, the guy at the very beginning, develop Varnish, because slow website sucks. Back to 10 or 20 years, he's still running the Windows, and then the browser doesn't start, and it just foils away. So this guy accelerates the website. So that's what we do. So we build the software that defines the web acceleration and content delivery solutions. It's mainly, right now, nowadays, we focus a lot on CDN development, OTT streaming, web API, e-commerce, online shoppings, this kind of service, like a TikTok, also socks, video scripts, also our targets. Here is our commercial partner base. It's also used the open source version at the very beginning, and then later on, they move to the enterprise. They need more support, more features. But those, often, actually start with open source projects first. If you look in your projects, you know, some scaling problems, you know, you have problems in scaling the database, the application itself, actually, you first look into the caching, see whether you can scale up in the caching level, because caching level is easier to scale up. For us, it's just a replicated instance. You have all the services. We have different sectors, streamings, you know, Hulu, Netflix, they are using our solutions. In manufacturing, you know, Tesla, somehow they use our solution internally, web API, to accelerate the whole workflow, because across the world, they have a lot of factory and fetching the footprints and design and all these things. They will use the varnish to scale up and speed up the system. Okay, so what is varnish? Actually, varnish, just sit between the client and the backend. The client can be your browser, can be your cell phone. Actually, the client can be another system. So in the nowadays world, we use like the Kubernetes, Docker, and Raman, all this cloud system. You have micro-surface. The client can be other micro-surface as well, to talk to another backend and other micro-surface. In between, varnish can cache all these requests, HTTP requests, and then respond. We can do some logics in the cache to speed up or aggregate the response. So that's why we have the web API acceleration or the edge computing. Some figures, just to share, just two months ago, we reached about 1.4 terabit per second in a single server. So if you are doing OTT streaming or video broadcasting, we are very good in that area, and actually we save a lot of power comparing the per gigabit per second delivery. So what we do is more than one gigabit per second in one walk, okay? If you are talking about ecosystem-safing, carbon-coated footprints, which is very good in the company future. So some very generic or technical stuff about varnish or HTTP, how it works. So if you don't have the caching or whatever in between, the client just come in and get the content from the origin and going back. In between you can have some HTTP daemon or you do a memory caching, this caching or your application itself, where it is main cache, all these components, you can use it in your application to speed up a little bit. But still, as long as more user coming in from the internet or from other area, the system will fail, okay? Because it's really difficult to handle or scale up in this level. So what can we do is one-ish cache, okay? One-ish cache will put it inside in between the client and your origin or your applications. Then the one-ish cache actually to do two things. One thing is queuing up all your requests. If we don't find the cache in the memory or on the disk, then we only send request back to your backend to press the content. So from this point of view, let's say you have five requests right now at the end you only receive one from the origin and then respond back to the client. So the latency here is very minimum. We just pause it until we get the response, the first byte response from the origin and then we're going back to the client. So Nometa is a web API or OTT streaming. The latency is very important for live streaming. So in between, we add up probably 10 milliseconds maximum. So we get the response back to the client. And also we can scale up, just add more instance then you have more strongly scale up your application. Without touching the origin, right? The origin will be more or less the same. Okay, so once we cache the content, actually we don't even go back to the origin. Okay, the origin is completely isolated from the client. Actually we add additional tier of the security from the application. So you are protected separately from the client side directly. So the client doesn't reach the backend. Okay, the core component is more on the technical side, how one is work. One is actually divide the whole system into several components. One is the core engine. The one is cache API, the core engine, and also the flag pools. So we have multiple flag pools architecture. On the top we use core VCL, the one is cache language to configure how the system behave. And also we have the read more, the module, the plugin to add on some other feature. This, all this feature you can create by your own, you know, open source. And also some enterprise read more that you just plug in and then you have more feature on manipulating the request. Behind the bottom two part is about the storage engine. First we use the JEManup to use the memory, how to allocate the memory and the link list to cache the content. Or in the enterprise version we have the MSC, the mass storage engine. We can combine the memory and the disk to deliver the content more efficiently and use your capacity more efficiently. And then the next on the right hand side is the VSM, the share memory module, which is more powerful to get the locking or the system status from the core without affecting the incoming request. Okay, one of the things we can about is about the logics in the core. It's about a final state machine. So it's a final state machine allows Varnish to manipulate the request at different states. So once you get the request, actually you can manipulate the request how we respond and also manipulate the response from the backend. So all these states we can do very specific modification or decision-making. It allows us to deliver more efficient and more details of these manipulations. Okay, here's the high-level the state machine how it looks like. So when we're coming in from start to request coming in, then we will do a hashing to look into the cache. If you find the cache, we will respond to deliver. We go back to the deliver if we find the hit, we see the hit. If we don't find the cache content, we go to the miss and then refresh it from the backend. So each state, each square box, actually you can put logs there and to see how you want to manipulate your request. So most of the time we manipulate from the backend, fresh. So you have multiple backend redundancy, all these things or low balancing. The logics is done at this level. If you want to manipulate the request, usually what we did it is modifying the response in the V-cell delivery. Okay, so that is the V-cell and V-mall. Yeah, as I said, V-cell allows you to manipulate the incoming requests and go how and where to catch the response. Of course, you can modify the cache objects, lifetime, you know, how to respond or delete the objects. And internally we have different objects to manipulating the request. Let's say response backend request backend response and also the generic objects. All these things you can access it from different API. You can extract the status. So you can use Prometheus or other module to displace all these status graphically. Okay, so some things more simple about the backend fresh. So we use this one to do a run robin for the backend, redundancy request routing or shedding to do more consistent hashing for the backend request. All these backend can do health chat to make sure your backend is healthless and send the request to make sure your services up and online all the time. Okay, backend response. Usually if the backend respond a positive result, let's say 200 code or you have other code like a 503 or 404, this kind of error code, you can specify how to handle it. When you have a respond, we can set up the TTL, that's the cache object in your cache, how you want to save it in the cache, how long you want to save. In that level, one is use three different values. One is the TTL, the original lifetime and other one is the grace period and the last one is keep. So when we use the grace and keep period, actually there's a magic that allow the cache to keep a little bit even longer out of the TTL time. Then once if you have problem of your backend, actually you have an option to get the content from your cache even that is expired. So there's a lot of other, you know, logic and metrics you can pair around with the resale to retrieve the cache. So the VSM, which is the show memory that's export all the status, the memory fat and all this thing to this area. So we can do it for analytics, logging and also we provide a very powerful query, we call it a USQ to query the lock from the memory. By default, we don't keep the locks, okay. We don't like Apache or engineering. We don't send to SysLog or whatever. We do the logging from external tools. You have to integrate with other systems to get the locks because the locks in, let's say in CDN, in OTT streaming is massive. So usually you export it to elastic search, for medias, this kind of database to keep it for a longer times. So because the system doesn't keep the locks, the IO on that part is very minimum. So that's why we are so efficient in handling the requests. We focus on that part. We don't spend or waste the IO to keep the locks. Okay, here's the lock. If you familiar with the vanish, here's the vanish lock, how you extract the details. So we extract all the information from the VSM. There's only one request. So the request you can see how it's coming in and out when and why. So if you're doing troubleshooting, if you know the HTTP spec, you have all the details. Okay, and here's the log and statistics. Okay, this yes. We do our own internal counter as well. So not only the locks, but also the contas, okay. And then the caching API and the store. So I just skip those store things, but I want to focus a little bit on the massive storage engine, which is the way how we store the cache content. Normally, what we write a file to the cache, we just have to file tables to keep the file headers and the file name and the content. In vanish, how you keep the cache actually we split into two parts. One part is the metadata and another part is the exact cache body. Doing so, we have the flexibility to just delete the cache from the metadata, not exactly from the storage. It's meaning we reduce the IO a lot until we search the whole block of cache has been deleted and then we will use the database, the metadata to clean up the space. So we save a lot of IO and more efficient to use the storage and less fermentations. So by using this one, actually some system can run for more than three or four years without reboot, it's still running. The disk doesn't crash. The fermentation is very low. Okay, invalidations, I have to skip it. So let's go to strip about web API stuff. It is more on the marketing place. So the web API stuff, mostly we use the vessel API, which is HTTP related. Let's say AGS core, HTTP request, pose or with the query strings or this thing. The engine itself doesn't care about what kind of query or post parameters. We can catch everything as a index. So when you query or using the get or post this kind of resources method, you can catch it by the one-ish cache. And it can be shared with multiple requests without talking about how to control the TTLF or all these things. The magic is done by the resale, okay? Of course you can follow the cache control headers or other headers, but most of the time if you're doing API call, we will override and very specific on how you want to handle it. And for this web API, some of the users use the S-sign insert ESI. That's like a template. So the surface I just send a template with the ESI call, like this one ESI include and then you will include another file. So the system will cache. But then this dynamic content or with some logics, programming logics, you can dynamically or on the fly to request another, to trigger another request to generate a file. It's just more like you have the JavaScript on your browser, but it is not on the client side. It's on the server side, okay? Also, you can create more other HTTP requests at the same time, not only the original request. So we rely on another read more, we call HTTP read more. During each call, we can create additional HTTP requests just like this one. So when coming in, I create another request, I copy the headers and then I check the authentication. Okay, another one is as a search. So it's kind of a template, just like the code template. So, but it is on the web side. So you put some logics and also combined with the JSON objects structure. And then we just put it here to generate content on the fly. It's more like you put the web server on the edge, but it's not exactly the same. We aggregate the content from different backend. Okay, this one will be, you know, to fetch the JSON objects from another application server and then massage and then generate the layout. Yes, as long as you can. Some of the files like 1009 or 2000 lines. Here, it doesn't matter. We don't limit you. So just say for loop or inside loop, we don't limit how many loop you have. Of course, we can have some memory control because it use memory on your cache. So it's mainly on the startup how much memory on this worker space. So if you have more loops inside this block of objects, probably you need more memory on the worker space. Otherwise we don't care. There's no, logically there's no limit. The limit is by the memory. Okay, like the code language template, less computation with just a template and then aggregate everything together. We also support Gzip. So that's mean if you fetch some content like a JSON object, you can see it and unzip it on the fly. Okay. More integration about like a red disk. So this completely open source. If you want to, you know, capture all this section along with the cold cluster, let's say the cluster you have 10 to 20 pop instance, you can use the red disk to share the information across all the logs. So just like this one, the red disk client that's full is an additional Widmon. We call it additional module. They talk to the underlying red disk bus, the red disk team and then okay, I want to get some session data and then update the card, the shopping cart here on the fly. Okay. It's more like a programming on the edge and actually it does. And yeah, it's more efficient than you write the engine next or PHP code here. Okay. Some configuration less about networking, you know, control the speed and rate limiting. Here it's about way limiting. So you can way limiting, faulting or checking what kind of method you want to accept, some kind of security things you can do on the edge in one inch. Okay. Checking the rules from the incoming requests or actually the method of the same, but this one is more like HTTP standard specification. If your request is coming through with English, Chinese, or Korean, then you can block or accept or create different version of cache. Okay. This one will affecting how you cache the objects. And more Widmon that we can put in your system, you know, exporting another Widmon to manipulate the response. So the response probably can be XML or other HTML files. You can replace it or we can update it. And MMDB like a dual blocking database, you know, device database, you can base on that device ID or location to do some additional decision. HTTP module that we show it additional request to send additional API call. JSON object, you can pass the JSON on the fly without, you know, doing a regular expression or something kind of like that. JWT, JSON web token is a built-in module that you can verify the authentication, the token itself to allow and disallow the request. Udo's or go to the director how you manipulate or manipulate or control the backend traffic costing multiple backend. Sympathics backend more like a generate a new page or a new response other than the backend. Okay, I think that's it for my end. So thank you very much. Any questions before we wrap up? Thank you. Thank you guys. This is the only example. There's a lot of other module that you can integrate. Red is main cache. And mostly the next and it's open source because there are lots of things happenings all the time. Do you have any existing projects you are trying to connect? Then we can talk about it. Yes, graph SQL actually by default is already there. It just accepts on how and what kind of query you want to cache. So there's nothing, it's no magic because graph SQL is more to us is just like a HTTP request. And there's no magic or no specific things we need to handle. Just one thing we find some difficulties. It's not difficulties or easy is about what kind of images you want to cache at the same time and how long you want to cache. So that is only the things otherwise they just go straight forward. We have some issue with some integration but not too difficult to solve it. Yes, exactly. That's all you need to do. We can support it. So that's why there's something you need to change or you can adopt it or you change the post to get or we can allow you to do the post. So if you do the post, that's mean you have the body from the request because usually the body from the request will be ignored by default. Then you need to open up the parameters to allow how much memory you want to spare to post the object. So that is something you need to consider. There's two version. So the open source version we still run on Hitch. If you know the Hitch project actually is still under the running software as well. It's a separate module. They do only the HTTP S termination, TRS termination. But the enterprise version is we built in the TRS termination. We don't need to do anything. We just bring it up, it will work. No more, no problem. I think it's already there. I haven't tested the new version that our new release on open source. The version seven is already support that one. Yes, yes, the version seven that they claims, you know the open source version, they claim they already test on that version. But I haven't test myself. And the enterprise version is still based on the previous version six or something. So it's not the latest one. The ESI syntax, we not compile the whole set of ESI. We compile most of them. Because the rest of this function, we can do it by other with one, it's more efficient than using the ESI function. Okay, cool. Any other questions? Yeah. Because I worked in the CDN company myself for many years. Okay. The other part is the actual service. Yes. Do you work on those tools or do you write on the service? We rely on our partner to provide a platform, but we work with Intel very closely. Actually the coming slide, extra slides about Intel stuff, we do a lot of do more integration to speed up to use the CPU more efficiently, especially in the cloud environment. Because in the cloud environment, they don't really aware the new one. Because when you have the IO and the NIC card all together, you have 10 gates. We are talking about 1.4 terabit per second in one set. So it's not the NIC card is in the memory. And the memory and the IO. Yes, so we'll keep it in the system and IO. Exactly, exactly. That's the very good point. Yes. Okay, thank you very much guys. If you have a question, let's speak again. Thank you. Thank you.