 Hello, I'm Tomas, and I'm working for Collabora, and I was recently working on OpenGL, VCL back-end, and recently I wanted to improve the performance of OpenGL renders, LibreOffice with it. I don't have any benchmarks or anything, but I just reviewed the techniques I used to increase the performance of the rendering. To je zelo vse tako, da so vse vse izgledali in zato čečili, da jim dobro bila vse, da jim je zelo, da je začeličnjala, da je je začel, da je začela. Je to vse, da je začela, da je začela, da je začela, da je začela, da je začela. To je tako, da je začela, da je začela. tudi there was a very bad shape at that time, so I don't think anybody really used it. Libre Oak is 5.1, it was a very good shape, there was no more rendering bugs overfixed and it was mostly rendered, Libre Oak was rendered correctly. The problem was it was not really fast or as fast as, as a software rendering. So we had to work on it to increase this. So usually misconception. It's usually accelerated, like OpenGL. It's not really hardware accelerated, but just with different, zelo naslično odvalo, da ještje. Nikaj, kako se načal, kako je pričak, stavila kar, da se bilo je vse vse, da počal, da se se zelo, tačne sprem, načal, počal, načal. Zda sem po!. So for a drawing complication liosse liber office, di su bi reeli, reeli fast, but it's not so simple. The problem generally is that rendering with对 GPU is quite numbered, just rendering. Like a software renderer for 2D objects. That um… Somehow open gel, when, it involved for Triller rendering also the api evolved with it to be more friendly for rendering this was not really true for 2d rendering we have still we still use this canvas style rendering which za srečen taj ročen. Zbela nekaj, nekaj, nekaj, nekaj, nekaj, nekaj, have to render everything with triangles. Took these scenes are mostly outsens. Some kind of manipulation of triangles. So the same thing we have to do it for 2D rendering. Open.js supports some GOD line, GA for line rendering, zelo to je nekaj dobro, da je zelo vzgleda, da je to naša. Zelo nekaj dobro, da je za to izgleda. Pozor je, da je začal, da je tudi zelo, da je to, da je inšljav. kako je zelo kot na občas. Zelo, da je začala, je začala in odstavila do vsega. Tko je začala do vsega, je to ne v pomečenu. Je to pošelje, zelo je začala in je začala. Zelo je začala. je zelo srednjo izgleda, zelo je na srednjo veče, ker in in in in in in in u OpenGL lahko posložite vidjevi komand, da je nekaj preformat, nekaj imaj na OpenGL. Volj način kaj je v JPS, ...j kad te Live give in Live have a local memory and you have to upload textures, upload vertices to the GPU memory before you can draw. Vse je zelo kratko. E, kako dozal, ampak v kompačnji API, kaj ti so kompačnji, da povzenej obrčnega in potrebe koordinac, koordina, ki je zelo prezipna se pri vrtiči priup Це, na lemmu GPU, ta je sorjo či dva, in se našto je kompleks. In na drugim sevim, njih smo na vsevizacijenih programov, vse smo na vsevizacijenih, in smo počke z vsevizacijenih programov, vse vse vsevizacijenih, in smo počke zelo vsevizacija. Prvno, na preformacijenih programov sem bil. Proste, imamo nativne kontroloče, nativne kontroloče. Musim vzvečiti in vzvečiti in vzvečiti, da je to vzvečito, da je to vzvečito, da je to vzvečito. Normalno, tudi naredno in v daljemne veli, zelo ti da bo vse na stavne, in je lahko nekaj, ale tudi da smo tudi igramarje več in izdelajte to katiega, in je zapevno, da smo tudi da bo vse vse vse vse vse vse uvanje. zelo. So we want to cash it. Luckily the controls, there are some controls that never change, like check boxes. And there are some other controls that change only when we are resizing. So caching them makes a lot of sense. So we cash the textures, and we have least recently used cash and just drop them when there are 200 textures. But this can still be improved in various ways. For example, we don't texture atlas, instead we have a lot of textures. We have one big, big texture, which includes many images. And in this way we decrease the overheads that we have to do when drawing. So one problem here is how to pack more textures that have a variable size into a big texture. There are many algorithms available. We have two, one simple one is just to divide the texture to equal size. And just put the subtextures inside these regions. This can be highly dynamic, but space is a lot of space. So this is currently used for icons, which have mostly equal size. If you have a toolbar, we have many of these icons, so we save some space in this way. But then we come to text rendering, which is quite similar to this. The problem is with text that there is no support by the GPUs for rendering. So we have to render on the CPU and then upload it as a texture. Again, similar for icons. The problem is that glyphs don't have equal size. So texture atlas is, this simple texture atlas is not enough. So what we do for texture rendering, one thing that we can do is just to render it into a texture, whole string, and then upload each draw call to the texture to the GPU Android. But this is quite slow. So what is usually done here is to cache the glyphs of the text and we try to draw individual glyphs as textures in the texture atlas. Usually this can be done in one draw call, which is very, very fast. So, for example, if you want to draw abracadabra, this is a texture atlas, which has glyphs, A, B, R, C, D, and then in one draw call we can just modify the coordinates. A goes in these regions, B goes in these places in the texture, and we can just, in one draw call, we can just render the whole word. Currently this is implemented for Windows, but not for Linux. So it's generally still quite, OpenGL is quite slow on Linux. So next thing I did is to decrease state changes. So OpenGL is quite interesting implementation. OK. So OpenGL is quite interesting. We have a lot of things you can enable and disable, like a JS scissor test, JS stencil test, blending, and you usually do this with the GL, enable, GL disable commands. But the problem is that OpenGL implementation is quite stupid in this way. So when you enable, it changes internal states quite immediately. And OpenGL does internal state change and many things resets. So we don't want to do this if it's not necessary. So we track state, what's the state of these flags ourselves. We don't use OpenGL to file the states. Another thing is also for binding the textures. You usually bind the textures to texture units. And you don't need to unbind them. But when you execute any draw that needs a texture, the texture must be bound to the specific texture unit. So generally we track the states which textures are bound and only rebind the textures when this is necessary. Similar is also for GL viewport, which is a JS scissor. Scissor is for clipping and viewport is to define what is the size of the surface we want to draw in. Next is we can combine the shaders. One problem is if you have a lot of shaders, you have to switch between shaders very often. And this is also changing the state. And if this is not necessary, again overhead in OpenGL. So what we did is combine the shaders into one big shader and just switch between what it does inside the shader with if statement. One problem is that branching inside shaders is not really recommended. But in the end it was better to do if statements, simply statements or switch statement inside shader. It produces less overhead than changing the shaders itself. The next is polyline drawings. I have here, I use these polylines for testing of the implementation. So what the polyline can have is, this is usually the basic curve. It can be, they can be open, they can be closed. What they have is usually some line endings, line beginnings around that or just square and line joints, these are possible line joints, can be around that or meter, doesn't have or doesn't have any or just bell shaped. And with how we did it before is that we decomposed this to inside with CPU to trapezoids and draw the trapezoids. This was quite expensive, so what I implemented is to do this with the GPU instead. And what we have to be aware here is that anti-aliasing, we want to do it inside the shader. There are different possibilities like MSAA, also which uses anti-aliasing for everything, but for performance reasons, we just wanted to do it without MSAA enabled. So usually this is, as I said, for light drawings, but also can be used for poly polygons, mainly outline of poly polygons and polygons, so that they appear anti-aliased. So this is usually how we do it. We have, if you want to draw it inside GPU, we have one triangle and second triangle. We calculate, depending on the line width, the extrusion vector, which is perpendicular to the line, and the extrusion vector is as big as the line width. So if you want to have also the anti-aliasing, on both sides we use half pixel width feather. The feather just fades the color of the line from 1 to 0 on the both sides, and this gives the anti-aliasing effect. Next we have batching and combining. So we want to decrease the overhead of the GPU and reduce the number of draw calls. So what we can do here is batch the drawings, draw commands, and reorder them and combine some of them. In the current state we do this for polygons, for rectangles, polylines, and text rendering, but not currently for gradients, and most texture rendering is not batched yet, so we don't have a full batched drawing enabled yet. But I think once we have everything batched, the performance can be improved quite a lot. So what we do with batching, for example we want to draw this rectangle and decide to smaller rectangles and two lines, so we have these draw actions, rectangle, line, rectangle, line. So first thing, there is an overlap. We have to check the order, how the scene is rendered. So first rectangle, which is actually the background, for example, is drawn first, and then we can draw everything else. Non-other rectangles and lines don't overlap in this case, so we can proceed with a second drawing. And next what we do is change the order. So we have now two rekt, draw reks, one after another, and two draw lines. So we can combine them and execute them with one draw call. Now we reduce the overheads from five... how many? We have five draw calls at the beginning and now we reduce this to three draw calls, which is quite nice. So I wanted to mention here also for a little bit about backend testing. I created this program for visual backend testing, which I used to see that OpenGL draws still draws correctly. And this can be used for a lot of things, not just in OpenGL, for example, to test other backends. So what the test does itself is it draws primitives to the virtual device and checks for pixel matching, but this cannot be done very exactly. So we have three possible states, either the test passes or fails. It's normal, but the third one is passes with reks, which means that rendering is not totally how it's expected, but it is something that we know that the backend has problems with. So it still passes, but there are some reks with it. This is especially in OpenGL. Sometimes it happens that the first pixel isn't drawn because of mismatches or changes in the backends in OpenGL. So I had to invent something like passes with reks. So OK, this can be used for finding rendering bugs in existing backends. So we can see how different backends draw and maybe there are some bugs that need to be fixed. It would be very helpful for when we got new backends, for example, a cute backend could use these tests to see if rendering is OK or not. The main purpose of this is then just to when the user first time runs the test, runs LibreOffice, you can test if OpenGL driver misbehaves or if it's OK or not. So we could disable the OpenGL and fall back to the software to the renderer instead. So future improvements. This is drawing with polygons drawing, so field polygons. There is a possibility to draw them with stensibuffer, but it's quite expensive to draw with stensibuffers, so generally this is not implemented. And we rather do it in the CPU for now. Then there's Bezier curves, curves rendering we could use, for example, a Lublin algorithm. Unfortunately, this is patented and we cannot use it. It's invented by Microsoft and I think they use this inside Direct2D quite extensively to accelerate text rendering. So the alternative for Bezier curves is to do decomposition in geometry shader. Geometry shader for two vertices you can program it so that it creates more vertices in just ideal algorithm how usually we draw Bezier curves with decomposition. The least is that to make API more GPU friendly we are currently the API is still canvas based so what we really want is something that is more tailored towards GPUs. So this is seeing graph API to implement seeing graph API for Bezier. We have a tree of objects and we can instead of draw calls tree of objects that are persistent between calls and we can optimize them for depending on the rendering target. So for GPU we can do optimization for CPU can we do different optimizations. It uses usually the matrix transformations extensively so instead of modifying all the coordinates of either of vertices or of coordinates of 2D draw commands we use the transformation matrix and in case of GPU we can just use this inside vertex shader and it's quite a lot faster. And we can extend this to have a separate rendering thread which is also very nice to have so that in normal thread we can prepare for drawing and rendering thread just draws everything and this is what I have. Thank you.