 Итак, как все знают, элифанты в природе никогда не танцуют. Но если кто-то имеет много пациентов и времени, то он может танцовать элифанты. И давайте посмотрим, как мы можем имплементировать this general approach to progress skills. Начинаем с меня и компания, в которой я работал. Это датей Игрит. just one month ago that Igrid was known as PostgreSQL Consulting. We provided database maintenance solution 24x7 consulting and so on, so on, so on. For medium, small, large clients, whatever. Personally, I've worked from this PostgreSQL for more than 15 years. Starting from 2000 years, like ages ago. My specialization is general database performance optimization. Starting from network stack and adding for query optimization, index attuning, PLPG scale optimization and so on. Additional specialization is what I call emergency database troubleshooting. Something broken really bad and need to be fixed fast. It's usually I'm doing it. Now let's move. So, just streaming that goes. You said that elephants doesn't dance. They can be make dancing. So, really, yes. In practice, in PostgreSQL and in every other database, some queries cannot be automatically optimized by a database engineer. In that case, implementation of alternative approach, alternative algorithm can provide really huge performance boost. 10 times, 100 times, 1000 times, depend of your workload. Of course, not every query can be optimized this way. But in many practical, real-world situations, there exists some alternative approach to get required data faster from the database. So, goals of this presentation will be the first. Demonstrate some very useful alternative approach to execute queries. The second goal is to provide a method of, a method and ideas of PLPG skill called conversion to plain skill queries. And I'm trying to arrange this presentation as self-contained learning package. So, it can be used outside of today presentation as some ready to use the material. As a result, this presentation contains quite a lot of queries, a lot of text, which I'm not going to read as full, but it will be useful for someone who read this presentation after. And of course, this presentation has some non-goals. It points, which I'm not going to discuss. And the first it will be not about basic query optimization, because there is a lot of material on internet, there are books available about how to optimize your query with indexes and so on. The second, it will be not about query planner and optimizer limitation. It is an independent topic. I'm just showing how to solve practical real-world problem with practical solution. And third, it's not about compare PLPG skill or performance with the same plain SQL queries. This afford me a real-world PLPG skill code samples, not in the fastest possible way, but in the way, which is easiest to understood. Because there will be some complicated concepts, it's a very good one. This presentation is targeted to mostly SQL developers and there are some prerequisites, which are really good to have to understand what's going on. It requires some basic PLPG skill knowledge. It's very good to have anyway. And knowledge about advanced PLPG skill features, such as with, with recursive, join, lateral and honest. Every sample on this presentation was tested on SQL 9.6, like latest available. But it should work on our previous version with minimal modification. If you try to run it on older version, like 9.3 and older, it's still possible. But it really require workaround implementation for some missing features. Originally part of this presentation was presented four years ago for very old PLPG skill version in St. Petersburg in Russia in 2014 years. Since then, it greatly improved and added a lot more additional information. So, every, this presentation arranges as multiple independent blocks. Every block has the same structure. It starts from problem description. Then I show a native approach, simple SQL query, explain another, analyze, where I provide the information of why it's slow and why it doesn't work fast. Then alternative algorithm implementation on PLPG skill, where it's a lot easier to analyze the algorithm. Then I will provide the same algorithm implemented in plain SQL, just single query, followed by explain and analyze. And of course, at the end, performance result, performance clock comparison of all three approaches. Okay, let's start quickly through initial data preparation. I'm not going to concentrate on it a lot, but it's very easy to understand what structure it will be. It will be a very simple structure with just two tables. First is block post table with ID, C time, payload and also ID. And secondary table, also test table with ID and name. Block post table have primary key on ID and have unique index and also ID and C time. This very simple structure allow me to provide seven different problems with a different solution, which we are going to discuss. Next few slides are just creation table, required tables, populated table. Someone can analyze it in free time. No magic. But you can run all this query on your computer and check every provided solution. Continue the population table. And now let's move to first practical issue. Using index only scans to speed up a query using large of sets. As everyone who works with positive scale knows that queries with large of sets are almost always slow, whatever you do. It's because to produce one million first row, the database is going to iterate into first million rows anyway. However, iteration through the rows require reading database, reading from the heap, from this table. But alternatively, we can try to use a fast index only scan to skip first million rows. And because index only scan into database table data and read only index, it's supposed to be faster. Let's see. So, very simple query. Select from work post table, order by ID. Let's set one million, limit 10. Let's see how the database executes query. What we see? The database fetched one million and 10 rows using index scan. Using index scan mean the database fetching not only data from index, but it fetched also one million 10 rows from table itself. And only after it applies limit 10 to produce required result. It's mean that we need really one million 10 rows, but the database going into table and fetch million 10 rows. Of course, it's not good for performance. Let's see how we can implement alternative idea on PLPG scale. In this case, it's a very simple idea. Initially, we're using only select ID which is required offset with limit 1. Because we searching not all data from the table, but only ID. It can use index only scan and skip offset rows. So, in this case, database built through first million rows using index only scan rows and not go into the table itself. When we found starting ID we can fetch required limit amount of the rows from the database itself using normal index scan. But it will run only for limit rows. In PLPG scale this look very simple. So let's try implement the same idea using planar scale query. In planar scale I'm using the literal initial I'm selecting again starting ID using absolutely the same approach just converted to scale. Start starting from searching starting ID and using lateral join to fetch required 10 rows. Of course this such query we make it manual no ORM automatic query generation will provide you with this approach. Now let's see does it really help or not because for every such idea you should test does it really provide performance benefit or not. So, I completely forgot this query version see that database used index on this scan for scan through first through million rows and then switching to index scan to fetch a fetch required 10 rows. So, it's supposed to be faster. Let's see What we can see for small offsets we have like two time difference in performance. So, advanced description almost two time faster. As offset get bigger performance difference get bigger as well. And for big offsets there is a four time performance difference between native skill version and advanced skill version. Not great but four time difference means that you have a page or report will run three or four time faster. Or you need four time less hardware for the same task. So, with minimal query changes we make real-world query three to four time faster. Ok, let's move to second optimization case. As I said, every work independent from each other it just show interesting idea how you can speed up real-world situation. This part I'm going to put down limit under join condition. If a query have combination of join, order by and limit case and database switch to sort plan. The database will join first sort all final results and provide only limit rows at the end. However, if you have for NK you don't need join for rows which will not get into result final result. Yes, so in many cases it will be a lot more efficient perform limit plus order by part and then only after join require tables to result. So, native query again. Pretty normal for everyone who work with a skill. Select from block post table join all sort table some also ready condition, order by set time, limit times. Just simple query which demonstrate the problem. Let's see how the database executes this query. We can see that database start from fetching 50,000 rows from block post table and then for every rows it's joining all sort table performing inner loop for 50,000 times and only after the database perform sort to return 10 required rows. Of course we need all sort information about all sorts only for this rows. We don't need all 50,000 all sorts in this case. So, how we can implement alternative idea in PLTG scale very easy to do. Let's start from we using we start from selecting required amount of rows from block post table applying sort and limit before we going to join with all sort table. So for table results in this query we limiting result to only 10 rows and then only for final rows we perform join this result with all sort table. So let's see this in PLTG scale it's quite easy to understand what's going on so let's try convert this query to plain scale version. Again we using to lateral construction with the same query. So again we selecting order by set time limit 10 and using lateral join sorry and using lateral join to join all sort info. Second order by required because the base can decide use some alternative plan and initial ordering. So what I want to show is that for this kind of query in PLTG scale can be converted to the same query in plain scale using lateral construction. So you start from initial query and use lateral construction for loop body. Let's see how the database execute this query. We can see so the database again fetched 50 000 rows from block post table it doing sort and limit and only after it sort and limit it join all sorts only 10 times instead of 50 000 times. Let's see Does it really help or not? For situation with only 5 all sorts and limit 10 native scale run 150 ms and advanced scale run 3 times faster. If you let's say have join not with only all sort table but you have like 5 more table join it same situation. Performance difference will be a lot more. So in many practical cases if join condition just use it to provide additional information it may be a lot more efficient to perform a query with order by limit first and then only after join a required row other tables. In practical situation I seen more than 10 times difference between native and advanced approach when there are a lot more join involved. It again stop using the database big vice box which no better than developer what to do. But apply alternative manual idea which can produce a great performance benefit. Let's move next block This block should be known for someone because it was published on wiki page and available in internet in many places but still it again shows the same idea using of alternative approach. So I'm talking about speed up Every developer finding their life sometime they distinct query of a large table always slow because it always can't fall table even if it look for 100 unique idea it will scan 100 million rows if there are 100 million rows in table. Alternative it's a creative user index and query to perform distinct calculation. This technique known as a law the index scan and not good description available on wiki page. So in native or CLC platform query look like this one. They are distinct author ID from block post page. Let's see how the database run this query. We can see the database scan of whole 10 million rows of the table to produce only 1,000 unique hours. Of course scanning 10 million rows cannot be fast. Now let's see how we can implement law the index scan using only . We are starting from minimal author ID from table. If you have index on author ID it will run really fast. Then for every then perform loop and look for the next author ID bigger than one which we already found and returning author's ID in result. As a result the database we found minimal author ID then using index we are going to find the next author ID after this one which is again really fast because query with such conditions using index very efficient and let's see how we can convert this query into plain SQL form. Well in this case because inside the loop we modify some variables we cannot use a simple join lateral approach. We switching to with recursive which allow us perform iteration with changes variables inside the loop. So again we are starting from last author ID so it's actually the same query. And then doing with recursive loop using again join lateral construction to select next author ID bigger than found on previous iteration. And finally return found values. A query look very complicated but if you start from PLPG skill version and debug history convert to this version pretty straightforward actually. And let's see how the database runs this query. So in this case the database start from looking for minimal author ID and then perform 1000 loop over index next author ID. It's supposed to be faster. Let's see does it really faster or not. Well but we can see almost 3 seconds on native skill 14 million table but 10 milliseconds for advanced skill version so it almost 300 time difference. So this speed means that you have a page unresponsible. This speed means that you can use this approach in many situations. So again alternative algorithm provide huge performance benefit. Ok let's move to the next. The next book I call Calculate Distinct on with in list. Distinct on using when application require fetch the latest data for some selected list in single query. Like you have 10 authors and would like to find single latest post from each of these authors. Distinct on usually slow unfortunately in the database. And again we try to use creative using of indexes and advanced query technique to speed up this query. So in native skill or simply implementation this query look like again Distinct on author ID from work post table some list this query should provide 5 rows with the latest post for each of listed author ID. Let's see how the database run this query. What we can see the database fetch all post from this listed authors so it fetch from the table 50 000 rows and then start sorting and unique query. It not really efficient unfortunately. What we can try in PNG skill let's say we iterating over author ID array and for every author ID we perform one single query from work post table very author ID from also order by CTIM limit 1. If you have index of author ID and CTIM this query will again run very fast. Now after we debug new idea we can try implement the same idea using a PNG skill in this case because no variable changes inside a loop we can switch back to normal lateral construction so we starting from array of authors and for each of authors run the same query as skill as we started on PNG skill version so it will loop over this 5 authors and perform one query for each of them. Let's see how the database is executed Of course in every explain I'm showing only critical part of explain because you don't want to see 5 pages of explain analyze for every sample for this so in this explain analyze we see that the database perform only 5 index only scan each each index scan one index scan for each all authors and now let's see does it really help for our case or not well native SQL run 110ms advanced SQL run 0.4ms so we again have more than 200 performance difference between basic and advanced approach and again means that for this case database will be not really fast for this situation no performance issue at all I'm just want to say this all timing was performance on Amazon 2 large memory instance all the database in the memory so I'm not that worried about disk performance issue or disk access time we have only 2 blocks left and every block will be more complicated than previous so it's getting interesting I hope now let's suppose we need to calculate this thing on the author list but over all the table close the table Distincton start use it when you start on the developer application Distincton work perfect but once your table grow the data Distincton query become performance issue quite fast from my experience Distincton on large tables always performance killer and no go for web page it even no go for online reporting so what we can try if we need calculate Distincton of a really big table so initial native query looks simpler than previous Distincton on author ID from blog post table ordered by author ID as we have 1000 of unique authors it should provide 1000 rows however let's see how the database producing this one 1000 rows ops it's problem database start again scan at all 10 million rows then it using disk to sort 10 gigabytes of the data external disk merge 10 gigabytes of data of course performance will be all full and in final it produce only 1000 rows not good at all let's see what we can do with this problem this is a bit more complicated situation of course but we again starting from greatest author ID and in the same time fetch the latest query latest post from this author again if we have index on author ID and C time this query will run very fast then we will use a bit more complicated query inside the loop which select the latest post from next following author ID so actually slight modification of quick distinct query which we discussed in block form and again if we have index on author ID and C time this query will run really fast and after after we debug it and this version we can try convert storable procedure to plainest query if you don't like storable procedure by some reason and again because there is a changes of loop variable inside the loop unfortunately we cannot use a judge generator and again of course to use with recursive construction start from the same query and then iterate over the same query so it's actually that's pretty straightforward conversion of already written pfj skill code to plainest query and let's see how the database run we can see the database you performed 1000 index scan to fetch all required data no more it read all 10 million rows from the table this time I will provide one alternative approach this idea we can using fast distinct author ID calculation which we discussed on part 4 and then using this fast distinct result to implement algorithm from block 5 so we joined block 4 and block 5 but to produce the same result it's a bit easier actually I understand what's going on here but it's a bit slower but it shows that you can combine every of this approach if you query require requires speed and let's see performance difference well 4 секунды for native skill approach for advanced skill 300 milliseconds 13 milliseconds sorry, automatically switch to Russian 13 milliseconds so we have again like 300 time difference and you can use query with this performance on the pages quite good now we are going to switch most interesting part of presentation I spent like 3 weeks attempt to make idea of this block easier to understand because it will be unusually complicated algorithm but very tactical query so basic new field look very simple you have you subscribe it for some authors and would like to see 10 latest post not for latest post for each authors but the latest post from all authors like new friend list in facebook the latest post well query look like easy to write but it's very hard to run fast if the database is really big so very common query for many many web application from my experience it'll usually have some more conditions but general structure stays the same also ready in some subscribers also list ordered by sit time limit maybe some let's see how the database run database again fetch all post from all all listed authors and then start sort 50 thousand rows to produce a final 40 not good of course from performance wise performance point of view so post from all listed authors authors if the database work for 10 years there will be huge huge post from many authors a problem problem number 2 it fetch whole rows not only also write in sit time but all payload including post text and so on so on increasing memory consumption the more post will be in the database the slower will be query the longer list of subscribed authors the slower will be query not good not good at all idle implementation for such query should require no more than limit plus offset index probes on work post table and ideally it should be fast for small offset and limit values like 100 1000 it's unlikely someone go past last 100 page list so let's start from array of also ready array of let's see this private author 20 then 30 160 we have array of this structure and this just positioning of rows into array and now using ideas from block 5 which we already discussed populated the second array of the latest post for each of listed authors it's actually the same queries which we discussed in block 5 so now we have the second array with the dates of the latest posts and now let's see a third position of the latest post on second array so in this case it will be position 4 end of February so we this array query for this one for this task so now we have a position 4 for the latest post at all now we let's fetch this row into final result so we found the first row the latest post of all from this authors it will be also ready for and this time of this one and replace this value with the date of the previous post of the same authors so we just select previous post from the same authors that is clear and now just rinse and repeat previous two steps so select the latest post from this table from this array and in this case it will be this rows in position 5 after 16 and push it into result table then rinse and repeat two previous steps on each iteration populate found results well a great works easy but as implementing this algorithm on a scale not going to be easy so even in a pg scale it look a bit scary so this is just populate this array this part then until we found required rows we doing step 2 and then step 3 and then if we found rows we start returning required results and of course increase of counter found rows and replace time of author id with previous message time so actually it just spread forward implementation of what we discussed on this slide 10 20 минут after analysis of this idea and now let's try implement the same complicated, this really complicated algorithm for plain scale in a plain scale version I find it very look like this one don't be scared we will split into the part and quickly go through it structure was very simple so again we using recursive iteration initial part of recursive algorithm our loop of and exit condition and produce final result initial part look like well we have empty results so far we have the rows found so far we have also idea which we use in every 3 момент and populate time array then it's main loop unfortunately I have a very limited time sorry 5 minutes so if we already found skip drops we start returning results increase row found pass through array of outer ID replace inside outer time array and adjust combination to lateral query to perform a question which we have debugged and produced in plp.jsp and that's our performance result with 10 outers if we looking just for one entry result will be 180 ms but alternative approach 150 times faster and for almost many situation alternative approach faster from 200 times to 60 times there more set of limit will be there less will be difference between approach so this approach very efficient for limit of set not more than 1000 or maybe 10000 however for most real situation like limit 10 of set 10 there is like 100 times difference because this kind of query very very popular almost every project in almost every project I see problem with such queries this approach can be very useful so now just some final notes efficient execution of some really popular real world query really require implementation of alternative algorithm in the same time implementation of custom algorithm a lot 100 times easier when you start using plp.jsp because it more procedural language after but in the same time algorithm implemented in skill usually run faster 2 times 3 times 50% faster and maybe by some reason you just don't like story of procedures so you have a process if you have idea of alternative algorithm you should implement and debug this algorithm in plp.jsp skill just convert it to plain SQL query and check does it really help thank you very much for listening me I hope mind not explode if you have question yes yes okayуEP 4 5 11 21 16 5 6 11 0 2 4 Да, это лимитация, но я не надеюсь, что ты сделаешь патч, чтобы встроить алгоритмы от 7-го года, потому что это очень сложно. Не, я понимаю, что это последний патч, но предыдущие примеры, почему это не работает? Ну, я пытаюсь сделать реаловод, с реаловодной базой, потому что у клиентов есть реаловодная форма. Я могу разговаривать, почему есть какие-то функции, или нет. Ну, с реаловодной базой есть очень большая база, но, конечно, у каждого база есть какие-то лимитации, и есть какие-то квиры, в которых база не может работать эффективно. Ну, это не очень легко. Ну, я хочу сказать, что я использую алгоритмы, и это для моей оппликации. И, патч, когда вы продаете рейтинги, все может быть неправильно. Нет, это не рейтинги, это индекс. Это не индекс, а индекс. Как это может быть? Да, это другой рейтинги. Ну, если я упоминал, что рейтинги были дистингтными, и у него был автобай, если мы можем просто обратиться к первому рейтингу, у него был автобай, этот автобай включил еще один колонн. Когда вы делаете это, то ваш автобай будет выиграть в первом рейтинге. Это значит, что рейтинга должна быть дистингтной, как и в другом колонне, я понимаю, я понимаю. Есть, конечно, патч, чтобы уничтожить рейтинг. Но вы работаете с current query database, который currently available, с current project, который не работает очень быстро, потому что есть проблема. Я просто покажу, как вы можете фиксировать эту программу. Это не о пострадке, это о логике. Если я скажу, попробуйте, чтобы у меня были дистингтные имена, и они выиграли их рейтинг. Это не просто дистингтный. Вы можете просто идти к первому query, потому что они выиграли 2 колонн, а это дистингтонный. Нет, не дистингтонный. Я думаю, что это действительно дистингтонный. Вы говорите об этом? Да, это дистингтонный. Как это вы говорите? Технически, да, это может быть... Я думаю, что первая оптимизация... 2 оптимизации, 3 оптимизации, 4 оптимизации, это может быть имплемент. Но... Ну, добро пожаловать, Патч. Вы знаете, и... Если я должен говорить по-другому, то почему я должен говорить по-другому? Так что, что у вас был план? Может быть, я не знал. Что у вас был план? Я согласен. Ну, это я понимаю, но... ...о данных. Окей. Вы получите максимум. Да. Да, и потом... ...вот предыдущий автор. Да, да. Так, и если вы... И потом у вас есть план для этого автора, да? Да. Да, у вас есть план для этого тоже. Да. И если вы, например, посмотрите... Да. Окей. Это очень впечатляет меня, потому что... Когда я посмотрел первый план, я думал, что будет лучше, чем это. Потому что, честно говоря, это один из самых оптимизированных план, в большинстве РДМС индексом ЛИСКАН. Да. Это один из самых оптимизированных план, в любом РДМС. Ну, это должно быть, как бы, скипт. Ну, конечно. Но, да. Это должно быть. Да. Я понимаю, вы понимаете, но... Окей, может быть, два таблика, и вы не выживаете. Ну, это все мои дела. Это все мои дела, об этом каком-то лимитации, лимитации базы, лимитации саундов, лимитации разработчиков. Ну, это... Это просто язык, я очень впечатляю. Ну, это просто один план, в котором таблика может быть оптимизированной. Есть миллион других план, в котором таблика может быть оптимизированной. Да, конечно. От его проспекта, он просто говорит, почему оптимизировать и не делать то же самое, которое вы делаете. Да. Теоретically, да, но никто не имеет времени к оптимизации. Да, есть что-то интересное, что вы должны поставить на таблике, вы должны услышать, что таблика должна сказать об этом. Я знаю, что он говорит, что он говорит об этом. Не знаю. Я дышу. Так что, может beans will be getting completed someday. Я думал, что вы говорили о hearts. Я не посмотрел это. Но так же... Surprisingly, как он делает это в grand sequel, как он работает. Потому что Но у вас есть проект в этом моменте, и даже если вы будете спонсорсоваться на разработке, вы не сможете получать результаты в следующих реализациях. Мы поставим на слайду. Да, конечно, это будет хорошо. Мы тоже поставили это в поспособлении. Я надеюсь, что люди поняли об этом. И от того, что я знаю, что поспособлении делает много фидов, так что вы будете слышать в другом способе. Спасибо большое, пожалуйста. Я надеюсь, что это будет не забор.