 Good evening, ladies and gentlemen, and I see Fred just tweeted about Sveta Smyrnova is ready for the challenge performance schema is 20 minutes and this really will be challenge because performance schema is big future rich and very complicated product actually. Okay, let's get started. My name is Sveta Smyrnova. I am support engineer for many years and through this years my work is becoming more and more easy to do and performance schema is actually main reason why. Today I will try to explain all the features of performance schema, all features I myself think are important, how it works, so you can, how to tune it, so you can tune it effectively, so you can get data from performance schema and you can also don't affect performance of your servers much. And finally I will show few tools, two tools which help to work with performance schema much easier. Okay, so why performance schema? Why it's needed? Performance schema is instrument for runtime performance collection and what is the difference? We always have in my school metadata collection, metadata collections and what is metadata? Metadata is schema like create tables, like tables, like databases, views, etc. It's like kind of support which doesn't change often. And runtime is information, this is resource usage, logs which are required, which queries is currently running. Metadata information always will exist in some very, very early versions of my school. First show comments, then last information schema tables. Runtime data, which you can examine, also existed in my school, but in very, very small amount of information, you can get really small amount of information. It was not perfect, it was just status variables, so you can see, for example, how many temporary tables server created, sign start, and how many temporary tables created your own session, but not more, you cannot identify where was peak, for example, for that. It's got some extension for information schema, but now it's like movement is here, so like information schema will contain data about metadata, and all about runtime will be in performance schema and actually are, and like, for example, in 8.0 to information schema in ADB logs to move it to performance schema as data logs tables. Why this change, why we are using performance schema for runtime collection? Performance schema has no design issues, which information schema has, no locking reads, no performance, almost, I put almost, because this is working QR code points to, and on slides, it has working link, it's points to current open bug report, which actually is about locking read, but I hope it will be fixed. It's optimized, it's MySQL engineering team put high effort for optimizing performance schema. When it first started, you probably read many complaints about like it work not as fast and that affects performance of the server, but they work it, and now it's almost all performance issues are gone. It's consistently extending, it's also, you may add your own extension for performance schema again, you can just go to user manual and read it, how to do it. I'm sorry, I do not like, I always have like kind of, what should I put on my slide? Long, long link, which is not readable, shorting link, which you cannot identify, which is this, or this probably QR code, so now I start with QR codes. And it's begin measure, it's begin most stable. So just like to show how it's grow, it's growth of versions. So first, it's, first the performance schema was cited in version 5.5, it has only 237 instruments, and now it's more than 1,000. I put here a Percona server, because it has a couple more instruments than MySQL server for Percona server extensions. It's what we can do with performance schema. First, it's all version 5.5, we can have insights on what's serving. It's like, it's very low level information, it's about like logs, mutics, it's about data reads. For this short talk, I put here information about IO, so like you can check for system tools, something which is written. But usually, even if it's dedicated server for MySQL, with operating system tools, you won't know what's written here. And with performance schema, you can clearly see what's written. You just query table called event weights history, or event weights history log, and you find out what's written, writes your data. Version 5.6 added information for statements, such like runtimes, you can have pretty good statistics of all your effective or not effective load. These tables called event statements are used by many graphical tools which collect data about your queries. Also, we have, in 5.6, we have event stages tables which show statistics by these things which you usually see in show process list. Like here, you may, for example, sort these tables and find out which stage takes more time and probably tune and maybe make adjustment for your server. Version 5.7, here in orange color, I put features which first introduced in performance schema. Before 5.7, we, once we created prepared statement, you could not find out if it still exists, if it's not to drop it, if it's not to be allocated, which, how good it's performed. So, with 5.7, you can see all current and existing prepared statements by querying this table. It also has same information, like in these event statements tables, like if you index use it, if other optimization used and just try it, and you find a doubt, watch it, please. 5.7 also adds information about storage routines, it's again, it's something which is, like in year 2006, I remember how people discussed on conferences, how to write even UDF to find out what storage routine doing, especially it has kind of flow control. And now, like you see, it's storage routine, it has this event name, it's called from flow control part. So, like, I didn't put definition of the storage routine, but it's, this part goes from handler part of the storage routine. So, you may find all these things, what is happening inside the storage routine. 5.7, there is also information about, like, anybody knows, like, this alter table, it's weights for metadata log, but anybody can say which thread holds the log. One, two, three. I cannot say either, actually. To find it out, I need to query performance schema metadata log table, which clearly says, like, go back here. Hour of read is number 95, which weights. And you see, it's weights, that's it's shared upgradable log, and it waits for exclusive log. And thread 96 holds the log. So, now you can just go, you can see that thread 96 actually does nothing. It's probably just not close the transaction, or you can either kill the thread or wait when the job is finished. What else, it's, like, my favorite feature, it's memory usage. For example, again, it's, like, because, like, I support engineer, I always receive these complaints, like, we are seeing something like this. But from this output, you cannot say what is my school doing. And with performance schema, you can just, like, run query and you can find out. This is, it looks maybe a little bit cryptic, but if you know my school options, it's very easy to understand, actually, like. This part, it's a join cache, it's probably a big join, big join used. This is part, it's about my NDB. So, you, like, again, like, you get usage to it, you can find it out. For replication, we have, always have pretty nice shows, status comment, issue with them. That's all the data was in the same comment, if you, like, if you have monitoring tool, you have to parse it. With performance schema, there is a different approach. It's many, many, many tables. But if you need all the data, you need to join them or still stuck with show slave status. For variables, before 5.7, it's, we have nice access for, good access for variables. But we only have access only for either global variables and status. And all, for variables and status of your own session. With performance schema, you can check all variables with all sessions. Like, for example, here, it's a pretty big joined buffer size. It's, you can find. 8.0 also has a summary about errors, like you see, like, for example, Worn-deprecated syntax, you may just check, what is your application doing and probably rewrite statements. And, actually, that's not full list, but we have limited time. Okay, how it works? It works, it's important to understand how performance schema works in order to tune it effectively, so it doesn't affect performance as a server. So, this is part of one of instrumented, the part of MySQL code, of which instruments this function. So, what's happening first? First, we call performance schema, make a call for this instrumented. So, we just tell performance schema, please start instrumenting our call. Now, we are actually doing a real call for the function. And then we call again this macro for performance schema, just telling it, please stop instrumenting, please record data. After that, performance schema stores recorded data. Here is events, performance schema has special name convention for tables, they call consumers, which stores the data. So, it starts, many of them starts with events, the name, like weights, statements, stages, etc. And then it goes currently, which is currently running history, which is last 10 per fret. And history long, last 10,000 per fret by default, but there's configurable. For events statements, it's table structure like this. And for tables, which are not events, which not store events, they have like try to make it again, it's very similar structure, very easy to understand. Return the data, it's again, you can just query it with normal SQLs, you're using MySQL, you use SQL, and you can query performance schema comfortably. Here, that's a picture, how it works, and what is important here, this one. This one is, you need to understand, that each instrumented part, it's wrapped into two calls of performance schema. And that means, for a rarely used calls, like statements or memory allocation, there is almost no performance impact. Because like, if you execute one statement, it's like two low level calls, it's almost nothing. But if this is a significant part, like for example, you are re-drawing, like you are reading 1 million rows. For this operation, you will always execute same code. So, for example, I have like kind of benchmark of table 5,000 rows. I truncated corresponding tables from performance schema before running. And what I am doing, I am just selecting all the rows. For events weights, you see that's pretty big numbers of instruments, and you see it's like, for example, this thing, it's true to instrument, this thing, it's expensive, because it's called it often. But for instrumenting statements, you don't know how effective, how it's like statistics, like if we index it, or if you need to turn the query, it's not expensive. I'm not telling you, like, you should disable all events weights instruments, but you need to think, to use them with care, and you need to don't be shy about performance schema at all. I often like see advice, like, if you have bad performance, just turn it off, but I believe this is wrong. You should turn it on, should keep it on, but just care about watching instrument. It's again, it stores the data into tables, which is again, what is important about this storage, so that performance schema must be supported by the company. So, if no developer wrote code for performance schema, it's not instrumented. It collects data only after it's enabled. So, like, if you started a server with memory instrumentation disabled, receive it memory leak, want to instrument it, you won't receive information right now, because memory allocation already happened. So, you have to turn instrumentation before the event happens. And after data storage into consumers, they never return it back. Okay, so how to tune performance schema? These are defaults for versions. And there are two places where you can tune performance schema. These are configuration options or setup tables. Startup configuration is always, like, performance schema on. It's now its default and recommended option. And then limits number of table sizes and number of instrumented instances. They are not, these options, not dynamic. You have to change them during restart. And also you can enable consumers and instruments, which you want to be at startup. You see, it's like a little bit different syntax, and I put it in orange color, so you must, it also supports wild cards, so you don't need to type much. For on-time configuration, there are setup tables. And you can, again, you can set up in various details. For example, you can just tell, you want to enable instrumentation only for particular user. Or for particular thread, for example, this is user connections, this is thread IO thread. You can just enable instrumentation only, like, for example, for tables or for events, and don't have it for something else. Also, these are enable consumers, which to store, which information again you want to store. These are instruments, which to instrument. You see, it can be time. It means it shows how much time. It was running. It can be enabled, disabled. Okay, so how to make it looks probably a little bit complicated for make it work with performance schema easier. There are two options. It's C schema. C schema goes, it's written originally mark lead. And now it comes as independent project. It goes to the version 5.7. It's practically views on performance schema tables and storage routines. It's just shortcuts for typical use cases. You do not mix this schema with something that doesn't add anything new. It just makes it much easier to work with performance schema. It's started. And just like to show you more useful views, it's views, which help you to find out which your statements are not effective and this is for memory problem. And like just example, here we are querying performance schema around tables. And much nicer with C schema. Same with configurations. So like you can just call storage routines to enable consumer instrument. Same for disabled, same for show. And C schema as well as performance schema supports wildcards. So you can use wildcards here as well. Another tool is called Percona monitoring head management. It's a tool, which is made by Percona. It has performance schema dashboard. Unfortunately, it doesn't use all the performance schema. But for this event weights, which is like locks, mutixes, IO operations, it has very nice graphic view over time. For example, you see here are some IO issues. You see like big peaks and you may probably want to know what happens here. And we can just move our mouse here and see like it was in a db data file read. Then we compare it with another point of time, we see there is no such reads happening. And again another point of time, it's another peak, but smaller. Okay, so here is more information on performance schema, which didn't exist in this 20 minutes. Here is my slide, share it with your GitHub and share I will put slides after the conference. Thank you.