 actually it's okay we can start earlier and well this is a pretty informative session so let us start earlier so we have dozens of slides actually it's pretty bouncy and I will do the first part in cabin will responsible for the second half so starting from my parts I like to start it's okay start from now and I won't go into great details due to time limits if you're interested actually we have a video recorded and if you're interested then you can follow us on Bench Ushu Linux can follow our communities to it and actually the KSM well this is not official study yet just chatting well personally I think this is pretty useful especially for the production developments well AI and clouds are our buzzwords now but for the foundation of this platform it's Linux Linux this building breaks still Linux and some of the organizations are still need support from Linux and if you're using Linux then that means you're likely to run into panic and crash so we know that some that there are panics in different ways and so today I'm going to talk about it I think it's the right topic for today well at the beginning I would like to spend several minutes on x64 and 86 and the main streams are these two structs so I pick one of them to focus down today so we talk about a mechanism or methodology well so if you develop a hang of it then you will know their structure better I have a lot of slides so I just skip the self-control and Bench Ushu so we talk about the practicing Linux crash and panic issue on production and cloud server so for my parts I would like to introduce they case them in cache I would like to have a brief introduction and we have nine practicing examples I will briefly go through eight examples and the final one the ninth one will be very complicated and will be responsible by Kevin and so why do we need case that Linux has been developed from for 28 years it's pretty strong pretty robust and have you ever seen a black screen in Linux well I think it's not frequently seen and if we develop if we deploy it on large scale then it's possible if there is the possibility of black scale so we still considering the reliance of this system and if we run the test then then you will find the possibility of crashing the whole system so why why the panic in the system well so I think this is how we develop KSM page KSM what's KSM and KSM is what we used in when we have a panic then we will develop a screenshots and we'll save it later who will restore the situation and find clues it's like what the police do at the mother scene will save the crash scene and then for later research and for maintenance and developers especially for embedded systems and for smartphones these developers and the personnel working on the optimization of products they also need KSM especially for physical devices and when your system is not responding we will use KSM so it's pretty handy and your CPU and fixed bus might cost a panic in the system and they are not responding to reboots so KSM is coming out to play so what do we need to provide you with a very good screenshots for research later well if you're working on Linux if you're developing a Linux including operation and management and cloud provider and totally all these relevant person outs they will have to know KSM better this is a struct for KSM when we have a crash we have two cores one is called production core and the other is capturing core and then we use KES EC to capture the cores of the crash they will switch and develop a screenshots of the memory and load it into local or network disk and how can we provoke it we have multiple ways you can do it manually and you can also do it in a system you can set a kernel panic watchdog or shock log off and or wolf memory etc will all works and if whatever system is blocked it's a core trace that's commonly seen for 120 seconds well it's probably a deadlock or other things so if you're using KSM then you can better solve this problem so you would know who have the lock who is waiting on the lock we will use KSM to learn these informations and also if we want to develop a KSM then we will develop it on deployed on server and here it's my recommendation is to use the x68 server so it's pretty new and it's pretty solid not need to break and and also we can develop it on ARM64 platform and ARM64 platform is commonly used now we have an experiment platform with 5.0 and ARM64 and this embedded platform enable you to run all the tests that you want how can we trigger in KSM manually this is how to do it after KSM this graphic shows you how to use the max5 to write the memory documents and this is using the crash tool to do the VM call and if we're developing the damn data then this is pretty handy as a tool we have a dozen of orders that you can follow and starting from here this is an example roughly we have nine listed here due to time sensitivity I'll go through several one first it's pretty simple LOP error and this is the whole analysis process and this is how we analyzing core trace for example this is the parameter the first one and this is the core this is the course for the crash in kernel and you can read it if you're interested after the session and this is how we run the analysis so due to time sensitivity I would just skip and this is how we check on the second parameter test number two is how we have access to a deleted list and restore the process of panic and experiment experiment number three is like experiment number one and it's how we visit a nonexistent page and this is a bug and also when we're developing a driver if we use a regular map this is what happened basically we're creating an issue and we add one parameter in experimental five compare with experimental four and let's say the hardware is not on the want-to-be status or ideal status then there is a that log then a case I can capture the information experimental six is a bit like arm 64 computing value and we have to analyze on the functional functional stacks and it's better for you to understand the functional stacks relations so you can read through the handbooks of arm 64 so this is the functional scheduling of arm 64 and this is how we organize the stacks and this is the graphics this is each stacks and they have fp and lr and how we store it and this is how we finish the computing and we have several equations this is preconditions based on which we can calculate and compute the relations in case m due to time limits I would just skip this is how can we come up with the relations of functional scheduling and experiment number seven is like this so how can we know each parameter its location in each stacks this is very important for our onsite analysis so first we have to understand arm 64 and their stacks and we have to also deduct on the parameters based on and not on recursive analysis and we have to compute or calculate on the location of each stacks and this is before and after the experiment this experiment is about how can we come up with the second and the third parameters and their exact values this is from my previous slides and how can we get the portraits and find the code of compilation and how can we locate the second parameters and this is arm 64 stack layout so only five minutes so times limited I'll skip and experiment eight is pretty complicated remotely related with what Kevin is about to talk and how can we use case m and use the stacks to get the value of parameters and then we'll know who's the holder of the deadlock so case m is the key to solve the deadlock we can basically get every information we need in the analysis process and this is a self-developed experiment by myself and we now get the rewrite lock from M&S and here is Tread we'll also need to get the semaphore rewrite lock and if we have the two threads going on simultaneously then there is a crash then how can we analyze on this example so briefly this is pretty lengthy and we can see that in PSME we have four threads and non- traceable we can analyze that one by one mainly we will use this compilation codes to deduct on the clues that you want to find and you will find the location of this clue in different stacks and you analyze the locks and you have to know the locks very well and if you don't have a good understanding about the rewrite lock in you can read on the blue bubbles so I will skip this part due to tab limitation and this is analyzing on the test process of threads this is to call trace we have a thread PIT number and we have circled it and this is how the test threads get the lock and we are analyzing the compiling code and we're also using the features of arm 64 for example arm 64 have x19 to x28 temporary change we will have to save it for further analysis we'll have two minutes so skip it also we have called down we analyze each thread one by one see that exact location of that log and later Kevin is gonna talk about the same thing so sorry to the time limit so skip these parts and we can read from the graphic on site this is my self developed threads and how it became a deadlock briefly this is summarization KSM can help us to solve the blackout the black screen issue and you had to know the crash tool and how to use it and you will have to learn the basic process of an analysis using crash tools and you will have to locate the parameters by compiling code and solve problems by doing this and then there will be no black screen for you and now let's invite Kevin an expert on kernel and you will talk about a copper level deadlock issue and how to solve it welcome I would like to introduce to you that no bound to cross over there's an issue we have encountered it's very interesting but very complicated so I won't go deep into this like assembly but I will talk about our concept in debugging yes I hope you can learn a lot from it so let's continue so this case if you want to want to see the detailed case description of this but I think that yes this is provided by the data the whole process I have recorded in it you can click it open and view it yourself oh so the problem is that of our biggest client his machine only have over a hundred CPU use but it has like a 5gb memory a hundred five hundred and twelve gb memory and the numad is enabled and there were calls fit and hit yes the numad is enabled and the you and there were some lock up in it so I like to go inside and check it so if you have it we use the kit up in the film we can see that the lack is in the smp equal function yeah if you see this line it means that is stock on CPI because TOB is unlike the ordinary they need messy it uses the it will be fresh yeah it is stacked you can see in this line that is stuck but why IPI if it is sent then it gets stuck because it's waiting for the CPU for example I'm a CPU 0 and there's another CPU you want and they want to fresh it to the CPU I want to win it you finish it so as to for me to continue running but it's stuck here and there's no response from that CPU this is what you can see from the field and when you're investigating it you can see that Linux what five years ago they've been developing this for a while they've seen there's a high loss including people from England more and then they give us some configuration and we ask them to test it and we search for a long time but we haven't come to a conclusion yes we've come to find why the IPI is missing we need to wait for it for a long time and it's missing so I have a written page yes if I throw it and I will wait for it if it doesn't come back our internal message if I interrupt I will send the IPI so you guys think send hasn't come back then I will send it again so the problem is not what I thought about before when you are debugging it you can see a lot of discussions going on but you just trust your eyes but I it's like when you find IPI missing you think you need to reflect I'm alright because the probability of IPI missing is that low but why have do I have this kind of problems so maybe I'm wrong I shouldn't refer to the VM I should refer to the host because the client won't provide us very detailed information if you some clients the thing they told us not true you have to analyze yourself whether it's true or not but the client they depend on you for the help so I look to refer to the host side as check it lock it's key down this can be wrong on a testing machine they can be verified so this is not a production only system in this testing environment who made it when we did enable the pneumatic after two to three days it emerged yes it's running very fast when we will produce it you dm code to hit it you can echo see and they were it were harsh it will start the KSC and if you enable panic and if you enable it yes it will panic itself and you can find the VM code yourself yes I'm trying to acquire the VM code yes but how do I find what's in it but what do I know it's busy I backtrace it there is the instruction code for each under the crash for each BT if you pop it into a lot under file and a few script and if you filter all the steps and the remaining ones you can see which one is idle which one is still working so what we want to find is that who is working who is playing constantly working and who just stays idle because in one case you are just get a lock and you are stay idle but the whole system is waiting for this app to appear and so I find some choice this KSMD KSM you can see that you can call procedure management if you want to you want to expand it a few memory you are just a couple all the similar pages together yeah to merge them together and you can open more beams on from the host that's being done by a lot of manufacturers KSMD it was locked the RWMC is similar to the one Ben has mentioned yeah this is the first victim is locked another one is the king huge page D what is the huge page huge page I think it's people will become less and less and it's still being missed missing possibilities will be reduced yeah you can check it yourself later so if the capability is low if you enable it it turned into always and they will check it always has some continues if you have both sides it will merge the four page into two layers and to make it to PDE because when they are screening these pages it was get stuck on them and you cannot tell what the victim is and the third victim is cream this is a virtualization when it is running KSM arc and call and enter and into the instructions get to the guest but there's something happened the VMSC it appeared and the two-dimension call yes these are the for the last page full of the VM one we want to connect it it is get stuck yes this KSM huge page and the QEM these are the three victims so we want to find who is the killer who gets the key and this one this is the most important part the client told me that when I enable numad D there was something happen be something happening and in the beginning we won't think too much you may think that no money is nothing to do with it but it has something to do with it so we find the bad tricks is no idea so what's it for it's the auto balance is of no matter yes we want to check if it's a remote access if it is then we will move the page into the local one to make the memory latency become reduced this is his backtrace it find this note it happens to be a KSM page note so it want to migrate this page to yes numad D wants to make migrating pages and it happened to find a KSM page and what but it there's an issue happening during this process when the migrating page you want to migrate the page from this side to another one to the business page but the page is another with another node so you need to have this app if you want to change PD the TOB cache will be updated needs to be refreshed to be refreshed from page table and from step by step and we have just mentioned the conclusion yes we want to migrate this case and page it has two over 200 pages so they cannot be copied by one single machine yeah over 200 pages they are merged into one page so I will skip the details yeah we have already uploaded our PDF you can see my disassembly analysis because I will talk about two to three hours if I go into detail yes how to get back then documents it'll be very complicated so yeah how to solve this problem you need to get the thread of it KSM has two trees one is stable one is stable tree so when we merge in the so the page if it's not stable you cannot merge it immediately so if you just go to ESS so it will happen instantly so we will place it first in our stable tree and after it has been screened twice if it's not changed down we will place it in the stable so we need to look at the stable tree we can see three the same pages if they are merged into one and they are put in this stable tree all the pt is they are merged into one page the linking is has over two million so and if you want to migrate so if we need to send the ip it and yes there are so many complicated it will take a lot of time I have told it I have a list for the crash command follow these procedures just you can see it too so I like to KSM and know my balance what is it it's about you can see this picture if I want to process a it was it was put in the cpu zero if the process a if we want to go to the sss there's a new more normal ss and let it seem we become longer and if the process a it needs to be put in the local node so nobody is doing this yeah if we were trying to best to migrate back to migrate back if you are processing e if it's in cpu4 but the ss you you can see on these two pages they have two pages so if there are too many pages on one node I'm not just a migrating page to this node one I think if we want to schedule this to another cpu in node zero you might balance it has one to migrate node wise to migrate process but it's not that easy because you want to migrate the process then the cpuq what's it what's about its loading where the loading be affected is it enough so that's not the key point I want to keep it clear when the pneumatic is enabled there's a pneumatic balance and the page happens to be a case in page when the the pages merge together over 200 mb so need to do the to be fresh and it causes the performance issue and you fall about this performance issue like the brandon greek they have down the cpu graph i'd be happy to be fresh yes when i view the fresh you can see the pneumatic you can see the migrate pages migrate pages and yes these are the key points you can see the bar is too long when the work work what is work the mamp yes it's reverse mvp the one supposed pte is and you will do the ip id and this one is af is here the total af is here so how can we do it you want to shorten it the upstream does not have a code in 2015 the angiapinkly he proposed it but was rejected by the hill kings so he wants to shorten the time and if you can see our discussions we send this the frame graph to him and after a few months the page was case and merge list was shortened as the whole process the whole story is about it's like this and because ben shu shu asked me here he knows yes you can refer to this because nothing can be obtained from online telling you how to do it in detail yeah that's the end of my presentation any questions the time is already up part pass two so uh 10 past two so time is limited so if you have any questions you're more than welcome to communicate privately