 Good afternoon, my friends. My name is Milind Gandhi, and I work at Red Hat as a senior software maintenance engineer in the production support group for RELL. And in the next 15 minutes, we are going to talk about some of the common pain points that we encounter while doing the VM core analysis and, you know, how PYKM framework helps to make the life easy doing those VM core analysis stuff. The PYKM framework that I just now mentioned is developed by Alex Sidorenko. He works at HPE. He has been developing and, you know, maintaining this particular framework since a long time, maybe early 2000. I have joined, you know, in contributing to this particular project since a couple of years now. And this is being quite extensively used within Red Hat and HPE and a couple of partners like QLogic as well. Initially, the plan was, in fact, to present this talk along with Alex. We created all the materials and everything together. But unfortunately, his travel plans, you know, he couldn't make it to Burno at this time, so I'm presenting this alone. So starting with some of the basic things, what is VM core analysis or the dump analysis, right? So a dump can be obtained when a system is having a working KM configuration. And it allows you to, I didn't carefully review all the minute details present in the data structures within your kernel. Maybe it's structure, union, link lists, spin locks, mutex, IRQ status, CPU, memory information and so many other pieces, vital pieces of information out there, right? And in many of the situations, even in the support organization, what we have seen is, and so many times, without a VM core captured at the time of issue, it is really, really hard and difficult to pinpoint or to deterministically find out the root cause of problem and find a potential fix for that. So even not only for crash or panic or hang scenarios, even for the performance issues, sometimes the dump collected at the time of issue, you know, provides a lot of useful information. For example, there could be processes that are waiting for some locks and that lock is already held by some other process that is waiting for something else. Now if you get a dump collected at the time of issue, then it would help us to actually review those lock status and what process is holding those locks and stuff like that. Now the standard tool for doing the dump analysis, it's crash. Everyone in the room knows the crash utility quite well. It provides quite extensive set of commands and options, but to use those commands and options quite effectively and to do those analysis, it is required to have a detailed knowledge of this particular structures, unions or we can say internal knowledge of the kernel subsystems. May that be SCSI subsystem, memory or the networking subsystem. If you want to dig into the VM core and find out those structure, union details, process, thread information and all this information, then you have to have the internal knowledge of the kernel and those core paths and everything, right? Then you can use those crash commands quite effectively and pull out the right set of information from the dump really easy. That would make life easier, but for the L1L to support staff or the kernel newbies who are just starting with the kernel development or here's new, for example, system administrator who are used to do the administration stuff and they're not much familiar with the kernel internals. It becomes a little bit hard to use those commands effectively and we might end up spending a lot of time just pulling out the basic information from VM core and afterwards establishing a root cause and finding a potential fix for it, right? Now when I say finding out the fundamental information from VM core, it could be as simple as LVM volume information, right? From your system, if you just want to find out LVM information, all that you have to do is LVS and you get your LVM volume information. Now, how would you do that from the crash or the dump or multipath information? Like these people who are aware with sand storage multipath, we can just do multipath-ll and it will list the multipath devices, underline sketches of paths, its eye routing mechanism or algorithms and all this information. But to get all this information from the VM core, it might take hours of investigation and it would also require the internal knowledge of the structures and unions, right? Now, in those situations, it is desirable to have a facility to extend the crash environment with additional set of commands that would help to pull out all this information in a real quick manner and we can spend our valuable time in actually identifying what went wrong instead of spending a lot of hours in just pulling out the basic information, right? In fact, in Red Hat, this PYKM framework is quite extensively used. So every VM core that has been uploaded by customer on the Red Hat customer portal goes through the checks by this particular program. So in Red Hat customer portal, every VM core that is uploaded by customer goes through the checks performed by this PYKM framework. So every support engineer has already to use those reports and they can verify if it's something matches with the bugzilla. And it really reduces the time that is required in VM core analysis at Red Hat. And as well as other partners are also using that. Now for the kernel hackers or the developers, it is really easy to pull out this information from VM core because they know the internals of those subsystems. But even for those kernel developers and hackers, doing this manual review repeatedly for so many VM cores on a daily basis is cumbersome, right? We can't process the same information from the same VM core on a daily basis. We are automating the stuff using Ansible for different purposes. So why not automate this analysis checks for the VM core analysis as well? That's where PYKM framework helps you to automate the VM core analysis. So if the kernel developer is working on some particular bugzilla and he knows that this particular core path leads to this particular bugzilla. And the structure or member variables are holding one particular value or locks are being processed in some different way or likewise. In that case, they can program all these checks within the programs written in PYKM framework so that when you execute these programs, those checks would be automatically performed and those users would be notified that this could be the potential bug and maybe flag the bugzilla number. So it would reduce the analysis time quite extensively. Now the crash is written quite intelligently that it already allows you to extend the crash environment and there is a way to create those modules or we can say crash extensions. Those can be written in C. It could be compiled and dynamically used to extend the crash environment. It's just a matter of using the extend command and giving a path of that particular extension and you will get additional set of commands right there in your environment. Now there have been quite a few extensions already developed. Those are quite nicely documented within a project page for crash. In addition to that, there are crash extension languages as well that we are going to see in the next slide. So these are two most popular and fundamental crash languages. One is the EP-PIC that is embeddable pre-processor interpreter for C. We had crash extensions written in this language in Red Hat for quite some time and eventually we started facing difficulties with the flexibility that it was providing. We had to maintain different crash extensions for different kernel versions. Maybe for REL 5.6 we had to have one crash extension compiled and usable. Then for REL 6.2, for REL 5.8, 6.4, likewise and so on and so forth for 7.1, 7.3. So maintaining all these different extensions for different releases were quite cumbersome right? Every now and then support engineer has to know which extension to load and all these things. It is good for small projects. It is writable in C but in a crash extension project becomes used. At that time it becomes cumbersome in doing all these program checks and within EP-PIC. The other one is PYKDEMP which is the main focus of this particular session. It allows the Python bindings to GDB and crash internals. I would say the beauty of Python programming language is that the programmers do not need to explicitly learn Python program. You can just open a Python program, start reading it and you understand it and you can go on modifying it. In fact for me I didn't know Python before I started working on this project. It was where I opened the programs and just started working on that. The features of PYKDEMP, you can split your crash extension in multiple different files. Then the time required to load all this crash extension on the files is quite reasonable. It doesn't require too much time in loading all these modules and stuff like that. It is completely based upon Python 3 so all the powerful features that you already have within Python 3 are ready to use within your extension language itself. It uses the robust error handling mechanism that is derived again right from Python 3 as well as it provides the ability to execute the crash commands itself and the crash command options within those extensions. So you will get its output parsing a string and that string can be again processed within the Python programs. So you can format a nice report for the users based upon the checks performed by those programs. Now there are some more features of PYKDEMP which makes it a premier choice for writing the crash extensions. It's the same extension could be used for multiple different Linux versions. Currently the stable and usable version of crash extension, this PYKDEMP binary is available from its upstream project page and it's being used within Red Hat and a couple of other vendors as well. It is able to process the VM cores collected from RAL5.6, RAL6, all the minor releases, RAL7, all the minor releases Fedora, Ubuntu, Suze, OEL, any other major Linux distribution as well as the upstream Linux kernel. You get the same binary to process the VM core collected from all these kernels. Now how it does is the beauty of this particular framework that we'll discuss in the next slides. And all these things, developers get a chance to automate their skills and analyzing the right set of information from the VM core. So next time they don't have to do this analysis manually. They just run the program. The analysis is done and report is out pointing out to the potential bugs or the rest conditions within kernel. Now let's talk a little bit about Python KDEMP framework design and how it achieves all these features, right? You might be wondering, let's work on RAL5, RAL6, Fedora, Ubuntu, Suze and any other Linux distribution, upstream kernel, all these things. So basically it initializes the Python object based upon the C structures or unions that are present within the Linux kernel source code. And once a Python object is initialized, all these attributes will correspond to those variables within those particular structures and unions. So as soon as the Python object is initialized, all you have to do is in a Python program you can just use it as a normal Python object and the attributes would represent the variables within those structures. So once the Python object is initialized, it's just normal Python programming, right? And these Python objects, apart from that, the C data types are mapped to the corresponding Python data types. For example, integer is mapped to integer in the Python and the operators within C are mapped to the similar operators within Python. We'll actually see a demo at the end of this presentation, so we should make it easier to understand, you know, how all these features work together when we actually do the analysis, right? This one caveat while talking about the operators, because C provides the two very powerful operators. It's a dot and an arrow. It's to de-reference a pointer and the member variable that is directly embedded within a structure. But Python doesn't have these two operators. It just has a dot. For example, here is the de-reference chain. They could look for C structure named pointer that is having a pointer named A that is having a member variable called D, a member variable called C, and further a pointer named D. This would be the de-reference chain in C while if we go on doing the same de-reference chain in PyKadam framework it would be just pointer A dot B dot C dot D. It's intelligent enough to identify if we are de-referencing a pointer or just a member variable that is directly embedded within a structure. Endowed queries so far. Yeah, thank you. So we discussed that based upon the mapping rules, Python objects are initialized by PyKadam framework from the corresponding C structures and unions. Now to ease this analysis and initialize these Python objects, the PyKadam framework provides a set of built-in routines. There are quite extensive set of those built-in routines, but for the benefit of time, I'm not mentioning all these things over here. It would go out of scope of this session. So two most important PyKadam API functions are readSU and readSymbol. Now readSU, as the name itself suggests, it is to read structure of the union from the VM core and initialize the corresponding Python object. There is the API code that is directly usable within a program written in PyKadam and readSymbol get a Python object corresponding to the kernel object defined as a global variable within the C program. So it could be a global symbol, that global symbol might be string, integer or structure based upon its type. It would automatically initialize a Python object or integer or string. And once it's initialized, it's all the job of normal Python program to process it. These are few more quite powerful built-in functions that are right available within PyKadam. As the name itself suggests, execute crash command in the background and return its output in a string. So it's right available within Python program to process it and provide a nice report to the users. Again, sometimes what happens, suppose a VM code is collected from the system that is having thousands of processes running. If you just do ps-m, it might take a couple of minutes to complete that, to list the information about all the processes. In those situations, we can even specify the timeout so that we can just timeout instead of just waiting on those checks to perform. Another one is enum info, that is to pull out the enum information from your VM core. The enums are quite extensively used within Linux kernel just to represent the state of particular, for example, particular device. Let's take an example of a SCSI device. It's using enum to represent its state. For example, a SDF running block, offline, transport offline, stuff like that. All this information can be easily retrieved just using this enum info member size. That is used to inspect the size of a member variable within your Python program and the member offset to inspect the offset of particular variable within a structure or union. So this is the syntax of read, issue, read structure, API call that we discussed in the previous slides. So the first argument would be the structure name that we want to retrieve. Second argument is the address from where we want to retrieve it. Now, it would return a Python object initialized that we could start using within a Python program. This would make the example pretty clear. Now, here are the manual steps in retrieving the HD struct member within struct gen disk. Now, those people who are familiar with Linux SCSI subsystem and have been doing the VM core analysis for quite some time, the gen disk structure is quite fundamental structure to hold the device information. This gen disk has a member variable called HD struct. It is named as part 0. All you have to do in crash environment is use print struct gen disk, mention the address of gen disk structure and the part 0, there is a member variable. All you get the address for HD struct. There are many people over here who are familiar with this particular command. If you want to do the same thing in Python program written in pykdm, all that we need is address for gen disk. For example, if you have a block device structure that is having a pointer for gen disk, then the bd underscore disk that is a pointer that could be used within the Python program just as this. I will explain this one as this is the Python object HD struct that is to be initialized using the readSU API call by passing the first argument that is the name of the structure that we want to retrieve and the second argument is the address from where we want to retrieve it. The second argument is bd disk dot part 0. The bd disk is a pointer for gen disk and it is a member variable called part 0. So in just one statement, we have got a Python object initialized for this HD struct member variable and this Python object is ready to use within your programs. This would make the things even more clear. So if you want to print that particular HD struct object or the structure, all you have to do is a normal print statement within Python. Just print it. It will print out its address. If you want to print out its member variables, then just do HD struct dot that particular attribute name or member variable name, you will get its value over there. Again, you can use those fancy options within print statement to print in hexadecimal, binary decimal or of your choice. This is one more example of the built-in function that is execute crash command in the background. So this is a very, I would say, ugly example that we have been using a long time back, but it would still serve the purpose to explain this particular command. So sys command is the fundamental command in Crash. It would provide you the basic information about the dump. For example, the panic string, load average, when the dump was collected and from systems, CPUs, task go running and stuff like that. You can just execute this sys command by using execute crash command in background sys. There is a command that we are executing and you can parse this particular output of the command as a string processing and you can match the release string within that particular output to compare it with the kernel version. If it's 2.6.32, it would say at 0.6 while it's 3.10. something, it means it's 0.7 and stuff like that. It's as easy as that. We are not using this particular checks in current version of the pykdump. This is just as an example to demonstrate how we can execute the crash command in background and parse its output as a string. The Linux kernel development is rapid-paced and there are a lot of changes that keep happening every now and then. There are chances that the definition of different structures within Linux kernel might get changed over the period of time. There would be very good reasons for that. There would be justifications, mailing list, discussions and all that and eventually those structure definitions are bound to change for the good reasons. But if we write the crash extensions by keeping in mind particular structure definitions, then there are chances that if that particular structure gets changed over the period of time, then our extension might not work with those dumps. In pykdump framework, we have an API called member size that we can use to verify if the particular structure has a member variable with that particular name and with a particular size. If that particular member exists, then we will go on processing that particular variable or otherwise we can look for another match or modify our programs to look out for another match that corresponds to the new structure definition changes. This is how it works with different kernels. This could be for suze, oel, fedora, ubuntu and all these things. This is how we can explain it in a better way. This is from the latest upstream code, the elevator.h header It is a definition for elevator queue which is a pointer for stuck elevator type within stuck elevator queue. The name of that pointer is type. Before this particular structure definition, the code was looking like this. There was a pointer with an elevator underscore type. It was changed by this particular commit and it changed its name to elevator just type. If you have written the crash extensions by keeping in mind this particular pointer name, there are chances that our crash extension would start failing when it doesn't find this particular variable present in your VM Linux. We can use that member size API call to intelligently verify what is the member variable name inside the structure. This is the first statement in one more thing. The member size function, if it identifies that there is a valid size of that particular variable within a structure, then it would return a valid value that is not minus 1. It could be 10, 20 or whatever size of that particular variable. Now, the first statement is to say member size to check the member size of the variable within struct elevator queue. If there exists a member called elevator underscore type, then it would return something that is not minus 1. In that case, we can go on processing that elevator underscore type pointer. Maybe we can get elevator name right from it and we can continue processing rest of the stuff in program. If it returns minus 1, that indicates that member variable did not exist, then we can apply the knowledge of kernel internals to another check to verify if that member variable has a name just type then just use that one in the next else statement. If that also is not present over there then at least we have a chance to flag an error message to the user that it didn't exist. Instead of just coming out of the program and saying error, we have the chance to print this information that would be usable. Now, exception handling it's quite robust as it's derived right from Python 3. There are many chances. For example, when our dump is collected, at the time it is already something unexpected happening on the system. It might have got crashed due to some unexpected reasons, due to some risk conditions, due to some bugs. For example, a typical scenario is storage ports flapping and your devices are getting offline coming back online and again it's going down. At the time whatever SCSI commands that we are sending to those particular device, there are SCSI commands might not be valid. Those might be being processed by every handling mechanism. Those might get freed, might get aborted or might get retried and stuff like that. At the time if the crash extension tries to process some SCSI command that is already freed or some of the SCSI device structures that are already freed then there is a chance that it might just crash. So we are trying to analyze the crash and our crash extension itself might fail in those scenarios. In that case you can just apply the try except block right from Python 3 put your statement within a try block. If it works all you get is a Python object initialized and if it doesn't work at least you get a warning message that hey this particular structure doesn't look valid and there is some error with that. So we can perform a manual review of that particular structure to identify what went wrong. In doubt square is so far. Are we good to go? Thank you. Using all these different features and capabilities within PYKM framework there are ready to use programs already available within this framework as the name itself shows export show is one of those programs which displays the TCPIP information the sockets you know established at the time of crash you can get this information from the VM core crash info it is to show the general information from the VM core as well as SCSI show to print the detailed information above from the SCSI subsystem it includes your SCSI devices the SCSI commands elevator what was the sector that were referenced by the SCSI commands elevator type and what was the SCSI commands age and everything. DM show it is used to print LVM information, multipath information as well as to run some of the automatic heuristics and checks within the VM core to identify potential fit box and report it to the user. Task info as itself suggests it prints the task information from the VM core and last two are as NFS show it is to print the NFS client and server information from the VM core so you don't have to go through those you know RPC related structures or NFS related structures from the VM core you can just type NFS show hyphen fn client or hyphen fn server and it will display this information right on the crash prompt itself the hang info if the VM core is collected from the system which has got stuck for maybe hours for few minutes still hang info if you just run it it will display summarized information about hang kind of it will try to format a tree as well about you know whatever the tasks that were hung at the time if it was waiting for another task if you were waiting for another task it will try to print the tree out of this information. Now this is about the upstream project page it is hosted on sourceforce.net ready to use binary that is containing all the programs discussed in previous slide are right there all you have to do is just download use the extend command to extend your crash environment and start using these programs there is no extra need to compile it separately just download and use it with the extend command if you at all you want to modify or you want to compile it separately for different architecture apart from x86 maybe for PPC for S390 there are already the manual steps to build this PYKM binary all you have to get is the source code for zip utility for the crash utility and for python just to get it compiled and you will get ready to use program. Now I will show you a quick demo about the VM core analysis how it makes the life easy in fact this is from one of the real world example that I encountered during my work in Radhat. There is one of the VM core that was collected from system that was you know crash due to the sysrq utility and that sysrq was triggered when the hung task panic sysitl parameter was enabled the k hung task d thread it found that there are some processes stuck in d state for more than 120 seconds and it triggered the crash now from the panic string it is quite visible that it is visible till last it found is it good ok so the panic string as itself suggest there were hung tasks due to which the k hung task d had crashed the system and sysitl parameter hung task panic was already enabled so at this time if we just run a bt command we do not get very useful information because this was sysrq that has crashed the system right in this particular case bt is not at all useful all we have to do is once again go to the ps list get the list of processes that was stuck in d state for more than 120 seconds these are all the processes that was stuck in d state for more than 120 seconds now the longest stuck task was an uninterruptible state or d state for more than 3 minutes and 47 seconds so let us set the context for that particular task so we have set the context for this task and it is the back test of this particular task which was stuck in d state for more than 120 seconds now all these things will lead us only to the program that was stuck in d state for more than 120 seconds but what is next we want to identify why it got stuck in d state for more than 120 seconds now the back test and the function course says that it was trying to get a write access for the journal it was trying to commit something in the journal writes but it took more than 120 seconds and that is where k hung task d crashed the system now here the kernel hackers or developers could apply their knowledge of this particular function sets to verify what were the first arguments passed to this functions how it was passed and it would also require the knowledge of assembly language to pull out all this information and verify what was the device whether it was a lvm volume or a multipath device or it was just a partition like sdb1 or something like that now doing all this analysis manually would require at least an hours time going through those structures then going through the member structures pointers finding out if it is lvm volume again you have to go into the lvm in a device mapper code find out what were the underlying devices now if you go on manual analysis then it would be some of these functions for example this particular function has first argument passed as a super block there is a super block corresponding to the file system so we have to trace out what was this particular argument its address and from there we have to check out what was the file system whether that file system was an lvm volume it was a multipath device but it was just a partition on a SCSI device so we don't want to do all this analysis manually we have the programs to do this now so all you have to do is extend the crash environment just normal crash command so it is loaded the crash extension if you want to review the programs available in this particular extension just type extend it will list there is the set of commands with the programs that we discussed in the previous slides there is crash info, task info SCSI show, DM show, NFS show and so and so forth now I will just quickly do so I will just load all the modules required for this analysis you have got the device mapper module also loaded so all I have to do is DM show, NFS, LVS so this is the output similar to LVS output within your running system all you get is the LVM volume name, its volume group whether its open, its open count its size, its underlying physical volume right there in the crash output right now again if you want to find out the UUID of those LVM volumes you get those LVM UUIDs also right there as well as volume group UUIDs now back to that particular task which was good, crashed now here some of the basic checks over the knowledge of this particular extension could be applied now the crashing task was flush 253 column 3 now here the 253 are the major minor numbers of that particular device on which the general rights were stuck now the 253 minor number will always in most of the cases will be used for something created on top of device mapper objects now in this case it is easy to say it was DM3 because it was device mapper object the major number suggest and a minor number is 3 so it is DM3 so just get back to our previous command so here is DM3 it was LV underscore app 1 it was hosted in this particular VG it was created on SDC right so within this many seconds only 0.04 seconds we have got all this information while doing this manual review would have required more than 2 hours now we want to trace out what was this disk SDC and what were the commands for IOS were pending on it so we will type a SCSI show there is a program discussed in the previous slide to print out detail information of the system see this is the output I will just reduce the font to make it quite visible in a better way so this is your disk SDC it is connected to host to adapter that is of the type VMware PV SCSI adapter you will also get the address for that SCSI host adapter again the number of requests pending status of that particular device number of pending request that is in the bracket the difference between done count and the pending and the number of IO errors that particular device similarly it prints information about all the devices right now this is brought us to that particular device on which the IO was stuck right the next task would be why the IO was stuck now let us do that one there is one more option hyphen hyphen check so you do not have to do those analysis manually this programs are written with you know the experience that we have developed that we have got while working with the customers on number of customer cases and you know there are a lot of support engineers I would like to put their names also in the slides like David Jeffery Lauren Silverman and these are the most people in Red Hat who helped to build this extension to the mark that we can perform all this checks automatically now if you run this check it has shows that there are so many SCSI commands pending on this particular device also you get HCTL values host channel target learn ID and it has a large timeout of 180 seconds and it has passed the 180 seconds timeout still error handling was not triggered on this particular commands and as the timeout was already passed that is more than 120 seconds and the K-hungtas identified that the task are stuck for more than 120 seconds it crashed the system now we will also see what were this all SCSI commands doing to print out the information over this all the SCSI commands you get disc is the A this is one of the disc which had 13 commands pending this struck request for each and every command this is the BIO structure associated with it SCSI command structure associated with its opcode in a sector on which this particular SCSI command was referenced to yep and similar stands to for another device that is SDC there were this many 20 commands pending and we can see so there were total 33 commands pending 20 here and 13 on SDA if you go on SCSI show so this is the host adapted to it was not triggering error handling because the block count sorry the busy count was not equal to failed count the number of busy commands were not equal to number of failed commands so it didn't trigger the error handling in first place now what was that one command that was pending that we can just identify using this command once again so all these commands had time out already reached more than 180 seconds but here it was one more command SDC so on this particular device there was one command that was having age of just 80 seconds so we were waiting for this one particular command also to time out and then the error handling would have got triggered so basically we were we were going through the right path error handling would have got triggered once this command times out but by the time KANG task B already found that tasks are stuck in D state for more than 120 seconds so it triggered the crash it triggered the panic so VMware to identify what went wrong with the VMware virtual disk it was taking so much time that more than 180 seconds to complete the IOS tweet so I will just quickly show one more similar thing I think doubts queries regarding this we have got 5 minutes for this anything about python extensions, bindings or just being used any doubts queries any questions yeah please yeah it is right python commands through but the attribute is present in that object more python I think yeah so I repeat the question for the benefit of recording that the question was why can't we use the python programming language facility itself to verify that attribute exists or not right the reason being this we are writing some certain checks within the program to use those variables and identify what was the value of that particular variable and based upon that trigger the flags for errors for example that we saw that command has been timed out and error handling was not triggered and all these things but you know once the python object is initialized we have no way to compare it with the structure definition from the VM Linux without actually comparing it because it is embedded within an elevator queue an elevator queue has a member variable called elevator underscore type yeah so we can just use that structure verify its member variable name if it's proper then we won't get minus one we can start processing it if you get a minus one that means it doesn't exist of course we can do that but you know this framework is already doing that part for you and it is providing this function so it's just to use this function for flexibility that's the only reason for using this but if you want to do that manually python allows you to do it yeah thank you any other question you still have five more minutes yeah please I have another question you have to cast the long when passing through the ASU function the long why it's there why cannot the ASU function so that could be a suggestion for improvement we can definitely take into that account and we can eliminate that need for writing long yes because every so the question was why can't we just eliminate the requirement for putting long over there while we already know the addresses would be in the long format that's definitely an improvement area for us you can put that in thanks for that if there are no questions I'll just quickly show another VM core as well that I have wrote or the NFS information sorry this one more from storage issue only now this particular dump was collected from a system that was experiencing the storage issues right so now this is printing the whole information about SCSI commands on various devices there's so many SCSI commands pending and while processing this SCSI commands you might encounter the request structure that is having a null SCSI command for example there are this many requests on this particular device which were having the null SCSI command now this null SCSI command could cause your system to crash during the error handling or while trying to about this or while trying to retry those commands so all this information is something not that we can retry by manual analysis it would require hours of analysis may be a day's investigation to pull out all this information but here you have all this information right at your fingertips for all this devices it also provides you all the information about it's vendor model it's the CTL values it's elevator type like it's deadline CFQ or no likewise and so forth but one more VM core for NFS now it is printing the NFS client information that there were 3 mounts present on our system we can just re-verify it with a mount command from crash environment so there were 3 shares that were mounted on our system so there were 3 shares mounted on my system at the time of crash if I want to pull out information about those clients and structures all I have to do is NFS show I have an FN client structure pulled out from the VM core how many seconds it was last used it's state and all these information so if you have any suggestions in improving this particular framework we will be more than happy to hear from you about what all the checks you would like to add to this particular framework please don't hesitate to contact me or Alex or email addresses are right there on the first slide it will already be shared and you would be more than happy to hear your feedback on this particular framework and to make it more usable yeah please go ahead remember pleasure so the question was when we are creating initializing the python object from series structure or unions whether we create a class or not so we do in fact create a class if you go into definition of this particular readSU that is in the python framework source code itself readSU itself is using those python classes and everything to selectively initialize those class and initialize those individual attributes from the member variables within a structure so it does uses classes so for anything that is not valid it would return minus 1 yeah that is also already there try except yeah so for example in this particular slide see PYL log warning at the time you will also get some more detailed information about what was the error that we have encountered see it will add additional to this particular error message that we are printing you will definitely get some more information now if you get this information all we can do is we can try to manually analyze that structure at least you know just coming out without analyzing that at least we get some hints or pointers any other questions suggestions for improvement we would be more than happy to hear that so I think we are all done with this thanks for joining in this particular session really appreciate it