 Yeah, outstanding missing feature we all know and that user space keeps pestering me about sitting in the back of the room is the ability to query mount properties, file system properties and so on. So in short, what we have seen various patch sets for the FS info system called by David Miklos worked on get values or through the X adder interface and there were various other proposals in the mix and yeah so we somehow need to come to an acceptable compromise for all of us that we can move forward because I think this will be a pretty crucial thing going forward and there are certain requirements or the preferences that people had. My main preference is that it's not exposed in the form of a file system because this is a giant pain for user space usually but Leonard can speak to this because he will probably be a user of this so it might also be good to hear what kind of API he would prefer, what he would find useful even if it's just by describing the use case. Yeah. So one of the things that Linus suggested was that we have a stat like system core with the part that is just a string from Proxaf Mountain 4 so it's you have some binary bits and some text bits. I'm not sure it's worth so maybe the simplest thing could be that it's just the line is returned but we could have such a hybrid system core with one part with binary bits and text bits as well that's one part. The other part is about which Leonard mentioned was the listing of child mounts and that's again I guess we could have a new C-score for that which just lists I guess the simplest is to return a list of mount IDs, extended mount IDs, so the 64-bit mount IDs but yeah those are just the ideas if anyone else has ideas. I've had a couple of people contact me after Linus had his input on this so he emailed me privately. I think one was from the crash dump point of view and he said please can it we would like something like this but please can it not be text because we don't like text parsing and it's slow. I would prefer a non-textual interface as well. So I think if I may the honest objection probably as apart from that mount notification the stuff we discussed yesterday and the general getting information about mounts was stuffed into one patch set and it was just in one sense I think why Linus also reacted to it as far as I understood it was just too much code. It was like I tried to review it I really tried but it was just a wall of code with nested and serialized structs it was really difficult to review so I think if it had done a little less it would have been way better if we need to split this over multiple system calls it's just my opinion then so what I mean it's not the times where system calls were like oh my god don't add a system call I think that that's over. So I think it's okay to have a binary interfaces but like with mounts and super blocks we have some things that are really difficult to represent in binary and I guess that's that was one of the issues with with that patch set that it tried to do this to generically so it was and it's difficult to do so I'm not sure what's the solution to that. We already have a format and Linus we have we have a text format and yeah it's difficult to parse but we already have parses for that and it's not not like it's it's really the big performance issue is not with parsing a single line it's with parsing the whole file. Do you want to say something about this? So from my user space perspective I like the problem I always have with text formers is that splitting out the fields is always nasty right like because of escaping and figuring out what the delimiters are and usually the kernel is not very good of having a uniform logic there sometimes completely crazy how it does that so from my perspective I'm actually fine with text formers as long as the fields are separated in the structure right like that that would all like for example what what I specifically mean by that is like I'm totally happy if we get ro and rw or something back as a string as long as those are two separate fields or something so but just to summarize from my user space perspective what I would love is I would love to get at least the idea of an atomic reply to things right like so I don't want to execute like a bunch of system calls to get the information about a super block even though that usually that's fine I think it would be much nicer if we could get the fields that we want a lot stat x basically in one block back yeah the other thing is what I just mentioned that I prefer like for example the mount ID or something which is numeric I don't really care if you give that in actually in a binary well you're in a string pick something just split the fields up in in a binary way but for example would you be fine let's say we had a mount stat x system call that or whatever like a system call that gives you all the information about which is extensible for for a super block and or for a mount and then you get the mount ID for that mount would it be okay if you would need to use a separate system call to query for all the child mounts yeah that's that's fine by me because the problem I mean basically I want the atomicity per object but I'm I don't I have no illusion that getting an atomic view of more than one object that it's also racy as I mean the mount table constantly can constantly change I mean you will have to protect against this against this anyway so a snapshot of the mount namespace doesn't really make sense I'm not asking for that I just want per object something that but what if you so what have you had as a starting point at least an FS info whatever name it is we can squabble about this FS info system call that would be a struct that encodes a core set of information would be useful to user space that struck this extensible we know how to do this we've done this before I know some people don't like it but extensible structs version by size we use the scheme that you had with the or that I guess you all came up with for the static system call where you have a set of request masks this is what I want the information and then a reply mask this is the information that I have available and then just extend it and there can be textual format in there if it makes sense and there can be non-textual format for stuff that doesn't make sense because it doesn't make sense in my opinion to encode RO or RW in in textual form that should just be bits but there's certainly information where we want textual format I think it's important to be able to export file system specific fields like the Miklash get excited proposal I don't know how you want to encode that but I think it's important and it would have been nice to get something similar for statics for that matter I mean file system do have I know specific information that they do want to export and yeah sometimes today it exported as get RM those no like I mean some some files was already do that right yeah there are virtual extended attributes but you might not be able to read the extended attributes I think you have to have read permission to be able to do that that you could add a new namespace like the the chat happens before you get to look in the namespace that sounds like something you could fix but anyway but I'm just saying like for the specific stuff extended attribute sound sound okay to me but that was proposed last year but yeah I mean like in the NFS standard there's a lot of stuff in the SMB standard there's a lot of stuff but I expose it in proc so all of those file system specific fields are in proc for the share and the server all that stuff right the problem is how do you tie that back to a file descriptor like how do you tell okay that file descriptor goes with all this file system specific stuff that I've exposed I don't know how to how to do that but I mean all the info is is visible and I assume NFS does the same kind of thing makes the file system specific stuff visible somewhere so I guess I'm very nervous about file system specific information being mixed in with whatever infrastructure we do for the generic mount information and I and again I think part of it is if we can do something small it's much easier to get it reviewed and I think we understand what the requirements are for why system D or any sort of user space thing is trying to understand you know what's currently mounted there is an awful lot of additional information that is very you know SMB XFS for specific but if that's not needed by this core use case I think how we actually handle the file system specific stuff just adds a lot more complexity to the solution and maybe we should just address that separately right because you know a number of us do have solutions that are sysfs ext4 block device or whatever but the thing is it's really only used for debugging and so therefore it's not critical that it be used you know by something like system D now if someone can articulate the use case where it's not for debugging but in fact something they really want for our you know continuous production use as opposed to debugging then maybe we can design that but the thing is stat X was fairly simple because we weren't trying to encode arbitrary amounts of file system specific information doing things using synthetic X adders is contentious there are some people who think that is a radical abuse of that interface and you know whether or not Christoph is right about that I'd rather not have that discussion so I'm curious like so the only like FS specific information that you would get out of this conceivably right now is mount options do you care about mount options aside from like read I only read write and like global things yeah there's a couple of other things we read one like uuid for example super block uuid and that's not a very specific info but that's another topic that's the third point that I had because not all most now some file systems generate in a uuid but not all of them do it's sometimes used to generate an FSID but not always for example XFS generates the FSID from the device block device and so exposing the uuid is something that would require additional work but if we have an extensible struct an extensible weight and it doesn't matter and we can just expose it once every file system has a uuid which ideally is something that we could do in the in the future but it doesn't need to be part of that right now but back to this file system specific question why couldn't that for example be an additional we could punt we can punt on this question for now I think and just have a core information struct that is generic for all file systems and then we could add an additional system call that maybe is even maybe is even but maybe have to be textual so you need to query and call into the file system say give me all your mount options and tell me the size of the buffer that you need give it a buffer and then it gives you the mount options that our file system specific encoded as a string if you have to it sounds all good for me but so what I would like to see is that basically the information that in the new mount API you pass into the file system with FS config and kind of in a similar way I could get out of it and that's you mean FS query yeah but again like the FS config stuff is bit by bit and that's fine that way I'd rather have it atomic right now I want to add something like that as well so that I can add a mount supervisor which we discussed earlier so that you can start for mount supervisor sitting in the mount namespace so when inside a container can just do a mount system call mount supervisor can intercept it and read the come the config options and say yes no or just just the config options another use for this is auto mounts particularly like NFS auto mounts we can't parameterize them because they're done by the kernel but if the mounts we can have a mount supervisor it can intercept that change the parameters and let it go let it continue so it's change our size yeah but I need some way to read out the mount options from so but anyway coming from for my personal use cases I generally only care about the generic stuff but I know like UTL Linux or something when they like the lip mount APIs that are basically which has system you happily uses by the way they want to have the specific stuff like the mount options generally too and they will display them by default right like so if the goal is to implement what lip mount typically calls and hence the mount binary typically calls and find mount typically calls then you have to probably cover both right like the the generic stuff and at least the mount options and they don't so I think that like in this case I think that it is completely okay to like have all of the generic information and then you just get a fucking string with like all the mount options like stuffed in there like if you want to itemize we could also do that right like you get a you know give me the buffer and then like I'll tell you okay this amount option a bc and like you would prefer that okay because I think that's a reasonable enough thing to do because we already have all of this stuff for you know mount info and then it's not like you know every individual file system isn't going to do something crazy because we all do the same thing for showing our mount options and we just do this new thing to spit out the options and the buffer that you give us yeah yeah I mean that's what I was going to say which is we have show mounts and in some ways my mental model of fs info is a scalable version of you know proc mount info that like works when you have a gazillion mounted file systems and so if it just returns a string the way mount info currently returns mount options and we aren't trying to do a crazy structured data structure thing we can solve that problem if in fact someone needs an exhaustive structured thing that is more than just the mount off and mount off mount info string let's do that part later right but yeah yeah so that that would be my my preference and we already have a system call for that which is get xatra which uh you can use get I will let you fight with Christoph about using get xata for like something that isn't a real extended attribute I don't think that's a good generic approach if a specific file system wants to do it so that they don't have to actually get that reviewed by fs develop great but like you know Christoph has flamed people in the chrisp for that before and there have been problems so whatever right I just you know that the consistent consenting implementations in file systems sure there is there is another problem with using get xatra one thing I did in fs info is a way to specify the file system you want the query by the file system ID because paths are not unique and there can be file systems you can't reach so the mount under thing you've stuck a false system there but you can't reach it how would your query thing deal with a file system that gets uh I beg your pardon descriptor number as mount option and yes that's real there is such stuff what what sense could it make for any kind of supervisor that caller of mount has something or other opened with descriptor number 69 well thank you now what do we do about that sorry I'm not sure I caught the question uh question about the first query actually I mean the thing is like if you have an auto fast right now right like you specify an fd and where you get the notifications and then it shows up and and and proc self mount info as fd equals five and you just like what the fucks that's supposed to mean right so um I think so one thing I did was I gave every mount object a unique id new 64 bit id and you could get that out with fs info or we could do something else to get yeah but then you could give it to fs info so you tell me about this id so by the way it just the the the weird thing about the extended attribute thing is that people generally understand extended attributes to be a property of the inode right and then you suddenly return information about the file system so that it's kind of weird right like this is hard to explain to people that oh yeah if you use that I think something different than I did that in here first and I can't wish I hadn't this is a better way with this hey everybody this is Derek I got a couple of questions for you the first is can you actually use fs pick to pick a file system by fs id and the second is does the file descriptor you get back from it actually do anything with extended attributes currently because I don't know if Kristoff was objecting to the use of fake extended attributes generally or just the part where people were mixing it in with actual extended attributes from real files such that you would never know if someone had simply created an x adder with the same name as an fs attribute or where it'd come from fs pick just takes a path well a path in the fd and as far as you know these sudo x adders at least stuff like sef usually puts them in their own namespace we have like sef had like a sef dot you know x adder name I think the whole x sorry I don't have your patch set anymore in mind but at least the x adder interfaces that we currently have in the way they are implemented this just makes me go like nope this is not where we want things this is like really a broken api in my opinion I'm usually not someone who calls something shitty but I think x adders the current way we do it's really shitty it's type unsafe it's really convoluted complicated the call paths get insane as soon as you have something like a stacking file system in the mix so it's really not a path that I would like to go down to yeah yeah yeah so we we moved we moved acl so hopefully we can move file system capabilities out of there as well because it's the same sort of problem that we have but why couldn't this be not deeply thinking about this why couldn't you just have a new inode operation get or super block operation that gives you the amount info for the super block huh I mean well you have it streets show options oh yeah right see so I think your textual idea for generic mount options is is good I think that's good enough that's works for works for util linux and probably works for works for you as well and why not why don't we make this a separate system call but go forward with a slimmed down elegant version of fs info only requested that like you mentioned that we would have to put a uid in every file system first but if we have the request you know request bits and then the response bits like we have in stat x and you don't even need to do that you could just you know the ones that have a uid to report it yeah so currently okay let's why don't we aim at merging an fs info like I don't want to be settled on the want to be settled on the name I like the name but by the end of let's try by the end of this year early next year and then next year I mean it's yeah I don't care who I don't care who does it I don't think this is magic it's like a static system call interface it's like copy and paste statics in a way and make it suitable for for generic file system information huh and make it extensible make sure it's did you just volunteer no alexa did I just repeated him because I have the mic so it's what you're talking about just I mean fairly easy we just had a I don't know a super block operation there's there's just a we just export one other optional thing if a file system has file system specific stuff and the syscall looks to see I mean L correctly pointed out we already have show we already have this for for show for showing them else right so this would be like show full info or show something info and and we're done I mean but the only problem is that I speak sensible what isn't it has to be sensible extensible extensible yeah so okay so that that deal is that would deal with the information for the mount but what what about the list of child mounts so that's that's what I asked at the beginning that could also be unless there are strong objections that could also be a separate system call right query for the child mount ids and then be able to query information based on those ids yeah huh so uh so the fs info whatever it's going to be called gives you the parent and the this other call would give you the children yes sounds good to me with the understanding that because of how the thing is I don't the thing this I that would probably require you to have a a substruct within the fs info struct and then you're back to yeah but I mean then you have a variable length yeah but wouldn't it be nicer wouldn't it be nicer if you had it in a separate system call and you could query how much space do I need for these mount ids or whatever yeah I I think there's like we I think having arrays embedded instruct variable length arrays is just terrible I fully agree like I mean you basically implementing your own marshalling system and type system and that's don't get into this if you don't have to remember we're looking for consensus and we have a coffee break to get to the problems why is it what what is it the thing is if you can split functionality mounting doesn't have to be far necessarily fast and getting mount information also doesn't necessarily have to be fast mount native notifications have to be fast what is so what is so problematic if we say don't let's not stuff it into a single system call let's just split it over multiple system calls raises her sorry could you repeat l theoretically if you want atomic picture you well you don't get any kind of electricity if you spread it over several system calls but I think so if that's a strong requirement I don't think so so the termicity I think is only about information about the object itself but if finding out the children or the that's not information about the object itself so much that is constantly changing so I think it's fine if that part is separate so yes I like atomic interfaces where it can get something of a consistent view of the object but once it touches the relationship to other objects I don't think I mean I will not make don't this sounds like asking for too much so so so I wanted to to mention can can you make sure that that any new system calls have documentation and test okay so okay so it's about the atomicity we uh your David's patcher said had added a version number to uh this two mounts so basically anything that changed the mount would bump the version number and this could also be returned from from this discourse and then you would know if if anything changed they refer to the same version of per object or per mount tree does version apply just to the object just one mount or to the entire uh mount tree what what I did was put a version number on each mount because when you've got a lot of mounts so it's when I return the list of children it's a list of actual tuples child version that child child version child version child version good to make looking interesting okay I think we need to call it and uh we can go to the coffee break we I hopefully we can all remember the good spirit on the on the list please