 Ac teid. A'n tro'n syni teid. Cydwn ni'n ddim yn ddim yn ddim yn ystod o'r exïtio, ac mae'r dros o'n bwrdd o'n dechrau, mae'n ddwyllaeth o'n ddweud o flyny. Mae'r ddweud o'n ddweud o'i gwell yn y ddweud. Mae'n credu Cystol fath yn y bar. Mae'n rhai o'n dechrau yn y bar. Mae'r ddechrau yn y dystod, unrhyw o'r problemau gennym o ddechrau o'i ddweud o gystafel. felly mae'r cahyfyddiad yn ymgyrch gyda'r ysgol, ond mae'n golygu yn ymgyrch ar gyfer ystod hefyd. Yr ysgol yn osod yn ymgyrch a'r eu cyflawn. Ond o'r gweld, eich gymlwch i'r gafethaf wedi'u gafethaf, o'r gafethaf o'r dda, y system gwraith cyflawn eich gafethaf, eich cyflawn i'r exaffes o'r buddwll, can fill in the gap for you by that. So by using a bit more dispace and putting it full of zeros, it can reduce the extent list from one extent, from three extents to one extents, and make for a smaller extent list. So it basically inserts a phishing block of zeros. The second problem is the converse of that where the optimiser looks at the date and sees there's a huge block of zeros in the middle and thinks it can fill up some dispace by getting zeros and splitting the extents. The problem, but there are problems with that. So things like beam map, phyma map, seek code, seek data can give you false positives and false negatives because the data can change, the extent list can change on view at any time. With the X-T4, if I remember rightly, there's a flag you can turn on, a syscatl leaf or something you can turn on to make this happen, but also defragmentation tools or EFSEC can do this, so you can't rely on the data from those. So in cache files, I was using a hole in the file system. I'm still using a hole in the file system to indicate there's some data I haven't fetched from the server yet. I've been looking at content encryption where a hole is used to mean this block doesn't exist. You can assume it decrypts. It's equivalent to decrypting to zeros. It's an optimization because otherwise you have to fully, if you want to store a block in an encrypted file way out over there, you have to fill everything in between that. Otherwise you can end up with encrypted data. This only appears to affect cache files, but it looks like it can also... this information can be accessed remotely by NFS and SEF. POSPYSIFS is not here to ask. If we're building content encryption on top of that using holes to indicate blocks we don't have, then this could be a problem there as well. There's FScrypt in the AXD4. I think that's not a problem because AXD4 doesn't allow you to do hole bridging and extent filling if the file's encrypted. That's correct. To mitigate the problem, is it possible for us to ask the backing file system not to do that for our purposes because the file is content encrypted? It seems like this is a philosophical question. To some extent. Many file systems have considered sparse to just simply be an optimization. Therefore, whether a chunk of file that contains all zeros is expressed as a gap in the extent tree or just simply unallocated blocks in the indirect block map or by allocated blocks that are then zeroed maybe because you're also trying to avoid free space fragmentation if you know the file is going to eventually be filled or you have caused to believe that. I don't believe... I think C-Cole is something that is used just to, again, optimize a copy, but it is not considered data, right? And the conflict is if you have applications that are expecting that the C-Cole, C-Data information is actually information as opposed to how things happen to be allocated at the moment and are depending on that because that's what FS Cache is doing. And also constant encryption. The thing is that if you're doing encryption, presumably the encryption layer knows about that and can just simply suppress that, right? If you encrypt file systems, you tend to use a Galois counter or other type of cipher that means you're blind to the holes anyway. That's the whole point of using XTS or GCM ciphers. If you're not doing that, you're doing encryption on files wrong. Well, that's how FS Crip... This is how FS Crip does it. It encrypts each block based on the block num and file i node. It doesn't store any data. The problem is you can have a block that when you encrypt, it ends up all zeros. That can be thrown away. But also the file system can insert a block of zeros, which you then say, ah, there's a block there, I'll decrypt it and you get some rubbish out of it. I think a block that encrypts to all zeros is so theoretical that you don't have to worry about it. The false positive isn't really the problem. False positive is a problem here where you've got a block. It inserts a block of all zeros. You don't know that that isn't a valid block. So you try and decrypt it and you get rubbish. Unless you can stop it inserting the bridging block. So I am not aware of any file encryption system that either file encryption systems are either integrated into the file system a la FS Crip, in which case they deal with it. Or if it's external to the file system and it's a sparse file, they don't write sparse files because if you write a sparse file, c, c data is a relatively new interface. I'm not aware of a user space content encryption that wants to write a sparse file and relies on c, c data to find out whether or not there was never any data written to that particular area of the disk. I agree that that is a theoretical possibility but it should also be remembered that not all file systems even support c, c and c data and therefore user space tends not to use that to try to store semantic meaning. It's just what I'm trying to do is avoid re-implementing an FS on top of the FS to store I've got this block or not got this block. Hold on guys, light and we want to say something. So the first thing is that we only really need the granularity of a FScript block. So if we can guarantee that we can get like using 4k blocks and set and if we can guarantee that we can get at least 4k granularity out of the backend store, then we're fine. Also the other problem too with an all zeros block is that you don't know that until you re-gold thing in memcom. So there's no way to know it can be mostly zeros until you get still some whole bit of the end, right? Anyway that's all I got. I think this is like not a problem because ecrypt FS and Butterfus uses ecrypt FS like anything that we zero will be in a page that's being allocated in a block that's being written and that will be encrypted. If there is a literal hole there's no extent for it. So you're not going to read it back and we don't do this like optimistic filling. Does the XD4 do this? Oh, okay. And XFS does apparently. Okay, Butterfus doesn't do this so we don't care. What if? Sorry, I mean when you... Go ahead, Jeff. Oh, I was just saying that I lost my train of thought because I had something I was going to say. But never mind, go ahead. Okay, so my question. If you had like a larger block like the erasure block or IO block you could query the file system what is your optimization block? And you would know for example that one megabyte gaps are not filled, right? You could rely on one megabyte holes but you don't know the limitation. I don't know that I can rely on that. If you could, if you could. If I could I could encrypt that block size. But I don't know whether I can because that's a function of the XD4 and XFS. Look, Ted can answer this question but I think the reason for optimizing extents is keeping a smaller B3 and a consecutive block I don't think that filling out one megabyte holes is an issue. Again, ultimately the question is whether a file system is obligated to maintain a semantic difference between a sparse file where there's a gap in the file versus a block that's written by all zeros. I was actually trying to look because as I recall I'm not even sure that like if someone uses the zero range F-allocate operation I think it says that it's preferred that you write all zeros. It's not clear to me whether it's invalid to actually deallocate to get the effect of F-allocate zero range. I'd have to go and look because I don't remember that. Historically there hasn't been considered to be different. I could imagine that file systems could agree to implement a flag that says don't do this thing because the underlying user the user of that file actually wants to assign semantic difference to blocks not allocated versus blocks are allocated and contains all zeros. At the moment it happens to be the case that EXT4 and XFS are the only ones pursuing this particular optimization. The basic rationale behind that was if you're going to be writing 32K of zeros on a hard drive that's actually faster than to have to seek around and mess around with the extent tree. Why not just do that? We could implement a flag that says don't do this optimization but that actually doesn't completely solve the problem unless we also effectively prohibit all file systems from doing this optimization without implementing this flag that says I promise not to do the optimization. Jeffrey on the virtual call has comments. Go ahead. Several years ago we had You muted yourself again, man. We had you in the beginning. Sorry. Several years ago we had a series of bug reports filed against AFS from CERN regarding applications that they wrote for that leveraged parallel IO systems. These systems essentially protect access to different portions. Use a common file potentially two petabytes in size where they allocate the file sparsely. They actually rely upon reading where the holes are in the file from their applications to figure out where their records end and begin. One of the problems that they had report us is that a hole they left that was 12 bytes in size would suddenly disappear out from underneath them corrupting their data. Clearly we pointed out that they were relying upon behavior that was not guaranteed but there certainly are user land applications unrelated to encryption that want this behavior to go away. It sounds like something that would be reasonable to turn off per file. How we have no compress and stuff like that. Right, so with ECS-D4 today there is a tunable where you can actually say what's the maximum size where we will actually bridge the gap with allocated blocks that are all zeroed and the default happens to be eight file system blocks. The workaround that I told David about was you could just simply set the tunable to zero and that solves the problem on a per file system basis. I think a file flag is probably more appropriate so that it's now on a per file basis as opposed to a per file system basis. I don't have an objection to doing that. Modulo finding a free bit in the chatter bit namespace that isn't being used by somebody else but sure, right? I can't speak for the XFS folks if they would be willing to implement a similar feature and I note that this is the sort of thing which is best if all of the file systems that do this optimization agree on the same flag to turn it off. Do we have any of the XFS guys on the... No, Eric failed. I would like to do a similar optimization for ButterFS at least for bridging. I would be fine. I don't have a great picture in my head for how often these bridgeable holes actually happen. I think the most compelling use case I can think of is people shoving down hole punches on VM images where a 4k hole punch isn't that useful for us to do internally. It just actually makes it much worse but we could still send it a scar down to the device and just now punch the hole, something like that. One of the reasons why we were doing this was at the time LibBFD, which is used by binutils had this really bad habit of writing files randomly that would eventually become sequential and they would leave these weird 4k gaps that would then later get filled and the optimization made sense if you were doing some silly benchmark-like kernel compilation. I think he still does that. I just wanted to point out that there is already an API in XFS. The same API has been used to set the project ID. It can be used to set extend-size-hint and cow-extend-size-hint. It's not exactly the same thing but it's a similar thing. I don't know if it can be used or shared or extended or whatever. I really need Kristoff in the room for this. That's all I've got for this in here. John. I think Kristoff was trying to talk and he's muted himself. Go ahead, you're good now. Hello? The API Amira has been speaking about is IOCTL called set extended attribute or something like that. There are definitely flag flags which are available there. If we don't have standard flags, we could use this call to set the flag. I believe we already like set project IDs as a summer set and the project ID inheritance via this API already. There are some precedents both in EXT4 and XFS4. I think I know flags with this API. That should be definitely available. You're good, David? I'm not sure if Italian is still what he said. The interface that Amira was talking about could definitely be used for this. This is in EXT4. But that's only if we don't have space. I think we do. It sounds doable. Excellent. Thank you very much. I think that's it for the file system track. We're going to wander off and get drunk. We will be back for the evening wrap up and lightning talks. People on the call, you wander off for three hours and seven minutes. Let me look before I say that for real. Five o'clock. Three hours and seven minutes when we're back and it's the lightning talks and the little wrap up. Thanks everybody.