 Come to believe then we'll begin. Good evening, ladies and gentlemen. I'm sure you're all dying to know what's been going on in the colonel I'd love to know because I'd love to know what he's doing in there when I'd rather have them for other purposes So I would it is my distinct privilege and by privilege I mean I bagged it when the sign-up sheets were going around to introduce to you the Most brilliant and beautiful colonel hacker. I'm sure you'll all agree Ben Hutchings telling us what's new in the in the next colonel I'm hoping that my slides will appear on the Doesn't look like my slide More words on that Yes, that looks like it Okay, so I gave a somewhat similar talk last year About what was new in the Linux kernel and this year some different new things have happened So we'll this talk will not be exactly the same although it's a pretty similar shape To just gloss over who I am I I'm a software developer or software engineer depending on what my employer chooses to call me this year By day and they've been developer at night. Although since I've started working at home So I was timings the other way around I've been working on the Linux kernel since about 2008 Both in Debian and in my paid work I'm currently doing most of the maintenance for uploads to unstable aside from all the Non x86 ports, which I don't really know very much about And I'm also maintaining the stable updates to Linux 3.2 that colonel org which then those those then feed into Debian and various other distributions that based on 3.2 as a As has been advised for free software projects linux releases early and often Currently about five times a year. There's no really schedule But it works out as about about that and there are stable updates every week or two Which are just supposed to fix bugs and force regressions Some of the new features that appear in a stable release aren't completely ready. Some of them need Needs apart from user land In the last year there have been six releases 3.11 to 3.16 So we have lots of new features some of which need integration Some of which we just need to turn on in the kernel package I'll recap what happened to the features that I talked about last year the team device driver will needed a user landsport package called lip team and that was uploaded in October the team devices are kind of a Supposed to be a better replacement for the network bonding device transcendent memory I Did think we needed to have a bit of a think about what what to turn on whether we needed to do some scripting perhaps for that None of which has really happened, but I when I was preparing this talk. I looked out the options that were there partly transcendent the transcendent memory framework was present but a lot of the specific Plug-ins for that were not so transcendent memory is about having a layer in between the working memory which is can be mapped into processes and the swap file or swap partition and files on disk Because the disks tend to be a bit slow. It's maybe useful to have an intermediate layer between those So we have now have Z swap Which lets you instead of writing? Sort of writing pages out to the swap partition and said compress them keep them in memory and that sort of serves the purpose of Producing the the amount of use memory While also being it's much much faster to decompress Those not quite swapped out pages then to bring them in from the disk Zen also supports This sort of intermediate state with the edge of the hypervisor, so That will be enabled in the next kernel upload Linux 3.16 point whatever it's going to be Going to unstable soon The new KMS drivers that I talked about as far as I know those are now supported by the Zorg drivers in in testing Module signing haven't been it hasn't been enabled, but that's mostly because we Haven't really made progress with secure boots So at the same point that we get kernel signing for secure boots we should also have module signing and That should give you a lot more assurance that the kernel you're running is the currently meant to run it also talks about having more support for discard which is a way of improving the efficiency of SSDs if you tell the if you tell the SSD that some blocks are not currently in use it can do a better job of where leveling and your disk should be a bit faster and last longer Unfortunately, we're still not enabling discard for SSDs automatically when you When you install devian you have to know what options to turn on later. There's an open bug for that if you want to work on it Last year I talked about improvements for two containers. We had finally had username spaces implemented properly, so you can create a You can create a container where the user IDs are Can start with zero for a route and Still be completely distinct from the user IDs That are numerically the same on the outside of that container and you also have The the special capabilities that normally associate associated with the routes you can give those to the containerized route and it will be able to for it to Have powers over the processors in the container, but but not outside the container one of the blockers for that was was XFS because every file system needs to be able to distinguish the User IDs in the current username space from the global user IDs which get ridden out to disk Glad to say that has been fixed. So we've been able to enable username spaces and I believe LXC and Possibly some other container Systems are using that Bcash, I believe people using that. I think someone reported talked about it on planet debia. I don't remember who However, the Bcash tools package that's needed to configure that is not packaged There's an open bug for an open ITP bug. I think there's some kind of Licensing mix up there that needs to be resolved But if you're interested in Bcash, please go and look at that bug see if you can help resolve it for Jesse So arm multi-platform, I believe we now have Devin installed working for some of the R&B seven boxes with a with a multi-platform kernel Don't know if anyone can give a specific anyone anyone know and I've got DI working V7 Well, I think it works There's some progress on GPU drivers, I believe Because Nvidia support somewhat surprisingly has helped with to get Nouveau supporting their Tegra SOCs Which have somewhat similar GPUs and feed to their PCI Express cards So Nouveau Nouveau is suitable for both And then the Nouvea project is sponsoring Development of the Etnav drive for the GPU that's used on their on their laptop slash development board Quite when that will be ready I Don't know so getting on to the new features that have appeared in the last year Unnamed temporary files Not very exciting but kind of useful Currently if you you can create a file a file that is not linked into the Not linked into the file system using the C library temp file function But actually that does have to create a file with a specific name which it will It will try to generate a random random name in Usually slash jump and if that fails it'll try another name and another name and another name until finally it comes up with something That no one's using and then it will immediately remove that file which didn't really need to have a name so there's now a kernel feature There's the option O temp file and if you specify that And you specify the name of just a directory Then you'll get a direct you'll get a new file which is in the same file system as that directory Sorry, assuming you have permission to write to create a file in that directory. You get a File on the same file system that doesn't have a name never had a name and One of the Interesting things you can do with that which I so far. I know you can't do with if you use to a file is You you can actually give this nameless file a name later Using the link at cool So the result of that is you can You can put content into your file you can set all its metadata like permissions and ACLs Whatever extra attributes you want and Link it into the file system and it's as if the file had been atomically created There's no other process is going to see this file in an incomplete state so That's probably kind of useful for some applications. I'm not sure what but Think what you can do with that unfortunately It's not supported on all File system types. It needs specific support in each file system. So you're going to need to fall back Unless you can your application can depend on using specific file system types and of course it was only as in 3.11 So if you need to support that the kernel versions you still need to fall back I'm going to skip over this and come back to it if I have time at the end the Luster file system is apparently quite popular In cluster computing applications. It's a distributed file system, which is a bit different from things like NFS It doesn't have a simple server It's been around for a long time since 1999. So why am I talking about this as a new feature? Well, now it's in the Linux dating directory and it's being updated for each new Each new version of Linux. So at the very least it does build against current versions of Linux Previously it was kind of flagging behind the kernel and Although we had Luster in squeeze It wasn't released in Weezy because it didn't work with the X3.2 And Unfortunately it's dropped out of devian completely now So while we have the the kernel side of it working again, we are missing the user land side So I dropped a mail to the former Luster maintainers. Maybe they'll add it back in if you're Maybe they need help so if you're interested in in getting Luster back into devian that's something to look at butterfs has gained support for deduplicating files Which you can kind of do in a way by making using hard links, but the problem with hard links is then An update you might not want Updates through one path to affect the other path you want to share storage that maybe have a copy on right behavior now butterfs generally Doesn't update Data in place instead it makes a new it makes a new copy and it drops a reference to the to the to the old data So you can you can have data shared between files without necessarily linking them Snapshots are very cheap and you can also do a kind of a cheap file copy currently that requires a an IOC TL although Possibly CP as a special option for it. I don't remember now, but you need to ask for it specifically so You may well end up with multiple copies of files anyway, and you want to save you want to save space I do deduplicating those Butterfs isn't going to do that for you automatically. It's not going to actually go out there and scan It leaves that to use the land So you still need a duty tool and you need one that's going to That's well you probably want one that is going to enable copy on right Rather than linking there is one of these fun tools called beat up, but it's not in Debian yet So any butterfs found out there won't have to do duplication think about packaging that Can we part of an existing package? Okay, so I look like this is going to be added to butterfs tools NF tables it's yet another file firewalling API As if we didn't have enough already Okay, well I explain why why this is actually a good thing Currently we have IP tables for IPv4 we are IPv6 tables for IPv6 firewall Firewalling we have art tables for the art protocol and we have eb tables for Ethernet bridges working at the ethernet level All of them are practical specific they need a kernel module for each kind of matching you might want to do they need a code module for each each action I Think some of the actions are somewhat shareable between these The they're all based on the the kernels net filter API internally, which is somewhat more flexible But only if you want only if you're prepared to write another module So the NF tables API exposes more of that flexibility it adds a Kind of virtual machine similar to the to the Berkeley packet filter that's that's commonly commonly used for for packet filtering on sockets I'm not quite clear on why it needed a different virtual machine But apparently it did PPF wasn't quite good enough So user land can generate matching code Upload that into the kernel and it'll all be safe probably because it's limited to what it's It's sandbox within this Specific version machine So we have a user land tool NF tables Which uses this API it's already packaged so that's great But the next stage was going to happen somewhere down the line. I don't know quite how far off it is All those old firewall APIs are now redundant But because you cannot you can generate all that matching code in user land now you don't need specific You don't need specific native code for it So the user land tools IP tables IP 6 tables and so on will need to be ported to user F tables And Hopefully upstream maintenance for those will do that. All right, so lock debugging Is something you're good at that is well multi-threaded programs Often have bugs involving locks and the kernel is is a massively multi-threaded program It has well every every single task that exists in user land Can also run in the kernel. So you've got a thread for each of those You have kernel internal threads. You have interrupts. You have soft interrupts. You have no maskable interrupts So you have a huge number of interesting Interactions there lots of different synchronization mechanisms Mutex is spin locks RW locks and you have locking Locking operations that may inhibit interrupts or soft interrupts temporarily and Hopefully we avoid data races that way But with all this all this locking going on Unfortunately, we might do locking in the wrong order and get deadlocks easy to introduce and and of course they were bite users in the field and We don't know how to reproduce them For some years now the kernels had a system called lock depth which dynamically tracks the locking operations sequences of locking operations But then it will do a sort of a static comparison of these dynamic sequences and will work out supposing these Supposing these sequences occurred in parallel Then could that potentially result in a deadlock So although it's not a pure static analyzer. It's it can Very quickly find many types of of deadlock bug So that's that's helped to fix find and fix a lot of bugs And in conjunction with The trinity fuzzing tool that's that's Resulted a lot of improvements in in the robustness of the kernel. So now you too in user land Can use lock depth Just as soon as we get around packaging it it's available as a library which is in the in the Linux source tree And should be built from a Linux tool source package only it isn't yet So I hope to find time to do this if that's something that sounds really interesting to you, then Please please help So we have a couple of new ports. Well, actually there are lots and lots of new architectures being asked to Linux all the time Many of which are not supported in Debian But armed 64 architecture Although it sounds all sounds a lot like arm And come from the same company is actually very very different from the 32-bit arm and it's treat us Currently treated as a completely separate Thing in the kernel The initial spot for this was added over a year ago, but it wasn't really usable In the last year it has become useful it's been become In fact reached a point where you can run it on both emulators and real hardware I believe that the Debian packages of the arm 64 kernel do run on real hardware a lot I haven't seen it happen for myself Everyone wants to donate me an arm 64 board to test that Perfectly willing to take it The kernels had support for power PC for very long time and Which was a several different variants of that we've had power PC 32 bits in Debian We've had unofficial port to 64 bits both of which are big engine And the kernel has always run as big engine although it supports little engine Useland, I don't think we spot that in Debian, but the kernel did Recently there's been this the open power consortium Has decided that power is going to be far more popular if only it could was little engine I Think that thing. That's what it is anyway, so So we know how the PPC 64 EL port 64 bit little engine both kernel and user land That landed in Linux 3.13 The kernel run as little engine as well and there's a new There's a subtly different user land API for that and Both of these are Being bootstrapped in unstable as we speak I think they're being quite good progress and it might even make it into Jesse So far private locking is There's not another one of those things who's not particularly exciting, but it sort of fixes a bug in POSIX Which is POSIX is the is the standard that Linux and Unix and similar kernels attempt to follow as a kind of a core Core interface to up core API POSIX says that if you if you lock a file Then as soon as you close any file descriptor that to that file you or that your process Drops its locks on that file Thing is you can open the same file multiple times You know multiple file descriptors it if you have a multi-threaded process You might well have multiple threads that don't know about each other opening the same file Locking the file lock in different ranges of the file because these are ranged locks not whole file locks As soon as one thread closes the file oops it dropped the locks that belong to the other thread as well So your multi-threaded process now will need serialization around opening and closing files Which is a bit silly What's more you're going to you have a problem of hardened symbolic links Which mean that maybe those files you thought were two different files are actually the same file So this serialization still doesn't help you So we have a solution to this we have a new type of lock Which well, it's almost the same type of lock, but it has the right semantics now. It's associated with the open file handle so the two threads that open the file Now have a complete separate sets of locks associated with our open file Multi-key block devices don't need any don't need anything new from your application. These are performance feature Every block device that corresponds to a physical device a physical disk is likely to have some kind of command queue or request queue which has all the All the reads and writes that have been Have been started or about to start on that block device Depending on the capabilities of the hardware the hardware might you might only be able to send one command to the hardware at once Or you might be able to send multiple commands. That's called NC queue And in case you the queue is maintained in software if you only have a single queue for your single device then the Adding things into the queue and taking things off the handling things over to the hardware that has to be serialized and then the completions are also being handled to a single context So you potentially have inter processor interrupts to wake up Processors on on other CPUs So if you have a really fast SSD that can be a bottleneck If you have a really fast SSD it might actually support multiple queues but the That doesn't help you so long as the kernel is using a single queue That's finally changed in in 3.16 you can have If the driver supports it that can be multiple queues for a locked device you can have multiple CPUs adding to these queues in parallel sending commands to the to the device in parallel and completions come back in parallel and Hopefully everything gets You don't have any You don't have so much contention between different tasks using the using the device So far. However, there's only one driver that supports this It's MTIP32XX which is for a very expensive Family of SSDs I believe that some that the SCSI block drivers for The SCSI drivers that work with Some other kinds of SSDs Are going to support this soon, but they missed 3.16. Oh, I missed that. Okay so NVME I'm repeating it. I'm repeating it with the microphone So NVME is already multi queue. Okay, great. So I think that covers all the all the really fast SSDs probably Great So, okay So if anyone has questions, I can ask them now. I can also go back to talking about the Network busy polling which is another sort of interesting high-performance feature This working Yeah, okay, I'm gonna find there are no more questions So with with just With Jesse having frozen at 316, but it's not not yet like slushie. I heard is the description What is the well? It's not We haven't frozen we we're not freezing until November And well 3.16 isn't in isn't in unstable yet. The next upload will be based on 316 So, what is the opportunity for getting patches that were accepted and queued for 317? into 316 Jesse the chances of that happening are very good if you ask now Less so as we get towards the freeze and Beyond the freeze we can still take patches for hardware enablements. So you drivers spot for new models and so on they're still okay, but The earlier the early you asked the better I don't believe personally that you are sick is a magic But so many people have been requesting that just get in jibion that is getting on my nerve Is there a chance that we ever get a maybe a flavor just like in jibion Geo sec is not going to support 3.16 for very long so far as I know so That's not really going to work as a as a patch within the Within the Linux package There is possibly there was some discussion several months back about the possibility of doing a A separate Linux GR sec package Which will be based on the 3.13 or 3.14 branch Whichever it is that's that's going to have long-term support from the Geo sec developers But I haven't seen any sign that that's actually happening I didn't want to try it uploading to that to new yet It's not going to come from the kernel team because we've kind of got our hands full Hi, thanks for all your work on the kernel team first Maybe just to make sure I understood the answer to the previous question I think what you and I think your answer was if someone wants to upload Linux GR sec Then that's plausible. Is that basically the summary? Yes I mean the usual objection to that would be it's more or less duplicating code and we don't and the security team doesn't like there being duplicate code in the Archive However, in this case, I think there was a tentative Yes, it would be okay in this case Because both of them are fairly well supported or something like that Okay, and then my other question is about backports kernels. I Use backports kernels, but I kind of just install them willy-nilly whenever is there a schedule or a plan or I'm glad they happen But when and why did they happen? I hope kept more or less up to date with testing. I Don't always remember to upload as soon as a test in propagation happens but I've been Being I hope I'm doing the uploads in a fairly timely fashion Any further questions That's right. That goes for wheezy backports. I haven't I haven't done any uploads for squeeze backports for some time now and possibly I May do that to support squeeze LTS But I haven't got around to it yet, sorry if this is a question from an outsider, but I'm Sorry from an outsider view. Is there any qualification process or testing done on the kernels as they go into either backports or testing or unstable? With them not really for testing propagation, it's it's the usual rule which is If there are new release critical bugs Raised against version in an unstable that will block propagation to testing And hopefully if it if it's in if it's okay in testing Well anything that goes into backports was previously in testing so barring compiler bugs or incompatibilities Then then it should be alright in backports as well It's not really very satisfactory, but that's Without a hardware qualification that that's that's the best we can do at the moment I was asking this in particular because the um the a lot of Google Compute Engine's customers are running the backports kernel in fact the vast majority are right and around the upgrade from 314 Sorry a lot of the customers were A lot of the customers So a lot of Google Compute Engine's customers that the vast majority are running the backwards kernel around the upgrade from 313 to 314 Customers would get the kernel as as it comes out, but we had a very large number of bugs start appearing So I was wondering if there was any plans for a testing process or a qualification process around kernels being put into backwards If you want to if you want to you want to help with that You know if you want to provide some of your engineering time to help with that That would be very much appreciated because no one wants to upload broken packages but as it is No, not the kernel team has a lot of time to spend on this unfortunately right looks like we're done here, so Thank you all very much for coming out in droves to to hear about