 Hi welcome everyone to this session where Julian who is one of the up-to-maintainers will tell us all about what he and His commentators have been up to in the last year and maybe what you will do next year Thanks, okay in the past year a lot of things happen in app and I want to start talking about fancy new security stuff and first of all File fetching file fetching in app works using methods that are processes which are run to fetch files using a single protocol and usually a method runs for a single host so if it fetches files from one host and Those files are downloaded to a partial sub directory and then moved into the parent directory when they were successfully fetched and And what we did in one point f 1.1 was we changed the user the methods run as from root to the underscore app user and the nobody group and We gave that user right access only to the partial files directory so it couldn't write anywhere else on the system or read any home data or configuration data or stuff like that that is not permitted to read and The files are then moved from partial to the main directory by the parent process And you can see the permissions below so the list directory has the normal permissions you would expect which is 755 so and The partial directory can only be modified and read and traversed by the app user and that has Is quite nice, but it also has a few problems because Currently the method verifies the file hashes which is obviously a huge loophole because if we verify the file in the method We are running and the method is compromised We could just say hey the file is correct And it would still it would be used by the parent app process And then if it's a package it would be installed and could have a backdoor or whatever We can fix that Example we could pass the file we could check the file in the parent process, but then we use the check summing as root Which is also not so nice so maybe we invoke a helpful process which we start as nobody and That would be safe because it could read the file We can pass it as an app file descriptor as an open file and Then it could just exit with zero if the file is verified correctly or with a narrow code if it's verified wrongly What we are wondering about is if there's a performance regression because currently we're verifying the file while we are writing it So we're taking a block of data very adding it to the check some and then we write it out and If we do it in another process we have to Sorry We have to read the file again, which is might be a bit slower But it should be cached normally by the operating system a kernel Another problem is methods can write to any file in the partial directory I can also read any file in the partial directory or list all files in the partial directory Even the ones they're not responsible for Which means my method could now modify a different file from another method And that would not be very nice So one idea ahead was to open the files in the parent process and kind of send it to the method using sockets But that might take some time because it's a Huge change of the current protocol, which basically is text-based We can also remove the read permission so we don't know which other files are in the directory that might work I'm not sure about the pdiff things Which require the multiple files and because the files are merged the patches for the packages files I merged into one patch and then applied in one go By the method responsible for that another idea for further securing of the methods is Set comping second sent boxing so we can restrict the allowed size calls To limit the attack surface, so if One size code has a problem. We can just it and it's blacklisted. We are not affected by it, which is quite nice But we must maintain white lists of the system calls that are needed by the method which is a bit complicated And it also has a problem because if you have a proxy script you can have auto detection proxy scripts It's best if I'm in a config file. They are run by the method and Well, they would inherit the restrictions of the of the second set boxing. I Influenced this in a branch in my repository by the HTTP method But I have not published it yet because I have not released it in an app Release because it's still work in progress another topic was much more a Publicized I think It was the SHA one removal So starting with 1.2 playing seven. We are considering SHA one as an unsafe hash algorithm and If you now miss an SHA to field that causes an error and For the GBG signatures that are non SHA to They only cause warnings because they were just too many of them and the whole thing would just Preg a lot of repositories more than it currently does So we already broke some repositories. For example, all the Google repositories were broken, but they are fixed and now It took some time, but it's I'm glad they're working now and Warnings as I said, there were lots of warnings. For example the launchpad report PPA repositories all produce warnings Because they were all signed with an SHA one hash some But they luckily had the right Type of key so they had High bit RSA keys and not DSA keys because here's a keys Would have been involved in the creation From the DSA from the DSA key to a new key Which is a bit more complicated than the changing the hash some which could be done very easily in the launchpad case And one thing you know a lot of users complained about the thing is also a good thing that they complain about but It's not the easy to fix is we can't disable the errors or warnings right now So it's not really possible to fetch for fetch repositories that miss SHA to fields because Apple just error out and you can't do anything about it But then might be fixed in 1.3 Not entirely sure yet, but That's the plan so you can say okay I want to allow this repository to have an SHA one key so for example if you go to art to a Snapshot from Debian an old snapshot from Edge or something like that and that is Weekly signed when you can just say hey, I want to trust this because I need to do stuff with it for historical reasons and In January 2017 So next year we will start to treat the week signatures of the release files as errors as well because That's a good point. I think the browsers are doing it as well. So that's the common SHA one defecation point And so I thought let's do this too Probably also for the Ubuntu LTS release which already has this whole thing enabled But we can think about that later That was it for security another very interesting topic was performance and You might have noticed that app got a lot faster between 1.1 and 1.2 and The obvious reasons where we forgot buffering in the PDF applying and introduce read buffering in 1.1 in 6 and 1.1 in 7 which Made the update take four seconds instead of 41 seconds So it actually becomes useful now in 1.1.9. We introduced white buffering Which further improved the runtime? So applying a patch now takes half of the time in my test scenario which involved a huge contents file and a huge patch and In 1.2. We are starting the processes in parallel so you can patch multiple packages files in parallel which Lastly improves performance again because it now scales up to the repositories up to the size of your CPU core count Now comes another trick especially for app file we introduced LC4 support in 1.2 which means we can dynamically recompress files we fetch using the LC4 compression algorithm and We're doing this for contents files especially because content files are really huge and they took a lot of time to compress with G-SIP And you don't want to store them uncompressed. So whenever we have to update them We have to decompress them apply the PDFs to the contents file and then recompress the contents file Which is insanely slow. It takes multiple seconds for one file and With the LC4 support we can do this in a far shorter time I have not measured it or at least I don't remember how far how much it was But I think it's less than a second now Which is quite nice That's also the G-SIP indexes option which exists for a far longer time already as the name implies It was originally used for G-SIP indexes. So all files were stored as G-SIP indexes in the list directory But now we use LC4 support for that as well, but that's configurable with a config option somewhere But I don't currently know where that is So it's probably somewhere documented, but I don't have it in my mind right now So what the effect of the whole thing was that? apart from updating much faster app file Improved performance as well. So if you search something in with app file You search for a file or you search which package contains which files It should now be six to seven times faster than it was before at 1.2 Which I think is really really nice and the G-SIP indexes is now only 20 percent slower than uncompressed files About 20 percent slower, which is good because you can just compress keep the files compressed and you don't really notice the difference in performance So now we come to a bit more complicated things or more internal things that Do performance the first was string views. So when we are reading a file a Packages file for example We want to get the data from the packages file into our cache file And what we did previously was we read it into a buffer Then we created the string out of the buffer and then maybe we trimmed the string or did some other modifications on the string And that copied a lot of data. So In 1.2, we introduced a string views which are plus similar to strings But instead of strings, they don't hold the data themselves, but they only reference a block of memory and have a length They are originally introduced in the C++ 17 standard and we have our own limited implementation there, so We can do that without depending on C++ 17, which would be not a good idea and the original idea for the string use thing was the port To the iPhone that was on by the city and maintainer Because on the iPhone the things were really slow and the memory was really slow and that took a lot of time apparently So here we replaced The strings with his own kind of string view. He named it differently, but it's not really relevant. It is very similar to a string view and This would use data copying a lot because now we can basically read it into a buffer of the packages file And then we write the buffer direct from the buffer directly into our cache file without doing any copy in-between And that is probably a huge benefit for device that have a low memory bandwidth such embedded devices like the iPhone was And other devices I mean The Raspberry Pi or something like that And another thing that was really annoying Because it was very slow was thinking of the head of the cache file when we wrote it. So after we wrote it we The cache file had a dirty bit set So then we synchronized the file with the storage device Then we unset the dirty bit and synchronized again Which meant the entire updating was blocked at the end on a huge f-sync call which really Bound the performance of the update call to the performance of the storage device and What can we do to improve this? we introduced a checksum and We're writing an Adler checksum into this checksum field in the cache We use Adler checksums because they are in the cheese of library available and we you depend on the cheese of libraries So that was easy and they're faster than the CRC checksum and the idea is now obviously If you wrote the cache we verified the checksum and if the checksum matches the check the cache is okay and if not The cache is maybe too short or some bits flipped or something like that So that's better even better than a dirty bit But it means that read performance is now slower So if you say app cache show it now takes a bit longer I think 80 milliseconds instead of 8 milliseconds on my system for example, that's not really noticeable in most systems But the performance in the update case is much better now because it now Does not need to sync data anymore, which was really Stupid because it's a cache file. We don't really need this file on the device. It's broken. We just generate a new one And the data integrity is also nice to have I wanted that for quite a long time because we always have Segmentation falls an app that nobody knows why they're happening and mostly it's just because the caches in some way bark So it's good to have this and before I forget it Now I'll increase the hash table size. We have a fixed hash table size So every package name is introduced in the hash table size and it has fixed size and it was 16,000 slots in the table Which was obviously not a good idea anymore because we have a lot more than 16,000 package names So I chose to increase it to 50,000 and also switch to a different hash algorithm, but That's the DJB algorithm. You probably know that Algorithm, it's quite popular but one thing I really like in apt 1.1 was Pinning pinning did not really work that well in the past in the old pinning algorithm. Each package has one pin and these pins call we call them specific form pins and Sources also have pins and we call them general form pins. So that those are the package column star pins and The first specific pin matching a package is a specific pin that applies to it and that worked a Bit you see this case works. It has two general pins Experimental is pinned to 100 and unstable is pinned to 900 and it now picks correctly the unstable version of Firefox but well Let's say we don't want to install Firefox from experimental So we pin it to minus one which means do not install this because it's packed specific pin. This did not really work App tells you oh, no, I don't want to install any of this versions and That's obviously not what we want right. We want to install The unstable version still just ignore the experimental version and in apt 1.1 That's exactly what we do We now pick the unstable Firefox again If you now pin the experimental version The unstable version minus one it would automatically pick the experimental version That's really might not be a good idea to automatically install experimental packages But you also have these are same issue if you have back ports of a new package and it doesn't exist in your stable distribution And causes that looks much better But how okay so what's the difference we assign the pins not to packages anymore, but to versions and The version now the first specific pin matching a version applies to the version and If there is no specific pin for the version the priority is the maximum priority of all sources This also means that choosing a candidate version is not much easier. So we reduce the code by two-third of the size and It just now finds the version with the highest pin and that is not a downgrade and it's the pin is above 1,000 which is the magic allow downgrade pin value now a bit of other stuff you probably Don't know about that yet, but in 1.3 App will auto-remove will only keep the latest provider of virtual package So if you have multiple Linux images installed and those all provide Some kind of virtue the same virtual package. Let's say it they provide Linux modules and You have a package depending on Linux modules at 1.2 or older would not Remove any of the Linux images you installed because it thinks that Your packages depending on the modules package Depend on all of the kernels because it just transitively goes through the tree and sees oh the reference the kernels somehow and Then says oh now the kernel is used. We don't want to remove that and in 1.3 we look at the source package and Only keep the latest provider of the source package, which means That the kernel example now works this is very important for some Ubuntu people because they have set of s modules in their kernel and packages depending on the set of s modules virtual package and that means they don't get any Pact kernels auto-remove anymore Now it works again Yes Catch some you might have noticed after fdp archive was really slow. I think I've read a lot of facts reports about that That was because the hash some caching was broken in 1.1 But in 1.3 it's fixed again. Thanks to pull request on our github instance on our github mirror and That's cool Because it now actually is usable for larger things again another Interesting fact is that I merged a patch for the fdp method That was lying around in the BTS since 2007 And it applied cleanly because nobody takes care of the fdp method anymore and The patch was about passive methods of fdp service returning responses. We did not expect But you can look at the back report or the comment message if you want to know more about that also We have a system d service and a system d timer and Previously that was a crunch up and it had a check if it's running on battery or not So the crunch of automatically updates your indices on your device so it runs automatically up gate update and Previously we checked if we're running a battery and then we basically exit with zero That was not a good idea for because it's not really over rideable anymore Because not over rideable you so you can't say I want to always run this updating even if I'm on a battery in 1.3 you can do this because you can now Use set that in the city unit by setting condition ac power to falls in your Override unit thing. I don't know what they're calling that So a bit of a recap That from 1.1 We introduced the ability to install local dummy packages You can just say app install dot slash file dot app I can specify any absolute path or relative path But you have to start relative path with a dot and a slash because otherwise app will not recognize them for safety reasons So we don't just specify a package bag High number accident and actually mean a package or something like that You can do the same thing with build dependencies. So I can say apt build up Directory or you can say apt build up the sc file and it will do that. I have that here so I can say Right and it installs all the build up dependencies for the package rep rep roll That's quite cool Also, okay We could I also had another example with the sc file, but that's not really important Um, that's all this by hash thing. You might have noticed it. Yeah, I've now used that was an announcement about that and this works basically by Storing the files using the hash value. So you create the hex digits digest And you store them in the by hash sub directory and there is a sub directory with the hash Algorithm name and their files named by the hex digest and the main packages files just linked to them or the other way around but This way it makes most sense and this is really useful because it prevents hash some mitch mismatches, which were quite common previously Because now you can update a mirror basically transactionally So you can update everything except the release file first and then update the release file And it just works as you would expect it to future stuff Actually, I wanted to work on that before but I didn't really Make it patterns. For example, I wanted to work on patterns during depth camp, but I Did not really manage to write anything useful yet You know patterns from aptitude. We're trying to bring the same patterns stepped So you can then use patterns and all apt comment lines and in preferences files Which I think will be quite useful Also, David is working on improving the communication between apt and d-package You might have heard about the installation ordering the problems of Specifying the order in which packages are passed to the package and grouped into d-package calls and David is implementing an external protocol a protocol for external Installation planners that where you have external programs so that can then return an order in which packages should be installed This is his sum of code project And there's also already Points one version in the git repository. That's about to be uploaded. I don't know if it actually works I haven't really tried it out yet But you can ask David in the IC channel if you want to know more about that and finally debt Delta support, I think it might make sense to introduce debt Delta support natively or another thing that is similar to debt Delta for users in Areas that don't have good internet connection activity I think a lot of Indian users have that problem and In Africa properly as well So that would be a good idea to have if you want to help us We're especially looking for people Maintaining ffdp archive and our d-select integration because we don't really use the d-select integration or Really know what it's doing. It's a shell script. So should be accessible to maintain And fdp archive is also not really used anymore by us, so We don't do much work on it. Just fix bugs If we can or don't spend too much time on it and if you want to help you can submit patches or pull requests at the bug tracker on our mailing list or on our github mirror Github mirror is a bit manually maintained. So I have to manually push New comments there currently it was working automatically, but currently it's not but you can submit pull requests there and issues if you want to and We'll take care of them You can also contact us on the mailing list and the IRC channel If you have other questions if you want to discuss bugs you want to discuss patches or whatever just contact us and We'll find out what we can do That's it already. So questions first of all, thanks to all the app maintainers for making our lives easier and I have quite a few questions and simple one, so The first is when I'm installing let's say any that package. Is there a checksum? Like for example Maybe I take it from the Debian mirror or I took it from somewhere and I'm not so sure about the authenticity of that package Is there a way to know is the signature there or something? So we know that it's all well and good or there's nothing happening there Well, if you manually downloaded the file and then pass it to install. Yes Then you can't very then up can't verify it But it might and in some ways in some sense it uses the version in the mirror instead, so I'm not sure if that's still happening, but it did use to do that. Okay, but that's probably not what you want So There was some sort of a checksum or something because once you update the index There are times where you're not able to you go to a place or somewhere where you're not able to get the whole mirror happening But you're able to get the one Package or group of packages that you can install from the web You can just take them from the web and just install it in your this but I do not know whether there is some way in Figuring out on knowing that this is the right thing Of course the idea is that because you're taking from a mirror or you're taking from Debian somewhere It'll be all good, but internally is there a way to check that? theoretically if you had the index is updated you could Verify it with the code, but we don't have any command for that yet. Okay Okay, the second thing which because I saw it very Okay, the second question which I had Was Basically at some points what happens is which has happened with me quite sometimes it's like sometimes the index it corrupted So like while it apt. I need to Remove all the stuff and then try and update again. Yeah Anything anything coming in your future will which will maybe even if if it's not Let's say this days maybe two days back index so I can build and have a smaller pedic something like that So you want to basically back up the local files and then yeah What one way is of course that I can just take in this but that becomes kind of a headache doing that all the time because you never know for instance where I come from Sometimes electricity goes when I'm in updating the index or something like that And it becomes a pain to actually do that again and again And if you're doing it from the ground up it takes quite a while to get the whole index updated. Yep We don't have anything planned for that yet So maybe we can think about it, but it's about probably a bit difficult to do Okay. Yeah. Yeah, but there are also more issues with the whole index corruption thing because app doesn't automatically Detect that and remove the files which would be nicer. So maybe we can have some kind of Fsck command which verifies that the files were downloaded are still correct and then automatically Ask you to remove them or something like that And lastly, I think this feature has been there for quite some time or at least I think I saw it on the BTS That let's say when you when I am upgrading a package or upgrading a bunch of packages updates have come in and Sometimes some bugs have a severe or a RC bug or some bug which is there Before Installing or before downloading the packages if a way could be done so that the RC bugs come first So we know okay. These are the packages which I do not need to download Well, that's a bliss box. We're just showing the box and that's a hook that runs after the download I think yeah, and maybe that we can do some pre-download hook, but I'm not quite sure if that's Exists yet or if it's somebody's working on it Okay. Thank you so Are you around with some time over the next couple of days to discuss? a new feature request Yes I'll fill in a little bit more I've floated the idea with some other guys and I've been totally slow and rubbish at getting through to the app team about it yet The existing app CD-ROM support is very brittle and rotten and old Yeah, and I know you guys hate it and not really interested. Yeah, we don't use CD-ROM at all Well, exactly. You guys don't want unfortunately a lot of other people do What I'd like to propose is throwing it away and replacing it with something called app to removable instead Say for example at the moment if people install off a USB stick Which is much much more common But they don't have a good net connection If later on they want to install more packages apt will happily tell them please insert the CD-ROM Which matches the information that was on the USB stick? But we'll never actually pick up on the USB stick if it's an if it's inserted manually You know, this is a bit rubbish for people Who don't know any better and we've had a whole slew of complaints about it over the last few years It will be really really awesome if we could plug in to say you dev or similar to recognize a piece of Generic removable media rather than CD-ROM has been has been plugged in Pick up on the metadata and do the right thing, but I'll get I don't want to take up all the time here I'd like to talk to you about it and go I'll sit down with you at some point the next couple of days if I can if you if you're and yes But that's good. Okay. I think the idea is a good idea. Sure. I then it then expands into crazy ideas What would be what might be useful for would be say automatic discovery of mirrors A mirror just say in a local archive mirror on your network could potentially be advertised using Multicast DNS and so on of our he and then just looks like a removable thing Yeah, you can currently do that with a mirror detection with a proxy detection script for proxies. Yeah I'm thinking there's probably a good way that you know a lot of this code could even be common But let's talk in the next couple of days. Yeah other questions So I have played most of your talk. So maybe you address this already, but I see you walked I saw you walked on pinning Do you plan to support pinning per source package for example to be really useful to pin All binary packages from the GCC defaults slash experimental source package Yes, I think Not maybe direct support for a source package pinning But if we have patterns then we automatically will have source package pinning support because you can specify a packet pattern that says source and The name of the source package Anyone else? Okay. Let's thanks Julian again