 I want to talk now a bit about our system for multi-arch, or actually I don't want to talk so much about our system for multi-arch, but a bit more about the way we think and discuss about it. I'll first repeat for those that are not clear in the details how multi-arch currently is set up, some problems that have cropped up with that, and then how I want you guys to think about this kind of problem. I'll edit a little part about some recent discussions that have been going on here at Depconf, and basically I wanted to open the discussion about how do we approach this in the future, what problems are there still to be solved, which is basically the topic of the following both. So if we run out of time, just stick around and Wookiee will take over in an hour, and I think we will have enough chance to collect all the problems and ideas that are floating around currently. I talked about that this morning, so I just really quickly, I'm a freelancer based in Berlin, studied informatics, now I'm doing IT consulting, and a bit of teaching of informatics also. Let's look at what we currently have in terms of multi-arch, and this is basically a stripped down version of the multi-arch spec that Steve and mostly Steve pushed forward and a bunch of others have been discussing since I think 2004, so this has been a topic at my last Depconf in Edinburgh at least since then I know. The spec basically says we have three possible annotations for a package, multi-arch same which says something about the package is co-installable with itself and must be used to satisfy dependencies from packages that have the same architecture, not dependencies of different architectures, that's what the same stands for. Large foreign packages are not actually co-installable with themselves, they know about multi-arch stuff, but they export something that can be used by packages of a different architecture. Then we have the ominous multi-arch allowed that allows reverse dependencies to annotate actually the depends fields, and we have a bunch of problems we've been phrasing in terms of these three categories. Also from the multi-arch spec there's really close to that, I found these two sentences which I want to quote here, if a package is declared multi-arch foreign, preference should be given to a package for the native architecture available, if it is not available the package manager may automatically install any available package regardless of architecture or it may choose to make this an option controlled by user configuration, which is a nice sentence for a technical spec but it really really makes my head hurt. And for allowed we have the definition, this value exists to prevent any packages from incorrectly annotating dependencies as being architecture neutral without coordination with the maintainer of the dependent on package. So what this actually means is we wanted to allow packages to declare this information on the dependencies that they're declaring, but we were afraid to do so and it might have broken our archive, so we have this flag that says my reverse dependencies are now allowed to do something fancy. How did we get there? Well we were basically just trying to solve allowing multi-arch and doing a seamless transition of the archive while doing that. And allowing multi-arch of course means solving a lot of problems for actual co-installability. We had timestamps that GZ inserted into the change lock in GZ, so these two files would be different, which would mean per our rather strict definition DPKG wouldn't allow you to do a co-installation with that. And so there was a lot of really great work that went into that to make this possible. And I think it's fair to say with this scope it was solved. All the people that over the years have been working hard to create a spec that the Debian and the Reborn to Archive could actually transition to with that scope that we started out with, system works as designed. Really again, great thank you to everybody who was involved in that. I know it was a long and painful process. Now we have this nice multi-arch system and we want to do fun and fancy and cool stuff with it in newer exciting directions. We just talked about building cross-compilers and making the building of cross-compilers easier in the before this one by actually using a full archive and saying, well, if I have this for an architecture, but I know there's already a compile-lipsy in the archive for that architecture, why should I need to cross build that just to get my cross board toolchain working. We've also been talking a bit about partial architecture. MinGW came up as an example that would most certainly be partial earlier and people have just while deploying multi-arch and converting their packages to multi-arch have been finding some interesting problems around embedding of interpreters and so on. I will talk later about some of these problems or we will talk together, hopefully, but the problem is a bit that we define multi-arch in terms of the implementation spec, which is a very formal spec. I heard earlier, too much mathematics, in my opinion I would say maybe not enough mathematics in there, but as we have it, we have these three classes of packages with multi-arch is multi-arch that, and it's really hard to reason about. It's hard to build or have your own mental model about a bunch of packages be really consistent. What I'm proposing is, before we change anything, before we say we need to solve stuff differently before we start hacking on DPKG again, let's work on our terminology. Let's talk about it in a different way and then hopefully our mental model will change with that. And this is something that I don't only want to address to the few people who understand the multi-arch spec who were in the booth on Tuesday discussing about possible futures. This is something I think that hopefully every DD will find helpful if we start discussing our problems in a more consistent terminology, we should be able to come up with better solutions for each small problem we might face there. So instead of enumerating all three use cases, what are we actually trying to describe? Two things. One is a per-package flag. The maintainer has reviewed this package and declared that it should be co-installable. This package is consoleable with itself, as far as we know. So DPKG will even attempt to do that. That's the one thing, one information we want to know about each package. And then we have a bit more graph theory stuff, which I will get to in a second and that unfortunately is a subset of mathematics. But graph theory as we need it is pretty simple. We have three packages, these are called nodes in graph theory, each package is a node. And we have a depends on relationship, so far so easy. The depends on relationship in the graph creates what is called an edge, to be exacted edge. So we know package A has something to do with package C, it depends on it. And what we actually want to do next is to think about it in terms of we have properties of these edges. Namely this dependency, this one specific dependency can be satisfied by packages of the same architecture, packages that provide the same ABI. This is a dependency that does actual C style method calls or using some C++ calling convention or whatever. Or this dependency can be satisfied by packages of any ABI because the API that we're using is actually just call user bin whatever with the following parameters. So we expect that API to be consistent across all architectures. So that would be an edge property or a notation or in mathematical graph theory, sometimes also known as coloring. And of course this is still a bit of an implicit information, even if we say only the same ABI. We could get really explicit and say a package of architecture AMD64 has this dependency and this dependency can only be solved by another package of AMD64. Why the distinction between implicit and explicit is, I will come back to that a bit later. So what does our nice multi-arch field actually mean then, first as I said, packages is co-installable with itself. And then if we look at it like that, from this graph theory point of view, we will actually say, well, this package exports interfaces that are architecture dependent or architecture independent or both. So a package can say, at the moment using just a coincidence, all incoming edges or reverse dependencies have to be from the same architecture or have to be from any architecture or we can have one or the other. So why am I proposing this and standing in front of a bunch of debutant developers talking about graph theory? I really think it helps me and others that I've talked with also to sort your thoughts or to sort also our collective thoughts about this promise. Because the other problems that have shown up since then are hard and I've always heard like, ah, if you have this exact specific problem, then just create a wrapper package with architecture like that and multi-arch like that and I always have trouble rubbing my head around that because we're phrasing the problem in terms that are really unintuitive. So I'd like to propose that we all try to think about it more in terms of what are the actual dependencies that we're trying to describe and that we're trying to solve. The con of course is after we've described it like this, we still need to append to our email or something a description how to solve it within the current system or it's not possible to solve this use case within the current system, let's solve it in the future with a different system by maybe some time expanding DPKG, sorry, a few more ideas in this frame of reference. We could start doing, if we change DPKG at all, we could start doing explicit architecture annotations that just would make the solving, the solver a bit simpler in many cases but there are some cases where at the moment we have like still an IA32 libs package that's arch any and only ever installed on AMD64 as far as I can tell and then it depends on an IA32 libs IA386 package which is only built on IA386 but never actually used there because it's just for, both these packages are just for describing this cross architecture relationship, forced cross architecture. So if we start changing stuff around and I'm not saying we will, I know that changing DPKG is a long and hard process but if we do that we might try to go for explicit annotations there. For thinking about possible problems it might make sense to reinterpret or redefine for yourself that architecture all packages as a union of or a package that is installable on all different architectures and once you've installed one of the architectures you have installed all of the architectures. That simplifies the reasoning a bit but it's not really that important for the discussion that we have from here. We might need to propagate dependency information along the graph so for multiple edges and I will show an example of that a bit later and we might also if we stay with a system where we describe packages in terms of interfaces they export we might also annotate these interfaces also with architecture information. So that is first my request to everybody if it makes sense for you try to phrase your multi arch problems a bit in terms of graph terminology I hope that will make our understanding of the problem space and our discussions more productive. Any questions so far? So then I thought I'd use the rest of my time to give a quick outlook. So as I said there was a buff on Tuesday discussing some of the harder multi arch problems we sat down and allowed ourselves also to do some green grass thinking and said well we have a bunch of problems if we were designing multi arch completely from scratch how would we go about it or what would we like to have to really solve this problem once in a really beautiful way. Some of the solutions we came up with would need a modified TPKG maybe even DAC some don't but I thought it might hopefully be interesting for everybody who couldn't attend that buff to get a short overview of where the discussion is at within the Debian project. Two related problems that we came up with is fake root and NSS. People might just as a normal user actually have experienced the NSS problem that maybe you install lib mss mdns for local hostname lookup within avahi but you only install it for your md64 and if you have a binary that runs on i386 it cannot actually load that nss module because you haven't told it to install also for i386. So we have just something in the system saying well it would be nice to have an nss module and some other process running in i386 and not being able to actually make use of that. Fake root is similar, it's a problem that will affect more the developers. So with fake root as architecture all, lib fake root is built for many architectures and as you probably will be aware, fake root is just a wrapper that sets LD preload so it will tell the processes running under it to preload the library that does all the moving around of actual open calls but if you are maybe building for i386 and there's something generated in your package that's run at build time and it's been compiled for i386 you cannot actually LD preload because your lib fake root might be missing in the i386 version. Another problem that has been discussed since November I think also a number of devian mailing lists and rears its head repeatedly is the embedded interpreters thing so an application let's say g-edit embeds an interpreter or a library version of an interpreter. For example here I talked about libparl, libparl is built for a number of architectures. libparl has possibly dependencies directly or indirectly or g-edit has a dependency on a certain Perl module so that a g-edit plug-in can be loaded. This Perl module is also possibly an arch all package because it's only contains Perl code but this Perl module again depends on a different Perl module which is arch any because it's built from C code. So now we have the situation that the g-edit that I install must match somewhere very far down the dependency chain a Perl module. If I just have that XS-based Perl module installed to solve the dependency I have completely lost the information that we have an architecture dependency between the application very far above g-edit and the actually loaded Perl module. So this is a bit of a transitivity problem we have the dependencies between these different things and the information well this dependency is actually wanted for this and this architecture needs to travel with the dependency information. I just wanted to go back to your comments about the general graph theoretic underpinnings of multi-arch and are handling the whole thing. I would like to say that from my memory of the original design discussions around multi-arch we were certainly aware that there were many properties that were essentially edge best. However the one thing that I think you haven't really taken into account here is that the number of edges in the Debian system is very significantly greater than the number of nodes. It's probably somewhere between n log n and n squared I suspect a lot I haven't actually looked at the data for that and it would be we considered it a very bad idea to make people annotate all of the edges. We considered it a very bad idea to make people annotate all of the edges simply because there's so much more metadata involved there and I would I think defend the decision to put most of the information on nodes where possible because just because that has better defaulting kind of semantics and has a better chance of making things work by default. Now you're right that there's some additional reasoning that needs to be done in that case but I think that is worth it. We certainly need to have some capabilities to have edge-based annotations so we have things like the depends, cool and any on multi-arch allowed which is one case, one restricted case of that and I can see arguments for extending that. I haven't quite puzzled out whether you're saying that we should convert all of the annotations that we currently have to be explicitly edge best. It wasn't very clear from what you were saying earlier. No I am saying that you should convert them in your head that when you're reasoning about the system just relying on I depend on something and this declares some multi-arch field and then DPKG will invoke some reasonable defaults just does not scale to all the edge cases we have in our dependency trees. For the problem of should all these dependencies be explicitly mentioned in the packages file. One of the points against doing that at the time was it would blow the packages file too much and I'm still not convinced that GZP wouldn't eat most of that overhead up but physical size it's that there is a cognitive cost associated to people who have to read all of those repetitive annotations and the more that you can do by default the more people can read the content that's actually different rather than the content that's just repeated lots and lots and lots. I don't know about you but I actually do read packages metadata quite a lot. It is for human consumption. I'm also opinion actually that the tool chain if we were to say every and I'm not saying we have to but if we were to say every dependency must be declared explicitly I'm still thinking that the tool chain should create the part line in the packages file so that you don't don't do it in the control file all the time. I'm talking about reading the output as well as reading the input which I do quite a lot. I'm sorry for you. But the people who do this quite a lot are exactly the people that everybody complains about being overloaded so if you're proposing to overload them further by making it harder to see what's going on. No I'm not proposing to overload them further. I'm saying that the normal standard DD is at the moment has trouble describing corner cases that she or he encounters using just the multi-arch field. There was another question over there. Yeah please observe that the problems we are currently facing are inherently not edge-based so by moving the specification of what is architecture dependent or where architecture barriers are to the edges we gain exactly nothing in this respect. So for the fake route and depend SS modules we have exactly the case that the architecture matching does not occur on dependencies but on a different kind of relation. And in this example you're giving over here is very same case that it's not a dependency where it's traveled it's a dependency path. Yeah I'm not saying I'm not saying just putting that into just putting depends on ArchMD64 into one certain dependency solves anything. That's not what I'm saying. I'm saying that what we are trying to do is solve these problems and I'm just giving some examples. You were at the bof you know that the ideas we've had about solving these problems and to solve these problems we first must be able to state them well. Yeah and that's what's missing at the moment. I think actually one excellent thing that somebody could do with who has some kind of interface design skills would be to write some kind of visualizer where you could plug in a set of without having to mess about configuring apt and getting your exact right setup which can be quite challenging. Being able to plug in a set of you know effect control data and say what's the result of this? Which packages are installable? What happens if I select this package here? Something that could be used to assist people in reasoning about this would I think be very valuable. That's an excellent idea. I would like to include the others that were not at the bof into what we've been discussing and I think then we should open it up to everybody. This was an error case. One could say with the current system something that we're not able to solve at the moment. Just to get back to my proposed terminology. The question of dependency nativity is something that actually cannot just be written onto one edge. It's a problem that needs some sort of transitive looking at it for solving it. Currently with the current DPKG and so on we cannot describe that clearly. We have a nice bunch of hacks around that but it would be nice to find some way to describe that at least on a high level for us as DDs. Possibly it doesn't have to be on a DPKG level but it might also make sense to do that. So I'll quickly go through the solutions that were proposed and if you're unclear on one of those ask and I will direct your questions to the team that was at the bof. The first really simple solution is they are all arch all packages that are affected for some power module that depends on native power modules. Mark them as arch any and multi arch same. That solves the problem as even the one with the G edit. It means we have a huge increase in archive size because I think our dependency tree often ends up at native C code but it's a very simple solution but one of the things that I would describe as a hack around the current system. Then it was discussed to have a field install for the same architecture as lipo. So if our G edit depends on lipo we know that we have a lipo of the same dependency as G edit and lipo will be installed in three architectures maybe so any module that could be loaded by lipo and contains native C code also has to be installed in these except exact three architectures. Then we have from April I think we have a proposal by Helmut where that is about calculating a set of running architectures for every package even arch all packages go through the dependency trees look at for each package look at for which of my which of the possible architectures are all my dependencies cleanly satisfied and even store that persistently so if you have a cyclic graph or something you still can do this calculation. Yeah that is rather a big change but I think this or something like this is probably what we'll have to do in the long run from my feeling. Are these three proposed solutions clear or question? Yeah I think we have a we might have a slight refinement to the to the details of the last which we'll probably bring that up in the next session I guess. The one thing I'll say is that the more I look at the XS case the less I'm convinced that's actually a problem because the XS modules depend on Perl so I think that may have been an analysis mistake and thinking that was actually a problem but there are other kinds of situations where the same thing arises. I think if we just limit the scope to XS and libfake root kind of situations that's all what we want to do then this middle solution might might actually be. But like the exact reason why it depends on Perl is why I cannot cross compile my other modules which depend on that module because then the wrong Perl is being installed. That's with current Perl yes. I think we should be a bit more precise about problems and solutions because there are currently two problems and the presented solutions only solve sometimes one of them so the one problem is that we have this transitivity issue where architecture specifications need to travel across architecture all package dependencies and the other issue is that we need to have libraries like libnss modules and fake root libraries installed without even knowing what is going to use them later on so these are distinct problems over there and the very first solution only solves the transitivity issue it doesn't solve the missing libfake root in the nss modules and so we need to evaluate each of the options against both of the problems. Also the third problem solution with running architectures only solves the transitivity it doesn't touch the fake root issue at all. Yeah yeah sorry this should have been clear this slide only is in relation to the slide before with the Perl problem. You write that the first solution will solve the transitivity problem at a in my opinion unacceptable cost in complexity but the third solution will solve the transitivity problem at the cost of getting a page into dpkg which might also be a pretty high cost I don't know. I will leave you yeah we have five minutes so I will leave you with one problem we also came up with which is that a package has dependencies on both interpreters Perl and Python one of them as libper the other as a Python interpreter but it doesn't care which one yeah and as I said before well Perl and Python actually both export two kinds of interfaces yeah and so for Perl we're using a native C style interface for Python we're using a a command line interface or something yeah and when we get into that deep of deeply into the problem space we need and we want to solve this mathematically really clearly and nicely we actually need to partition the dependency trees and so that for example for the running arches solution we don't end up with an empty set of possible running arches for the actual application a yeah so we would need a way to describe using grouping or something else would need to describe a way to say well the the running arches for Perl and the running arches for Python can be disjunct the two sets that is something to discuss more in detail possibly in the next part I don't know what Wookie wants to concentrate on yeah for me it's important to get everybody else who's interested in into this discussion do you have questions so far do you have an error case or use case that we have not discussed yet I see you looked on the on the graph on the the jibker pkg and apps view but running programs don't know about packages they know only about what's on the file system and if a program tries to enumerate possible plugins for example then it then it will see the modules that it cannot use yes well and I don't see any solutions here other than a variant of the solution one you mentioned with every such module put into a multi-arch directory yeah that would be the solution I would recommend in all such cases well if the archive is a problem then then they could be even the duplicator is in some way because the this package is identical for all purposes other than than the directories but no it will not be the same because you only care about the for example compiled extensions and compiled modules and compiled plugins and there will be by definition already a bi specific no I mean I mean all modules that depend that might depend on so right and for example in python it can have the same location for both arch all plugins and arch any plugins or extensions and it enumerates both yeah but but and then you have the transitive problem of where the arch all depends plugin depends on further plugins which end up being yeah what I'm thinking about is putting all AMD64 modules in the with all the both arch all and AMD64 into one directory and all I 386 into another possibly with with with some sim links to to the duplicate them but this way a running program would know what what it can run I think there are multiple ways to to to get around that for example in in python we already name these extensions different so we have for an AMD64 extension we embed the multi-arch name into the extension so we don't have this problem there and I could imagine that that for for other plugin systems or module systems you could do the very same or for Perl you could just add all well architecture dependent library passes to the default point of interpreter and see well just discard some at runtime yeah well whether it is in the in the file name or the directory name well it's a runtime package which should handle runtime exceptions understand the problem but you will have runtime exceptions I can write a file which is pure shell which will not manage to succeed to run on any of their architectures because of the embedded you know things I do with it right and that's something you will have to handle if you do write python module say import C strings this is an arch all module which should run on any native system but it may or may not run if you try to embed it in another application and you're back to the same transitive problem this is not a different problem from the transitive dependency problem okay the video team has signaled us that the time is over so we need to at least cut out the stream and I invite you to keep discussing this we can get to get some fresh air for the next 10 minutes and then bookies uh boff will have a very similar topic so it's identical in fact uh in content so we can pick it up again there so thank you you're now given a short chance to escape but there'll be more okay