 Good morning. This is Zach. He is going to give a talk. And if you stay afterwards, we have more wonderful prizes for coming to a talk this morning. Thank you, Molly. So this talk is, so it's nothing related to that DPL thingy, but it's actually related to my job as a researcher. And the main topic of this talk is how to actually reuse dependency solving across different package managers, both within a single distribution like Debian and among different distributions. So the talk, the work which is presented in this talk is actually shared by many people of the Mancusi project. And one is Ralph, which is taking a picture of me right now. And there are other people which work with us on this project and generally care quite a lot about Debian. And this is a good moment to thank the Mancusi project for actually supporting also my general work in Debian as DPL. So in this talk, I basically go through three different parts. The first one is what we are lacking in dependency resolution in today package managers. Then I will present a format called CUDF, which is aiming at being a common format to describe dependency resolution scenarios in package-based distribution. And I talk about the implementation of this format. And finally, even it is not in the index yet, about a competition to find kind of the best technique to do dependency solving that we have run recently. So just take a step back and think for a moment at the notion of distribution. And basically think about why we do what we do in distribution like Debian. So essentially what the distribution is, is a man in the middle between upstream software authors and final users. And the first goal of the distribution is actually to ease software management, to make it easy for system administrator to have a coherent set of softwares installed on a given machine. And the key idea to actually make easy software management is the idea of a package. So we do, we have this kind of granularity of package, which denotes at which level you can install, uninstall, upgrade, maybe even downgrade a specific piece of software. And the killer application which we have built for the past kind of 17 years on top of the notion of package are package managers. So package manager are our killer application that we use to sell, even if we don't sell it, sell what we do to our final users. So what actually is a package manager? So the idea is to make both easy and flexible doing software upgrades. And with upgrade here I mean something very generic, so I will use the term upgrade for even either installing or moving, upgrading, downgrading, any kind of change we do to a package on our machine and we call it upgrade. So the first task of a package manager as we know it today and is to abstract over a package or a trivial. So what a package manager does is avoid that the user have to manually download software to care about the security, to care about being secure that this package really belongs to its author and all this kind of stuff, it's taken care for us by a package manager. The second feature of a package manager is the actual low level deployment of packages on disk. So once you have retrieved the software embedded in a package, the package manager takes care of installing it on disk and takes care of all fancy details which can be very, very complex to get right, like triggers, like transactions, like the management of com file to avoid that user changes are lost upon upgrades, like diversion alternatives and all these fancy features that we do have in DPKG and that exist also in other distributions. The third point of a package manager is dependency solving. So the idea is that when you, when the user want to install a specific package, you just say the name of that package, but to install that package you might need a lot of other softwares which are stated in the metadata on the package and the goal of the package manager is to install or missing dependencies so that you can have a working system, identify conflict so that if you want to install two packages together that cannot be installed together it will warn you and more importantly to compute upgrade path. So a kind of very important task of a package manager is to help you migrate from one release of a distribution to another and to do that usually we have a lot of different ways. You have different choices and the role of the package manager with its dependency solving engine is to compute the best upgrade path between the status you are now and the status you want to reach. So my question here is, is dependency solving in today package manager as good as we want? And the big claim here is that it is not. It is not for various reasons. The first reason is what we call incompleteness. Incompleteness means you request something to a package manager to install a specific package. There is a way to install that package fulfilling its dependencies and avoiding all conflicts, but your package manager is not able to find that solution and to propose it to you. So this is a very simple example which works in the sense that it, sorry, it works with APT get in the sense that APT get is not able to fulfill your request in this case. So the example is you have two packages A and B and you have two different version of the packages, version two and version one of each of them, and you have a kind of cross dependency. So version two of package A depends on version one of package B with a stick dependency and vice versa. If you ask APT get to install both of them by default without any pinning it will fail because it will try to install the most recent version of both packages and that's impossible. While it is clear that there are at least two solutions to fulfill the user request, installing version two of package A together with version one of package B and vice versa. So this is a very simple example, there are others, and it's just to show that what is incomplete, incompleteness is and that what we really want is completeness. We want a tool that each time a solution exists to what we ask him, what we ask it, it will be able to propose us that solution. The second reason why we claim that dependency solving is not yet good enough is that it's not very flexible. It has a poor expressivity. It does not allow the system administrator to specify very specific and interesting policies like you know what, among all the possible ways you have to satisfy my request, I want that you choose one which minimizes the installed size on disk. Another example is okay, I want to install KDE or GNOME or something big and I want you to propose me the solution which minimizes the download size of all the packages I need to retrieve. Okay, and this is for instance interesting for if you have very slow connection. If you have an alternative among two packages and choosing one package means retrieving a very, very big set of packages which are dependencies and if the other alternative means retrieving just one very simple package, if you are requesting this kind of policy, you want the second choice to be taken. There are tons of other examples. For instance, you might want to selectively not trust specific maintainers, so you might want your package manager to avoid completely to install packages maintained by a person you do not trust or you might want to whatever, so whatever kind of preference you have on the choice of how to satisfy your request is a policy that we might want to ask the package manager to implement and we are not there yet. Finally, there is a kind of engineering problem. So implementing dependency solving seems trivial at first, but in fact it is not. So in fact it is an NP complete problem, so the complexity is potentially explosive and we have been experiencing with naive implementation of dependency solving and they are either incomplete as we saw or they can loop or all this kind of problem. So I would say that the engineering problem, the point is that we really would like to factorize out dependency solving engine and reuse it across different package managers, because reusing is what we do best in free software. So this is the idea. The idea is to have a way to not reimplement dependency solving again and again in all package managers out there, but to reuse them. And where can we reuse them? Well, we can reuse it, first of all, within specific distributions. For instance, in Debian, we have different tools which do dependency solving as build, the P builder, APT get, aptitude and the kind of way they do dependency solving change. So you have some kind of nonpredictable choice and that might mean that while a package build on your machine, maybe it doesn't build on the buildy network. So a goal and an advantage in sharing dependency solving within this tab will be to have this kind of uniformity. But then let's be bold and let's try to see if we can share dependency solving across different distributions, because the problem is the same that we have as the problem that Reddath have or Fedora or Ubuntu or OpenSUSE. The basic problem is it is the same. And finally, it is interesting to reuse a solver between we geeks and the community of scientists, because in science there are quite some people which do this kind of dependency solving and this kind of constraint solving and on one end they are interested in having our data, because our data are real data, not built just to test some specific constraint solver, and we can benefit from their expertise in doing dependency solving. So let's make clear that the end goal is not having a standard package format. It's a goal that I don't think it's worthwhile because the package format is very specific of distribution and it's used to implement the policy of the distribution. While the end goal is finding not really the best solver, but actually to try to identify what we need and finding the best implementation of what we need, possibly with the flexibility which enables us to specify policies different than to other distributions. And then if we find something that fulfills this need, well then at the point we can either deploy it in every single package manager or deploy it in a separate library that we will use across different package managers to do dependency solving. So our proposal to do that is a format called CUDF, which stands for Common Upgrade Ability Description Format, is a format in which you describe upgrade scenario, in which you describe the problem that a PTGET has to solve when you ask it to install a package that cannot be trivially installed. So the structure of this format is based on three different parts. The first part is what we call the package universe and it essentially is the set of all packages which are known to the package manager. In term of APT is the all packages file that lie in Valibra PT lists. Then we have the package status which is the set of packages which are currently installed. It is usually a subset, actually it is always a subset of the package universe because the package manager is aware of all installed packages and it only contains the currently installed packages. And finally the third part of what this format can describe is the user request. So what did you ask to the package manager? What is your goal? What is the goal of the user? So what is difficult in designing such a format? So as always when you think about designing a common format the problem is abstracting over the specific features of a specific package manager. So the first challenge we have in designing such a format is version numbers. So the semantic of versions changes significantly from distribution to distribution. So this is the first challenge that we need to face, how to have a commoner presentation of versions. Then we have package relationships which are the same in all Debian-based distributions but are different on like RPM-based distributions. Then you have lexical conventions like the fact that the name of package is different in Debian and in different distributions. Then you have the semantics of virtual packages which is kind of peculiar to each distribution. And finally you have a sharp difference between the world of Debian-based distribution in which you can install only a single version of a package at a time and the world of RPM-based distribution in which you can install different versions of the same package at the same time. So the format is a plain text file format, it's inspired by LFC A22 and it's a list of stentas as we have in packages file. And each stenta is a set of typed key value pair. So this is a very simple example, a kind of hello world. So here we see just two packages and four version three, the dependency line, we have comments and this is another example of with package OpenSSL. I said that the key value pairs are typed, meaning that each property can also can only accept values of specific types and here I have a brief overview of the types we have. We have integers, we have positive integers, booleans, package names which are kind of very liberal because in in the world of RPM-based distribution you have package name like slash bin slash bash. This is a package name. It's not acceptable in Debian but it is something that if you want to support in such a format either you allow that character to be part of a package name or you need to do some form of escaping. Then we have package formula, what our formula is, what in Debian are for us the dependency line. So they are boolean formula in which you can express disjunction, a conjunction of packages and where each package can be associated to a specific version constraint. And then we have package list which are just a degenerate case of this where you have only commas which are conjunction and not disjunctions. So a CDF document is then made of several different stanza. You have the first stanza which is the preamble and which is optional. Then you have the whole universe of packages, one stanza per package and in the end you have the request stanza which encodes the user request. So this is a kind of a skeleton of a CDF document. You have a preamble, a lot of packages and a request in the end. Let's see what are package stanzas. So a package stanza describes a single package and you have several properties which you can use in a package stanza. Some of them are mandatory, some of them are optional. So the package property is the package name as in APT list and it is mandatory. Also it is mandatory to have version and versions are integers and I see in a bit why. And then you have a very important property which is installed which actually distinguishes between a package which is in the universe but not installed and the package which is installed on the user machine at the moment where the request is stated. And then we have some kind of usual suspects. We have depends, conflicts and provides. Let's see some highlights of packages in CDF. So the first important difference with what we do in Dabian is that versions are not strings but versions are integers. And this is the best way we found to abstract over differences of versions in different distribution. And the idea is that every single distribution has its own semantics for versions but in every single distribution you can you have a total order. You can take a set of version of the same package and order them totally and once you have done that you can map that order to integers. Provides is used to encode what in Dabian we call virtual packages and what in LPM-based distribution we call features. What is different with respect to Dabian is that the provides are versioned. So you can do something like provides, HTTP version greater than two and actually you are not forced to have a single version but you can have this inequality like in this example. So that package which provides HTTP greater than two will be able to satisfy all dependencies on HTTP which are from three and upwards. And if you don't specify a specific version in a provides that means that that package provides all possible versions of HTTP. So it will be able to fulfill all dependency requests on HTTP. Last important highlight is that conflicts are not implicit. So what does it mean for a conflict to be implicit? It means that in distribution based on DPKG if you have two version of package bash, version one and version two you have an implicit conflict between the two package. You cannot install the two packages together. You don't need to specify a conflict on bash because it's implicit. This is not the case on LPM-based distribution and in fact this is not a big difference. The point is that in Dabian when we need to install different version of the same of the same software at the same time we just encode the version name in the package name. Think of the Linux kernel. It is a single software but we have different packages which have the name of the kernel in the package name. So if you want to achieve what we have in Dabian while you do something like this you have package bash version five and you you declare a conflicts bash. As it happens in Dabian self conflicts are ignored so that essentially means that you cannot you cannot install this version of bash together with all other versions of bash. And this works also for virtual packages and this is something we know pretty well in Dabian. So how we do a mutual exclusion between sets of packages which are providing the same feature like post fix and exit. Well we provide a specific name made transport agent and we also declare a conflict with that name and that is exactly the same in CUDF. What is interesting about CUDF is that the set of properties you can associate to packages is not closed-ended but is open-ended. You can have whatever extra property associated to each package like download size, install it size, maintain a string. You can declare some priority. You can declare the suite the package come from. It's it's very very much free form as long as you declare the property in the preamble of the document. So here we are stating that the rest of packages in the in the universe have a suite extra property which which is an enumeration between stable testing and unstable and which defaults to stable. You have a bugs property which defaults to zero and is an integer. You have a property called pin priority to encode the pinning which is an integer and which is mandatory. You have to specify that on every single package. A request Santa contains a request like adder and then one or more of install, remove and upgrade. And each of them is actually a dependency formula like bash greater than three. So having a request to install bash greater than three means that the user wants to have installed whatever version of bash which is greater than three. And you can have something some weird stuff like less than two or this kind of stuff which are not expressible with APT usually. And when you request an upgrade you are basically enforcing the fact that you want only one version of the upgraded package installed in the end. So this is a complete example of a CDF document. There is a lot of packages. There is no preamble because it's not needed and there is a request in the end. So all this has been specified in a technical document which you can find at www.mancusiorg.cudf. And it is a specification document with some features like having a formal semantics which is very useful to double check implementation of the format so that you can verify whether they are in fulfillment of the semantics of the specification or not. There is a separate format UDF to which I will come later which is used to capture upgrades from users in a way similar to popcorn. And there is a kind of a primer document which is very simple to read to get started with the format which is available at this URL. So what the format does not support yet even though it is for coming is multi-arc. And the problem with multi-arc is that it is no longer true that a package name package version is enough to identify a package in the universe because you can have different packages which have the same name the same version but which have different architectures. So the fact that package name package version was kind of a key in the format was a basic assumption so we need to change a bit the specification to support that. And another thing we don't support yet is the locally rebuilt packages. So if you look at the code of IPT there is a kind of heuristic which enables to distinguish a locally rebuilt version of a package from an official version of the package which has the same name in the same version. So this is something as well that we need to add in the format. It is implemented in a library called libcudf which implements parsing of the format pretty printing that consistently check basically meaning that you can check whether an installation has some dependency problems or not and implement solution checking. So verifying whether a proposed solution by a package manager is actually in fulfillment of the original user request. The code is OCaml and I think this is not surprise for anyone who knows me but there are so C bindings which completely hide the OCaml layer. So you can use just the C library not even knowing that there is an underlying OCaml library. It is LGPL and you can find it at this URL www.manchusiorg.codf. There are also Debian and LPM packages available still in a separate repository and we'll come back to that later. Okay this is just an example of how does the command line tool work so you just provide an universe and you can check whether it is consistent or not. You can provide an universe and okay this is an example where you find something which is not consistent so cannot certify dependencies. This dependency is called Turbo of a package called Gasoline Engine and that means that you have installed Gasoline Engine but not one of its dependencies. Well you know the usual kind of stuff that APT can tell you. So where is this deployed? So we are quite happy about the the assentness of this format so there is initials CODF support in a package manager called CAPT which is a kind of alternative package manager to APT we have in the archive even though it's not the last version of the specification but there is interest from the package maintainer to to go that way. We are discussing with the APT and APT2 maintainer on how to have support on CODF in the official package managers and the idea is not to to have different APT package list. The idea is to enable package managers to to spit out a CODF representation of the dependency problem they are trying to solve and actually it is also supported in distributions other than Debian for instance the URPMI package manager is the package manager used in Mandriva is able to to support CODF and spit out reports in this in this format and is forthcoming in LPM5. So what we did with all this so as part of the Mancusi project we have run a solving competition where we kind of have a big set of CODF documents meaning of upgraded scenarios encoded in CODF format. We offered them to the scientific community there is a community which is interesting in seeing how good they are and how fast they are at solving constraint problem most notably the SAT community people doing SAT solving and we kind of try to check whether out of this community we can raise their interest in using their technology to solve dependency problem and in particular the shortcoming I discussed in the beginning of the talk. So we held a competition at a workshop called LocoCo which stands for Logics for Component Configuration and it was June I guess right it was last July sorry and in that we had two different tracks like you can compete for different goals and we had 11 solver participants across the two tracks and the competition is over and we'll briefly go through the result but all the information is available this URL which is www.mancusi.org leshmisc-2010. So we had two tracks one track we called the Paranoid track for the really paranoid system on this track and in that track the goal was well to satisfy the user request but in doing so minimizing the number of removed packages so that you do not remove some software which has nothing to do with the user request and to minimize the number of changes in the installation status so essentially the intuition behind this track was okay I need to have something done but please touch the list you can touch so that you don't screw up with other stuff which are installed on my machine and this we believe is a kind of typical policy that a system administrator with a machine at risk would like to have implemented. A completely different track was what we call the trendy track so in the trendy track is kind of for the desktop user or anyhow from a user which would like to have the most recent software so in fulfilling the request of the user the track required to minimize the number of removed packages because we think this is a kind of general requirement minimize the number of packages which are not at the most recent version so it is written this way for to have the formal encoding done right but essentially mean upgrade as much software as possible to the most recent version even more so minimize the number of recommends which are not sorry minimize the number of satisfied recommends of installed packages meaning each package you install as a set of recommends please install also these recommend as much as you can and finally minimize the number of newly installed packages this might seem a bit counter intuitive but when you ask apt to install bash you don't want it to install 15 000 packages just because they are in the archive okay to each of the in the various track we used we have used different set of c udf sorry different set of c udf encoded upgrade scenarios so we had some some problem coming from db and udf which is a tool you do df save sorry which is a tool you can use on db and machine to to submit your upgrade scenarios to us and then we show where you can find it in a moment so and then we created some artificial problem trying to to make the live of package managers make miserable so we have easy difficult and impossible problems so essentially when we took different mix of debian suites so in easy for instance we took debian unstable a basic uh uh desktop scenario meaning you just install debian unstable and you choose the the desktop task and see what you get and on top of that we created artificially various requests in which we request to install either install 10 packages at a time or remove 10 packages at a time this kind of stuff though the more difficult part we add it's written stable and unstable but it was testing and unstable uh oh sorry stable and testing well whatever and uh to install that even in this case add increasing the number of suite doing install or a mover of removals of lots of packages and then in the impossible track we kind of all all stable stable testing unstable doing upgrades from one release to another and kind of increasing the number of available packages and see how the dependency solver react um so the web as I told you we have different solvers coming essentially for these six teams so the first one is a solver based on technique called answer set programming then we had a solver coming from a portuguese university based on a set based solver we had aptpbo which is a nice hack done by the distribution called kaksa magica in which they took an apt package manager and equip it with minisat plus which is a open-source satsolver pbo actually we had another university of louvain providing his own solver we had p2 which is the kind of the package manager integrated in the eclipsy platform so eclipsy has its own satsolver used to solve dependencies and we used that and it was participating in the in the competition and then we had a team from university of sofianti poli participating with the proprietary software and the results are that unfortunately the winner in both tracks has been the proprietary solver but the second one was the the solver of p2 based on the satsolver called satsolji and okay this is it for what we have done this is how you can help so if you go to mancusi.devian.net you will find various things and in particular in particular you will find the oh i'm offline oh well i have some network problem but anyhow here you will find a non-official apt repository and in particularly inside it you will find that package called mancusi contest like popularity contest but mancusi contest and in it there is a utility called the udf-save which is just a wrapper around apt get or around aptitude on the command line so if you find a kind of dependency problem which the solver is not able to find a solution you can just rerun it prepending udf-save in front of it and that will produce a cudf document which you can then upload to our server we guarantee anonymity and all this kind of stuff but it is a way to actually contribute to a corpus of upgrade problem on which we can then try to improve dependency solving over the years offering these problems to the scientific community which is interesting in working on them so that's it i'm ready for questions if you don't have any question on the content please consider also suggesting your favorite upgrade policy we are really looking for some fancy upgrade requests that the system administrator might have and which we can offer as a challenge to people doing solving thanks there was a question over there i think okay bastion okay now it's on okay works okay i have actually two questions the first one is how do you deal with different packages package name for the same package like you said bash and user bash say the example again you have to package bash and fedora it's called user bash and okay how do you deal with that so in fact with this kind of format we are not meaning that you should use packages from different distribution on the same machine it is just a way to capture in a single format upgrade problems coming from fedora and coming from debian and have them encoded in in the same semantics so it is not something that user can or should use to have different packages from different distribution on single machine this is not the goal for us the goal for us is improving dependency solving technologies that both fedora and debian user can benefit from it so no i i was when you map versions to integers yes i think that that you know that's a pigeonhole problem in mathematics which tells you that you can't do it so but you could do it to do it to rational numbers because then you can always squeeze and rational is just two numbers it's well the so the the universe of packages we consider is closed so a single cdf document only deals with the fixed set of packages which is actually also already the case in package managers so when you ask rpt to solve a dependency issue it only considered the package that he knows at that moment and given that the universe is closed you can map directly version of the same package to integers without losing any information yeah but every every time you really do the encoding from scratch well one policy that i typically need as a sysadmin is that i have a domain of packages that they want to be as up-to-date as possible but the rest of the system must be stable so imagine i'm a python programmer so i want to have all the python related stuff as up-to-date as possible on a stable system so basically you were saying that the two kind of policies we had trendy and paranoid you want them to be applied to different cluster of packages in your system okay don and how are you dealing with the different possible paths for dependency resolution so i mean i know that it's always possible to have multiple ways of resolving the dependency graph including uninstalling everything so how is that okay so the the long-term plan so at this level in cdf we have encoded only the semantics of pet package upgrades so the the specification tells you formally when a solution is really a solution to a user request and when it is not and all the the track we had in the competition were kind of not defined within the cdf semantics so we had separately way to compute the the value of the solution the long-term plan is actually to have a language in which the system administrators can encode their own request so the idea is to have a small language to in which you can actually encode what you what you want to do like minimize the number of installed packages or whatever and this is not yet in the cdf semantics but it will it will be coming at the moment i mean we just so we did the easy part let's say we just said okay the track one is optimizing in this way and how it has been done it's been up to the solvers um so i actually where are you on film i actually have two questions the first one is why weren't multiple architectures being considered in the first place because i know that rpm is doing it for quite some time to allow co-installability of multiple architecture packages and just debian doesn't yet um and okay i replied to that and that so in fact the in the let's say ad hoc standardizing committee of this stuff we on the debian side we have been more active that the young people which were on that so it should have been up to them to raise the problem before we didn't really think about it in the beginning and we took her to us later on but so the solution we have the solution is kind of easy it's just up to generalizing the notion of a unique identifier so you basically add to each package a unique identifier and in that unique identifier you encode the architecture or in fact whatever you want so that's easy kind of a historical reason of course there are also the chains on the architecture yeah and the second is somehow that format seems to be useful for upgrade reports to collect the information on upgrades what the user wanted or what was available at that point and what was installed at that point are the two tools mature enough to do such a thing so for what concerns apt get yes so it is fairly easy because apt get only have kind of a single run of dependency solving so with an external wrapper you can easily capture the the situation so yes for doing this for capturing the user problem this upgrades the tools are mature enough and in fact one of the commercial distribution which is participating in our project is using this technology to collect problem for their own user and offering support to them via their own support channel for what concerns aptitude no the tool are not yet good enough but the point is that what we have now is an external wrapper while aptitude internally can do several iterations of solving so we would need to hook into the code to actually at each step capture the scenario so it is totally possible and Daniel Barros is actually interested in doing that it simply we haven't yet come up to actually doing that with the version provides is it possible to give a range of version numbers or can you only do ranges which are unbounded on one end you mean in the dependency specification or well with a dependency you can depend on versions greater than that and conflict with versions less than that and that will give you a range but you can't do that to say I only I provide this interface between versions three and eight so we are not we cannot express all of the possible ranges but we can express the the closest one in dependencies and the open-ended one so we miss we are not completing the ability to encode all the ranges but as far as I can tell all the ranges we have found in a real package manager can be encoded well I can say that I can just say that so in future version of the format we are going to add an explicit operator that will enable us to encode all possible ranges of versions um I didn't see anything in there which allowed you to encode the fairly important property of whether you actually chose a package or it was automatically installed whether you what the I mean the app has gained the functionality to understand the difference between something you explicitly asked for and something that was just produced something that was just um I didn't very well am I okay so in that is exactly the example of uh of something that you can encode with extra properties but we for which we didn't see the need of having it in the core set of properties so you can very simply add a boolean and a property like a leaf or root or whatever I don't know how it is called in I don't remember what it is called in apt terminology and with that boolean encode whether the package has been explicitly installed by the user or not and you can easily define an upgrade policy saying please remove all the packages that have this boolean set to true if you can do that respecting the pen okay fine so that works so I'm glad you already mentioned multi-arch the other similar aspect is um differently optimized packages so um you know you what we'd like to be is be able to provide different the moment we encode that in the package name you know mplayer i686 which is a really crappy solution to the general problem of different variants different builds effectively at the same thing which might be your local build option maybe that provides that functionality but that's certainly a class of ways you wish to optimize and we may need I don't know if you need extra functionality to express that or not yes so more generally I mean let's say generalizing the answer to your previous question the format is open-ended you can define extra properties and the idea that you do that exactly to then ask the dependency solver to optimize on specific function of those properties so yes you can have an extra property a custom property which express how optimized how much it optimizes the package and then ask the package manager please maximize this or that yeah okay I have another question if I understood your format correctly you don't have any notion of what we have in debian as recommends and suggests how are you planning to deal with this sort of kind of soft depends okay so we actually have used recommends already in the trendy track and once more the reason why that is not explicitly in the format is because the correctness of a solution does not depend on the fulfillment of those soft dependencies so even with the aptitude or the package other package manager we have in debian if our solution is correct ignoring the recommends then it is correct also considering the recommends so what we did in the track where we asked to optimize our recommends is to have an extra property called recommends and specifying that what the package manager should optimize is a function of that recommends all over the package universe so you're planning to implement this kind of policy for that the sis admin yes that's it how do you handle the removing of a package in in the archive they unsupported the package okay so the point there is that it is already the case that package managers only care about the current status of the system they see so they don't have a like a perspective of the evolution from one state to the other with apt when you do apt get update you basically throw away the previous status and like retrieve the the new status and you kind of join it with the status of installed packages but it will not remove the unsupported package sorry it will not remove the unsupported package yeah well want to say something about that yes sure okay so that's in fact a good point and i think this is also a point which can be handled with extra properties so you could have well currently the model both the packages which are installed on the machine and the packages which are available to apt in the same way but you could change it and add an extra property which indicates a flag say which says well this package comes from unstable or this package comes from testing or this package comes from stable and in that way you could just add a policy which says remove all packages which are not also available in stable for instance thanks over there oh my you have a lot of questions sure i was wondering if you'd um looked at all into um optimizing upgraded pathways or just um you know consider the end point because for instance it could be a problem if you have packages that provides the service um to have it you know be um just unpacked and have the service not running for an extended period or for instance in the case of hard conflicts as opposed to opposed to breaks to have the package uninstalled altogether temporarily that's true so this is a kind of one end a shortcoming of this approach and on the other end the liberty need design choice so the point is that the semantics of the format is what we call them we call it transactional so we have the initial state the package manager compute the final states and we just assume that you will magically go from one solution to the other and this again is already the case in current package manager that first compute a solution and then kind of decide how to implement it asking the underlying low-level package manager to do that so this is actually how to compute a optimal upgrade path from the current starters to the status finder by the package manager is part of our future research and we actually have a project which will start on that but we are quite confident that the two things are kind of separate so you can safely first compute a solution and maybe encoding their all extra requirements you get from this kind of constraint and then decide how to deploy that on system anything okay okay so um since this is NP complete have are you also looking at how to encode strategies that solvers use because it seems okay you know so a solver may come back with a strategy but it'd be nice well it'll come back with an with the solution to be nice to somehow templatize that and turn that into a strategy in the future okay so let's say that in the project we have all the values expertise so we are kind of on the side of encoding the problem formally and this kind of low-level strategies are actually being implemented in the solvers so there are the the team of people which are working on that are doing this kind of optimizations so unfortunately as I said not all the solvers we are on the on the project are open source but most of them are so the the strategies they are implementing are actually available and can be looked at to you know to to mimic them and as you say to templatize that anything else Franklin are you ever aware of some efforts to have database for the values names and to connect the projects among these among distributions you mean like mapping different properties to the names just the names of the project and being able to say bash bash his name bash in the end but maybe oh that's an interesting question because so as part of our future research at some point we envisage that having a separate infrastructure which was actually in charge of doing what you just said so to actually map the name of an upstream project to different packages and this is kind of interesting when you start thinking at the cloud because you might have a different machines like a debian and a reddit machine which like have a dependency one on another like a web server on one side depending on a database on the other and at that point you need to know that a dependency with as seen by a specific distribution may have a different name of a different distribution so we kind of acknowledge that need but we don't have that yet and actually we don't have a plan to go in that direction we have time for one more question if anyone has a last question okay thank you