 Excellent. Thank you for your patience. Wow. Alright. Hi everybody. After that wonderful display that I'm sure inspired confidence in my ability to computer, package management unites us all. We're all here for this right? Didn't run in the wrong room? You didn't sit in the wrong place watching that ridiculousness for the wrong talk? Cool. Okay. Alright. So let me get my headset. I'm gonna put this over here and then this over here and I can, can I just, whoa, that scrolls fast. Alright. Hi, I'm Sam. I by day am a software engineer at Stripe. I also work on package management for go. I also wrote an article about two years ago, almost actually, almost like exactly two years ago. It was like February 11th, 2016. That was entitled So You Want to Write a Package Manager and it was absurdly long but that article I had at the time, my goal was I wanted to work on package management for go because if you don't know, is anybody who uses go in the room? Alright. So you know that package management for go has been like a giant trash fire for years. So we really needed to improve things and there were a bunch of reasons that I embarked on on writing this. It's like 14,000 word long article with lots of cute diagrams. Do people know this article? Like a couple. Cool. Alright. So read at some point or not. I tried to write it with enough sarcasm that you could actually get through 14,000 words. So it's kind of entertaining as you go. But yeah, my goal was I wanted to really sort of lay out kind of the whole problem. Not just for go necessarily but say what really like is package management and what are kind of the best practices and ideas there in a general sense and certainly it was sort of tilted towards what I was thinking about with go but I wanted to work generally on it and that spirits is really the same one as this talk. This unites us all idea and I want to talk about this title. Right. So I give talks a fair bit and I'm definitely like let's get along when we can kind of guy. But this is one of the more kumbaya talk titles that I've given. But I want to tease it apart a little bit because I think there are actually three different ways that we can look at this title that are kind of interesting. So the first possible interpretation of unites us all is there's this kind of an American idiomatic thing. So please forgive that but there's this sort of pseudo intransitive use of the verb unite which is definitely not an intransitive verb. But it's interesting because it kind of refers to an experience that a group of people sort of share and this is kind of a common thread that can be a foundation for like camaraderie and relationships. But the connotation is passive. It's not like we are actively brought together by something that's more like oh you know we're sort of we have this thing that we share and that's kind of nice. If we put it in terms of hey we have this thing in common then we actually are now using the verb to be which of course is intransitive. But I think that there's a fair bit of shared experience when it comes to package management which is important to think about. The obvious thing there is that we all okay raise your hand if you had the experience that package management is terrible and you want to rip your hair out and and you just hate everything. Right so you're all honest yeah pretty much because package management is the worst. That's over here. So everyone has this experience certainly is the easy one to point to. There's nothing like being in a foxhole with someone to to inspire to inspire camaraderie. There is also something bigger though when it comes to the shared experience with package management that I think is important and it's because of the way that package management is deeply tied into the process of creating software and the difficulties of creating software and all of the different unknowns that we are trying to deal with. So there's one sort of sense of Unite's assault. The next one is is the sort of more standard definition of Unite right. It actually brings us together. This is another idiom the the ties that bind but the the things that bring us together the things that bind us together. So in this sense I think that that it's interesting to think about in the context of package management because package management is the thing that mediates and defines the relationship between people who work on software systems at least to the extent that you know our interaction with other developers is defined by our interaction with their code and like let's not push that metaphor too far because it gets kind of weird but at least at a basic level you know a fair bit of the time I'm interacting with other people through the code that they have published and I'm doing that in a way that's mediated by my package management tools. So an important boundary in modern programming languages is the boundary established by by package managers and this talk is going to be somewhat slanted towards line towards language package management but I am actively trying to move away from that and sort of generalize as we'll see I'll come back to that but certainly language communities are are strongly defined by the package managers that they that they have. This is actually kind of why if I'm being honest I find the moniker like the C++ community kind of confusing to be completely clear I have like never written C++ in a serious way so I don't really know how they would define that community but not having any sort of consistent package management makes it difficult for me to understand sort of where where the boundaries of a community set. So the third sense well if we if this was the title of the talk then it will be fine to talk about like individual communities or individual package managers but the title of the talk is in fact not package management unites small reasonably uniform groups into larger tribes it is unites us all which means that I really want to talk about what it means to really be looking at the problem of package management as a whole and that's something this is really kind of a reach even to say like the problem of package management in the first place you know we refer to package management but it's not a problem domain in the same way that like compilers are a problem domain however I think that it should be I think that's an interesting thing that's that's kind of starting to emerge right now but you know compilers exist independent of any individual language there are techniques in compilers that you can see applied across different compilers regardless of the language that they are attached to and I think that we would all be well served if we could bring package management as a problem domain to the point where we can talk about it independent of a particular language I also think that at least when we're talking about the the analogy that the compilers is apt because the way that I tend to think about at least language package managers and to an extent system package managers is the output of the language package manager is the input to the compiler compiler phase 0 if you will so this is the outline for the talk is this this this happens to me like a weird amount where I saw I picked the title months ago without really thinking about what was going to be in the talk and then when I actually sat down to write the talk I'm like well what do the words in the title mean which I just do because I love I love words and then I started tearing it apart and turned out that the whole talk is in these three different interpretations of the idea of package management uniting us all so these are the three parts okay let's talk about the first one shared experiences so two parts of this right we have the part that we know package management is one of the very few problems that is blessed with its very own kind of hell I mean really like how many other places not just one circle of hell but like it's whole own hell the the the particular brand of hell a lot of it has to do with what happens when there are shared dependencies in a system you know we've got a and b and both of them depend on c and there was just a knock down drag out fight between a and b as to what versions of c will work with both a and b different words for that shared dependencies diamond dependencies a lot of things but that's one of the main things I think that the people think about it is not the only circle in dependency how there are circular dependencies and self-referential other kinds of self-referential issues there are you know overly tight or overly overly loose version constraints to get you all sort of stuff that doesn't actually work but my take is there there's a lot of different circles it's a mistake to focus on any one of them to the exclusion of the others even though the the diamond dependency the shared dependency situation is is probably the hottest one of course the thing is that with most package managers no matter which problem you're facing you have two problems because most of them unless you are actually like the person who wrote the package manager you have only you're only like 60 percent sure that the problem that it's telling you about or that you think that it's telling you about is the actual problem that you have to deal with because this domain is arcane and complex enough that maybe it's reporting an erroneous error or an ephemeral error or maybe it's reporting an underlying error that actually you know this is the surface version of it when you solve that something else will happen the the problems just sort of keep compounding and this is you know well like I said this this this first bit is is about the shared experience of pain right we see this we see this a lot in package managers I am of the belief that it is because it's a complex domain but maybe it's just because all of us who work on package managers are bad at writing software I don't know either way it's easy I think you know to to look at these the frustration of working with a package manager and to compare it to some imaginary ideal where we don't have any of the problems that were that were running into and get really frustrated and be like package manager is terrible why do I even need this I'm just gonna roll everything to a make file and everybody can go F themselves this is this is usually not usually not the the best way to go and so this this brings sort of the the second shared experience right the thing that I think is harder to see for that for the same reason that a fish doesn't really sort of know that it's in water so I think it's important to keep in mind the role of package management in the inherently difficult process that is creating software and what you're doing when you are working with a package manager so when I think about the process of of developing software you know I'm clearly I'm not gonna write everything top to bottom myself I'm not that person I don't think that person exists anymore so I know that I'm gonna have to rely on other people's code and I know that that's gonna bring some risk into the equation maybe they are just crazy people maybe they're gonna pull the code out from under me maybe there's bugs hiding in there that aren't easy to know about maybe there's security holes hiding in there that aren't you know that isn't easy to know about but the thing that I'm counterbalancing against is I need to I still need to ship something and if I can't pull in the things that other people are working on then I can't actually progress I can't do my job so given that the role of the package manager is to essentially be the thing that mediates between the rest of that deeply unknown very risky world and what we are working on in front of us it's unsurprising that we would see a fair number of complicated and annoying things coming into our package manager it is trying to organize a very chaotic space so when I think about a a good package manager I have some some basic rubrics that I apply right my sense is that given that the world of dependencies that are out there is chaotic and is hard to know how much we can trust it the best approach to anything like that is experimentation if I need to go and test out whether a dependency works well then I want to be able to do that easily and I want to be able to pull it out easily I don't want this to be an arduous process just to perform my experiment to get at the underlying question of whether or not this code is any good or the maintainers are any good and similarly I don't want the process of doing this experiment to sort of push me out on a limb where it's really difficult for everyone else that I work with or my build process or whatever else to make it very difficult to replicate my experiment I want to be able to iterate quickly and flexibly and explore the the crazy chaotic world of dependencies but be insulated from it at the same time and be able to rapidly push things through so that my experiments give me feedback fast it's a very weird sort of I think balance of two conflicting forces at the same time right we want on the one hand something that lets us be fast and flexible and on the other hand something that is exceedingly reliable in terms of its outputs okay ties the bind so it's important to talk about the goals of package management right the sort of high-level stuff as with any software like if we don't have a solid solid grasp on its foundational purpose then it is way too easy to get lost in the minutiae of the particular little problem we're trying to solve right in front of us and there's a lot of those in package management so discussing package management as like a risk management system as I've been doing is that's kind of high-level and hand wavy I get it for this part I do want to make a sort of another high-level assertion which is modern package management is constitutive of communities however the argument for this emerges directly from not high-level pitches but the actual guts of of package management so if you were really annoyed at all the abstract hand wavy talk you can wake up now because I'm actually going to talk about like things and we're gonna start from the beginning that is what is a package I I don't actually have a terribly satisfying definition that I can put into a single sentence yet I'm working on one but what I do know is that the floor the minimum possible number of different definitions is at least the number of package managers that are out there and working on things none of them are entirely the same and even subtle differences in the way that these different package managers define their packages make them you know deeply and surprisingly incompatible so given that degree of variance let's just start super simple with this let's say that this box represents a package so first and foremost the package is a boundary line it separates this packages from other packages you think of that a couple ways in one sense it means that wherever you know this package physically exists it's going to be possible to retrieve like a tarball of it and just this discrete package that's a distribution time boundary in another sense it strongly implies although it does not actually entail that any software to be compiled or interpreted using this package will likely incorporate at least the name of the package in the software that it generates that is a boundary that we can see it sort of compile or execution time packages also impose rules on what they contain so whether this is whatever sort of kind of software this is there's going to be some amount of rule and position on on its contents by rules I mean you know these can be really simple and and we'll of course vary from system to system but maybe we're going to require that some sort of metadata file must exist like a package.json file for npm maybe we're going to mandate that the file layout must conform to a certain pattern perhaps it's because that's the pattern that you know the compiler or interpreter that's going to be looking at this package later on is expecting maybe we go even further maybe we say that the logical objects there in files the classes whatever it is must conform to certain like linting or test passing rules whatever the thing is so even if even if you have a system which is managing to gatekeep on on the creation of packages to make sure that like the rules are followed there's still going to be plenty of gray area in the space that is the in the possible set of packages and and it'll be possible to make judgments about like what best practice are about construction of good packages versus bad so we have some rules if we can enforce them in a way that prevents totally invalid packages from existing great but there's still going to be a sort of gradation of package quality that's going to mean something specific to the the the package manager environment that's it the other major thing keep on going the wrong direction so in addition to imposing rules on on the logical guts of of the package package is also exposed some kind of metadata I mean and you know the examples here like super simple but we can say they expose author maintainer information or dependencies and and constraint information which versions of of its dependencies it's a lot to work with this is necessary for the for for the well for the functioning of the manager by and large information that is sometimes presented in interfaces was the author maintainer what website is associated with and then the there's more functional information like that the the package manager need to you may need to use in order to construct a dependency crack there is a big piece missing of course when we talk about packages that way we can't just say that they're in name usually they are at least a two-tuple they are in name and a version now if we're going to say that they're name and a version then of course we also have to realize that and forgive my horrible notation that there's going to be families of packages that is that all share a name but have different versions maybe there is some ordering relationship that we can understand between these there's a v1 a v2 a v3 or something like that or maybe there isn't that's not you know a generalized guarantee of of the domain but if we roll this up one level further then we make it to the idea of the package universes that is simple way to think about a package universe is it is the set of all packages at all versions that exists more or less that a that a tool can access so you know yay however many exist in in this universe saying though that it's that it's the set of all is actually a problematic oversimplification though like our own actual universe only part of the package universe is going to be observable or reachable at any one time for for any given tool there are a few factors in this right like the simplest one by far is well if we were to say that like the package universe is the set of all packages at all versions then that would include packages that sit uniquely on like your laptop right now and I can't reach that from my laptop so clearly we're looking at at least some subset of the total universe of packages that our package manager might be able to interact with there's also if we're thinking about a sort of a time component in our universe then it's possible that at least in some kinds of package management universes maybe some packages go away maybe they get moved but either way the time dimension matters and then yeah that the metaphor kind of breaks down but so the way that you constitute the universe for the matters so for system package managers and most language package I mean most systems they have some kind of central or perhaps distributed registry or repository that actually is like the set of packages right I'm in a very weird case here actually because go is the exception to this more or less simply because go uses more or less FQDNs for its for its import paths which mean that like the thing that is actually deciding the universe's DNS not not some service that some some people are running somewhere but you know for the most part if we're saying that there is a registry or repository that is out there which is capable of taking a name and mapping it to an actual like physical package object as soon as we say that there is a there's a registry then we're gonna accept that well maybe there's like one public registry or a couple of public registries but then there's also going to be the possibility of private registries or maybe like ancillary additional public registries anyway the interaction of all these together ends up sort of creating a view into the larger universe and your package management tool at any one time is going to have a view just into some particular slice of the universe that's weird that wasn't supposed to be in the slide oh well so that's weird why why talk about all this right my my original argument that I'm trying to make here is that the these elements actually tell us something about how we constitute communities using package managers so when I say constitute what I really mean there is that package managers both define the boundaries of and the way that you participate in a community of software the rules that they apply are the the rules that you end up having to follow they are effectively norms that people are expected to obey when they are creating packages and because those rules are guidelines at the at the end of the day learning how to better obey those guidelines in a way that is less harmful to the rest of the ecosystem that is more helpful to the other people in the ecosystem while still you know serving your immediate purpose learning how to balance those kinds of concerns more or less ends up being the definition of being a good actor inside of the system and you learn how to trust other people through whether or not they do a good job of following the best practices and those sort of gray edge cases that we seem to be constantly spending so much of our time in early study additionally you know the package then also must be in a reachable part of the universe so that slice of the universe that our tool can actually see for us is kind of the bounds of the community I mean prior to the advent of modern package management like there were different ways that I you know I guess you just great curl down tar balls all the time and that was fine and you were used to doing that you stuck into make file and it was cool but like once the automation of a package management came into play that's just like that's just not how we do things anymore for the most part so the package management tools ability to reach something really sort of ends up for most purposes determining whether or not this thing is kind of in a route in or out of the community all right one last binding type history a little bit of a story so few years ago I started a deep dive down the rabbit hole that is proof and correctness in software I have no CS background so this turned out to be quite the trip and like I am still super much just a deletante with all this like I've never cranked out a TLA plus proof or even like written things in Eiffel or address but I came to a question that I think is a pretty normal one for people who've been down this rabbit hole who who here is like familiar with I mean just software correctness in general and and proofs and such and things that go on with it okay yeah this is a it's a super fun super weird rabbit hole and it gets like super confusing super quick that's that's lots of supers there but it's because you kind of come to this question here right which is the title of this this paper that the legendary Tony horror wrote because as soon as you start looking at the idea that wait it's possible in the first place like construct a proof that tells us that the software does we think it does like there can't be bugs there's this there's a great quote from from Dijkstra about tests do not prove the presence or do not prove the absence of bugs in a program only their presence this is always sort of the space that you haven't explored yet with your tests is where all of the craziness and the risk is and the only real thing that you can do to deal with that space is proof so for a long time in earlier earlier history of computer science and software engineering it was believed that well of course we were going to use mathematical proof in order to be sure that our systems were correct and Tony horror of course one his his Turing at least in part for horror logic which he wrote up in a paper in 1969 which is a pre-condition post-condition sort of way of asserting that the state of a program is good that's now been kind of codified into Eiffel that I mentioned before super interesting anyway kind of getting off topic but he wrote this paper 96 he's like alright so I've worked in this for 20 years and how is it that we built an entire software industry which seems to work despite the fact that there we have only explored the tiniest little set of paths of the things that our programs could do and it seems to be good enough and it's weird the answer to this question basically is software engineering it's a super profoundly unsatisfying answer there there are a lot of way stations on this rabbit hole and seriously like it's it's fun to jump down but in some way like we've made software sufficiently for some interminably infuriatingly imprecise definition of sufficient reliable enough through the craft of software engineering itself writing tests doing incremental changes modeling things formally and informally and then crucially bring this back isolation and modularity right if you look back over the history of software the idea of modularity has not always been a thing it's so familiar to us now it's again when these sort of like fish and water type of things but prior to the advent of compilers and higher level languages you know they could do things fancy things like stitch files together we didn't really enough the capacity for for for modularity so also in the in the 70s there were there were sort of two major things that were happening on the one hand we had structured programming coming around this was a dykstra and whore and a couple of other people but the basic argument there was we need to get away from go to and jump and we want to move towards subroutines and blocks and loops the sort of primitives that we are familiar with in a lot of modern programming today and at the same time there was also modular programming which is sort of co-emergent with yeah with with structured programming the big things emphasized here were separation of concerns and information hiding now for the most part like modules are not necessarily themselves functional objects like we could again there's a little rabbit hole with like wait what's the difference between a module and a function both of them hide information I can't even say that modulars that excuse me that modules aren't necessarily functional objects for the most part though they are sets of functional objects you can yell at me about standard ML later it's totally fine but the reason that all this matters right is because the the presence of modules John regares a computer scientist University Utah the presence of modules when it's when they are organized well it makes the total size of the software almost irrelevant we can focus on just a little piece that we are actually working on and if you look at this and you cross your eyes a little bit it really starts to sound an awful lot like packaging as well packages are also generally non-functional objects there are containers around functionality the point here is to say that is the sort of I'm trying to stitch it all back together to say we've got a whole lot of different logic that we drop into the bucket that is packages and we have some rules that we sort of enforced on it from the top right but the last layer when it comes to actually defining what a package is in your system usually comes from figuring out what the relationship is between whatever your your programming language or your your whatever system your your packaging thinks of as a module and then what you think of as a package usually it's either one-to-one or it's one package to end modules but there are cases where there's like n modules or sorry one package to n modules or one module to end packages that's it but this relationship ends up being one of the really really crucial ones to understand because it makes way more sense in general to think of to think of packages and the relationship to a set type of type of object like a module than it does to an individual functional element like a function which is all to say that packaging sits at the intersection of humans and software right it's the spot where we have this handoff this interchange from a module which may have some functional aspects to it to the thing that humans actually interact with I'm not attaching my name to a module but I am attaching my name to a package and that's important because it's the package is the thing that gets distributed that other humans are making decisions on the basis of that is getting organized alright 19 minutes okay last section with this terrible unbinding the boundaries title alright I'm kind of already transitioning into it by you know having this this general discussion about about what constitutes packages but I wanted to want I want to move beyond that to talk to some to talk about some some individual things that really really do sort of span and kind of look towards the future a little bit so I don't know maybe I'm spoiling the the panel five o'clock but we're at we're at an interesting moment now I think for a long time and certainly still in some places relationship between folks who worked on different package managers and different distros on different languages was really fractious it was not a friendly place to be and the past few years have seemed to change it's really heartening but I was going to actually put a I think I put a slide in no I did not okay literally just this morning Ken March on from NPM tweeted something about how NPM five which was released a year ago now nine months or something like that a bunch of the design of NPM five was informed by the article that I wrote and then how a bunch of things that I worked on in depth have been informed by their experiences over NPM we have now this package community space which if you were sitting in this room you might be interested in checking out there's there's a discord chat that we have open there's like 25 channels for discrete package managers in there but you know the point is this is this is not we're getting away from this sort of fractious relationship we're getting to the point where at least the people involved in this are interested in talking about like what are the fundamental parts of package management and I am very interested in that question and what I have come to is that one of the first things that we actually have to do is define a taxonomy of package management if we want to be able to grow this into some kind of compiler ask language independent system independent domain this means talking about what the objects are which you know we've already done with packages a little bit but talking about then what the different choices are and what the meaning of them are and and classifying you know the different systems that are that are out there it seems to me that this is a prerequisite specifically because I've seen a lot of conversations now between people who work in different areas and we don't have shared vocabulary you know we sort of have to hash it out every time and so many so many behaviors end up getting encoded slightly differently at slightly different parts in the process and all that turns into something that are just sort of assumptions for people but it's very difficult for us to tease it apart so teasing apart it's kind of the first step so here's an example of what a taxonomy might look like to be clear the taxonomy is mostly like a forward-looking project if you're interested in that talk to me because I really want to work on this but system versus language so I think fully to actually talk about this this morning too so I'm curious how mine compare with his but so we think about system package managers they usually aim to create combinatorially safe package universes which is a really interesting property and also a really difficult thing to do because the possible combinations are it's you know a combinatorial explosion but that means you know that we are going to publish a repository where we have tried to test at least a reasonable set of accommodations of software that are in that repository because we want to insulate anyone who is working on our distro from the possibility of an unintended interaction between software that produces some crappy outcome is that is generally not what what folks who are using system package managers are trying to spend their time doing this also has a social implication which is that there is a finite group of trusted people who are responsible for populating the universes version constraints are also usually correct because they've actually been tested together and also system package managers tend to encode more of the metadata level I mean realistically like you know you can you can pay blah you can play sudoku with d package it can be done like these are these are free package managers are almost pseudo programming languages they are pseudo programming languages in many cases and because system package managers are so interested in general in keeping the contents of the package opaque they tend to encode more information at the metadata level that's where they that's where they that's where they put stuff by contrast language package managers are usually ungated there's no attempt at a combinatorial guarantee because that's literally no one's job I'm a developer I push a package my job is to make sure that my package works according to the definitions that I have set out with all the dependencies that I may have it is not my job to make sure that I work with all of my dependents that's like their individual job so we don't make the attempt there's no attempt at a combinatorial guarantee version constraints then are usually optimistic because we want to say well you know I'm publishing this now but maybe I'd leave it and I don't want to say that my stuff only works with like this current published version of my dependency maybe it'll work it'll probably work some verse supposed to help with this hand wave it'll probably work with future releases of it at least within this range so I'll leave that constraint open so that I don't create a pain in the butt for anybody who is using my software three years from now when a whole bunch of new versions of my dependency have been made and it's trying to reconcile with with somebody else of course over the open constraints end up meaning that you have things that don't actually work together all the time so you know problems but because language package managers are actually very much about very much about the the details and the contents of those packages very often details may flow through from the contained packages symbols and be used in the work that the package manager does actually I should not say very often I should say it should happen more often yeah it contrasts to you know assistant package manager where they're going to try to encode that as metadata as opposed to inspecting the actual contents of the package and code as metadata as in a maintainer decides that the package has some property and like sets it as additional metadata it's not inferred from any sort of analysis of what is in the package itself besides the system versus language distinction the single most important thing in a package manager is whether you allow duplication of packages or not this is like far and away the most important is relevant for both system package managers and language package managers next in MPM are two examples of package managers that duplicate as in when I say duplicate what I mean is you know our share dependency issue again right like we have A and B and both of them depend on C if you duplicate then effectively A and B get their own copies of C everybody's happy A and B don't have to agree on everything everybody just gets through a little play space and it's fine kind of most others don't the benefit of this and the reason why it's an estimable goal is because you can avoid set you can avoid billion satisfiability which I'll come back to him just a second the cost vary in something like next I mean I read the cost is as relatively low in next what are we like wasting disk space oh buddy but in a language package manager it's trickier what you often end up there with there is things like multiple you end up with with global state that gets duplicated and if you have multiple instances of a package both of which are trying to connect to a database and they both like create database singletons now we have like two instances of things both talking to the database at the same time and all of a sudden your app is super incorrect and very very weird raised ways that you didn't realize it all because the package manager made a decision about duplicating this thing so the question of whether duplication is safe or not is often really quite intertwined with the way that the language itself works but the benefit of no sat is huge because this is the giant thing on the screen and also sitting in the middle of this problem when I say sat who knows who knows what I mean cool okay billion satisfiability yay it is the first of of a carps 21 and 21 NP complete problems NP complete means ouch or or if you know big o notation it's it's oh to the end in general for the solving part or just oh shit that's fine as well so boolean satisfiability problems are I'm way behind my nose they are problems which take sets of boolean propositions and try to work out a combination of values that actually satisfies all the requirements it is a tremendously difficult problem to solve at least generically and there's a whole field of research dedicated to it in fact until about so I think it was 50s or 60s that that carp articulated boolean satisfiability and it wasn't until the late 90s that they figured out a technique that actually made a computationally feasible to solve anything more than like the smallest trivial problem but there are other big costs from this tend to be that it is such a complex problem to work on in general that it's very very difficult to reason about the answers that we get from from set solvers there have been plenty of cases where some really plenty there have been at least some cases where some just wonderfully well researched PhD thesis made some improvements to some set solver and it appeared to work and then like three years later they figured out that none of the reasons why that wonderfully research thesis thought that it was improving things were actually accurate and all the PhDs just cry about it so it's it's a it's a very tricky tricky domain to work in there's good very good reason to want to avoid set if you can at the same time well so here's the thing that I'm engaged in right now so despite set being a very nasty problem it's also potentially very useful that's reason it's so nasty now the reason that we end up at set is is easy to explain if we're just looking at version of them if we're just looking at version numbers there's a Russ Cox the BDFL of go has a great post up called version set which he wrote 15 months ago or so which he just lays out like yes version satisfiability is is is empty complete but we don't just have to be looking at version numbers we can also encode like deeper language rules as as things that we want to that we want to put in the constraints all but they get really tricky to solve otherwise but we have the basic tension here between approaches and this is something that that the Russian are going back and forth on right now which is sort of a sad as the problem versus sad as a sad as a platform for for working on things so here's this thing called Schaefer's dichotomy I've got six minutes it's a really weird result so Schaefer's dichotomy tells us that all sad problems are either in P or in P complete there are a whole mess of complexity classes between those two but no satisfiability problems fall into them and we can know for certain that if a given satisfiability problem bullying satisfiability problem can be expressed using a subset of Boolean relations just which sort of forms of clauses are allowed then it will always be in P as this is interesting because Russ and I he hasn't put this post out yet but this like I expect this is going to be out in public in the next couple weeks for soon in any case has something that he's describing as minimal version selection this approach is his attempt to sort of work on on a package management where we treat sat as the problem to be avoided more or less so the basic preconditions are and I'm not doing the full dive here on a time and like he's gonna have a write up but if a student that Sember can adequately define compatibility within a major version I know big assumption certainly doesn't work for for for dynamically type languages maybe works for for statically type ones but so assume Sember works to if we only allow minimum versions to be specified and three a mechanism allows to allow a mechanism exists to allow duplication but only across major versions meaning that we can have you know like exactly one version on the 1.0 range exactly one version on the 2.0 range then you can actually work it out we're on the P side of Schaefer you can express it using within the restricted set of Boolean relations that's dictated by by Schaefer's dichotomy which is super cool it's really really super cool like getting around SAP it's still being able to do essentially what we need to do is very powerful there's a lot of trade-offs that come with this but no time now here is an alternative these are not actually like I say that Russ and I are going back and forth over this we're going back and forth over over a bunch of things related to go package management this is these two are not like directly in opposition necessarily but this is something that I've been thinking about for a while which is the idea that well version constraints they cause unnecessary pain both when they're loose and when they're too tight and because even if a even if a constraint was like good when you set it over time they tend to age poorly no matter what which just means that versions aren't great it shouldn't be a big surprise there are really really compact representation of a whole giant pile of information that they would not be the best approximation all of the time should surprise no one but given that they are approximating a whole bunch of lower level information all of the symbols that are exported by by system maybe maybe we could do better if instead of just looking at versions we peak deeper at the logical relationship between packages so I'm tentatively calling this shape analysis CS professor friend of mine tells me that this is actually most analogous to maybe most analogous to gradual typing I have not really dug down this rabbit hole very far yet but the essential idea is that instead of just saying hey a depends on b let's look at the bits in a that depends on the bits and b and if we know that a dot foo references b dot bar doesn't matter if it's a function or a value or whatever we just know that there is a name reference to be to bar then we can do a very simple inspection on b we can look and see if part of its exported symbol set includes b and bar and if it doesn't we know it's not going to work for some languages which have you know dynamic typing or not dynamic typing but like you know dynamically exported symbols not so great but this is why it's over the idea of the API shape analysis is that maybe it gets us close enough to doing most of the work that we end up only needing to use version constraints to sort of massage things at the end instead of needing to use them to define the whole domain and then people can do a lot less declaring of awkward version constraints and we can instead be a little bit smarter about how we pair packages up alright see you thanks everybody