 So I'm a mathematician, so like Dorothy's talk, this talk will be sort of about methods. And I've also benefited from Tetsuo's talk because he introduced some of the things that, sure, introduced some of the tools that I'm gonna talk about. So here's like the idea. Suppose we have one of our DNA knots, then the question that you wanna ask as a mathematician is, okay, what's the space of these curves, right? I mean, this knot is some sample from a probability space from a probability distribution, and you're like, well, what's the space? What's the distribution? And an idea that's been around implicitly, at least for a while, is that, well, the knot is an equilateral polygon and it's got n-edges or something, and you're going to sample these polygons one way or another and you're gonna talk about polygonal knots. Or you're gonna talk about lattice knots, or you're gonna talk about knots made of chains of beads. And I'm kind of interested in those as mathematical models because they have a lot of interesting structure that lets you do computations and do theory in a really nice way. So here's what I wanna do. If you're a physicist, you might know of action angle coordinates. They're like this thing from mechanics, right? Where it's like, if you write things in action angle coordinates, everything is nice. And so really like my main point for this talk is that the space of polygons has action angle coordinates and they're quite simple. And I sort of suggest that we do computations in them as well as proof theorems. So let's see if this will work. Oh, no, it won't, of course. Okay, let's do it this way. Okay, so here's the next two things, right? So if you think about it, it's a very easy sort of classical thing to do to say I'm gonna take a random flight, which is to say I'll just sample a bunch of points on the sphere and stick them end to end to make a random walk in space. Now, if I want the walk to close, then it's weirder, right? Because it's some conditional probability thing where I have to condition on the probability zero event that it happens to come back to the same space. If you're a geometry, you would say I'm gonna restrict to the sub manifold of the space of end points on a sphere where the cloud of points has center of mass at the origin, exactly at the origin. But of course, an idea that goes back a long way is instead of conditioning on all of them summing to zero, condition on half of them summing to the same thing as the other half. And if you do that, then you can write down a very nice probability distribution on the segment that connects the origin to the place where the curves join. And let's see if my laser pointer will work. Okay, so this goes back to Raleigh, really. This is the probability density of the random flight in three dimensions. And it has this wonderful closed form, wonderfully awful maybe. But you know, the little plus means it's the thing if the thing is positive and it's zero if the thing is negative. So this is piecewise polynomial in the length of the end to end distance of degree n minus three. So you know, one thing you can work out from that is that the probability density of the segment joining two points on a random closed polygon is just sort of the product of these two PDFs, the one from the first arc and the one from the other arc. And this is some piecewise polynomial thing and it's degree n minus four. And I guess it's worth saying that there are a lot of exact computations you can do just from that page of formulas. Like for instance, here are some expectations for chord lengths, right? On a closed polygon. So, you know, they're exact, like this is not a sampling thing, this is a theorem. And if you like, you can get something nice like a pentagon, you know, a pentagon with length one edges. I'll just draw it kind of flat. As two diagonals, let's call them D one and D two. And the computation up there is that the expected length of D one is 15 17ths. If, no, the other way, 17 15ths. If all these lengths are one. And of course, by symmetry, the expected length of D two is exactly the same. And using the PDFs, you know, you can plug this into Mathematica and there's the expectation of a, the length of the chord connecting vertex zero to vertex 37 and 112 gone. And it's that, exactly, exactly that. This is mathematics. But it's, you know, if you're a physicist, you're like, oh, it's 4.6. Okay, so, you know, and like you can do other functions of chord lengths. Like for instance, Tetsuo was interested in the approximation to hydrodynamic radius, which I think of as an average of one over the chord lengths. And you can do that too. And for instance, the expectation for the hydrodynamic radius of the 15 gone is exactly that with the weird, interesting series of descending log terms. Or about 0.768. But, okay, so, if you think about the chord length expectations, it's kind of interesting that they're rational numbers. And it's kind of interesting that the degree of the polynomials N minus four. So, now I'm gonna sort of, this is the part, I don't know, where we make the big leap. So, I'm going to say that those aren't accidents. Those are like actually really significant. And here's where we move from like classical old stuff to like very new stuff. So, here's an idea. If we built the polygon by joining like one arc to the other arc, we can fold the arcs around each other, right? This is like an old idea in random polygons and I don't know who the first person to do fold moves was, but Ken knows. Okay, Ken did it. So, that's cool. Okay, so, the thing about the fold moves that I like are the fold move is a continuous symmetry of the space, right? So, that's good. So, right, everybody who studies physics knows Noether's Theorem, right? Which is that if you have a continuous symmetry, you have what? That's right, a conservation law, right? Every continuous symmetry should have a conserved quantity. And it's actually not so hard to see what the conserved quantity ought to be. It's the length of the line connecting the two points. That's what's conserved when you rotate one around the other. Now, you could say, oh, well, maybe it was supposed to be the square of that length or like sine of that length or something. But in a very precise sense, it's supposed to be that length. So, here's the idea. There's like a ramped up version of Noether's Theorem called the Deuster-Math-Hekman Theorem. And it says a little more. So, if you have an even dimensional space, it says that if you have more than one symmetry and they commute, then the collection of D symmetries gives you D conserved quantities. Okay, that's really still sort of Noether's Theorem. But it says more. It says that the joint distribution of the conserved quantities is continuous, T-swise polynomial, and critically has dimension M minus D. So, half the dimension of the configuration space minus the number of symmetries. Okay, so let's just check. If we think about polygons, an N-edge polygon has N points sampled from the sphere. Those are the edges. And it's conditioned on three things adding up, right? The three coordinates of the end point have to be the three coordinates of the start point. So, we had two N things for the edges, minus three because it closes, and minus another three because there are three rotations in our three. So, if we consider the space up to rotation in our three, then we have a two N minus six dimensional space. And that is an even dimensional space. And it turns out to be like, that's like a key point, is that by modding out by rotations, you get back to an even dimensional space. Closed polygons before you modded out by rotation were two N minus three dimensional, which is odd and nothing works. Okay, so half of two N minus six is N minus three, and we had one symmetry. So, the chord length PDF should have been degree N minus four, and it was, and that was cool. Okay, but wait, there's more commuting symmetries than just one fold move, right? There are lots of fold moves. Now, they don't all commute. If you have a fold over D one and a fold over this guy, these are not commuting folds. And you can work out the exact failure to commute and so forth, but trust me that this rotation and this rotation are not commutative. But as long as the chords don't cross, then the fold moves sort of obviously commute. We have these rigid triangles and we're just folding, we're just pleading the thing up. Okay, great, so how many of these diagonals are there? Well, if we triangulate the polygon, there's only one, two, three vertices that aren't the end of a diagonal, right? The other N minus three are the ends of the diagonals and the triangulation. So there should be N minus three commuting symmetries, each of which has an angle and each of which has a conserved quantity. So here's the theorem, right? The theorem is that if you think of the joint distribution of these N minus three chord lengths and the joint distribution of the N minus three angles corresponding to them, both those distributions are uniform. After all, they have dimension, sorry, the space has dimension two N minus six, half was N minus three, we had N minus three symmetries. So the distribution is piecewise polynomial of degrees zero, but it's continuous, so it better be constant. So the only thing left is to say, oh well, constant on their domains, right? Because the D one and the D two aren't actually completely independent of one another. There are some constraints on the possible values, like this one can't be 10 and this one be two. Well, what are the constraints? They're just the triangle inequalities. There's the picture. So let me just show you what the inequalities are. If you look at the inequalities over here, what you see is there are these conditions that look like D two is less than two. Okay, well D two is part of a triangle with two edges of length one and D two is the third edge. So the most it could be is two of the triangles like flat. The least it could be is zero if they're crushed together. But in fact, it being zero says something about the next diagonal as well. Because D one, D two and the edge on the left are part of a triangle and the edge on the left has length one. So D one plus D two is at least one. On the other hand, D one minus D two can be no more than one because D two can't be too short. If I've gone out that far and I go back by an edge of length at most one, I can go back at most one unit towards the origin. So these linear inequalities determine an N minus three dimensional polytope. This is called the moment polytope. It's the polytope of momenta of conserved quantities. In general, it's like always a polytope. And on that polytope, the distribution is uniform. So if I took like a random five gone, then I would be picking a point randomly in that polygon in the plane and two random angles and I could put back together the random pentagon and it would be a perfect random pentagon. It would be sampled perfectly. Another way to say it is that if you wanna do these expectations, what you're doing is you're doing an integral. You're integrating over polygon space like with respect to volume and you're dividing by the volume of the whole thing. And the volume form in these coordinates is the simplest it could be. It's like D D one through D D N minus three and D theta one through D theta N minus three. And then you're good. And if you wanna go back to vertex coordinates, well, you know all the diagonals. So you've divided the polygon into a bunch of triangles and you know all three edge lengths of each triangle. So you can just build the triangles and then stick them together one after the other after the other until you build the whole thing. I should say that this is a lot like integrating over the sphere by using the coordinates and theta. So if you know this trick, right, the area form on the sphere in cylindrical coordinates is just D D D theta. There's no weird sign term or something. And the point is actually that the Deuster-Matthekmann theorem applies to the sphere as well. Z is the conserved quantity and rotation around theta is the continuous symmetry. Okay, so now what we're gonna do is we're gonna use this picture to sample equilateral closed polygons like quickly and perfectly and not a Markov chain and so on and so forth. But you realize that kind of the hard work's already done because we've already showed that what you're really sampling is a polytope. All you're sampling is this like region in Rn minus three cut out by linear inequalities that you can write down and know. Okay, so here's what it looks like. That's the picture for a hexagon. There are three diagonals and it cuts out some region in R3. It looks like this funny pointy box thing. So the triangle inequalities are the first and last guy are between zero and two and all the middle guys are within one of each other and some to at least one. That's not such a bad system of inequalities and in fact the trick is if I write things in terms of the differences between the consecutive Ds and I call those differences S for steps then as long as the step is between minus one and plus one then the condition that DI minus DI plus one is less than one is like automatically satisfied. So really I just have to pick the right steps but what's the condition on the steps? The condition on the steps is that they started zero they vary between plus and minus one and as I keep stepping I build up diagonals and the diagonals have to come back to zero for this thing to close. So it's like you're sampling a Brownian bridge or something. It's not quite a Brownian bridge it's really like an excursion because the sums have to stay positive as you go. Now it turns out that if you just sample SIs from the hypercube then the rebuilt system of diagonals will work about one over end of the three halves of the time. That was like a neat estimate that we had to do and it was way better than I thought it would be. You would think that if you're just trying to build points in a polytope from sampling some hypercube that contains the polytope the polytope's gonna occupy exponentially little of the space. So your chances of getting anything usable should be like exponentially small and it should take like exponential time to build the samples. But this polytope occupies tons of the hypercube. You know, polynomially like little of the hypercube and it's not even a bad power. So it's actually really fast to construct samples of diagonals that work. And it's really short too. So Tetsuo said I should convince people that this was not hard to program. So this is the whole thing in three lines. It's three lines of Mathematica because it had to fit on a slide. But it's not many more lines in C. And I should say that I'm not gonna read the code to you. I just wanted to point out that I could fit it on a slide. I should say first that this is not the first direct sampling algorithm. Yuanan's group came up with a direct sampling algorithm sort of based on the sync integral PDFs directly and so did Grossberg and more as well at around the same time. The difference maybe is that if you're dealing with the whole PDF you have to sample from that thing. And that's hard to do because these sync integral things are lengthy sums of alternating binomial coefficients, times polynomials and that's the density. And the point is there are these huge terms with hundreds of digits that almost but not quite cancel and what's left over is the PDF you're trying to sample from. If you do it this way you kind of automatically generate all that complexity without dealing with big numbers. But you can do more, right? So because I'm a differential geometer you know I like to really think of this as like integration over space. And if you want to integrate Monte Carlo is only one way to do it. And in fact for lower dimensional examples it's not the best way. So it'd be nice to have like some explicit grid system on polygon space so that you could like look in each box and see what kind of knot is in that box. And it turns out you can do this too. So if you look at the moment polytope and you round off every point to the one to the nearest point whose coordinates are all half integers then you divide the thing into this attractive system of boxes and triangles. It turns out that you want to subdivide those guys into right triangles or simplices depending on the signs of what you rounded off and also the size of the absolute value of what you rounded off. And when you do so you break up the space you're trying to sample into a collection of simplices that are all exactly the same size and exactly the same shape and they're all orthogonal and you can count them. And then the idea is that by picking one of these triangles at random you can then subdivide it however you want very centrically or whatever. It turns out that the number of these sort of half integer points that you're condensing to is exactly the Motskin number. The Motskin number counts a kind of excursion. It's a path that takes plus or minus one or zero steps starts at zero and comes back to zero. So in this sense it really is a random bridge. It's like explicitly a standard random bridge. But of course each attractive point doesn't attract the same amount of simplices. If you look at the one in the upper right hand corner it's got a whole square that goes to it whereas all the other guys have only like a large triangle. Now there are tons of these simplices. So this is only gonna work for so far but it really is gonna work until then. 12 guns or something, maybe 15 guns. Okay, so we did some experiments, right? So we could sample 60 guns for example and the sampling is perfect. And this is like 10 million samples. And what we did was we sampled them and then we took the Homfly polynomial of all the 10 million things. We saw about 6,000 Homfly polynomials and there were like 30 or 40 where it was too crumpled for us to resolve it. And then we made this plot which is the log of the probability of each Homfly type. They're not really not types versus the log of the rank, right? So the most common thing was rank one and the second most common thing is rank two and so on and so forth. And if you look at this plot it's really kind of suggestive, right? Because it's like a line. Now I should say this was first observed by Enzo's group and we didn't know this at the time we made this plot. We were really surprised. And Enzo's group observed it in a model of condensed globular polygons on the cubic lattice. So a really completely different model than this one. These again are equilateral polygons in space. It's an off lattice model with no repulsion whereas their model was an on lattice model with self avoidance. And yet nonetheless they both had this linear phenomenon. Now you'll notice that I said I observed 6,000 Homflies and if you can read the right hand edge of that plot you will see that I'm cheating, right? So where does my plot end? Okay so what happened to the other like five, six through the data? Well the point is that I claim those are under sampled because when you get out there what happens is that sure you see something one time in 10 million samples but that doesn't mean its probability is one 10 millionth or anything close to it. It could mean there were a billion guys each of which had probability one 100 millionth or something and you just picked one thing in that crowd and you're going to assign the whole cell to that one guy. So if you really do the plot all the way out you get something that looks like this. And Erika Yohara and Tetsuo's group made some plots like this for us and was very convinced that this log linear thing was maybe not so good because of this tail. So we decided to check it in a third model, right? And the advantage of the third model is it's a finite model and we can check every state so we can know everything. The question? Oh that would be cool. Well it also gets noisy. Okay, yeah it's just not just that. Yeah, yes that's true. But yeah our data gets kind of a little noisy at the end too. Okay so here's our finite model. Okay so the finite model is there's only so many ways to embed S1 into S2, the circle into the sphere, like up to diffeomorphism which is to say combinatorially. So if you want to see this as a division of the sphere into faces and edges and crossings there's a finite number of ways to do it and we listed them, all of them. Here's the first 40 or something on one screen. And they start with a diagram that kind of looks like the trefoil and then a diagram with some curly things in it and so on and so forth. And then to each of those pictures we associated plus or minuses randomly to the crossings uniformly. And then if the diagram had a symmetry like the trefoil diagram in the upper right hand corner, my right, I guess, your left, then we said well okay we're only gonna think of sine patterns as different if they're not related by a symmetry of the diagram. So like if you have two pluses and a minus on the trefoil it doesn't matter where they are like because there's a symmetry of the whole diagram. So with that count we could list all of the 10 crossing diagrams. There are about 1.6 billion of them and we classified all the knots in all the diagrams going all the way from the unknot which is about 70% of the space that's in the upper left there down to the 10 crossing knots which occur exactly once, some of them, and that's at the lower right hand edge. There's about 550 knot types that you see. Includes both primes and composites but of course there are no links because of the way we built the model that's only a single curve. And you see that kind of amazingly like these things look pretty linear at least to me and they even look like they kind of have the same slope. So it seems like, and there's no like bad thing at the end, right, it doesn't bend, it doesn't get noisy, like this goes through nine orders of magnitude from the unknot on the left to the single 10 crossing knots on the right. Eric Rodden is rechecking this data for us just to make sure we got all the knot types right but I think so far he hasn't done enough checking to tell whether we have any errors. So I claim that that means so far we're passing Eric's tests and it seems like this is really right. So, this is like my question for you, like should we have expected this? Is this some kind of universal feature of random knotting models? These log linear laws occur a lot, they occur in the frequency of words in large texts, they occur in social networks, they occur in random graphs and things. So why is a knot like a word? Okay. So let's see, I wanna talk a little bit more about the model that Tetsuo discussed earlier today. So here's a way to say it which is or in differential geometry language. So the idea is that there's an identification between closed framed curves in R3 and a Grasmanian of two planes in a complex vector space. So let me just like draw a picture, okay? So for instance, all I'm saying is that I have like CN for example and then I'm just gonna take two Hermitian orthonormal vectors A and B and this is Tetsuo's quaternion method picture where the entries in the A and the B combine to make quaternions. Those quaternions are pushed through the hopf map to become frames and those frames are the edges of the polygon. But notice that this is an amazing structure because it's not just a measurement of volume, it has a full Ramanian metric. So for instance, if you wanna compare one protein shape to another protein shape, you can ask what the geodesic distance is in the Grasmanian from that plane to the other two plane. And it's actually like an easy and fast and interesting computation to do it. If you want to think about smooth curves instead of polygons, you can do that too. All I need is a finite dimensional complex vector space like say generated by Fourier functions on the circle of degree no more than N or something. That's a nice L2 orthogonal basis for periodic functions on the circle which are complex valued. And two planes in that are the Fourier curves whose coefficients are not that big. Something that I find really interesting about this is that naturally framed space curves are not one Grasmanian but two Grasmanians. The first Grasmanian, the functions are periodic. The second Grasmanian, the functions are anti-periodic. That is to say like the function at zero is minus the function at two pi. Both of these map to curves downstairs with closed frames but which curves? Like the pretty theorem is that one Grasmanian is the curves with even self-linking number and the other one is the Grasmanian of curves with odd self-linking number. Because when you pass a curve through itself and do like link equals twist plus rive, the self-linking number of the frame changes by plus or minus two. So there can't be any way to get from one Grasmanian to the other one. A really pretty fact here is an interpretation of the total twist, right? So I told you once that every continuous symmetry has a conserved quantity, right? So there's a continuous symmetry of curve space where you change the base point. You push the base point around the curve. What's the conserved quantity? Twist. So twist is the Hamiltonian of that symmetry of the space of framed space curves. There's like more you can do but one thing that I'm really looking for co-authors for because I know how to do like the technical part but I don't really know what the interesting experiment is is that we can generate random framed space curves which are smooth and have frames from the beginning to the end. So there are these cool experiments that I saw once. I think it's like Steve Levine was doing them where you take curves that like get close to each other at the end and you say the alignment of frames relates to the probability that the polymer will cyclize and form a loop, right? If they're sort of matched up, they'll bond but if they're like rotated in some weird way they won't bond and this is a really natural way to generate ensembles for this but of course I don't know any of the constants so I don't know which experiment would be an interesting computation of cyclization probabilities but maybe one of you does. Okay, so it's late. So I will just say thank you for listening and these are the references.