 the weights, which are essentially ordinary probabilities. So the density order comes in two forms, which have a set of probabilities, set of states where the corresponding probabilities were at continuous range. In any case, the average value of an operator A, maybe across the ensemble, is given by the trace of the density operator times the observable in the classroom. And we're all going to distinguish that a pure state is a special case of the density operator, which is only a single term in the discrete sum here, which ends with 100% probability. In that case, Rho's ejection operator onto the one-dimensional subspace stand by the single pure state. If the state is not pure, that is to say, if it can't be written in this form, and you have to write it as the sum of terms, then it's considered to be a mixed state. So we're conceiving that a pure state is a special case of a density operator, in effect, taking the density operator as primary, and then pure state is something that's a special case of that, if you derive from that. This, in fact, is the most satisfactory point of view. However, the logical structure of what I've presented so far is not very satisfactory. It's, in fact, circular, because here I've defined the density operator in terms of pure states, and then I end up defining a pure state in terms of density operators. So I'll fix that up in a minute. But in the meantime, I just want to say that this conception of the density operator as a statistical weight of pure states is nevertheless useful. It's a useful way of thinking about the density operator. As in, for example, if you're not going to experiment with you have a beam which was polarized in a random direction. All right. So in any case, what I'd like to do is not to be too particular on the logical structure of this, but just to take these expressions from the density operator and derive certain properties of them. So I'll be able to do that in a quadratic set in order to use the board here. But I'm going to derive now some of the important properties of the density operator, which there's three that I want to mention. They call them A, B, and C. The first property of the density operator is that it's permission to expose the formula again. It's obvious that it's permission because the probabilities are real numbers. And these outer products are permission operators and some are permission operators. The second property is that the trace of the density operator is equal to 1. This is easy to prove in that formula in discrete sum because if I take a trace of rho, for simplicity, I'll deal with the case in which we've got a discrete set of probabilities and a discrete set of states. Then sum over the discrete probabilities and then the trace of the outer product, which is the projection operator here. The trace of that is the inner product. That's how you take the trace of the outer product. And so it turns into the inner product psi i to psi i. But we're assuming that the psi is normalized here. So this just turns into the sums of the probabilities, which is equal to 1 because probabilities have been 1. So the property of the density operator that is trace is equal to 1 should be interpreted as being normalization condition on the probabilities. Then the third property of the density operator, which I want to mention, is that rho is non-negative definite. Like for short pipe, just saying that rho is greater than 8 than 0 is a consistent process, but it means the same thing. I'll remind you that an operator's non-negative definite, its definition of non-negative definite, give its expectation value with respect to an arbitrary state that's non-negative. So let's take an arbitrary state phi with a sandwich in a round row like this and ask what do you get? Well, again, using the discrete sums of the probabilities, this is a sum of i's probabilities, f i. Then we have phi scalar prime of the psi i times psi i scalar prime of phi. And these last two factors here are complex contradictions of each other. So this is the absolute value of phi scalar prime of the psi i quantity squared. And you can see the entire sum that the probabilities are non-negative. These are non-negatives, so the whole thing is non-negative if we get the conclusion that it's a non-negative definite operator. This non-negative definite property is also connected with the interpretation of rho as describing the probabilities of the system. The reason is that you may be aware of this, but this scalar product squared is interpreted in quantum mechanics as being a probability of finding the system that you measure the state psi. Given that it was in state psi i initially, this thing squared is probably finding the state phi. Never mind that, this type of interpretations come out of the possible sequence of quantum mechanics rather easily. But my point is that this has an interpretation as a probability. And the upper eyes are probabilities, so the probability is the psi i's. So all together, what we're getting here is something that has to be a probability, and that's why it has to be a non-negative number. In any case, these are the three-pencil properties of the incident operator. Now, to go back to this point I was making earlier that the density operator is primary, and pure states are to be considered a special case of that. An important part of this interpretation is the fact that the density operator is measurable. So in a sense, we don't have to define it. We just have to say how long you measure this. The idea is that if you compute the expectation values of enough operators or enough observable say, which are given by the trace of road time today, one can determine road from the results of measurements. Now, I won't prove this in the general case, but what I will do is illustrate this for you from the case of the spin on half system, which is actually an important base of practice. So to go to the new board, the immediate project here is to show how you can measure the density operator in a spin on half system. In a spin on half system, the spin operator can be written as 8 bar over 2 times sigma. It is, of course, proportional to the magnetic moment operator. And if we were doing the Stern-Gerlach experiment, we really, from a physical standpoint, the Stern-Gerlach experiment measures magnetic moment. Sends are proportional, but I'll just talk about spin. All right, so the idea is the following. It's that if we measure the average value of the spin operator, that means we take our ensemble systems and we take the average value of Sx, Sy, and Sc, and get three real numbers, constituting a real vector here. One can measure that. The problem is, so given this, the problem is then how do we find the density operator from that? Spin on half system has a Hilbert space, which is two-dimensional, so the operators are represented by two-dimensional matrices. And so the question is, how do we determine what the density operator is, what its matrix is? Well, to answer this question, I need to make a progression in these properties of two-by-two matrices and polymatrices. Let's let, I know the two-by-two matrix, any two-by-two matrix. Then it's a fact that n can be represented, which you may well know this fact, it's a fact that n can be represented as a linear combination of the identity matrix and the three polymatrices. And let's call the coefficient of the identity matrix A, let's call the coefficient of the polymatrices B, which is the three-vector, since there's three polymatrices. All of them are four coefficients, which makes sense because a two-by-two matrix has four numbers in it. In fact, the identity and the three-polymatrices form a basis of two-by-two matrices, in terms of which the arbitrary matrix can be expanded. To show that, you just need to show that these four matrices are linearly independent. I won't go through that, but it's a straightforward do that. But the question is, how do we determine the expansion coefficients A and B? To answer that question, I want to make a little digression into some further properties of the polymatrices that I probably should have done in the last week's homework, but I didn't. Anyway, here's what they are. Here's the trace properties of the polymatrices. Well, first let's talk about traces of the identity. Traces of the identity is equal to two, as is obvious, because traces are some of the diagonal elements that have been in the matrix except ones that are diagonal. The trace of the individual polymatrices, all three of them, is equal to zero, as you can see just by looking at them. And then the third formula I want to quote is the trace of a product of two-polymatrices, sigma i times sigma j, is equal to twice product of the delta ij. This third property actually follows from the last week's homework problem, where you show that the sigma i sigma j is equal to product of the delta ij plus i times that's one i j k sigma k. The notation used here is that the delta ij is understood to be multiplied by the identity, so I'll put it in an explicit amount. And if we take the traces of both sides, then on the left-hand side we get what we want here. And on the right-hand side, the trace of the identity is two, so we get twice delta ij. And the trace of the single term, for this term, is zero, because of this middle identity here, so that goes out into the trace. And so what you'll get is the solve-coded term, because that would be twice the delta ij. So I'll really erase the proof of this and just leave the trace identities for positive cases. By using those identities, we can solve for these coefficients a and b. Solving the coefficient a, first of all, just take the trace of both sides of this equation. Trace of n is equal to a times the trace of the identity, which is just twice a, plus the trace of that term, which is zero, because b is just a vector of numbers. And the trace of the signals is zero. It's 2a plus zero. And the result of this is that we find that a, the coefficient a, is 1 half the trace of n. Likewise, if I take m and multiply by, let's say, sigma i, I get sigma i times one of the polymetric states, sigma i times n is equal to a times sigma i plus sigma i times b of sigma. I'll write this as bj sigma j, using a summation notation on one of the j's. These are just a number, so I can improve that on the left. But when I have the terms of matrices, it's just a problem of sigma i sigma j. Now if we take the trace of both sides of this, we can trace sigma i m is equal to zero in the first term. But in the second term, we get b sub j times twice delta i j, using the trace expression of there, which is the same thing as twice b sub i, during a little bit of three-dimensional tensor analysis. And this is equivalent to saying that the vector b is equal to 1 half the trace of the polymetric state, the vector matrices, sigma multiply times n. So these two boxes here, the negative results for the expansion of two by two matrices as a linear combination of identity and polymetric states as soon as you get the coefficients. All right, now to return to our problem, which is given the expectation value of spin, how do we find the density operator? Let's say, first of all, the density operator in the standard plus or minus SUC basis is a two by two matrix. So it can be written as a times identity plus b times sigma. And the question is just what are the coefficients? Well, for my own, it's not misinterpreting a row, calling a row now instead of n. So what you see right away is that a is equal to 1 half the trace of the row. But the trace of the row is equal to 1. That's the normalization condition of the probability. So a is equal to 1 half. As far as the vector is concerned, that's equal to 1 half the trace on row times the sigma using, again, this replacing n by rows, all I'm doing. However, allow me to take this expression in the moment when I divide by h bar over 2 because h bar over 2 times sigma is at the spin. But if I do this, this becomes 1 half times 2 divided by h bar times the trace of the row s. But the trace of row s is the average value of the spin, which is something you can determine experimentally. So this turns into 1 half times 2 over h bar times the expectation value of the spin vector. And the result of this is that if I take these a and b coefficients and plug them in here, then what we get is this row is 1 half of the identity plus 2 over h bar times the expectation value of spin divided by sigma. And that's the answer to our problem, namely, given the average value of spin, how do you determine the density operator? This is an interesting example of how, in the general principle, that the density operator is measurable if you make the measurements of a sufficient number of observables. This is a simple case because there's only a two-dimensional Hilbert space. And in higher dimensions, you have to obviously make many groups of a larger number of operators. This almost makes the general principle. Yes, it's a question. How do you know row is a two-by-two matrix? Because I know it. The row is a two-by-two matrix. And sigma is a three other two-by-two matrix. So here we have v equals case of sigma n. And there we have row. Oh, I see. You were concerned about the order of the p. Yes, one of the properties of the trace is that you can succulently permute the factors. And the answer doesn't change. So it doesn't hide. It's probably sloppy about the order I wrote. But in fact, it doesn't matter. All right, so this is an example of this. Now before leaving this example, let's ask a further question. Let's ask, what are the conditions such that this density operator represents a pure state? Perhaps you'll recall that in the case of the thermal beam, we found that the density operator was just one half of the identity. That was because the expectation of this thin was zero, remember? So you see, this formally agrees with our earlier result from that. But let's ask the question, what's the condition which should represent a pure state? Well, a reminder of a pure state is when the density operator can be written in terms of a subnormalized cat style like this. It's a projection operator up to a one dimensional space, which is spanned by the cat in question. And since it's a projection operator, it means that its eigenvalues are either zero or one, or the eigenvalues of rho. Now this is not generally true. It's only true of the rho as a projection operator. So this is where a pure state, the eigenvalues of rho, is zero or one. And moreover, the eigenvalue one must be non-degenerate. That's to say the eigenspace must be one dimensional because then the psi, the state vector for the system, is the vector that spans that one dimensional subspace. So this is a criterion for a pureness of an density operator has to have only these two eigenvalues and the one eigenvalue must be non-degenerate. Well, what about this density operator, or what conditions does this represent a pure state? Well, limit is an exercise for you that if you take this expression for an arbitrary two by two matrix, let's let A and B be real because if they're real, this implies that they flow as per mission, and unroll his permission. So these are real methods now. A limit is an exercise for you to show that the eigenvalues lambda are equal to A plus and minus the absolute value of vector B. It's just a straightforward exercise in a two by two matrix. But in our particular example, where rho has this form and A and B coefficients are given by these expressions here and here, we can plug this in and ask one of the conditions of one of the eigenvalues should be one and the other one's zero. Well, this is going to require that A, let's say plus the absolute value of vector B is equal to one and A minus the absolute value of vector B is equal to zero. These two conditions imply that A is equal to the magnitude of B, which is equal to one-half. Well, we already knew that A was equal to one-half because that was necessary for the trace condition of rho. But now we find that in order to get the two eigenvalues to zero and one, magnitude of B must be also equal to one-half. So the B vector here is this thing that's two over h bar times the expectation value of S. So what we conclude this is that rho is pure, a pure state, if and only if the expectation value of S is equal to h bar over two times the unit vector, let's call it N-back. Some unit vector. In fact, one can show that generally, the absolute magnitude of S is less than or equal to h bar over two. This is the maximum value of the magnitude of the expectation value of the spin that can take R. In that case, then we can write rho in this form as equal to one-half identity plus the unit vector in question dotted into the problem matrices. And for a spin one-half system, this is the general form of the trajectory of the density operator for a pure state. Now I'll leave this as an exercise for you, particular of what size is equal to, this is of course going to be the eigenvector of this matrix with eigenvalue plus one, and so it's straightforward to figure it out. But this then gives you the wave function. This is how you get a wave function starting from a density operator. Notice that if you don't have a pure state, then there is a more wave function. Now, so that's all I'm going to say about the measurement of the density operator. It's important to make some practices in that. Now, what I'd like to turn to now is a quantum statistical mechanics. Statistical mechanics is applied whenever you have incomplete knowledge about a state of the system. In classical mechanics, it means you only have some probability distribution of a probabilistic sense, since you only know the state of the system in a classical state and in a probabilistic sense. It's usually described by a probability distribution in a classical phase space, which is called a numerical distribution. In quantum mechanics, the repository of all the information, the probabilistic information you have about the system is actually the density operator. So the density operator is a central object in quantum statistical mechanics. Now density operators, different density operators are appropriated in different circumstances, depending on what knowledge you happen to have. And just as in classical mechanics, there are different ensembles, one might consider. And so, one can think of different density operators corresponding to different ensembles with different physical situations, or one has more or less knowledge about the system. Now, as you know, the one's knowledge about the system is quantified by the entropy. To give you a little history of the entropy, the entropy was originally conceived in the 1860s and 1850s and 1860s by a Kelvin and Clausius, which was originally conceived of as a quantity that applied to a system of thermally equilibrium, like a cloud of water at a given temperature. And it's still considered this as one of the ways entropy is used. However, somewhat later, Boltzmann realized that the entropy, the concept of entropy could be generalized also in non-equilibrium situations. And the entropy became, in effect, a functional of the probability distribution of the system. And Boltzmann wrote down this famous formula for what that probability is, or what that indefinition of entropy is. So to make a long story short, if I take Boltzmann's formula, which was classical, from the 1870s or 1880s, not sure of the exact date, and make the obvious transcription into quantum mechanics, then what you get is an expression for the entropy. It now becomes a functional of the density operator. And this is the formula, as the entropy is minus k, which is Boltzmann constant, times the trace of rho times the logarithm of rho. And this can also be written as minus k times the expectation value of the logarithm of rho. Because remember, the trace of rho times the operator is the average value of it. So the average value of the log rho is seen across an ensemble. I'm not gonna justify this. That's another course to do it, but this is the definition of entropy and entropy in quantum mechanics. Now, I will remark, however, what we see here is a logarithm of an operator. Remember, rho is an operator, it's a emission operator, it's an observable. And back when we were doing the notes number one on the mathematics of quantum mechanics, I think I mentioned this when we were in a lecture about the function, what it means to talk about the function of the observable. The idea is that you go to the eigenspaces of a given observable and the computer function of the observable, function of the observable has the same eigenspaces and therefore the same projectors, but the eigenvalues get replaced by the function of the eigenvalues. That's how you make a function of an operator. If the function is a polynomial or something you can expand it in a Taylor series, you can also conceive of the function of an operator by the instance of a polynomial or the series of the operator. In that case, the two definitions are the same. The logarithm doesn't have any Taylor series expansion, not that point, but zero anyway. Anyway, the concept of a function of an operator is defined for all observables even for rather strange functions. In this case, it's just the logarithm which is not the strangest that we'll ever see. All right, now, so as I say, this is a functional of the density of the operator, so density of the operator is given different entries. One remark is that if you have a pure state, entropy is zero, and if you have a mixed state, the entropy is always positive. So a state of minimum entropy is where you're in a naturally pure state. Now, the game we frequently play here is we impose constraints on the system. We say, for example, we know quantity of entropy, which is defined as the average value of the energy to Hamiltonian, which more exactly is the trace of both times h. Well, let's suppose this is given. So in other words, let's suppose we have an ensemble which the average energy is known. There's also, of course, the normalization of rho, which is equal to one. This is also a given, it's another constraint of rho. And if we maximize, this is really Boltzmann's game of playing here, we maximize the entropy subject to the constraints that the average value of energy of the normalization of giving, then it's possible to determine what s is. And s turns out, excuse me, how much s is, what rho is. It turns out that rho, in this case, becomes proportional to e to the minus theta h, where h is Hamiltonian. The theta here is a Lagrange multiplier and a constrained maximization process, but it's otherwise whatever kT, the usual beta parameter in the statistical mechanics. So this is easy to do, but I'm not going to do it because it's the wrong course. But I do want to point out that you get this density operator under these conditions, which stands for the canonical ensemble, that is to say, it's an ensemble that's physically means a system that's in contact with the heat path. Or you create an ensemble of systems by taking and putting in contact with the heat path, and moving them and doing an experiment such as measuring their energy or some other observant. All right. The normalization constant that appears here, this is just a personality, but the normalization constant is traditionally called one over capital C, so it might grow as equal to that thing with one over capital C. And the normalization is determined by demanding the trace of rho as equal to one. And if you do this, then this implies that C, which is a function of the fluid dynamic parameter beta, is the trace of the equalized beta H. So the Hamiltonian, we put a box around that because that's the basic result in quantum statistical mechanics. Zeta, of course, is a partition function. It stands for German, such non-zoological words, some of the states. The trace is, of course, the sum of the diagonal elements of an operator. Here, we've got an exponential operator, the Hamiltonian. A convenient basis in which to do the sum of diagonal elements is the energy eigenbasis. Let's introduce an energy eigenbasis. Let's write it this way as N of r is the basis. Let's say the energy eigenvalues are E N. And this is the same kettle over the M. Let's say these are normalized here. So the N labels the energy eigenvalue and the r is an extra index used to resolve the genereces because they're our sum. So let's say the r goes from one up to, let's call it g sub N, which is the number of the generated energy eigenstates. Then, this trace, or you look at this box, this trace can be written this way. It's a sum of N and r of diagonal elements. It's a N r sandwiched around e to the minus beta H, and then r like this. However, since these are energy eigenstates and we're acting on them by a function of the Hamiltonian, it just brings out a function of the eigenvalue. So this gets replaced by e to the minus beta H sub N, the energy eigenvalue of the state. And the result is this turns into a sum of N and r of e to the minus beta E N times the scalar product N r of itself, which is one since the states are normalized. So we can write this as a sum of N of r of e to the minus beta E N. Or since the sum N is not the kind of r of the degeneracy index, we can write this as just a sum of N times the order of the degeneracy times e to the minus beta E N. And I ended up after this with the formula that you probably were in under the stat net courses for the partition function that's given by this, in terms of the degeneracies of energy eigenfinding values. It's really normalization of this number. Now, there's a little secret that I never tell you when I read stat net courses. You spent a lot of time calculating the partition function because it's useful for finding things like the equation of state and a lot of important things to derive from it. But it actually does not give you complete information about the statistics in the system. If you look at that, it means you're really the density operator and not just the traces of the whole thing. So for example, if you're talking about a gas, maybe it's not an ideal gas and you want to know correlations in particles, that's something we can't get from the partition function in the full density operator. All right. Now, I think that was the main thing I wanted to say about that, except that I'd like to give you an example. This is an example of the density operator to see how it works in practice. We, rather speaking, the basic idea is that the different energy eigenstakes in double equilibrium have probabilities that are proportional to just the Boltzmann factor. So that's an easy rule, e to the minus beta times energy. It's an easy rule. These are not normalized probabilities, but if it's the only answer to a relative probability, it's easy to do. Let me give you a physical example of that. I want to talk about hydrogen and gas, but I mean atomic hydrogen and non-hydrogen molecule, ordinary hydrogen and gas and laboratories of H2, and I don't want to talk about that. Gas and atomic hydrogen occurs in astrophysics, so these balances are full of large clouds, many of the clouds contain atomic hydrogen, they're non-hydrogen, not molecular hydrogen. It's a super-assisted actually. Tomic hydrogen, of course, has an electron and a proton, and they both have a spin, but both spin won't have particles, so either spin can be up and down as a total force of spin states. Because of magnetic interactions between the electron and the proton, these spin states are actually split into two energy-mining states called the similar and triplet state. And so, let me tell you about how they're in debt. There's a similar and triplet state. Schematically, the similar looks like this one, spins are offset when the triplet states are in parallel, and if I write these states as Sn, where S is a total spin, and N is a total magnetic quantum number, and the similar state is zero and zero, and the triplet state is one and one, and M is equal to zero plus or minus one, that's why it's called a triplet state, because it's three states. What's called the similar state, also zero and the triplet state, one, or what do they call it, E zero and E one, as they actually have different energies. In fact, the E zero is really the ground state, and the hydrogen atom and E one is slightly above it. They're split in energy-mining, which is magnetic in their action, as I mentioned. The energy difference between these two states, this is called hyperfine splitting. The energy difference between these two states is translated into a wavelength unit for the photon derivative of the transition, is a 21 centimeters, but this is a famous 21 centimeter line, it's important to ask for physics. You translate that into temperature units, it's about five Kelvin's. The actual temperature of these gas clouds out there is comparable to five Kelvin or, in fact, maybe some larger than that, but in any case, the point is, is that the temperatures are such a magnitude, that the volts are, in fact, is important when determining the relative populations of these two energy levels, at least when these are clouds of atomic hydrogen. In particular, the relative probabilities of the two levels are E minus E zero and E minus E one, except it's a little bit tricky because E one has got a three-folded genesis, so I need to multiply it by three. Those are the relative probabilities. I'm going to put it another way, the partition function in Z of theta is just equal to the sum of these two, these two terms, like this, is the sum of the volts, in fact, that's the genesis. What is the density operator? Well, it's one over Z, that's the normalization. The singlet state is given minus theta E zero and there's a projection operator on the singlet state like this. And then for the triplet state, it's given minus theta E one, and then what you've got is a sum over these magnetic ones, that are less than one N, and this other product is one N. This sum is the projection operator on to the degenerate triplet state, and this other product here is the projection operator on to the non-negative ground state, and here's the density operator describing those atoms out of the galaxy. Now, there's a question which, so probabilities can be just read off as the coefficients of these terms, the various terms here. You see the three excited states all have the same probability and the same energy, definitely the same. Now, what we've got here is two states with certain probabilities. If I took a linear combination of the ground state and some of these first excited states, then from the coefficients of the linear combination I'm also at certain probabilities. The question arises is what's the difference between this density operator and some linear and there is one, and some linear combination of the ground state and excited state? The answer is there is a difference. This is what you call an incoherent mixture, it's the same thing, the same, it's a mixed state, whereas a particular linear combination of the ground state and excited states is what you call a coherent mixture, which would be a pure state. Let me try to describe this in a more general language without reference to the cyclifying of transitions in nitrogen. Let's say that if we have a density operator for some system now, you just take it in with the system and for simplicity, let's suppose the energy hiding states are not in degenerates, so let him just take it in like this, it relates that again, this is really the same notation I'm choosing earlier for a general density operator. The evidence relates and there are also probabilities so they sum up to one. This is a density operator. Let's also consider a pure state psi, which is given by its expansion in terms of energy eigenstates with some complex coefficients like this. This is what you use to ruin the expanding states. Now if you have this linear combination of states, then you know the probability of finding a system in the energy eigenstate then is the square of the coefficient of C in the square. And these things are positive, not negative numbers that add up to one, so they're just like these probabilities now here. The question is, what's the difference? Is there some physical difference between these two? Supposing the C in the square is equal to the aps, is there any difference between these two? The answer is, is that if you're only gonna measure energy, you couldn't tell the difference between the two, because you get the same probabilities from the energies. But if you wanna measure other acceptables, then there is a difference, there's a physical difference between them. This is a mixed state, this is a pure state. To understand this, allow me to do the following. First of all, let me take this complex number C in. Let's take this absolute value called A in, like this. And then let's write C in as if it were A in, which is its magnitude, that is the phase factor in the I phi. And so this is what I'd see in terms of amplitude in phase four. Then let's take this pure state psi, and let's compute the density of the operator in the corresponds to it. That's of course just the product of psi of itself, but taking the series, forming its complex contract, forming the braheter, multiplying together, you're gonna double sum that looks like this, to sum up N and M with A sub N times A sub M times E to the I phi N minus phi M times the outer product of N with M. It's represented as a matrix in the energy representation. But as you see, it's not diagonal. This matrix is diagonal. This one has off diagonal elements. And so that's one of the differences between this density of the operator and this one, which is the first state. Now in many circumstances, the phases which appear here are not known as well as the amplitudes. The amplitudes are related to the probabilities. The phases in many circumstances are not as well known. One of the reasons is is that the phases are time dependent. In fact, by the Schrodinger equation, phi N is equal to minus E N times T, where E and N is the energy eigenvalue. So these things evolve in time, and the phases keep increasing in time. If you're taking an off diagonal term where N is not even the N, then the energies are not equal. And so the difference between these phases grows in time. And ultimately, it'll get to be an arbitrarily large multiple of two pi. If there was some errors in the energy eigenvalues, where you didn't know them precisely, then most errors would ultimately result in phases that were phase differences, and that were completely unknown to within a factor of two pi. In other words, they become effectively random phases after a sufficient amount of time. So what this suggests is that we go from this pure state into a statistical mixture in which the statistics is given by a random phase ensemble. These phases are all independent statistically independent of one another, and uniformly distributed between zero and two pi. If we do that, then we'd like to replace this pure and this is about the right statistical average of phases. Well, the average of phases only affects this term because nothing else depends on phases. So let's take you to the I phi N minus phi M, and let's average it over the phases. What do we get? Both N is equal to M. These two phases are equal, and so they cancel out into the average value of one, and the answer is one. But if these energy levels are not the same, then you've got independent random variables that are uniformly distributed around a circle, and the average is zero. So this average turns into a product of delta Nm. And so if we take this statistical average of this pure state, what do we get on the right-hand side? This thing is going to average the product of delta Nm and collapses into the single sum becomes the sum on N of A N squared times the other product of N with N. In other words, it looks just like the Boltzmann, it looks just like the Boltzmann or a, well, this is the Boltzmann ensemble if you have to answer the Boltzmann factors. More generally, this is an ensemble that's diagonally heterogeneous in relation, and that's what we get in this random phase, in this random phase of an ensemble. So in particularity, ensembles in a canonical ensemble that's in thermally equilibrium can be thought of as an ensemble in which the probabilities of being in different units of the eigenstates is known, but the phase is completely unknown. So it's equivalent to this close point here. Now, so that's not about the canonical ensemble. In fact, that's all I wanted to say about the density of the operator. What I'd like to do now is to go back to the posthumance and quantum mechanics. This is all we know, so we don't have to copy this down. Go back to the posthumance and quantum mechanics which I presented earlier, and I didn't complete four because we were using, it was incomplete because we were using pure states, which at that point was undefined, and that revised those posthumance now in terms of density operators. So some of the posthumance are the same as before, and one is still the same, because the system corresponds to the ket states. Two is different, the state now, the system corresponds to a density operator, which is a commission operator with unit trace, and whose is non-negative definite, that's what that expression means. Three is still the same as before, and we mentioned the process corresponds to an observable and a complete emission operator with an axon and a ket space. Result of item number four is the same, also the results of the measure are the eigenvalues to the observable, even in the screen of the continuous, depending on what you have. Five is a probability posthumance, it now becomes modified, the probability of finding the result A equals A, and in the discrete case, there's a trace of Rho times Pn, where Pn is projected along to the corresponding items space of the operator A, the eigenvalue a sub n. In the continuous case, the probability of finding A in some interval, let's say from the case here of A1, there's a trace of Rho times projection operator that corresponds to that interval, which is the interval of the product along the interval of the continuous variable A, here I can put it in sum over the genitalizing index, in case there's a genitalize, because the interval is high. So this is the revision of the first five posthumance. And as far as the six posthumance goes, then I'll remind you that in the case of the purest states, we talked about a couple of lectures ago, the last posthumance said that after the measurement, the system's been projected onto the item space of the observable, whose eigenvalue you've gotten, the measurement, so-called last postulate, that is replaced by something else, when you have density operators, and I'm reading that as an exercise for you to fill in item number six, it's a good exercise for people who know density operators. All right, so that's all for density operators. And now I'd like to, I'm going to be creating something. Are there any questions about density operators for you all? Yes. Is there a more general criterion for, like is there like a public stage here? Yeah, there's several criteria, and I've mentioned some of them in the notes. So here in lecture, I only mentioned the more obvious one, where you diagonalize and then get the eigenvalues. That's some of the most worse. There's easier ones, though. It turns out that row squared equals row, if and only if it's pure. You can see that a row is psi, psi, you can square it, but you get this, row squared is, you see, and this is just one, so row squared is the row of a pure state. And the kind of this works too, it's pure method only as conditionals. The condition is the trace of row squared. Actually, the trace of row squared is always less than or equal to one, but the trace of row squared is equal to one, and that's pure. And yet another one is the entropy, entropy is equal to zero, if and only if it's pure. This is, maybe not so easy to do, as we've mentioned, if you're going to be leaving the log of the flow, it's going to be the eigenvalues and eigenvectors. And if you're going to do that, you might as well go for, might as well use a similar criterion. It's still, it's interesting, because zero entropy means that you have the maximum information you're going to have about the system, it's not a pure state. That doesn't mean that you've eliminated statistical uncertainties in the measurement process, because quantum mechanics is in principle statistical, so there will still be statistical distributions and answers for the different systems, but this is the maximum you can get. Are there any other questions? So if I'm going to like to do now is to turn to the, turn to the question of the spatial degrees of freedom, we're also going to talk about spin. So let's talk about spatial degrees of freedom now. To make this simple, let's first of all take a one vk, so that we're thinking of a part of the moving of a y in there, or the x, and let's also suppose that there's no spin or any other variables around except the position variable. That's not quite what I mean. There are other variables around like you mentioned and so on, but what I do mean is that a complete set of community observables, which is what you need to measure in order to get things down in one dimensional subspace, is the polar space, consists of just the x operator itself. That's all there is. I'm going to use the notation putting a hat on the x to indicate an operator. I'm sorry, because sometimes I use the hat to indicate a unit vector, and it's hard to get a good notation to this. So in the present case, this means operator, and this is to be in contrast with x without the half the sense of a number. There's something I should have told you a couple of lectures ago, so I'll tell you now, it's a little bit of a regression. There's an old terminology here, the graph, which is that he talks about what he calls Q numbers and C numbers. And here, let's use some terminologies and I'll tell you what it means. A Q number basically means an operator, and a C number means an ordinary number. So we usually use this terminology when we want to draw up a distinction between let's say an operator and its value. So in the case of x hat, there's a Q number and the x which is the variable for coordinate is C number. Now, so we'll just take it as given that x hat by itself is one dimensional problem with these set of commuting observables. The, when we measure x, measure position, of course we get a continuous set of values. So if I take the eigen ket, eigen value problem that the operator x had, it looks like this. I labeled the eigen ket by its eigen value and multiplied operating with x hat which brings out the eigen value. This must be a member of the continuous spectrum because we know measuring position makes non-continuous values. At least until we get down to the cost of the order, nobody knows what happens. And as a result, the normalization of these position eigen ket's must be understood in an adult function sense like this. And moreover, the resolution of identity is a integral from minus infinity to plus infinity to the other product of x with x. This always follows from basic formulas in the fundamentals. Now, let's say it inquires to the probability of measuring x to y in some interval. Let's say this is with respect to a pure state. Having done the density operators now, you're gonna find that most of the rest of the course is to talk to them about pure states. This is, as I've said, I think in lecture, I'm certainly going to say this is a bias of quantum forces that in real experiments you're always using density operators to not go in functions that don't go as much. Anyway, let's say we have a pure state psi and we're interested in the probability of measuring x to y in a certain interval. But the possibility of quantum mechanics, this is psi sandwiched around a projection operator that corresponds to that interval. For that projection operator P i, i is an interval here, it's an interval x0 to x1, it's an interval from x0 to x1 of B x of the other product of x or x. It looks just like the resolution of the identity except the interval less than the whole regular thing just over the interval. In any case, this means that this probability is then equal to the interval B x from x0 to x1 of psi squared of product of x times x squared of product of psi. Now at this point, we'll make a definition. We define the quantity x squared of product of psi to be equal to the wave function psi of x. And this may be sort of your i wave functions from the cat formula as much as we found this earlier. And with that definition of wave function, you can see that this interval, the first factor is the wave function of complex conjugate and the second factor is the wave function. So this is an interval x0 to x, x1 of B x of the absolute value of the wave function squared. So the result of this is that the probability of lying in a interval is the interval of the interval of psi of x squared. Therefore, psi of x squared is the probability density, which of course, we know that this is how it probably comes out. Now, this definition here, let's see. Yes, this definition here can be thought of in another manner as well. This is an effective part of a translation table that takes you from cat language to wave function language. If you're given the cat psi, you want to find the wave function psi of x. You just take this given product from the left and position it back. How do you go the other way? Suppose I'm given a wave function and I want to find the cat psi. When the answer to that is, as I take the projection operator here, we multiply both sides by the cat psi. So I get the cat psi solved on the left-hand side. To the right-hand side, I get the scalar product of x with psi, but by our definition, that's the wave function. And so the result is, we get this formula, is cat psi, physically integral, is solved from right-hand side to the left-hand side dx, cat x, and psi of x. And this goes the other way, if you get psi of x, you can now find the cat. When it does something else too, it shows you that the wave function is nothing but the expansion coefficients of the quantum state, the vector and the homework space, with respect to position basis. This is one of the reasons why we think, we should think of psi of x as, in some sense, not having a terribly privileged role, because you can take it back in psi, and you can expand it in lots of different bases, instead of it's part of a great choice of bases. And this is just one of these, just position bases. All right. Now, there's another point to be made here about this, which is that the wave function psi of x has a single value, a single value, because it's nothing but the expansion coefficient of the state vector psi with respect to the position lagging mass. So for a given value of x, essentially it has only one value. This is an issue which arises sometimes in solving the Schrodinger equation, which is the first question to the introductory courses, when we separate the variables and so on, there's some point we have to say, well, we have to require a wave function to be a single value, and what's the physical justification for that? Well, this is really it, this is the reason the wave function has a single value. I've done this all with one dimension, but it's easy to generalize this with three dimensions. If we do it in three dimensions, then there are normally three positions for this, x, y, and z. And so each one of them, of course, amounts to an operator, as a measure of the three coordinates like this. And now the complete set of convenient observables still ignore any skin, assuming there's no skin. Complete set of convenient observables is three variables, x, y, and z. For all also, my business X hat, either I, or each one, is to increase the index one. Now one thing to say right away about these three observables is to think meaning with one another. X hat, y, commentator, x hat, j, is equal to zero. How do I know that? I'll remind you that the communicativity of operators is something that can be subjected to an experimental test. It has to do with the fact that if the operator is weak, then you only measure it in either order, and you get the same statistics on the doubly-filtered system. So in this case, if we measure X, for example, by putting a slit in the x-folded y-direction so that you're, so the slit in the way you want to see, you have one measure of x, but the slit in the y-direction is like this. If you want to measure y, you can slit the x-direction, and so what it leaves behind is a little square. And let's just say, one can say that experimentally can be determined that the statistics of the ensemble you get doesn't matter when you put the x-slit first or the y-slit first, it's the same answer. This is equivalent to commutativity of these operators. By the way, they have to communicate if they're going to form a complete set of commutative observables since that's what the C stands for, is commutative. All right. Well, in this case, we have eigencets. We have simultaneous eigencets of the three observables. We can call that x-vector like this. But these are actually simultaneous eigenvalues of the three operators. X hat i equal to x-vector is equal to x i and i equal to value times x-vector. This is actually three simultaneous eigenvalue equations. Most of the formalism that I just described in the one-dimensional case goes over with kind of only obvious modifications going over the three-dimensional case. And so I won't bother to write all the numbers. I'd rather solution the identity number becomes three-fold integral for all space here instead of talking about the probability of the interval. One talks about the probability of finding the critical region in space, the absolute value of psi squared is the probability density of three-dimensional space and so on. It's a straightforward relation to three-dimensional space. OK. So I'll stop now. I'll just tell you next time, which is coming up tomorrow. I'll be telling you about translation operators.