 This lecture is part of Berkeley Math 115, an introductory undergraduate course on number theory. And this lecture will be about primes. So I'll just recall definition of a prime. First of all, we define it for positive integers. So prime p is an integer greater than 1, not divisible by anything except 1 and p. So of course we know the first few primes of 2, 3, 5, 7 and so on. The letter p is often used for prime numbers. p obviously stands for prime. Notice by the way that we have this condition p has to be greater than 1. 1 is not prime. And the reason it's not prime is that I've defined it not to be prime. And the reason I've defined it not to be prime is that it turns out to be very convenient to have 1 not count as a prime for reasons we will see fairly soon. Not all mathematicians agree with this. There was a mathematician called Laina who used to compute primes a lot and he sort of annoyed all other mathematicians by counting 1 as a prime. So he would produce counts of the number of primes less than a million or a billion or whatever and his counts are always off by 1 with everybody else's count because he thought 1 should be a prime. So that's a prime if you're talking about positive integers. There's an alternative definition where we allow negative integers. So here we want minus 2, minus 3 and minus 5 and so on to count as primes. And here we would say a prime is an integer p not equals 0 such that if p equals a b then a or b is a unit. A unit is something that has an inverse and the only units for the integers are just 1 and minus 1. For more general rings sometimes we will allow other things to be units. In fact we will see a few examples of this bit later. And let's start by recalling simple ways to test for primes. So suppose we want to test some numbers say 101 is prime. We just need to check all the numbers 2, 3 up to 100. These do not divide 101. Well that's kind of stupid because we can simplify this a bit. First of all we only need to check primes because if it's divisible by say 6 then it would have to be divisible by one of the prime factors of 6. Another way to speed it up is we only need to check numbers less than or equal to the square root of 101. That's because if 101 equals a b and a b are both greater than the square root of 101 then a times b would be greater than 101 which is a contradiction. So we only need to check numbers up to the square root. So we just need to check 2, 3, 5 and 7 because these are the prime factors less than the square root of 101 and we can easily check it's not divisible by any of these. So we can easily see that 101 is prime. You can see the number of steps for this is going to be roughly bounded by about the square root of whatever number we're checking. So if we're checking n to be prime then we're going to need to check about the squares of n numbers. Well a bit less than that because we only need to check primes but whatever this gives a crude bound. If n numbers n up to about a thousand or up to about a billion if you've got a computer this is quite reasonable. This is too slow if n is large and if n is a really large number say it has a hundred digits or something you wouldn't use this test because it's too slow. Well actually you would use this test part of the way because we have some fast ways to check whether numbers are prime so they take a long time to warm up and you know before trying one of these heavyweight tests what you should do is try the easy tests. You check n to see if it's divisible by 2 or 3 or 5 because these are really fast tests to do and if you've got a computer you might for example check that n isn't divisible by any prime octramillion or so before starting one of these more complicated tests. So checking a number for divisibility by a few small primes is nearly always a good thing to do when you first come across a number. Now we have the fundamental theorem of arithmetic which says that any number n, n is greater than equal to 1, n is a product of a finite set of primes in a unique way. Well it's not quite unique because 2 times 3 is equal to 3 times 2 so we say a unique way op to order and you may think you thought of an obvious counter example because you might say that n equals 1 is not a product of primes but I will say sort of note that 1 is a product of the empty set of primes because by convention the product of an empty set of numbers is taken to be 1 in much the same way that the sum of an empty set of numbers is taken to be 0 and it's actually very convenient to have this convention about the product of an empty set of primes being 1 it makes several theorems rather easier to state. So that's a product of primes for positive integers. There's an alternative version if you allow positive and negative integers where you'd say if n is not equal to 0 n is a product of primes. Here we allow primes to be possibly negative in a unique way op to first of all order and secondly op to units. And it's actually not quite correct to say it's a product of primes in a unique way because minus 1 isn't actually a product of primes. So what you have to do is remember to say it's a product of primes and a unit and it's in a unique way op to order in units so you can have 2 times 3 is equal to 3 times 2 is equal to minus 2 times minus 3 and so on. So these two versions are pretty obviously equivalent. I'm just going to prove this version here and the second version follows immediately. So the proof is in two parts. There's an easy part which is existence of a factorization n equals p, q, r and so on where these are primes and there's a hard part which is uniqueness which says that this is unique op to order. So let's do the easy part. Let's take a number n. If n equals 1 or n prime then we're done because it's a product of an empty set of primes or a product of just 1 prime. If not, otherwise n is equal to a, b where a and b are both greater than 1 and less than n because if you can't write n like this then it must either be 1 or a prime more or less by definition of a prime. So by induction a is a product of primes p1, p2 and so on and b is a product of primes q1, q2 and so on. That's because these numbers are less than n so we can apply our inductive hypothesis about every number being a product of primes and so obviously n is equal to p1, p2 and so on times q1, q2 and so on. So by induction on the integers we've shown that every integer is, every positive integer is a product of primes. Now we've got to do the hard part, uniqueness and here we need a key property of primes so this is the key step. Suppose p is prime and suppose p divides a times b then p divides a or p divides b. So this is the central key result we need in order to prove uniqueness so how do we prove this? Well suppose p does not divide a then the greatest common divisor of a and p is equal to 1 as p is prime. Now p is prime so the only possible divisors of p are 1 in itself and we've just assumed it doesn't divide a so the only number that can divide a and p is 1. So by Euclid ax plus py is equal to 1 for some integers x and y and now we multiply by b so bax plus bpy is equal to b. Well p divides this and it divides a b because we assumed it divides a b and p obviously divides this because p divides p so p divides b which is what we're trying to prove we've shown that if p doesn't divide a then it must divide b. So with this key step it's now easy to prove uniqueness. So let's suppose p so let's suppose n is equal to p1 p2 and so on product of primes and suppose this is also equal to q1 q2 q3 and so on. Then p1 divides q1 q2 q3 and so on. And now we show that if p1 divides a product of two numbers then it must divide one of them and obviously you can just sort of repeat this by induction and show that if a prime divides any product of numbers then it must equal then it must divide one of them. So p1 divides qi for some i but then qi is prime so p1 equals qi. And now what we can do is we can just go back and p1 is some qi here and we can just cross off p1 and qi from the factorization and repeat. So we sort of repeat this after dividing by p1 equals qi. So the factorization is unique up to order. Of course we can't insist on the order because p1 doesn't necessarily divide q1 it just has to divide one of these. So this proof of uniqueness of factorization into primes is more or less in Euclid's elements of mathematics written more than 2,000 years ago. He didn't quite state it and the reason is that the Greek mathematical notation was so clumsy that it was actually really difficult to even state the product. State that n can be written as a product of primes. The trouble is the Greek notion of multiplication was a bit tricky. They tend to represent numbers as line segments and you can multiply two line segments to get an area and three line segments to get a volume. But if you haven't yet discovered four dimensions it's a bit difficult to do a product of four numbers geometrically. So what did Euclid do? Well what Euclid did was he proved the key step here. He proved that if p is a prime and divides a b then it divides a and it divides b and that's the hard part of proving the fundamental theorem of algebra. So Euclid is sort of credited with proving this theorem because he gave the essential hard part of the proof. We can also give a similar result for polynomials over say the reals or any other field. So if you've got a polynomial f of x is a0 plus a1 x and so on. Then we can talk about polynomials being prime if you can't write them as a product of polynomials of smaller degree. Notice here the units are the constant polynomials that are not identically zero. So the result here says that any polynomial, any non-zero polynomial can be written as a product of irreducible polynomials and constant polynomials. And this is unique up to order and multiplying everything by constants. I'm not going to give the proof of this. You more or less just copy the proof I gave for the integers. Now I want to give some examples to show why we actually need to prove this and why it's not entirely obvious. So for example you can ask a similar result in any collection of objects that you can multiply. We can multiply integers so we can define primes and talk about unique factorization. If you've got any other collection of objects you can multiply you can again try and define primes as things that can't be factorized and so on. So let's look at some cases. Suppose you look at the reals that are greater than or equal to one. Again you can multiply two reals that are greater than or equal to one and you will get another real. There's one unit which is one. And what would a prime be? Well a prime would be something that can't be written as a product of two things that aren't units. But the problem is if you take any real you can write it as the product of the square root of A times itself. That's any real greater than one. So there are no analogs of primes. Let's put primes in inverted commas. So here you can't write a real number greater than one as a product of a unit and primes. Where by unit and primes we adjust the definition for the reals greater than or equal to one. A similar example, well the reals are sort of continuous in some sense. And we can ask can we do something like this for things that are discrete? Well here now let's take the integers 1, 5, 9, 13, 17, 21. So these are the integers of the form 4n plus 1. And notice you can multiply them because 4n plus 1 times 4m plus 1 is equal to 4 times mn plus 1. So here we can multiply these integers and you can ask what are the primes here? Well this one is number 1 is obviously just a unit and 5 is prime and 9 is prime. I mean you can multiply 9 as 3 by 3 but that doesn't count because you're only lying numbers that are one mod 4. So these are all primes, 21 is again prime but 25 is not prime and 29 is prime again and so on. So 25 is not prime because it's 5 times 5 and similarly 45 would count as not prime. So we have an analog of primes and we can write every number is indeed a product of primes. And I'm going to write primes in inverted commas because I'm talking about these funny primes where I'm counting 21 and 9 as being prime. However this is not unique. And the problem is we can do things like 9 times 49 is equal to 21 times 21 and 9, 49, 21 and 21 are all primes in this funny sense. So here we get a non-unique factorization. Of course if we allow integers that are 3 mod 4 then 9 becomes 3 times 3 and this becomes 7 times 7 and this becomes 3 times 7 and this becomes 3 times 7. So we restore unique factorization if you add the numbers that are 3 mod 4. So anyway the point of this example shows that you actually have to do some sort of non-trivial proof that ordinary integers can be written as a product of prime. So you've got to eliminate some sort of weird case like this going on. Well the problem with these two examples you could say the reals greater than or equal to 1 and these numbers are not actually closed under addition and subtraction. So can we find examples with addition and subtraction where we don't get uniqueness? And the answer is yes. So let's look at say functions on the reals that are greater than or equal to 0. And here we can multiply them and add them and subtract them point wise. But if you look at the function x you can see x is equal to root of x squared and root of x is equal to the fourth root of x. All squared and the fourth root of x is equal to the eighth root of x all squared and so on. So you can keep on splitting this function into a product of smaller and smaller things and none of these are units. So there's no way to write this as a sort of product of prime functions because we can always split the function as a product of two smaller things. So we can't write functions as a product of... If you're looking at general functions on the reals there's no good concept of a prime function that you can write all functions as a product of. Another example, let's take all numbers of the form m plus n times the square root of minus 5 when m and n are integers. This is an example of something called an algebraic number field and a lot of number theory can be extended from the integers to algebraic number fields. However, one thing that can't be extended in this case is unique factorization because we can see that 6 is 2 times 3 and it's also 1 plus root minus 5 times 1 minus root minus 5. And this gives two different factorizations of 6 and they don't differ from each other by units and you can't write either of these as a product of none units. So there's no uniqueness of factorization in this case. Incidentally, we will see later on that n plus n, I think the numbers of the form n plus root minus 1 do have unique factorization into primes. These are called the Gaussian integers and we will later adapt Euclid's proof to show that it can be extended to these as well. Here the units by the way are plus or minus 1 and plus or minus i, which is root minus 1. So it has slightly more units than the integers. So in some sense, we're quite lucky that the integers do have unique factorization. I mean, we could easily have ended up with the integers behaving like these numbers here, which would make life a lot more difficult. Next, we will have Euclid's proof that there are infinitely many primes. I discussed this in the first lecture, so I'll just quickly recall it. So you remember what we do is if we've got any finite set of primes p1 up to pn, what we do is we just multiply them together and add one. And what we do is we take a new prime p that divides this, and then we notice that p is not equal to p1, p2, up to pn, because if it was p1, say, then it would divide this and it would also divide this, so it would have to divide one, which is nonsense. So p is a new prime. So there are infinitely many primes. Well, Euclid didn't actually say there are infinitely many primes because he didn't like the concept of infinity. So what he said is that for any finite set of primes, you can find another prime not in that set, which is another way of saying that. Of course, he didn't use the word set because set theory hadn't been invented, but whatever. As I said in the first lecture, there's a common mistake that some people think that p1 up to pn plus one is always prime, and you can easily check that it isn't. I mean, let's do the first few cases. First of all, we take the product of the empty set of primes, which is one and add one, and we get two. Then we take the product of all the primes we found, two, and add one, which is three. Then we take two times three plus one, which is seven. And we take two times three times seven times plus one, which is 43. Then we take two times three times seven times 43 plus one. And this time we get 1807, and this factors as 13 times 139. So we don't always get primes. We really do sometimes get this number factorizes. There's something you might wonder about. What happens if instead of taking two times three times seven times 43, you just take all the first few primes, two times three times five times seven times 11, or whatever, and add one? Is this always a prime? See, it looks as if it's got a much bigger chance than normal of being prime because it's obviously not divisible by all these small primes. So these numbers are certainly more likely than most numbers to be primes in a sense. However, even these are not prime. We should check the first few cases and just see what happens. So we take two times one. That's obviously prime. Two times three plus one. That's seven. That's prime two times three times five plus one. That's 31. Two times three times five times seven plus one. Well, that gives you 211. That's still prime. However, if you get two times three times five times seven times 11 plus one, that's again 2311. That's again prime. If you take two times three times five times seven times 11 times 13 plus one, this is 30031. And this actually splits as 59 times 509. So the reason I'm doing this is to sort of emphasize that number theory is really a rather experimental subject. When if you've got a question like is a product of the first few primes plus one is prime, the first thing you do is just calculate the first dozen or so examples and see what happens. And quite often you will be able to answer your question by just doing a bit of calculation. We can have another question. Suppose you take Euclid's method of generating prime. So you remember we get two, three, then we got seven, then we got 43, then we got 13 and so on. And you can ask, do we get all primes appear in this list? And this is actually an incredibly difficult question. Well, in some sense it's a very easy question because it's easy to see that the answer is almost certainly yes. The answer is almost certainly. And the reason for this is as follows. Suppose we take some prime, say, does 101 appear? And let me try and convince you that it is almost certain that 101 appears without actually doing any calculation. So the chance that it appears at step n is about 1 over 101 because we're dividing a sort of random number by 101 and the chance that a random number is divisible by 101 is about 1 and 101 because there are 101 possible one sorts of remainders. So the chance that it does not appear at step n is about 1 minus 1 over 101. So the chance that it does not appear at step less than or equal to n is about 1 minus 101 to the power of n. And you can make this as small as you like by making n sufficiently large. So if you take n to be a million, this will be a ridiculously small number. So it's almost certain that 101 appears in the first million numbers. So this is a bit like saying, you know, suppose you toss a coin a million times. What's the, you know, it's very unlikely it's going to turn up heads every time. And similarly, if you threw a dice with 101 faces, if you threw it three or four times, you probably wouldn't get a 101 once. But if you threw it millions of times, it's as certain as almost anything can be that this number is going to turn up somewhere. So the number 101 almost certainly appears. It's very hard to imagine how it couldn't appear. And the same obviously applies to any other number. So this strongly suggests that all primes do in fact appear. However, it seems to be almost impossible to prove. You see, we're saying these numbers are random. Well, the trouble is they're not random. They're produced by this, you know, Euclid's method and, you know, there's a specific computer program. And it's conceivable that there's some very subtle structure that we don't know about that somehow forces all these numbers to be not visible by 101. And I've no idea how you could possibly prove this. So this is an example of a statement in number theory that it's almost obvious that it's true, but we have just no idea how we can actually prove it. I think someone once said number theory is a subject where any fool can ask a question that nobody knows how to answer. And this is a typical example of such a question. Number theory is full of questions like this. Now we have another question. We've shown there are infinitely many primes. And if you look at the primes, you notice that the last digit of a prime must be 1, 3, 7 or 9, unless the prime is 2 or 5. Because if it's not 1, 3, 7 or 9, the number would have to be divisible by 2 or 5. And if you write that out a lot of primes, you see the last digit seemed to be roughly equally common. So we can ask, are there infinitely many with last digit 1? In other words, are there infinitely many primes with a form 10n plus 1? And the answer is yes. This is a very famous theorem proved by Dirichlet, which we will actually discuss later in the course. What I'm going to do now is to give a few easy cases of this, or easy variations. So instead of 10n plus 1, we could obviously ask for any other arithmetic progression. We could ask, suppose we're given numbers a and b, are there infinitely many primes with a form a and plus b? Well, they're obviously not if a and b are a common factor, so we should say a and b are a co-prime. And then I guess we should also say a is not equal to 0, otherwise you could just take 0 times n plus 1 or something like that. So there's some obvious easy cases. If you look at primes of the form 2n plus 1, there are obviously an infinite number because all primes except 2 are of this form. So what about numbers of the form 4n plus 3? Well, what you can do is you can do a sort of variation of Euclid's proof. What we do is suppose we take some primes of the form p1, p2, up to pk of the form 4n plus 3. Then I'm going to multiply them all together and then I'm going to multiply this by 4 and subtract 1. So this is of the form 4m plus 3, obviously. And we notice this is not divisible by primes p1 up to pk that we found because if it were then these primes would divide this number and they would divide that number and so they would have to divide minus 1. However, it's also not divisible entirely by, let me rephrase that. It must have at least one factor not of the form 4m plus 3, sorry, not of the form 4m plus 1. That's because the product of any number of factors of these is also of the form 4m plus 1 as we saw earlier. If you multiply numbers of the form 4m plus 1 it's still of the form 4m plus 1. So this must have at least one factor that's not of the form 4m plus 3 and it can't be divisible by 2 either. So it must have a prime factor of the form 4m plus 3. And it can't be one of the primes that we've already found of the form 4n plus 3. So there are infinitely many primes of the form 4n plus 3. We can do a similar thing for 3n plus 2. And I'll just leave this as an exercise. There are infinitely many primes, greater than positive primes of the form 3n plus 2. What about 4n plus 1? Well this is rather more difficult. The problem is that the product of 2 primes of the form 4n, numbers of the form 4n plus 1 is of the form 4n plus 1 which we used earlier. The trouble is if we multiply numbers of the form 4n plus 3 together they don't need to be of the form 4n plus 3 so the previous proof breaks down. Well we can do 4n plus 1 but we need to use the following fact. That if a prime divides m squared plus 1 then p equals 2 or p is of the form 4n plus 1. And what we're going to do is we're going to prove this later on in the course but for the moment I'm just going to assume it. And as I said whenever we have something in number theory you should always start out by checking the first few cases just to get an idea what's going on. So let's look at the numbers m squared plus 1. So we have 1 squared plus 1 is 2. 2 squared plus 1 is 5. That's of the form 4n plus 1. 3 squared plus 1 is 10 which is 2 times 5. 4 squared plus 1 is 17. That's of the form 4n plus 1. 5 squared plus 1 is 26 which is 2 times 13. That's of the form 4n plus 1. 6 squared plus 1 is 37. That's okay. 7 squared plus 1 is 50. So that's 2 times 5 times 5. And you see all the numbers that appear here are always either 2 or of the form 4n plus 1. So I've at least made this statement reasonably plausible. Anyway, now using this fact we can now copy Euclid's proof. Suppose we take some numbers p1, p2 and so on up to pk where all pk's, we're all pi's of the form 4n plus 1. What I'm going to do is I'm going to multiply them together and I'm going to add 1. But that's no good because I can only show that something is of the form 4n plus 1 and it divides a square. So I can square this. But then the problem is this might be divisible by 2. So I'm going to get rid of the possibility that this is divisible by 2 by sticking in a 2 here. So we now know that all factors of this are First of all, they're of the form 4n plus 1 by the result that I postponed the proof until later that I stated earlier. Secondly, they're not equal to 2. Thirdly, they're of the form 4n plus 1 or 2. And I've shown they're not equal to 2 because I stuck a 2 in here. And they're also not equal to p1, p2, up to pk. So they must be new primes of the form 4n plus 1. So we've got infinitely many primes that are 4n plus 1 and infinitely many that are 4n plus 3. Well, you can't really push this rather simple idea much further. You can push it slightly further so we can show that infinitely many primes that are 3 or 7 that have lost digit as 3 or 7 when written to base 10. And the point is that if m and n have lost digit 1 or 9, so then so does the product. That's m can have lost digit 1 or 9 and n can have lost digit 1 or 9. And if you multiply them together, the result will still have lost digit 1 or 9. So if you take something like 10 p1 up to pk plus 3, it must have a prime factor with lost digit 3 or 7. So if you take p1 up to pk to be all the primes you've thought of so far lost digit 3 or 7, then a prime factor of this will be a new one. The trouble with this is it's very hard to separate these out. So if you want an infinite number with lost digit 3 or an infinite number with lost digit 7, it's much harder to prove. But this is what Dirichlet managed that we will discuss later. Another question you can ask is what about gaps between primes? So if you look at our primes 2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 27, 29, we can see that we sometimes get gaps. So here's a gap of 6, here's a gap of 4, here's a gap of 1, here's a gap of 2 and so on. So the gaps seem to be, I mean, they're sometimes quite small, they sometimes go down to 2 and they sometimes get a bit big. And we can ask, is there a bound to the gaps? And the answer is no, there's no bound on the gaps and this is quite easy to see. Let's just take the number n factorial and we can add 2 or 3 or 4 or up to n. And these are not prime. And that's obvious because n factorial plus k is divisible by the number k if 1 less than or equal to k is less than or equal to n. So here we've got a gap of n minus 1 numbers that are not prime. So we notice here n minus 1 is slightly less than the log of n factorial. And in fact it turns out that in some sense the average size of a gap between numbers, so if you've got prime numbers up to n, the average size of the gap between prime numbers is about the logarithm of n. That actually follows from something called the prime number theorem that I'll discuss later in the course. And it's incredibly difficult to show that the gap is sometimes significantly bigger than this or sometimes significantly smaller than this. So I'll give a couple of examples. There's an amazing breakthrough by Zhang that the gap is sometimes less than or equal to 70 million for all large n. Well of course it's obvious the gap is sometimes less than 70 million because we've found lots of gaps less than 70 million. What Zhang showed is that you get infinitely many gaps that are less than 70 million. In other words if you keep going forever you will never get the minimum size of the gaps between primes don't keep increasing. It's believed that there are infinitely many cases when the gap actually is size 2 that's the famous twin prime conjecture but that seems to be way beyond anything we can prove at the moment. This number here has since been reduced to something a bit more reasonable than 2 or 300 or something and will probably come down a bit more. On the other hand you can ask can the gap sometimes be bigger than log of n. There's a sort of conjecture that the gap is sometimes about log of n squared so that's about the square of the average size. This seems to be far beyond anything we can prove. Let me give you an example of some of the things we can prove. There's a result by Rankin who found it in about 1938 which says the gap is sometimes bigger than this rather spectacular. You take a third of log of n so that's the sort of log of n is the average size and then you multiply it by log of n then you multiply it by log log log log of n and then you divide all this by log of n sorry log log log of n all cubed and your reaction when you look at this is probably to say this number is utterly and completely ridiculous and you're right. Actually the thing is log of n increases very slowly. Log log of n increases so ridiculously slowly that it's constant for all practical purposes and these two terms are even more ridiculous. However, analytic number theory is a really fond of long chains of logs like this. Chains of three logs occur reasonably often. Chains of four logs like this are unusual even for analytic number theorists so this probably comes close to setting the record for bizarreness of bounds in analytic number theory. This has actually been improved very slightly but improvements on this are actually extremely difficult to do. So I think I'll pause there and continue the rest of the lecture on primes in the next video.