 I believe you can answer your own data analysis questions. Do you? If you do, stick around for another edition of Code Club. I'm your host, Pat Schloss. And my goal is to help you to grow in confidence, to ask and answer questions about the world around us using data. This week, we'll build upon our experiences with the readsheet function from the Google Sheets 4 package and the geomline and geomsmooth functions from the ggplot2 package. We'll use these functions and the separate function from dplyr and geomribin and labs from the ggplot2 package to look at seasonal trends and lamb prices here in southeastern Michigan. As I've mentioned over the past few Code Clubs, I have sheep on my farm. I like raising sheep because they have a pretty quick growing period of about six months. They're small enough that I can have a decent number of them. They taste good. And in southeastern Michigan, there's a pretty strong Middle Eastern clientele that eat a lot of lamb. Last week, we created a figure showing the number of lambs being sold at my local livestock auction had been falling off for the last two years. That's only part of the equation. Lamb numbers could be dropping because prices are falling, or with decline in sheep, they could be getting more valuable. Also, you may have noticed this, but the total number of sheep sold in the figure from last week was the total. I don't know how many animals were sold in each category. Although I've been recording the number of sheep sold and the price for each class over the past five years, I've never really plotted the data. Today, we'll plot the price for each weight class to see what has happened to the prices over the past five years. For today's Code Club, we'll build off of what we did last week using the GG Sheets 4 package to read the spreadsheet. We'll remember how to use the rename function to fix my horrible column names. We'll learn to use the separate function to split my price ranges into two columns, and we'll see how we can use the geom ribbon function from the GG Plot 2 package to plot that range of data. Along the way, we'll also see the mutate, geom line, and geomsmooth functions again. Finally, we'll use GG Plot 2's labs function to give our plots descriptive titles and access labels. Please don't watch the video straight through without firing up our studio and trying the code and exercises yourself. If you haven't already followed the setup instructions, go ahead and pause the video so you can go over to the Riffamonus Project website for this Code Club, make sure you have everything you need. Hopefully you found your way over to the Riffamonus website and the blog post for today's episode of Code Club titled, Predicting Lamb Futures. As you scroll down, you'll see there's a section called prompt. In here, we're going to go ahead and use some of the steps that we used in the previous episode of Code Club. I'm going to go ahead and copy the link to the spreadsheet that we used from before and we'll take off. So I'm going to go ahead and copy this link and we'll go now over to our studio. So I'll set up a new R script and I'll call this my Google Sheet. And I'll make it a variable just to kind of keep things nice and tidy. At the top of this, I need to do library, Google Sheets 4, library, tidyverse. And I'll make sure that these lines are all run. Very good. So again, if we look at Google Sheets, Google Sheet, so Google Sheet, of course, is the name of the URL that we want, but to read it in, we're going to need to use read underscore sheet like we did last week and we read this in and again, it's going to ask for authentication. So I'm going to put in one to use my Gmail account. It's reading in the prices, land prices from the range, from the sheet, numbers and prices. If you recall from last time, we create this extra column because in the final column, I had a series of holidays or different things that came up. And so we're going to soup kind of a spiffy this up to make it a little bit more polished and better reading. Something we also talked about last week was that the date and the total are being read in, not in a great format. So we'll fix that. So the first thing we'll do is to say sheet equals numbers, underscore and prices. Make sure that works good. And then we can say range. Last week, we did A and B, columns A and B. This time we want A through J. So A, B, C, D, E, F, G, H, I, J. A, J, we'll read those in. And we see we get rid of that last column, the dot, dot, 11, which wasn't really necessary. The other thing that we did talk about last week was reading in the column types and telling read sheet what type of data appears in each column. So we can do call types. And you know what, I'm gonna go ahead and put each of these arguments on a separate line. So it's a little bit easier to read what's going on here. And we can use a single character to indicate the type of data in that column. And so we're gonna have 10 different characters here. So we'll have a capital D for date for the date column. We'll have an N where we could use a D for numerical data for the total column. And then we're gonna have the eight following columns all being characters. And so eventually these will be numeric, but for now they're characters because they represent a range of prices, 40 to $100 per 100 pounds of animal. So I'm gonna put in eight C's. One, two, three, four, five, six, seven, eight. So let's double check that that's 10. Six, seven, eight, nine, 10. I think that's right. If I run that, it'll complain if I didn't get the right number. 44 warnings. So it's warning me because it doesn't know what an NA value is. So remember the default NA value for read sheet is a blank string. So quote, quote. But what we have are NA's. So we'll say NA equals quote NA. Reads it in and everything is good, okay? Very good. So again, this is our data frame. We're gonna do some variety of things with these different columns and having spaces in the names isn't really helpful. Having numbers in the column names isn't really helpful either. And there's kind of a mixture of capitalizations, right? So we're gonna harken back to a previous code club and use the rename function and rename all which will allow us to modify these different column names. So I'll go ahead and add to this pipe. We'll do rename all and I will add to lower and it reads it in and rename all. We give it the function to lower and it's gonna run the to lower function on all of our columns. So now we see that everything is in a lower case which is great. I'm gonna go ahead and do a rename and I wanna get rid of spaces and I wanna get rid of these numerical ranges for the column headings. So I'll do aged underscore sheep equals quote aged space sheep. I'll do feeder lambs with an underscore. And so again, the syntax is the new name equals the old name. And I could put quotes around the new names but I'm feeling lazy and I'm not gonna do that. Let's do hair lambs. Hair lambs are a type of sheep that don't produce wool. They produce hair so they shed and let's see. So hair, lambs. I'll keep doing this space thing. So do new crop. Never quite been able to figure out what new crop is. I think these are really young lambs that are just weaned so they're kind of small. And I think that's it for the named ones and then we have these weight ranges. So 40 to 85 pounds, 85 to 105, 106 to 130 and things over 131. So I'm gonna call these small, medium, large and extra large. So we'll do small equals 40 to 85. Medium equals 85 to 105. Large is 106 to 130. And extra large is greater than 131. I think that was the way they wrote it. Yeah, greater than 131. So now when we run this, we get our column names the way we want it. Nice and tidy and descriptive. And the data are well-formatted. So I'm gonna go ahead and rename this as auction data just so that I'm not re-hitting Google Sheets over and over again. I've had problems in the past where if you read the same Google Sheet many times in a short period of time, Google gets kind of upset. So we'll run this and so now we have auction data as our variable and we're good to go. So what I wanna do with these data is I want to look at how the price of different types of lambs has changed over the past five years. And what I'm gonna be most interested in are these large lambs between 106 and 130 pounds. That's kind of the ideal market weight for lamb. A lot of lamb that's sold is right around 125 pounds. This is what generally shows up in your restaurant or in your grocery stores. As I was saying, we have a really strong Middle Eastern clientele here in Southeastern Michigan. They actually kind of like the smaller weight class of things between like 45 and 80 pounds because it's at the size of an animal that a family can purchase and eat as a family for feasts like the end of Ramadan and other feasts like that. So that's the goal. We wanna make a plot over the past five years showing the variation in large lamb prices during that time. So the data are here in this large column but right now it's represented as a range. So on August 31st of 2015, the prices were $150 to $170 per 100 pounds. So about 50 to $170 per pound. To get at this data so that we can plot the actual prices over time, we're gonna need to separate the large column into two columns for the min price and the maximum price. The nice thing about the way the data are formatted is that there's a hyphen in every row of these columns. And so we can use a function called separate that allows us to separate the large column into two new columns using that hyphen as our separator. The separate function is really powerful and it's worth digging into if you have anything that's more complicated than a hyphen or a single character to separate data into multiple columns. So we will try this and we'll do separate auction data. So we're gonna separate the auction data. The column that we wanna use is large and we can say into and then we're gonna give it a vector. And so into needs to be a vector that's the same length as the number of values in each row. And so I'm gonna say min and max. And what you've perhaps noticed is that I didn't tell separate what to separate the column on. But what we see is that we get min and max as our new columns, okay? So to be a little bit more explicit, what I like to do is to define the separation character. That way, if I'm reading through the code without running it, I can see what I separated those columns into. So I'll say sap equals quote hyphen quote comma. And again, that makes it a little bit more explicit. If we go ahead and look at the separate help page, we'll see there's a variety of arguments that we could use. So there's a remove function, a remove argument, convert, extra fill. These are some of the extra arguments that we'll talk about how to use here. We'll talk about the remove and the convert columns. So the first thing that we notice perhaps is that the large column is gone. And so this will get us to the remove argument. So if we say remove equals false, then what we see is that we keep the large column and we get the min and max. But if remove is true, then we get rid of that large column. So true is the default value for remove and I'm happy with that. I don't need to keep that range around. The other thing that we notice is that my min and max columns are characters. They're not numerical at this point. And so we can use the convert argument and say convert equals true. And this will convert our new columns into a numerical column type. And so we see now that min and max are an integer type, a numeric type of data. The default for convert is actually false. Great. So this gets us to the point where we have min and max as being numerical columns for our large animal class. I'm gonna do a couple of things here to kind of tidy this up a bit. And I'm gonna pull auction data out to make a pipeline. I can get rid of this initial comma and let's see, let's go ahead and select to get out the date and the min and the max. And so now what we have is a data frame that has the date for each week of data and the min and the max weight classes. I'm gonna go ahead and name this pipeline a large prices. And that way it'll be easier to work with this data frame going forward. And again, we can look at large prices and we'll see that we've got the date, the min and the max. Excellent. So the first thing I wanna do is I wanna plot the kind of midpoint price for large animals for the last five years. And so what I will do is I'll do large prices and I'll add a column. And so hopefully when you hear add a column, you think mutate and I'll do midpoint equals max plus min divided by two. I misspelled it, large prices. And we now see that we have an extra fourth column for midpoint, which is between the min and the max. To plot this, as we talked about last week, you can use ggplot aes to set the aesthetic for the x and y axes. So we'll say x equals date, y equals midpoint. And I'll add a gmline at this point. And it's giving me a warning that it removed 29 rows because it contained missing values. If you looked at the end of the spreadsheet, you'll know that I've got rows in there for every week going through to the end of 2020. But again, we see in the bottom right corner here of our plotting window, the seasonal, the trends, the annual trends in the price for large sheep between this 106, 130 weight class at my local livestock auction. And it's surprising to me to see that the price is kind of peak kind of in April, May, and then fall off over the course of the year. And so there is year to year variation, but this seems like a pretty strong trend and something that I notice is that it looked like it was set to have a great spring of really heading up. But of course COVID hit and the price is just plummeted before these large animals could really take off in value. And that's because again, these large animals are consumed at restaurants. And if restaurants are closed, then no one's gonna buy these large animals. So that's quite a problem. Okay, so this is a little bit of a noisy plot. And as we talked about last week, you could use geome smooth to kind of smooth through the data. Now, when we do this, we get a pretty flat smooth curve. And we talked about using the span function last week that allows you to control the wiggledness, the amount of wiggle in the line and the lower the number that you give to span, the more wiggle, the higher, the less wiggle. So let's try 0.1. And so we see that we get a pretty good fit of the data. Now, something that I might do to clean this up a little bit might be to say, geomine, I'm gonna make this color gray and I'm gonna make SE equals false to hide that standard error confidence interval. And so we can see the nice, we can see kind of the real data in gray and the smooth data here in blue, which to me, it makes it kind of attractive. And as we talked about last week, we could do things like size equals 0.2. To make that blue, well, it's maybe too thin. Let's do 0.5. To not make it so bold. But anyway, we talked about this last week of ways to kind of change the appearance of our smooth line. Of course, the data we have is a range. We don't know what the actual average price was. So this is why I'm calling it a midpoint price. We don't know the average. And the reality is that if you go and watch the auction, you'll see that there's large groups of animals that are kind of uniform in weight that are sold together. But they might range between say, 100 and 110 pounds or something like that. And so we don't know if heavier animals are priced more per pound than lighter animals. But the reality is also that the quality of the animal certainly says a lot about the price that you're gonna get, right? So if you have a tall, skinny animal, it's gonna be heavy, but it's not gonna be valued very highly because it just doesn't have as much meat on the bones as a smaller animal that perhaps, smaller framed animal that's got the same weight, okay? So let's think about how we can represent this as a range. And to do that, we're gonna learn a new geome from Gigiplot called geome ribbon. And we'll do large prices. And we'll pipe this to Gigiplot. And the aesthetic that geome ribbon wants is that it needs an X value and it needs a Y min and Y max value. And so we'll say X equals date, Y min equals min, Y max equals max. And then we'll say geome ribbon. And so you can see the ribbon diagram that's generated here. It's a little bit spiky. And that's, I think because the data come out in kind of 10 cent increments or $10 per hundred pound increments. And so it is a little bit jagged, but you can kind of see the range and you can certainly see down here in 2020, there's quite a bit of variation in the price there. And that's something that we'll look at in the exercises. Now we can play with the appearance of the geome ribbon. Something that we might like to do is to change the color inside the ribbon. We might also like to change the border of the ribbon. So if I say geome ribbon fill equals Dodger blue, we see that the ribbon now is blue. By default, there is no border line on that curve, on that ribbon. But we can set the border, say with color equals black. And so now we get this black line that is the border of our ribbon. And I think we can do size equals 0.2 to make that line a little bit have a thinner stroke. But again, whether or not you have that border, it's up to your own personal preferences in what you like for the data to look like, what you want for the plot to look like, right? It might be nice to kind of think about fitting a smooth line through the Y min and Y max. That's something that's a little bit more advanced than where we're quite at now. Perhaps in a future code club we'll come back and look at how we could add those smooth lines as the borders for our ribbon. Anyway, the last thing I wanna talk about with our plots is that as we've been making plots over the last few weeks, I felt a little bit uneasy about them because they don't really tell a nice story, right? If you look at this, well, I have no idea what's on the Y axis. The X axis thankfully tells me the date, but I have no idea what's going on here. And perhaps we'd like to tweet this out to say, hey, large land prices seem to peak in late spring or in the spring and fall off the rest of the year. So to fix that, there's a function in ggplot called labs. And for labs, we can give it a, for labs we can give it a variety of arguments. So we can give it the title, we can give a subtitle, we can give a caption, and we can give it an X axis label and a Y axis label, okay? So to figure out where these all go, where is the title, where's the subtitle, where's the caption? I'm gonna plug in the argument name as the value that I'm passing to the argument. And sure enough, what we see is the title, subtitle. So the title is gonna be in a larger font at the top left of the plot, followed by the subtitle. The caption is gonna go in the lower right corner and then we have our X and Y axis labels. Now, if you go back to the ggthemes, fun with themes code club from a while back, you'll see how you can modify the location and the font and the sizes of these different labels. So for my X axis, I'll put date. For my Y axis, I'm gonna put price and I'll put dollars per 100, oh, not percent, dollars per 100 pounds. And my title, or yeah, for my title, I'll say large lambs reliably are most valued in, let's say, April and May. And for our subtitle, I'll say prices of lambs, 106, let's say, of 106 to 130 pound lambs from 2015 to 2020. And then for my caption, I'm going to say data reported by United producers in Manchester, Michigan. And so the title, I like to be declarative and to tell what should you see when you look at this plot. The subtitle is perhaps a more bland description of what's going on. And then the caption we can use to indicate where we got the data or perhaps some caveats or nuances that you wanna make sure that people appreciate about your visualization. So we run that and we get a pretty good looking plot. Now, I'm not such a fan of this gray background. And so in the past, I've been using a theme classic which has a nice clean presentation. What I'd like to see are a little bit of grid lines. I normally don't like grid lines because I think they're just too much. But it would be nice to kind of see where the vertical grid lines are for each year to know kind of the break in the year. And so to do that, I'm gonna use theme light. And so again, this gives a pretty nice presentation of the ribbon moving over time. Of course, we can copy this lambs in theme light up to our previous line plot for the midpoint data and we can add that here. And in a way, this gives two different plots now for the same data effectively, presented in a slightly different way. If I'm looking for a general trend in what's going on and I'm not so worked up about the actual values, then I kind of like the presentation of the smooth line better than the range. The ribbons are just kind of noisy. Perhaps if I could figure out how to smooth Y min and Y max, I would deal with that. But again, it's two different options to look at data, the same type of data. I have to admit that although I've been recording these data for the past five years, I find the result that we just looked at pretty surprising. I hadn't looked at the data in that kind of detail. And so I have other questions that come to mind and these are gonna be questions that you're gonna get to grapple with now in the exercises that I have here on the screen. So questions that I wonder about are, is there an ideal time to sell my older use that are kind of past their prime and ready to move on to greener pastures? How much has that variation in price changed over the past five years? Is the recent change due to COVID or has that always been with us? So take time now to pause the video and work through the exercises on your. I hope you found these exercises interesting. Perhaps the application is a little bit different than what you encounter in your day to day life and the types of questions you have. But I can assure you that these types of questions are really important to me as a sheep producer. So for the first question, I asked you to look for seasonality in the aged sheep class. So aged sheep, as I said before, are the sheep that are perhaps past their prime, use that are no longer able to support their lambs or rams that perhaps need to go if you know what I mean. And so let's go ahead and look in our studio at these data to see if there's any evidence for seasonality in this weight class. So in here, we're gonna go ahead and I'm going to copy my large prices definition down into exercise one. And I'm gonna, instead of large prices, I'm gonna say aged prices. And instead of separating large, we're gonna separate aged. And if we look, what did I do wrong? Oh, it should be aged sheep. And so if we look at aged prices, let me move this up a little bit. We see now that we have a column for the date and the min and the max. And as you can see from these prices, they're quite a bit lower than what we see for large animals, right? So these aged sheep, like I said, are several year old use and rams typically. And they're typically destined, sorry to say, to dog food. So even though your, you know, phytos kibble says it's got lamb in it, it's most likely mutton. It's just too expensive to put real lamb into dog food. So what we wanna do first is to make a fitted line using the midpoint. And so again, we're going to do very much what we had done earlier. What should be aged prices? GG plot, AES, X equals date. And to make the ribbon plot, we could do Y min equals min, Y max equals max, plus geom ribbon. I'm gonna do color equals Dodger blue, not color, fill. Always mix color and fill. And so we can see that it's pretty flat. Maybe there's some spikiness here and here, kind of it seems at the beginning of the year, but it's really hard to see what's going on there. So let's go ahead and look at the midpoint. So we'll do aged prices. Remember we did mutate midpoint equals max plus min divided by two. And then we pipe that to GG plot, AES, X equals date, Y equals midpoint, and then geom line. And yeah, it looks like there's kind of a spike towards the beginning of each year. Let's go ahead and fit our smooth line to that. And we'll do geom smooth span equals 0.1. And I'll do SE equals false to get rid of that confidence interval. And yeah, we can kind of see some spikiness in the data at the beginning of each year. Maybe to make this look a little bit more attractive, I will go ahead and do theme light. And I will make my geom line gray. So it stands kind of in the back a little bit more. Let's see, it's not happy with me. Gray needs to be in quotes. And so sure enough, we can kind of see the spikiness. So this kind of makes sense because people typically will have their lambs born in January through May. And so if you know that you can't support a lamb, you probably know that pretty soon, right after they've had the lamb in June through May. And so it's pretty expensive to keep around a pretty non-valuable animal that you have to feed into the winter. And so yeah, you could hold on to a U until January, but you're gonna be feeding her for several months knowing that you're not gonna get much out of her. So this is one of those cases where we could hold on to a U until January where we get the best value, but it's probably too expensive to hold her onto her until then. So let's go ahead and give these plots some more descriptive titles. And again, we'll do labs. Title equals, let's say, aged sheep are more valuable in January. And we'll say subtitle. We'll say aged sheep prices from 2015 through 2020 or through present. So I just got this week's data in here. For X, we'll say date. Y, we'll say price. And we'll say dollars per 100 pounds. And caption. We'll say is data reported by United Producers for Manchester, Michigan. Excellent. And of course, we can copy this up to our ribbon diagram, add that to the pipeline, get our ribbon diagram. And we can toggle back and forth and see which representation we like. Again, for these data, I kind of like the smooth curve a little bit better than the ribbon diagram, but it kind of depends on your application. Also kind of looking at these data, I kind of do see a little bit of a blip kind of in the middle of the year for at least like 2017, 18 and 19, where they kind of increase in value a little bit. So I'm not sure why that would be, but regardless, it's interesting. Certainly the strongest signal of value is in early in the year. So the next question is, how has the weekly range in price for large lambs varied over the past five years? And so here, what I want to know is for each week, what is the difference between max and min for our large animals, large lambs? And so we'll remember we have large prices as this. And what we want to do is create a column that we're going to call range. So I'll pipe this into mutate. We'll say range equals max minus min to get the range. And then I will then pipe this into ggplot. AES x equals date, y equals range. And I'm going to do geom line here because I don't have a range of ranges, a line. And I'll go ahead and do theme light. And so here again, the prices come out in $10 per 100 pound increments. And so we get kind of this like stair step look that generally the prices range between say 10 and 30 cents per pound across time. But there's a few fluctuations. And certainly here in 2020, it went from there being no range to be, so perhaps there weren't many animals being sold. And so there really wasn't a difference to being a very high range. Perhaps reflecting a pretty wide variation in the quality of the animal or just kind of uncertainty as to whether or not people should be buying these large lands as we're kind of coming into COVID and restaurants shutting down. So I could go ahead and I like to make my theme line be the last line of my ggplot. So I'll go ahead and add geom smooth. Let's do span again of 0.1. SE equals false. Don't forget our plus sign. And we can kind of see that there is some over the year seasonal fluctuation here. But certainly we see that in 2020, the variation has really taken off. And we can again add our labels. I'm gonna go ahead and copy what I had up here down here, good place to start. And I will say range in price, weekly price for large lambs has spiked in 2020 due to COVID-19. A sub title will be difference in min and max prices for 105 to 130 pound lambs. Let me run that just to check. And so what I'm seeing is that my title, at least in my resolution is kind of running off the right side of the screen. So to deal with that, I can put in a backslash and where I want the line to break. So backslash and means line break. So if I run that, you can see I added that. And I'll go ahead and do that here. I could also perhaps think of a more concise way to say the same thing. So perhaps I don't need the from 2015 to present because that's kind of obvious. And so we get that. And the title's good, date, X for the date is good, date for the X, axis is good. And here we'll perhaps put range in price. And there we go. Pretty attractive plot that tells a story. Again, maybe I'd come up to GeoMline and say color equals gray, just so it's not so pronounced and that the trend really pops out of us. All right, so this final question is a bit more involved. So I get to choose when I, I get to choose when I take my lambs into the auction. I can take them in for any of these different weight classes. And so that's why it's good to have this data because I can perhaps forecast when lambs of different sizes are gonna be more valuable than the others. And so I might say, if I've got a 95 pound lamb, should I pay, because I have to feed it, should I pay to grow it to 125 pounds to get the gain another 30 pounds? That might take me about six weeks to do. It might take four to six weeks to gain that final 30 pounds. And that's not free, right? So is my increased revenue income going to pay for, more than pay for the cost of feeding the animal over that time? So that's what this question is asking. So to do this, we need to look at two columns. We need to look at the medium and the large columns and to separate those. And we're going to assume that we're gonna get the midpoint price for the medium and large weight classes. So what we need to do, I'll break this down. So we need to separate medium into min and max. We need to separate large into min and max. We then need to calculate the midpoint for medium, calculate the midpoint for large. And then I need to determine or calculate the total price for a 95 pound lamb, calculate the total price for a 125 pound lamb. And then I need to calculate, calculate, and all the typos, the difference in total prices. And plot, okay. So these are the steps, okay? So we will take our auction data and I'm gonna do separate twice. So to separate medium into, and I'll do medium min and medium max, and I'll set using my hyphen. We'll also do separate large into large min and large set on the hyphen. And so if we look at this, it's time to get expected three pieces. And that's because I put my closing parentheses after my sep rather than after my medium max. So good. We have our medium max, medium min, medium max. And we have, where did my cursor go? And we have large min and large max. Excellent. So now we're gonna do a mutate. And so for mutate, we will say medium midpoint is going to be medium max plus medium min, divided by two. Great. We'll do large midpoint as large max plus large min divided by two. And I could do medium total as gonna be medium midpoint times 95. So this is getting this part and large total as large midpoint times 125. And then we want the difference, which will be large total minus medium total. So again, kind of looking at our pipeline and the outline I made. So separate medium into min and max, separate large into medium and min and max are these lines 112, 113. So I'll put 12, line 113, line 112. Calculating the midpoint for medium is here at line 114. For large, it's at line 115. Calculate the total price for a 95 pound lamb is at line 116. And this then is at line 117 down here. And then the difference finally is at line 118. So again, if we plot this, air and medium, medium min plus medium max, it's not happy with something. So it's saying error, non-numeric argument to binary operator. I think I know what's going on. I think it's saying non-numeric argument that's making me think that I forgot to include my convert argument. So if I look at the output of these three lines, I will see that my medium min is a character type and medium max is a character type. So I need to say convert equals true there and here. And that worked. And I'm going to clean this up a bit and say select date comma difference. And something that I'm seeing is that it's still in per hundred pounds. And I didn't actually divide the total by the hundred pounds that this price was per hundred pounds. So I need to divide this by a hundred pounds. So I run that, then I see the difference between the larger class and the medium class. Excellent. So I'm going to call this difference. Maybe I'll call it large, mid, medium diff. And now I want to plot it. So we'll say large, medium diff and pipe that to ggplot as x equals date, y equals difference and use gmline. And I'll go ahead and add theme classic. And because I'm just sure I'll do it, I'll do geomsmooth span equals 0.1. Can't add theme classic because I needed the parentheses. But I think what I really want is theme light anyway. Excellent. And so what we see is we see pretty strong variation in the difference between the large and the medium animals. And let me go ahead and make our S equals false. I'll make my line gray to make it kind of stand back a little bit. And what's interesting to me is that in January, the price difference is very low. And so it's probably not affordable to raise an animal to 130 pounds, 125 pounds. In early January. It's probably better to sell it a month earlier as a medium lamb than in January as a large lamb. And that's probably because a lot of people have brought smaller lambs in that were born in May. They brought them in in September and October. Those are getting fed out. And now they're kind of filling the market in early January. So there's probably an influx of large lambs in early January. But we can kind of see that we generally have a difference of $20 to $50 per animal. And so that tells me again as a producer that I need to spend less than $20 to get my animal from 90 pounds to 120 pounds to be confident that I'm gonna make any money over that. And again, as we saw from mapping the, plotting the large lamb prices, selling those in late spring or mid spring is probably driving a lot of this trend. So we can go ahead and add labs and we'll do title equals, I'll say don't sell lambs in January. Sell large lambs. And then our subtitle will be total price difference between a 125 and 95 pound lamb. Caption, like we've been saying, is data from United Producers for Manchester, Michigan. X is date. Y is going to be total price. And this will be dollars per animal. And there we go. We have a pretty nice plot that tells an interesting story about the difference between kind of raising animals for these different weight classes of sheep. Thanks again for joining me for this week's Code Club. Be sure that you spend the time going through the exercises on your own to help reinforce your new skills. Even better would be for you to take the data and tools we've worked on today to answer your very own question. I'd love to see what you did. Please feel free to drop a line in the comments below to tell us what question you were eager to answer. Also, please let me know what types of data analysis questions you have and I'll do my best to answer them in a future Code Club. Be sure to tell your friends about Code Club and to like this video. Subscribe to their Fomona's channel and click on the bell so you know when the next Code Club video drops. Keep practicing and we'll see you next time for another episode of Code Club.