 Good morning, good afternoon, or good evening, depending on where you are logging from. It's my pleasure to welcome you to the first session of Glossalon Soil Spectroscopy webinar. My name is Ypen. I'm from Global Soil Partnership FAO, and I will be moderating this session. Our today webinar will introduce you a very important topic needed for soil analysis, soil spectroscopy. Before starting, I kindly ask all of you to check the Zoom chat as some rules and information on this session will be now posted. I would kindly remind you that the session is organized in a webinar format in which participants cannot activate their audio and camera. However, participants are encouraged to post their questions in the QA box, which will be moderated by the Glossalon coordinator. We will choose a few questions to be answered to be answered live. The rest will be answered via chat. In addition, a chat box is available and can be used for interaction between participants. Please use the chat responsibly. The co-host of the meeting is Isabel. She's here to help you for any technical issue, so please don't hesitate to write to her directly using the private message option of the chat box if needed. Before digging into soil spectroscopy with our renowned speaker, I would like to give the floor to my colleague Lucrezia, who will provide a bit of background of Global Soil Partnership and the Glossalon. Lucrezia, over to you. Good morning, good afternoon, good evening to everybody, and thanks for being with us today. Just give me a second to share my screen. I hope that you can hear me well and to start the presentation. Can you see? Yes. Okay, so as you mentioned, I would like just to give you some brief information on the framework under which this series of Webinar is organized, starting from the very beginning. So who is organizing this Webinar and this Webinar series? So the Webinars are organized under the framework of the Global Soil Partnership that is a partnership established in 2012 to position soil in the global agenda through collective actions. Indeed, our key objective is to promote the sustainable soil management and improve soil governance to guarantee healthy and productive soil. How we do this? Well, all activities are downscaled through and with the support of seven regional soil partnerships and additional support is provided by GSP partners and the Intergovernmental Technical Panel of Soil that is our highest body of technical experts on soil. Our work is currently organized in the thematic areas you see on the screen. And in addition to this, we have technical networks like the one on laboratories, so Glossalon, the one on black soils, the INDS, the one on fertilizer, in fact, and many other that help us implement some kind of scientific topic specific activities. I mentioned Glossalon, so Glossalon is the Global Soil Laboratory Network that was established in 2017 to build and strengthen the capacity of laboratories in soil analysis and to respond to the need for harmonizing soil analytical data. So in 2017, the network was established and it started to work on what chemistry. So we focused a lot on training, harmonization, standard operating procedures, the execution of interlaboratory comparisons, and much more. In 2020, we launched the Global Glossalon Initiative on soil spectroscopy and right at the end of last year, again under Glossalon, we launched the International Network on fertilizer and it's called INFA. If you are interested to learn more about all of these because I'm being very brief to give the space to go to start the webinar. You can visit our web page and if you have any question on Glossalon in general, you can contact me at the email address to see your screen. Otherwise, if you have questions, specific concepts, you can contact directly. To close this very brief presentation, I would like to remind you that since the main objective of Glossalon and the Glossalon Initiative on spectroscopy is to develop national capacities in soil analysis, we decided to start organizing a series of webinars. Now, because the audience very different backgrounds and levels of knowledge of spectroscopy, the first three web dinners provide basic knowledge of soil spectroscopy. Again, because one of the most frequently asked questions that we received was on how to build and use spectral libraries, we decided to focus the fourth and the fifth web dinners on soil spectral libraries that providing concrete examples from France and Brazil. Ultimately, the six web dinners that we organize until now will be about spectral measurement. Please note that this is just the beginning of our training program on spectroscopy and that the more web dinners online courses, instrument demonstrations, and others will come in the few months. And be aware that we are working to make also these web dinners available in multiple languages. This was my last slide. So I thank you very much for your attention and I wish you a very successful webinar and thanks to both for for being with us today and providing audio information. So back to you again. Thank you for this nice introduction on GSB activities. I now have the great honor to give the floor to Professor both Timberg from Swedish University of Agriculture Science. A very well known worldwide soil scientist, his research focused on digital soil mapping and a variable rate fertilizer application in precision agriculture proximal soil sensing diffuse near input spectroscopy for soil analysis. This is also one of the first soil scientists who started research on the topic of the soil spectroscopy. He and his colleagues have also conducted the first comprehensive literature review on the subject of the soil spectroscopy. I remember I remember that this paper was one of my first literature reading during my PhD study, 10 years ago. Now I still read this paper sometimes if I have any question or doubt the link to the research paper and another two important research work from boy is now being posted in the chat. Without feather ado, I would like to now give the floor to Professor Boston, please. Do you see the correct screen now. Yes, and you can hear me as well. Yeah, go ahead. Good. Thank you. And thank you for the opportunity to give this seminar about soil spectroscopy. This one is the first of the series as you heard and it will be mostly an introduction to soil spectroscopy. But at the end, if the time doesn't slide too fast, there will also be an example where we use the soil spectral library to, to map soils at the field or farm scale because my, my interest in your interest spectroscopy is, is as a fast and simple and cheap method to get date, get the info, the high density information about what you need for precision agriculture. So, so my background is, is, is the practitioners when it comes to nearing in for spectroscopy what I have learned and what I would try to add some of it to, to, to, to talk about here is, is, I have learned by, by practicing nearing in for spectroscopy on soil and also to, and try to understand the things that I think that I need to understand. So it will be from a user's point of view, pretty much. We have also worked me and my colleagues have also worked a little bit with trying to understand a little bit more about what is going on in the, in the soil spectra, what, what's actually affecting the spectrum. Just jumping into it directly. This is two different soils, the spectrum of two different soils is the reflect on spectra because that's what we measure. And what is reflected from the incoming energy source, which is light source is what is not going somewhere else. With somewhere else it could be scattered away from the detector, it could be absorbed by something in the soil, a chemical activity in the soil and that's mainly what we are after that's information we want to need want to read from the spectrum and extract from the spectrum. There are two different toys, one is more very much organic and one is less organic and you can guess which one is which, I think later on, I will come up with another. I will not talk only about the near infrared spectroscopy or near infrared region of the spectrum but also about the visible part of the spectrum. Because the instruments we use today often enough contain both and both are valuable for detecting different properties of the soil. Visible is usually best of course what we can see with our eyes and it's from 300 or 400 nanometers depending on who you ask up to 700, 780 also depending on who you ask. And then from that the part of the mid of the infrared part of the spectrum nearest to the visible that's why it's called near infrared. It's up to about 2500 sometimes you can really see also 3000 nanometers. So, in visible and near infrared we usually talk about wavelength on nanometers. If you, when you come up into the mid infrared for some reason, everyone talks about wave numbers and frequencies, which are directed in another direction. The higher the wavelength, the lower the frequency. So the frequency decreases towards the right, the wavelength increases. Actually, I should also say that very, oh yeah, here's the frequencies are for the visible and near infrared. We want to see that. But I will only talk about wavelength, almost only, at least with one exception. There is one source for confusion and that is that in some communities research or applications. And I are near infrared is defined up to about 1000 nanometers and then up to 2500 or 3000 it's short wave infrared. It's especially in the remote sensing community, you will see that so I guess that many of you have come across this definition and it's just like, it's just how it is. So it's, but it's good to know that both definition exists. Of the speakers for these webinar series, I think at least a vendor the last one I will use this other definition. I'm quite sure about that. The reason why you want to use visible near infrared spectroscopy is, first of all, because it's extremely fast. If you have the spectrophotometer and the soil in place, it's a question of seconds to get the spectrum. And afterwards, nothing has happened to the soil. I don't know chemicals or anything like that involved. And you don't really need to prepare your sample very much. It's a better spectra and it's easier to use spectra if they are from dried and sealed so it's but you don't, it's not absolutely necessary. The instrumentation can be built. You can get it almost any way to present your sample to the instrument or vice versa, depending on what instrument you use and some instruments you can use very many different sample presentation techniques. In addition to being flexible, the instrument can be built by rugged and durable. So you can use an instrument in the field, you can put them in a tractor, you can put them in the airplane, you can set them up in a satellite and that is done too. And of course, once you have the spectrum and you need the calibration, we will get back to that. You can analyze several different properties of the soil from one signal spectrum. But with a few dots. So what is the near infrared and more detail? Well, of course, it's absorption at different wavelength, hold information on the chemical composition of the material. That's a general idea. And in the visible region, also that the energy coming in through the light source into the sample, it can make the energy level in the visible part is higher than further up in the spectrum. And it can cause the electrons to excite to a higher energy shell or audit. And but with the longer wavelength in the near infrared, the energy level is lower and and the molecular bonds starts to vibrate for this cost of vibrate and that's what causes absorption. So, the good thing here is that a specific bond in a molecular bond requires a certain amount of energy to vibrate. And each wavelength or band in the spectrum correspond or holds a certain amount of energy. So that means that the bond absorbing its specific amount of energy will only affect one band in the spectrum. And that's why we can gain information from from the spectrum. Now, the in the near infrared, we don't have the fundamentals as the ball and that's this one the energy. There is a one to one fit between the requirement and the energy in the event. Those can be found in the mid infrared. So for example, if we have the fundamental at about 3000 nanometers in the mid infrared, then we can have overtone in the near infrared. Twice as much energy that can be shared to with two balls. And so that would be about 1500 nanometers. We also have combination bands. And that's when, for example, you can see that there are three different types of vibration here and that each of those require different amounts of energy. And one of the stretching vibrations can be combined with a bending to stretch vibration. And the fundamentals for each of those are different in the mid infrared. And if you summarize that energy, you can, you will find that in the lower combination band in the upper range of the near infrared. So two different types of vibration in the same bond can can share one quantum of energy. So, so that's how it works as simple as that. To look a little bit more closely into this, we have an example of water. The fundamentals be one, we two, we three is the same example as before. That's a frequency and or a lambda one, lambda two, lambda three is the corresponding wave length is inverse. The first overtone is double twice the energy, the frequency or half the wavelength. The combinations, it's much easier to calculate from frequency, just summarize the frequencies. And then you can take the inverse and get rid of lots of zeros and stuff. And then you have the shorter wavelength in the near infrared. For example, we have the, let's see. If we have three different from the three different fundamentals for these three. You can see that the first overtone and half the wavelength, as I said here, and also we can combine the one with the two and we two will be three. And, and when we do that, we get an absorption band in the near infrared as the combination bands in the upper part. Also very much. So what kind of information can be found in the visible near infrared? Well, I gave you the example of water and that's not the coincidence because water absorbs very strongly in the near infrared, especially at the distance of 1,900 nanometers. And, but also absorb some vibration molecule bonds from between bonds between the large and the small atom, like see carbon to hydrogen oxygen to hydrogen nitrogen to hydrogen. And as you can understand, we see from that it's organic molecules, the near infrared is a quite narrow range. So it's not very very many kinds of bonds that actually do absorb there. So it's a manageable number, but instead, since it's overtones and combinations, it's a bit messy. Not some overlaps and the same thing will show up in several places and different things will show up in the same place. So the spectrum as such is quite confusing. But it is still it's used for several different things in agriculture for its ingrained quality and that's, I think is where it's all started actually the pioneers. For example, Carl Norris, Phil Williams and John Shank, who are regarded as being the pioneers for near-infra-espectroscopy development during the 60s and 70s. They all work with forages and grain quality, not with soil, but still in the agriculture arena. So the food industry for process control and called control pharmaceutical industry medicine, a little bit petrochemical industry and everywhere where you have organic materials, it's you can be sure to somewhere find, find near-infra-espectroscopy use. Actually it's quite little used in commercially in soil. It starts to show up in different labs and so on. Actually it has been there for a couple of years, but not as for forages and grain because that's the main method used. At least in Sweden and large parts of Europe, that's a method used and in the United States and Canada as well, probably in many other places I don't know about. Okay, so just move over to soil. Compared to those other things I just mentioned, soil is extremely complex and diverse. Although you can say that there are only two main constituents apart from water. It's the mineral and organic fraction, but these two fractions are highly diverse and variable. And it can be a mixture of lots of things in the same soil. So there are not two soils, even though there are only few meters between them that are exactly the same. And also the structure of the soil will influence the spectrum. Even if you have a, of course, if you go out into the field and measure directly, you have lots of influence on the structure, but also in the lab you will have an effect of the structure. If clay soils aggregate in various ways, you don't necessarily destroy that in the sample preparation, while sand is also single grain, so you won't have that effect. So the course of the structure, the higher the scattering of the reflected light, so that means that more, if the scattering is high, the larger part of the reflection is actually missing the detector. So we'll have a total lower reflectance, which will look like a higher absorbance. So the main factors influencing soil spectra are actually the main constituents of soil, clay minerals and organic matter, but also, of course, water and, as I said, the structure. And I will tell you a little bit about each of these. There is two or three slides about each. We start with water, which probably have the greatest effect if it's not dry, of course. So the main features in the spectrum are at around, somewhere around 1400 nanometers and also 1900 nanometers. You will all, in all soils, you will find these two peaks, even in the dry soil. Because there is, you can't dry soil completely. At 1400 nanometers, we said that before it's the first overtone of the stretch fundamental at 2850, 70 nanometers, so that's 1400 certified to be exact. Now the world isn't perfect, so it may shift a little bit from the calculated value. At 1900, it's a combination of the bending and stretching vibrations from 60080 and 2870. And the corresponding frequencies, if you have a calculator, you can add those to each other and take the inverse and move the decimal point to the right, quite a few steps. And that will end up with 1950 actually, not working, but that's just a technical question. So that's where the main features of water can be found, there are other places as well, but if you look at the, oh, I was hoping to have a, yeah, there it came, there is a delay. This is the same soil, the absorbance spectrum from the same soil. The line at the bottom here is the dry air dry or 105 degrees dry soil. Also with a flattened surface, the absorbance looks quite comparably low, where the reflectance is high compared to where we add 12.5 in the first step. The dotted line, 20% to 30%, and then wet is when the soil is saturated, it's simply too much water, but not very much too much, but it's like a wet surface, not just moist. What we can see here is that the absorbance increases or the reflectance decreases when we add water, especially with the first step. We can also see that these water peaks that are mentioned increases even more. So that's kind of logical. We can also see a difference between the visible part and the near infrared part of the spectrum. In the near infrared, it just continues to increase that sort of absorption. And that is that the water itself absorbs the molecules in the water continues to absorb. And that will continue until there is only water, so to say, that the soil is too deep down to have an effect on the spectrum. So that will just continue. While in the near infrared, sorry, while in the visible part of the spectrum, it quite soon stops to increase the absorption. And that is because it's as soon as all the surfaces are covered with water. Nothing more will happen. And that's because this is not direct or absorption of the water itself. It's actually more like a diffusion or that it has to do with the, as I understand it with the refractive indexes between the difference in refractive index because the difference in refractive index between air and soil is high. And that will means that the light is reflected with a small angle and lots of it is reflected back and hits the detector. While the difference between the in refractive index between water and soil is low, they are more similar. Or you can say both are kind of dark. And then the angle is more wide wider or more open and and the energy or light will just penetrate. You can compare that with a window and a mirror, a window, the difference between the refractive index between air and the glass is small. And you can see right through it. And the opposite with the mirror. It's kind of the same thing here. That's my picture of it anyway. Now moving over to clay. Clay, clay fraction is simply a size fraction. That's a definition of clay. It's a mineral particles smaller than two microns. And that could be, and to a large extent, is secondary layered clay minerals like ilite, smectite, montmorillonite, kaolinite, larmiculite, and whatever. And it could also be primary minerals like quartz and feldspar, but those would primarily turn up in the silt and sand fractions. Carbonates are some soils very rich in carbonates and that will turn up in the clay fraction. In old weathered tropical soils, preferably, you would find these red or yellow soils with lots of glutides and hematid, for example, when you have the sesquicides or metal oxides, which are of course small and will turn up in the clay fraction. So it could be different things that turns up in the clay fraction. So that adds to the complexity of soil, of course. Just to look at a few different minerals, we have three different minerals that are quite common in soils to the left. Again, in old weather soils, you will find lots of kaolinite, which is characterized by, it's a one-to-one mineral clay mineral, you say, it's one octahedral and one tetrahedral sheet joined together. And those sheets, those double sheets, can be joined together quite hard to each other. And then there you can find dissolved water to the surfaces, also in dry soil at 1400. And in kaolinite, for some reason, there is a very characteristic duplet. It's quite hard to see here, but it's there, but it's more visible here. And here, actually, it's not directly water, it's a hydroxyl group joined to the aluminium in the octahedral sheet on the mineral. And that's where you will find in other clay minerals as well. And sometimes that aluminium is substituted to iron or magnesium, and then you will find absorption higher up, and that's what you can see here, and also in the inlite. Also characteristic for pure, this is our pure minerals, it's not from soil, it's mine, the minerals. It's that in kaolinite, we don't have any water trapped in the interlapid layer, that is between the one-to-one sheets. But you can find lots of that in smectite or montmorillonite, and a little of it in ilite. You know, if you know that montmorillonite and smectite are known for its swelling features, and that means that there is much water coming in between. And lots of that, or comparably lots of that is still trapped in dry soil. These are, to the right we have soil, but here is one high in gypsum, it's from Iran. Gypsum is a crystal with lots of hydrated water, and that hydrated water shows up as distinct shoulders on this water peak. And also this one here at 1700 nanometers, which is quite unique. When you add water to any other soil you will get some absorption here, but not this strong. So this is a quite unique feature of gypsum. Gypsum nitrate is not very much, but there is a little difference from other things, but there is an absorption feature about 2300, which can be confused by things in ilite for example, but also by organic metal. There is also hematite, they are a bit similar. You have in the visible especially, this is a yellow one and in hematite it's a little bit more to the right absorption, and also in the very beginning of the nearing for around 900 nanometers there is this wide bump, so to say, which peaks at about just before 900 and go tight and just after 900 and hematite, but I guess most soils you will find a little bit of both. I'm sorry about this delay, but I will start to worry no. Maybe you can stop sharing and then just go to the next. Yeah, I actually need to, but I can't do anything. Yeah, I can't stop sharing. Sorry about this. I'm afraid I need to close my, you can still see me moving as well. Yeah, it's not the network. Maybe you may want to share on behalf of Professor Stenberg now. Yeah, I'm working on it. Otherwise we can share it from our side. Okay. Okay, great. Yeah, yeah, let's leave that one and continue with organic matter. So this is three different soils, one organic, more or less organic, at least one will low soil organic carbon, but high in clay that's a dotted one at the bottom in the visible. And one, another one high in sand also low in soil, we have a matter. And that's one in the middle in the visible. And we start with the visible. And it's no coincidence actually that that it's like this, because as you know, an organic soil is dark. It absorbs absorbance. So that means that it's, it's high so reflectance is low, which makes it dark. And then the clay is the brightest of them. So it absorbs less reflects more. And we can, since it's invisible part we can actually look at this by with our eyes instead of a strange spectrum. What we have here is to the left we have two sides. It wasn't easy to find those but with 100% sand no clay at all. In one of them, at the bottom there is no organic matter at all and then the other one on top it's a little bit of organic matter not very much though, but still it's obviously much darker. And despite the very low amount of organic matter you can, you can see an obvious shift in color. And these are two I letic Swedish clay soil, which are quite bright. And the clay soils have a much larger surface area, of course, and it's still it's 40% clay but it doesn't really need to be 40% it's enough with 10% that you will have a similar effect. And the difference between the soil with 2% organic matter, the soil organic carbon and 0% of the soil and carbon and and 2%. I think I won't use this laser pointer because I think it starts to slow down when I open that actually. You don't really see any difference in color it's slightly slightly darker this one on top but not very much. And that's the because of the larger surface area you get kind of delusion of organic matter. The layers are much thinner and you can't see that easily as you can in a coarse sand soil. So that's why we have that effect. And then if you focus on the nearing friend now the spectrum starts at about 1000 nanometers in the near. We have some you can find in literature and I have also seen them in published I think you have will have one of those people where this is a little bit more described but one band that is you can find in the literature is 1660 I have never seen that. Maybe it's somewhere else in some other force I've never seen any effects of organic matter there. On the other hand there is this duplicate at 1728 and 1754, which is, I found to be very, very important for, for, for measuring the organic matter content and. And at least this 1728 it relates to the alkyl group is there are strong and do that actually strong absorption features in the middle infrared and overtones is fine here, and, and the combination of that is fine he found here 23 or six. So these are alkyl groups, which are right and those are primarily. It's more of those in the well, well, degraded organic matter, because it's in the short chains, and also, and the ends of the longer change, but in the well degraded organic matter or well decomposed organic matter. The long chains are shorter, so you will have more of those end point groups. Johanna Wetterlin, a colleague of mine has has done some work on that and try to establish what what turns up where and that is a continuing work. It's quite interesting. One reason for these sets is that we have in for Swedish soil is some reason we are calibrations for organic matter is quite bad for on a national state, much worse than in most other countries or parts of the world. And our calibrations play is play content is much better. So you can't win them all. So over to structure the effect of structure. The two ones in the middle the blue and the green are the sand. And it's the seed sand and the milled sand, but it's not really any difference because the sand is a single grain soil and and a ball mill that we used here don't really affect that might crush some of them, but it's no real effect. Why the aggregates and the clay sorry, it's, it's goes from a quite coarse powder to a very, very fine powder. And what we see here is that the seed clay with the aggregates intact small aggregates up to two millimeters. They are still there and we have a lower reflectance more of the of the energy light is is scattered. When we put it through a ball mill, we have a very, very smooth surface and more of the energy is reflected. So that's a kind of albedo effect, which is not really what we're after. And as you can understand when we compare different types of soil. These albedo effect would be the main thing that will be seen that's so we really would like to get rid of that. And that can actually be done by different types of scatter correction or pre transformation of the spectrum, which is usually done. And there are many, many different ways to do that, but one, a good start is the first derivative for soil that's a very good start because it usually works well. Often it's found to be the best one of several tested. So, and then then you can see that we accentuate the actual features of the spectrum. It's also only the nearing friend. So there are some drawbacks also before I mentioned the positive things with. I think I did. Yes, I did. The drawback is as I mentioned that the spectra are quite messy or complex or part you can't really you can't read a peak height or a peak area to and directly read the content of the organic matter or something else from that. You need a reference data set. You need to model make a model to use a new samples. You need to make that model from known samples. So, and that the number of samples you need that depends on the diversity of stories you want to, to predict or estimate something. Also, which can be seen as a drawback, it's not necessarily drawback, but the calibrations tend to be empirical. So they are kind of black box. So the principles for making calibration is that I will not go through ever all the techniques for this there is no time for that but, but you need to start off with the calibration samples and pick out calibration samples from those it could maybe good to save sound for validation. Maybe pick out new that's actually the best to pick out new so they are totally independent but you need a calibration set with spectrum and reference data with a reference method or standard method for what you want to measure. So you need to make a calibration with some multivariate technique that can use all the data points in the spectrum, it could be 1000 or 2000 data points in the spectrum. Partially squares aggression. T s is probably the most common these are used to be the most common, which is a kind of similar is liner multivariate techniques as is the line regression but you can't really use that because are too many variables. You use these PLS or PCR which is kind of dimensional reduction techniques. It's with few latent or replacing components or factors you may handful of 10 or 12 you replace all those 1000 or 2000 spectral bands in your spectrum. And then the last 10 or 20 years more and more use of and also with the larger data sets available more and more use of more fancy methods like memory based learner which is kind of PLS, but also with the non linear data mining techniques or machine learning whatever you want to call them support vector machines, neural networks, regression trees and all that stuff. I have actually compared all these methods and I found that. Not very surprising but you won't do miracles with a more fancy method, it might be a little bit better but no miracles. And then you use this model to predict your unknown samples. We don't have a reference, but it's very important that the calibration samples are representative on on the samples you later on want to to estimate. And also that you validate that's very, very important that you validate and it's your very validated independent data sets. There are a lot lots of if you read literature on near infinite spectroscopy and probably other techniques. There are very many papers which are a bit doubtful in that respect. There is something called CME replication. For example, where if you have your pick your calibration samples from clusters, several clusters, and you pick out your calibration validation samples from by random, then you will have more or less similar samples in both the calibration and the validation data set and then they are not really independent. So it's important to make sure that your validation data is actually independent in relation to where you want to use your model. So you need to think there and that too often it's not thought about enough. So that's a take home message. So the validation or calibration it's a way of relating the Y space to the X space or the dependent to the independent and which is in, for example, this case is the clay to spectra clay is the X space, a Y space or the dependent variable. And the spectra is the X space or independent. So in the validation situation, you have the answer and you can relate your measured values to your predicted values or vice versa. But in the prediction scenario, you don't have the answer. So you don't know how good it is actually but the software packages that you use they often provide you with some kind of uncertainty measurement which is related to how well the predicted sample relates to the model or is represented by the model. So that gives you some indication. There are also lots of different statistics used. And the most common, I think is the R square, which is just how much of the variation variation that is explained by the model also the residual mean square error roots means square error is very common RMSC or RMSC is often followed by additional letter. This is your mean square error prediction P cross validation CD or just see for calibration. The RMSC P is very much very much similar to as is calculated to standard deviation. But instead of being the average distance absolute distance to the mean is a average absolute distance to the to the model, which is the line here in the one to one line. Absolute average distance to that to that line. RPD is rate your performance deviation which is often used as the standard deviation divided by by the error. So that should be a large number. R square should be as close to one as possible and RMSC should be low as low as possible. I think I jump. Large one thing that is now I will give an example if you are not falling asleep. Most of you are still there so we continue. This is a sample or it's not that small but it's been popular to build large spectral libraries libraries that can be used as a rational method for predicting samples, new samples in a large wide area. And many of these celebration perform well when you validate them in their own state. And this is a Swedish national library. It's about 12,000 agriculture soils. We built a model on about one third of these those and validated on the rest and the validation here is what you can see the best we could do and it's actually done with one of those memory based learners and the memory based learner is the kind of PLS where you pick your you take one sample and you compare the spectrum of that sample with this with the with all the spectrum in the in the library and then you pick out according to some predefined definition. A number of samples that are as similar to that sample you want to predict as possible. So it's spectrally, you may build a unique calibration for each sample or each group of samples or however you want to do it. And then you build spectra that are as similar as possible. So when you do that you assume that when if the spectra are similar, also the soils are similar. So that's the intention and it's often works quite well. And it's become a popular method. But now how well will a large scale calibration work when we want for as in my case I'm interesting at the variation at the farm or field actually the field state. So just samples collected just 50 or 20 meters apart. How well can such a model resolve that kind of variation. A completely different scale. And camp will these models perform better these large scale models built on the large scale data set perform better than if we make a unique small calibration on just a few of those farm or field samples, say 10 to 50 10 or 50 or somewhere in between. Which one will perform best. And that's what we test. But before I show you that I will show you what actually will happen when we, when we apply a large scale calibration to a small scale field. And if you if this is a large scale calibration and the red dot represent soils in the field or a farm. You can see that red dot was carefully chosen to be far away from the model. So it's, it's not the best sample in this calibration. But that's the type of soil we have in the field, then we will have that the ranking of the samples in the prediction of these new samples in the field will be quite good but there will be a quite strong bias. And that's also what happens in real life, sometimes. So this haxa as it's pronounced field to the left. There we have a very small bias. And that field is actually the soils. That field is very well represented in our national model. That type of soil, but then a very, which is close to Brian. You can see that the ranking is good, but the bias is very strong. And the SEP here, that's the RMSEP but correct them for the bias. So if we assume just that we move it up to the, to the one to one line. Then the SEP is just about point 12 and here it's still around point four or five. As it was from the beginning, so that very small difference here. But here, the difference is large and actually it's much lower here so actually the prediction here is better than that. It might also be due to the smaller range. So this is something to think about. And that's the problem we will have not only linear infrared spectroscopy, not only with organic matter as this is, but that's the kind of, that will happen with most techniques for at least for soil. So this is kind of universal phenomenon. Not very strange actually. So validation at the field scale. We compare the national scale global PLS, just the straightforward PLS, the memory-based learner, which was selecting the most similar samples. And then we select them to the field to get rid of this bias, for example. And the national soil spectral library, which is spiked. So we tried to make the model, to move the model in the direction of the type of samples we had in the field. And also, just without using the national soil spectral library at all, just to make a few sample calibration for the particular farm. We had 11 farms. Four measured with the same instrument as our national library. And seven farms measured with a completely different instrument, a very different instrument actually. So this is an important feature in this project. It was a bit unfortunate, but we got some interesting results from it. And the bars here is a variation of clay, soil organic matter and clay content. And as you can see, the variation in each farm can be almost, or large part of the variation we've had for the entire Sweden. And all of these 12,000 samples. So the variation at the small scale can be quite big. We have two different instruments. We tried to make the spectrum spectra for those instruments similar. There are techniques for that. We use something called piecewise standardization. And you can see the A and B here are the original spectra, which was an outlying instrument, not the one used for the library. And then we have the field spec down here, which we also use for the library. So that's the kind of spectrum we wanted. The albedo here is just to make it clear. So we transformed based on 18 standard samples that were measured with both instruments. We tried to make them similar. And they looked by eye, very, very similar. We were very, very happy. But then we projected them on the large data set. And then we can see that the four farms with the same instrument, they were projected onto the library, while the foster instrument, despite of the transformation, they burned. So it didn't really work. It did some, but it wasn't good enough. But we continued anyway. Why give up? So this is the difference between PLS, the small ones here, and the memory-based learners. And as I said, we have problems with organic matters. Clay works very, very well. Very small residual mean square error, just 3.5. And then reference method isn't much better than that. But organic matter should be better. So this is how it is. So this is the result. So the red one here for the four farms with the same instrument, this is a PLS, not very good. A high RMSE, a large deviation between the four farms. Using the memory-based learner, it improved things quite a lot. But it was quite high still, over about 6% clay. This is an example from the clay. But while with spiking, with only 10 samples, it improved the calibration quite substantially. And it was even better than with just a local calibration. And that's what it seems to be, that the large national library, it makes the modeling a little bit more robust. Then even more interesting, when we moved to the seven farms with the deviating instrument and the strange spectra. PLS was still about similar as with the four farms, but with more deviation, but it was also three more farms. And the memory-based learner just failed. It was worse. And that's not so strange, because if you try to pick out similar samples to something that is deviating from the beginning, it must be, it's just stupid. So in a way, it's good that it failed. It should fail. But the interesting thing is with spiking, with just 10 samples, we also, and that's, I said, there are rarely any miracles. We can improve different techniques a little bit, but no miracles. But this is at least close to a miracle. It's probably not a miracle, but it's a much bigger effect than I would have expected. So this spiking thing is quite good. And you can probably do this in other ways, just to use maybe two or three samples to adjust the baseline. At least in this case in the four farms where we have the same instrument to begin with. Maybe it might be a little bit more complicated here with the centerforms. And this is, and that was for clay. And for organic matter, it looks more or less the same. It's the same type of results. And these are the results in visualized. So the standard, the normal PLS, straightforward PLS, you see the strong bias. We get rid of most of the bias with a memory-based learner. And this is one of the four farms. And with spiking, it improves even more. And it's actually better than the local calibration with 10 samples from just that farm without national library. So that's the results from that. So the conclusions are that the largest, there are large systematic errors to be expected if you apply a large scale model or large-scale reference data sets to a small-scale situation. But that can be managed, but with some extra effort. So we actually lose some of the advantages with quick and easy measurements of the near-infrared. We need to do some more reference measurements. But probably not very many. And you can also look at the situation where neighbors, for example, join together and you can use just 10 samples, but for the twice area or something like that. But there are things that need to be further studied there. It could be... There are opportunities. So that was all from me. Many thanks, Mo. It was a little bit long, but actually not too much longer than an hour. I think it's a great presentation. I think it gives us a very comprehensive introduction on the soil spectroscopy, also giving us some future perspectives about what we can do with this technology in the real application. And I think we can have 10, 15 minutes left for some discussion. I saw quite a lot of questions. I tried to answer many of them, and then there's too many. I think we can only select a few of them to answer live. And unfortunately, we cannot answer all of them, but if there is some question, we couldn't address the live. So please feel free to write an email to Bob because previously we shared his webpage. There is his email, so probably he can have some more detail to answer. And some of the... I think in the very beginning, I saw a question is... It's quite interesting, I guess. He's... He's your colleague or formal colleague or friend, as Thomas. He said the... Are you aware of the proficiency test aimed for spectral analysis? Sorry, the... Proficiency test? No, I don't think so. Okay. Maybe I am, but not under that name. No. Is it a validation technique or what? I guess he's mentioned about the variation of the spectral data between labs for instrument. Okay. Yeah. That is actually an obstacle in general for mirroring for spectroscopy. That is... Even if you have the same brand or the same type of model of instrument, they differ. So model transfer between instrument, that's a big issue in mirroring for spectroscopy. That's actually another drawback that I didn't mention. But lots of work is going on managing that. And there are some, for example, FOS-Tecator. They have networks of instruments where they try to handle that in the camera network arena. And they pay for that as well. I saw a few questions. Our intro comes from, I guess, it comes from the region. Where is the sort affected soil? They ask if high-sort content has some... the spectral will get affected or not? High-sort. The senility. Yeah. We don't have much of that. So my experience is a bit low, but there is... It's used to say that, I mean, pure water is a special type of spectrum. And to answer one of the questions that pops up all the time is, can we predict nitrate and phosphate or other salts? We can't really because they don't absorb. It's quite simple. The salts don't absorb in the near infrared. But the salts, if you put salt into water, the spectrum of the water will change. But it's not the salt that absorbs, but the salt affects the water. So the matrix of something affects the bond, so that it absorbs in a slightly different way. And I've actually said that several times. For example, with the example of gypsum, the hydration of the gypsum, it pops up new kind of feature we can't see in another mineral. It's water bound to the gypsum in different ways compared to other minerals. So we see other features and different minerals absorb differently. And it's often water in connection to the mineral, clay mineral that gives the spectrum features. And I didn't say that, but we often distinguish between primary and secondary calibrations or absorbance. And organic matter, of course, absorbs directly, water absorbs directly. But when it comes to clay, I think I kind of tend to say it's the primary absorbent, but it really isn't, because if you are interested in the clay content, it's built up, as I said, in many different things. And it's the mineral that absorbs. But that relates so strongly in a quite stable way to the content. So it's almost like it's the primary absorbent, but actually it is. Thank you. Another interesting question from what I could see so far is that I think most of the colleagues or the audience are interested about how this technology can facilitate in a fertilizer recommendation. I don't know if you have any experience or we can hear some of you about the perspective. Yeah, that's kind of how we want to use it, because it's not directly that, as I just said, the plant available nutrients generally don't absorb. But usually what you can do is that you, in mechanistic models, you try to model the root and cycling and so on. It's not that easy. We try to use that in precision agriculture and the models are simply not good enough. But I mean there is a relationship between the soil type, especially if you go, not just look at the surface. And that's a problem we have with almost everything, that we just look at the surface. But the soil profile is important. So the, I mean, on the more general, from a more general point of view, it's the soil type is relevant for also for fertilization, but we can't really measure it directly. We get a better view of the soil. And it's, I mean, how much nutrient that is needed, that also depends on how, how much water you have. I mean, there are other things that you need to consider. So, I mean, you could, even though the crop could use more nitrogen, for example, and you add that, and then it's dry and no irrigation and it doesn't help. So, I mean, agriculture is complex. But you can't read the nutrients directly, the plant development nutrients. With some, I'm not definitely sure about phosphorus because there are some papers that might indicate that in some circumstances, at least that you, there might be more or less at least direct information about plant-available phosphorus. The problem with plant-available phosphorus is that it's measured in so many different ways. And also it's moving around. Yeah. It's not so stable. No. No, I think, maybe it's more stable than nitrogen, but yeah. Yeah. I think the fertilizer recommendation also need to take more soil properties or other factors as a consideration, not just the measurement of the MTK. That's also my understanding. And I think the last question, we can make it an easy one. Since this is the, is a part of our training program, one question was asking what specifically spiking means. I think it's better to in a future application, it's better to let people understand. I've done some work with Raphael Viscala-Rossel, and we debated the word spiking quite a lot, but after a while, he was satisfied with spiking. I'm not sure that's the best. He thought spiking is something you do with drinks. And that might be true. And he should be better in English than I am. But what we mean with spiking is that we actually extend the calibration data with some small number of local samples to move the model in the direction to that type of soil. So if we add 10 samples, we don't just add 10 samples because if the spectral library is very large, we multiply them and perhaps 50 or 100 times or something to make the influence larger. But you probably can do that in another way, just to use the spectral library. If it's just a bias, you can use the calibration from the spectral library as it is and then just move the, by two or three, if you pick the right two or three samples. That covers the range. Then you can move the baseline, so to say, to the center of the model. So maybe you can do it with much less than 10 samples also. I guess there is a limit because you will never know exactly which are the best samples. You will never have the right answer. You will never have that because if you get that, there is no use of the method. Yeah, I think the spiking, I guess it will work because it brings some local information. It's about moving the model in the right direction, just to give it a little push in the right direction. I think it's more or less in the ending part of our webinar and for sure we cannot answer all of the questions. I can see there are still have 50 questions waiting for the answer. Otherwise, we will finish in six o'clock today. So if I would suggest, I would recommend that the participants, first of all, I apologize, we cannot answer all of your questions. I could answer one though, that I didn't mention that because I just mainly talked to my clay and organic matter now. That's because we know that there are strong relationships and real relationships to the spectrum. But for example, pH, it's often claimed that you can't predict pH. Actually, if we do that with our soil spectral library, the Swedish soil with 12,000 soils, we get a pretty good relationship between the spectrum and the pH. Not really a good prediction model or calibration, but a relationship. But protons are not absorbing it. We know that. So what probably happens is that we measure the buffer capacity. And with the more clay and the more organic matter, the main constituents and the main things affecting the spectrum, they influence the pH. They have a large influence on the pH through the buffering capacity. So it works in a way. But agriculture, I mean, changed that relationship. And if you lie, you just destroy that relationship. So then it won't work. So it might work, but it's a bit risky. But if you know what you're doing, it's okay to use that kind of relationship, secondary relationships. Yes. But you need to know what you're doing, I think. I was during my PhD thesis, they're part of chapter. I was trying to link the spectrum information directly, indirectly linked to the liming application. Try to get that. That's fine. But anyway, I think the, I think that we are really in the end. And sorry, again, we couldn't answer all of your questions, but you are very welcome to write email to me or to the, to the Boston bird. And then we will address your question by the email. We will be happy to make a connection with you. After this. After this seminar, you will still hear from us and the receiver shortly and email with the link to the recording and the presentation because we are, because some of the colleagues due to the different time, time zone, they, they couldn't join this seminar. So we will record and edit. Finally, we will upload to our webpage and to inform you. And a very big thank you to our today presenter and all of the participants. I think, I, I think that we have reached around 600 participants. Maybe a bit more even to join the first webinar on soil spectroscopy. My colleague is now posting the link to the other five webinars scheduled on soil spectroscopy. And you will, you are, you are all invited to join and the register. Remember to check this page regularly as another series of webinar on wet chemistry, health and safety equipment, perching quality assurance and quality control and laboratory management will be organized in the next few months. And thank you all once again. And I wish you all a pleasant end of the day or evening. Thank you. Thank you. Thank you. Thank you. Bye.