 Hello everyone and thank you for inviting me to present the National Academies about this work on hyperspectral virus detection in vineyards. I'm Luca Brilanti and I'm a professor in the Department of Agriculture and Technology at Fresno State. And the title of this talk is Empowering Autonomous Virus Detection in Vineyards, Hyperspectral Vision Systems, Preaching Science and Industrial Application. This talk will review the work of the Iberbid project. We'll go into the detail of the basics of hyperspectral imaging and then we will present application to virus detection in vineyards and I will summarize the main performance results. And I will conclude with a recapitulative and perspectives. The Iberbid project started operationally in 2020 and is currently ongoing. It's an acronym that stands for Iberbid Spectral Virus Detection in the vineyard. We ask operators from industrial standpoints we have a broken wine company and some super estate as of now and we are currently expanding to other partners. And as universities, the partners involved were Fresno State, which leads the project in association with Cornell University and particularly Dr. Mark Fuchs and the University of California Agricultural and Natural Resources with Dr. Monica Cooper. The founding came from the CDFA, the USDA, the California State University Agricultural Research Institute or CSUNRI and the F3 or Fresno Merced Future of Food Innovative Initiative. To summarize the basics of hyperspectral imaging, I will start by describing what remote sensing is. It's just the measurement of some property of an object by a recording device that is not in physical contact with the object. So if we take the electromagnetic radiation of the sun that impacts against a leaf of a grapevine, our case, that will be the amount of light incident on the leaves. A portion of that energy will be absorbed and used for photosynthesis. A portion of that will be reflected and the portion of that will be transmitted or passed through the leaf. Of course, we are currently working with the reflected portion and we are working with passive devices. That means that we use the energy of the sun as a source of energy. Why are we interested is because we know that diseases can affect the biochemical and biophysical properties of the plants changing their optical signatures. And so an LT leaf will have a different reflectance respected to an infected leaf. And these different schools actually be recorded in different regions of the electromagnetic spectrum through a camera sensor, which is what we have been using and what our spectral cameras are. Now, what are in particular hyperspectral cameras? If we consider any digital device for taking pictures as what we have on our phones, for example, we normally acquire information only into three specific bands, which are the red, green and blue or RGB color codes, which roughly corresponds to 16 million different combinations, unique different combinations of colors. So by combining just these three bands at different intensity from 0 to 255, the four innate bits, we can have a complete representation of all the nuances that we can see through our eyes. When we work with multispectral data, we expand the number of bands, not only to have three different color coding, but a larger amount, which is still into a limited amount. And also the chunk of lights, if you want, that are taken into consideration are fairly large. The difference is that we start also acquiring information into regions of the electromagnetic spectrum that we normally observe with our visible eyes, such as in the near infrared, for example. However, the reconstruction of the electromagnetic radiation remains discrete and it's somewhat incomplete. With hyperspectral data, the electromagnetic radiation instead is rebuilt in a quasi-continuous way because we acquire a very large hundreds of bands of very narrow chunk of lights and so at that point we can actually rebuild in a quasi-continuous way the electromagnetic radiation. Now if you look at the electromagnetic radiation into the electromagnetic reflectance of a grapevine leaf, we go in this case from on the x-axis we have different wavelengths, we go from 400 to 2500 nanometer on the y-axis we have the reflectance. In the visible area, which is actually the focus of this study together with the near infrared, we have the information related to chlorophyll functioning, particularly to leaf pigments, but in the near infrared we have information in the cell structure. In the short web infrared, we have information regarding water content, but also regarding leaf biochemicals such as cellulose, sugar, starch, protein, lignin and so on, particularly COH combinations. And so at the beginning of the HyperVid project we actually worked into the visible and near infrared roughly from 400 to 900 nanometer while now we are actually working into the short web infrared domain as well. The HyperVid project started with multiple steps. We worked first into control conditions in the laboratory. So we have a dark cabinet with a specific illumination and we work with the touched leaves. And in this section we mostly work with visible information. Then we brought the cameras into the field and we mounted that on a tripod. So we're working with still cameras and we acquired images of the full canopies. And then we work with dieralymaetry from ETH. The first part of the work has been complete and was published last year. The second part of the work is about to come out on computers and electronics in agriculture. And the third part of the work will be coming up hopefully toward the end of this year. So all three components of this work, they actually use the same structure. So we acquire data into the field. For each single vine that we acquire images from, we actually also sample leaves for the analysis of the virus through molecular diagnosis or PCR. And this is a very strong difference respect to similar works on the topic where we are not actually building a system based on visual observation, but we are actually building a system against molecular diagnosis. And we are also comparing the performance of the machine or the sensing system with the performance of an expert assessment. So the leaves go to the laboratory for PCR analysis, as I said, and the images they go to the lab for further processing through computational systems. For in the control condition, we have a data set of roughly 500 samples or PCR sends, tested and imaged in the dark cabinet. On the tripod conditions, we have roughly 700 samples. And on the drone, roughly 300. And then the machine learning model is built again on the spectral imagery in order to predict the results of the PCRs and then compare to classification, both on the images and in the field from a couple experts. In the dark conditions, we actually also attempted to differentiate between multiple different viruses. So we had in this data set, non-infected, red blotch, leaf roll, and co-infection of both leaf roll and red blotch. And it is very hard, you know, these images here are fairly, are the ones showing the, some of them are the ones, are some of the images showing the strongest symptoms, but otherwise they tend to be very mild. And actually we also sampled across multiple dates. Therefore the symptoms are at the variable levels of expression. The expert accuracy with this data set is roughly 50%, while a deep learning system based on convolutional neural networks actually achieves an 87% when the symptoms are visible and 86% when the symptoms are not visible. The distinction of red blotch with leaf roll sees that most of the samples are predicted as leaf roll, also because leaf roll was more abundant in this data set with respect to red blotch, but also because red blotch has the tendency to have milder symptoms and so it's actually harder to predict. And the co-infection has the lower accuracy. If we look instead at the spectra into the visible and near-infrared from 500 to 900 nanometer, we see that here in the green line we have the average reflectance of the not-infected samples. On the red we have the average of the infected sample and the ribbons are the standard deviations. You can see that the samples and these are data on the tripod. The samples are very similar or very close to one another. There is not a very strong difference. So there are nuances that needs to be analysed. And in particularly in the green domain we have a large amount of reflected light from the not-infected samples and the infected samples instead reflect more in the red domain and into the red edge. So this is actually consistent with the visual symptoms where we normally see the canopy approximately becoming more red. Even in this case we actually acquire data over multiple dates and so at different levels of expressions. And so we have the ability to differentiate no visible symptoms with 75% accuracy through a PLSDA model that can only use 5 bands of 16 nanometer width while the expert on the RGB images only achieve 54% accuracy which is normal because at this point we don't see the symptoms yet. When we compare the symptoms after variation in multiple dates not only at harvest when the symptoms are more expressed but in multiple dates we have an accuracy of 76.6% with an SVM model that use 18 bands also a 16 nanometer width. And we have compared, I don't go into the details here but besides comparing multiple machine learning models we also have compared multiple dimensionality reduction algorithms different ways to reduce the noise and bump up the signal and so the 16 nanometer actually seems to be the ones that on our multiple essays has given the largest performances. The expert with RGB images is a 78.6% and the expert in the field is a 75.3%. And so as you can see the machine learning models of course outperforms the expert before the expression of the symptoms and is comparable to the expert performances after symptom expression. We also realized as we were acquiring multiple data in multiple time from before the symptom expression during the symptomatic expression of the symptoms on multiple steps until harvest that when the symptoms are not visible we actually have a larger amount of difference into the near infrared while in when the symptoms are visible then the difference into the near infrared attenuates and we have a larger difference into the red area and the red edge. And so this is also particularly important to understand if you want to develop systems that can be used over a longer and extended period of time so therefore more operationally capable in production that we probably need to focus on different areas of the spectrum depending on the time of the season. We attempted a multiclass classification in this case where we had three classes instead of only having infected and not infected we had non-infected, visibly infected and non-visibly infected. Those are the ones that the expert classify are not infected from a visual standpoint but they come out positive at the PCR results. And so overall here the accuracy is roughly 71% we have higher accuracy into the classification of the visibly and non-visibly infected categories because of the ability of working on different regions of the spectrum or giving the machine learning model the ability to differentiate these two classes and utilize different regions of electromagnetic radiation. Now finally the results on the drones we roughly have 264 bind sampled 122 were non-infected 142 were infected with red blotch so these are spectrometry from a drone and up with a pixel accuracy of roughly 7mm and very balanced data set. In this case we acquire a limited number of bands only 40 this is a limitation of the hyperspectral camera that we use for this purpose then we clean up the spectra and we interpolate over a larger amount of wavelengths and we utilize a smoothing filter compare different machine learning algorithms in this case the image was only performed one time at harvest the time of maximum symptomatic expression and here we go back to have the maximum performance that we observed into control conditions because we do not confuse the model with milder symptomatic expression but we only go there one time when the symptoms when the leaves are red when the symptoms are the most expressed and if we compare the accuracy with the expert the expert data roughly at 3.6% accuracy on this data set and so overall we have comparable or superior accuracy but if you look at the confusion matrices here we see that on the left we have the expert performances on the right we have the machine learning performance we see that we have slightly higher false positive with the model roughly 10% respect to the expert that does 8% or whether we have a very strong lower false negative with the model 17% respect to the expert 24% which makes the overall higher accuracy of the system respect to expert assessment if you look at the importance of the different bands we see that the green, the red, the red edge and the near infrared the most important accuracy which opens up also to think that it's possible to have very simplified sensors if we only use if we only if we go at the right time during the season we also applied an inversion of a radiative transfer model or the prospect model in order to estimate what their perspective images really see and so we can see that we have an estimation of the chlorophyll concentration and so the not infected vines have a higher amount of chlorophyll respect to the infected as well as higher amount of carotenoids so the overall the photosynthetic apparatus is in better shape respect to the infected vines of course what we have is that we have a higher amount of anthocyanins into the infected or red color compounds respect to the not infected ones now if I summarize the different performances in control conditions in all models we always have higher or similar accuracy respect to expert assessment when we compare the hyperspectral sensing with expert visual assessment to molecular testing the machine learning model will reach an accuracy of 87% in control condition and similar accuracy is from the drone at harvest the time of a higher symptomatic expression it's harder to have a higher accuracy so high accuracy when we actually focus in the vineyard on the time of variable expression of the symptoms and so if we expand the sensing campaign from a previous reason or at the onset of the reason all the way to harvest and it's important that we try to simplify the hyperspectral sensing in order to build an operational tool based on a limited number of bands that you overscaled directly use upcoming complication will further expand on the results that they presented that will come out this year and what are we actually working on thanks to the latest grant from the California Department of Food and Agriculture Pierce disease glass-winged sharpshooter board we are working on developing systems for presymptomatic binds and white varieties particularly white varieties are an important target for us at the moment because there have been no work at all on these varieties and it's very hard to assess visually the infection we are also working with the full range handled spectroradiometer in order to understand which region of the electromagnetic radiation is actually the most sensitive to the virus infection and we're combining that with the chemical analysis of leaf metabolites also to try to understand the fundamental of the spectral signature and we are trying to go more and more toward an operational tool and so we are expanding our capability to varieties that we have not currently worked on and new vineyards in order to see what possible confounding factors needs to be accounted for in order to allow the growers to use this kind of sensing for their assessment needs thank you to all the people that have worked over the years on this project particular to the Torla Roche Pinel which is leading the work in the lab but also all the students that have collected or analyzed or managed the data information to all the cooperators over the years and if you want to know more about what we are doing you can connect through this QR code thank you for your attention I look forward to your question