 Good afternoon everybody. I think it's 4 p.m. so we'd like to of course that we immediately start. So I would like to welcome everybody to this interesting webinar hosted by K-Löwe and I could know in the context of the Big Data Grips Project. My name is Robin Krone and I will be moderating this webinar and I will give a bit more information before we start with the interesting speakers of today. And as you can see on the title slide shared today, this webinar will be about predictive analytics for food risk prevention with a special focus on how to design both visually appealing and interactive predictions. Before we start the webinar, I would like to give you some hints on how you can get the most out of this online session. First, feel free to take notes. At the end of the talks, we will have approximately 13 minutes for Q&A. We would therefore like to ask you to use the live chat function here in Zoom to ask your questions. We, as the organizers, will try to bundle these questions and organize the Q&A afterwards. Finally, I would also like to ask all members of the audience to please turn off your camera and microphone. Then I'm happy to introduce the five interesting speakers we invited for you today. First, we have Professor Catherine Verbert who is an associate professor at K-Löwe and also the head of our research group. Second, we have Dr. Nikols Monuselis who is the CEO and co-founder of Argonaut. Then we have Dr. Janus Tuitsis who is the CEO and also a partner at Argonaut. Janus is followed by Dr. Nini Thun who is a postdoc at our research group. And then finally, we also have Michales Papa Constantino, the data services teacher at Argonaut. I also want to ask every speaker to introduce themselves in approximately one minute at the start of their talk. Before we finally jump into the talks, I would like to briefly go over the agenda with you. First, Nikols and Catherine will go over the challenges, both from a business and academics perspective. Then Janus will guide us through on how to translate these business questions to a risk dashboard. Nini will go over the design guidelines on how to make such a dashboard visually appealing. And then finally, Michales will show us the food archive food risk prediction dashboard that was developed in this project. We estimate that these talks will take approximately one hour in total and then we have 30 minutes left for the Q&A at the end. So don't forget to post questions in the chat as soon as you can think of them. In all of me, I would now like to ask Dr. Monuselis to please introduce himself and talk about the questions from the food industry about predictive analytics. Thank you, Robin. Thank you everyone for joining us. I am Nikols. I'm the CEO of Agrano, but I'm so much pleased and happy that we are organizing this together, Agrano together with K-11. Our working relationship with K-11 goes back in the years before Agrano. Back in my PhD student times, we started working with the team. The Catherine was a member back then on topics that had to do with how do we collect data and meta data, data about data from a variety of sources in different formats and how can we combine them to build a way to discover useful information. It was, I think, over many years ago. I would not say how many exactly. And part of the legacy that got our company where it is right now is coming from research work that the team at K-11 has been doing. So I'm very thankful as well for having the opportunity to present this together with you guys. My intro will be about the business challenges and the business interests that the food industry sees in this technology, AI and in particular the prediction part. How can we use algorithms to calculate meaningful predictions? So if we go to what the industry says, literally says, one of the colleagues that is leading the predictive modeling and analytics project with one of the largest food manufacturers in the world said, okay, there are so many solutions out there or so-called solutions. Everyone that is working in a company like yours is coming and saying, we have the greatest solution of all times. But I think that in most cases, there are solutions to a problem that is not defined yet. So that's why I find very important to focus on the definition of the problem. And as part of this exploration, this journey to define very well the problem and the questions that we have to respond to, we set up an interest group that gets together people from the food supply chain, people working at food manufacturers, retailers in different companies that are either very much interested in using or would be interested to use this kind of technologies. This is a collaboration that includes teams, more teams like the team of Queen's University and Professor Chris Elliott, food-thrilled testing people and data modeling and analytics people. And we already have 25 companies represented in this interest group. And before joining, we asked three questions. One of those is, what is your experience with predictive analytics? The second was, how much money would you pay for such a solution? And the third one, I will save it for another webinar. What did they say? What is your experience with predictive analytics? The majority of them said, I know the technology, I've heard a lot about the technology or I've read a lot about the technology, but I haven't tried it yet. So it's a buzzword that I hear a lot, but I don't know how it looks like or what is it for. And how much money would you pay? How much money would you pay? Most of them said, not more than 20,000 per year, in the range of 10,000 and 20,000. This gives a little bit about the starting point when we start talking about this technology to our audience, to our users, as well as the potential investment that they're willing to make. And if we see a discrepancy there, it's this connection, this challenge between what is the value that we can provide that will justify this investment and maybe justified a higher investment in such a technology. And I leave it here on the table for everyone to consider. Okay, thank you very much, Nikas. I would like to know if you could also please introduce yourself a bit and then we're happy to listen to your talk as well. Thank you. Okay, thank you very much, Robin. Just to briefly introduce myself in a bit more detail, I'm an associate professor at Keogluven at the computer science department where we work in the human computer interaction group. So most of our research is focused on human computer interaction issues. And so we focus a lot on, for instance, explainable AI, also visual analytics, where we try to research how we can increase the adoption of predictive analytics of machine learning models for end users. And so not only expert users, but also non-expert users, so users with little or no knowledge in machine learning, which of course poses additional UX challenges. Can go to the next slide, Robin. A first key challenge here that we try to address is uncertainty. Uncertainty is present in almost all data. And of course, this uncertainty also propagates when data is transformed. And then when data is used in models, additional uncertainty is added on top of it. And so part of the research that we do is to visualize that uncertainty, uncertainty in the data, but also uncertainty in models to support awareness of this uncertainty for end users. That's a big research line in our group. Secondly, we also research how we can enable end users to interact with models, with interaction with models. The key goal is that users, first of all, can understand the rationale of models, but also this has been shown to play a key role in trust building when users then use machine learning models. Secondly, we also research how we can then use user interaction to incorporate feedback of end users. What we sometimes see is that machine learning models sometimes don't have a very high accuracy that by incorporating domain knowledge of end users, we can increase the accuracy of a model. Then Robin, can you go to next slide? Model explanations are of course crucial here. So users need to understand the behavior of a model if they want to incorporate feedback. But of course, very recent models also deep learning networks are often very, very complex. And so they're often way too complex to explain to end users also definitely non-expert users. So most of the work that we do here is using model agnostic techniques. For instance, using approximations, simpler models that then approximate the more complex model that we then try to explain to an end user. Other approaches are, for instance, using techniques like example-based methods, maybe we'll show some examples later in this presentation, like line, like you have also shapely values that then try to, for instance, give insight into which features are important in a model and what are particular value spaces. Now, what we see is there have been a right range of explanation methods that have been proposed in the literature, but there's actually very little work on evaluating with end users whether this actually improves the model understanding, whether they can interpret an actual model. So a lot of our work is then focused on evaluating with end users in real applications, whether end users then indeed understand models and whether this increases their trust. Now, what we see in these user studies is that there's a lot of differences. Users are very, very different. And then depending on the technical knowledge that they have, also depending on their domain expertise, the effectiveness of user interfaces on top of predictive models is very, very different. So in most of the research projects that we do, we employ user-centered design methodologies where we involve end users into the design of a user interface. For instance, starting with focus groups, core design sessions, then elaborating low fidelity prototypes, and then evaluating these with end users that in the end we come up with designs that are tailored to the needs of the actual end users of the applications. And with that, I would like to wrap up my short introduction on challenges, but Nini, as I mentioned, a bit later in this presentation, has chosen some very concrete examples of the work that we've been doing. Thank you. Okay, thank you very much, Katrin. And indeed, before we go to Nini, I would first like to give a floor to Yanis. We will also introduce himself first and talk about how to translate all this business side towards this risk dashboard. So, Yanis, love it. Thank you very much, Robbie. Thank you very much for inviting me for this presentation. My name is Yanis Stoicis. I am the CTO in Agronol. And I am also a partner in Agronol. My passion is in working with data in the food industry and in providing solutions to the critical business questions that the experts in the food industry have. And this is what I will be talking today here, following a very interesting intro by Mikos, but also the challenges that we face when we are developing a risk dashboard that Katrin mentioned. So, I will speak of how we can move from a business question to a live risk dashboard that can help the experts working in the food industry to take critical decisions. Next slide, please. So, I think that we will all agree that everything starts with a business question. And when it comes to artificial intelligence, there is no single answer to the question of which is the best strategy or which are the best AI tools that you can use to answer a business question. And this is because applying artificial intelligence and developing prediction models is not about just using some tools that are available, but it involves trade-offs such as more speed, less accuracy, more data, less privacy, more autonomy, less control. So, there are many critical trade-offs that you need to study. Our method to address the critical business question consists in something that we call the food safety intelligence equation. This equation has one part that has to do with the data that we need to use, a part that has to do with the methods that we can use, and of course, the predicted indicators, the prediction outcome. So, let's start first of all from the business question. It's very important to sit together with experts and understand the problem that they have, understand very well what is exactly the business question that they need to answer, which are the limitations in the current workforce that they have, and in the processes that they are using, what is the nature to understand, what is the nature of the problem to be solved. Then going to the data file, it's very important to answer questions about what data we need to use in order to answer these critical questions, how we can collect this kind of data, and usually the data collection and processing, if the data are skated and heterogeneous costs a lot in terms of effort. So, this is something that you need to take into consideration when you are designing such a process, when you want to answer such a critical business question. And the other step is the prediction method, which are the right artificial intelligence methods that you will use to answer the critical question, to estimate the predictions, and also which are the metrics that you can use, and what kind of metrics you need to use in order to assess the accuracy or the recall of such prediction methods. Next slide please. So, in our case, I will go to the very specific question that we hear from food safety experts that we are working with, which is how could I predict which will be the food categories that will have more incidents within the next few months, within the following months. So, in this case, since we have to do with incidents for specific product categories, we need to collect all the data for the specific, for the incidents, like border rejections and food reports that are announced for the specific product categories. So, I need to collect data from all around the world, from all the national authorities that are announcing three calls and border rejections for product categories and for specific ingredients. And this is one part, a very serious part of data collection, data processing, translation, and also enrichment. And then, you need to understand which is, and you need to select the best prediction method that can be used in order to predict the number of incidents. And since we want to predict the number of the incidents, here we have to do with a time series prediction problem, because we have the incidents that are in different times, at different points in time, historical incidents like food records and border rejections. So, we need a very good method that can forecast the incident and the outcome of selecting the very good data set, a very good prediction method, and also using very good metrics to assess the accuracy of the prediction methods will be the number of predicted food safety incidents for the following months. Next slide, please. So, just to share a very simple example of the results, of how the results can be, can look like for such a specific questions, like business questions, like for which ingredient categories that we call and border rejections will increase. Here we have the main, some of the main categories, like nuts and nuts products, milk and milk products, fruits and vegetables. We have the first two columns referred to the validation of the algorithm that we have done for previous years and how well they perform using an accurate symmetric. And what these algorithms have predicted for the next 12 months. And in this way, if we have such predictions for the different categories, it's very easy to understand, to identify which are the main categories for which we will have more incidents within the next 12 months. So, we see here, for instance, that the incident seems likely to be increasing for nuts and nuts products within the next month, whereas we'll be decreasing for milk and milk products. So, in this way, you can identify which are the categories that I will focus in terms of testing and other verification activities. Next slide, please. And then it comes a different question of how you put such a prediction method, how you will deliver such prediction methods in a way that will also deliver value to the expert. So, in this case, to address something like this, we sat down with the expert. We understood the processes, the critical processes in this identification, in the answer, in order to provide such an answer to the business question. And we identified that one very critical step is to check which are the ingredients for which we will have increases. So, to ensure that very, very, very verification activities that I will use will focus on the right ingredient. The second step is to identify and predict which will be the hazards that will likely increase within the next few months, to verify that your laboratory plan and audits include this kind of hazard. The next step will be to check which will be the increasing and emerging risks. So, without any manual work, the value will be that you will be able to identify emerging and increasing risks. And finally, to know which are the suppliers and products that will be affected. So, immediately, you can activate the mitigation actions that you need for these specific products, ingredients, and suppliers. So, based on this modeling, we developed and we delivered a live food archive prediction dashboard that will be presented by Michael later on, that models these four steps and delivers prediction in such a way that can also deliver value to the expert and answer a very critical question. Thank you so much for your attention. Thank you very much, Janis. For the interesting talk, I will immediately go over to Nini, who will talk about how to design such a dashboard in an interactive and visually appealing way. So, Nini, if you want to share your slides yourself? Yes, if I can. It says I need permission to do that. Alina, can you give Nini permission? Oh, if you stop sharing, I think I can share the screen here. Yes. Hi. So, my name is Nini. I'm a postdoctoral researcher in the group of Catherine, and a big chunk of my research has focused on building recommended systems, visualizations, and mainly just trying to solve the real world problems. So, to begin my presentation, I want to start talking about what prediction models are to end users and how the trust on prediction outputs can change depending on how we present the prediction results. So, to give an example, let's have a look at this diagram. So, most prediction models appear to users as a black box. So, we have an input, and then there is something that happens in the black box, and then there is an output. So, this is how many people, many of us see the prediction models. So, then there is a possibility that the output could change from what the user might expect. So, when this happens, and the first thing they're going to ask is, how did it happen, and why? So, the worst thing that can happen in this situation is that the user don't trust the system anymore or the predictions at all. So, what that means is they're not going to use the prediction system to make any important business decisions. So, trust in the systems, much like trust in user and in people, they're based on competence, benevolence, and integrity. So, what do we do? How can we mitigate these kind of issues? Well, the first thing we can think of is to explain, simply explain the predictions. Why did it happen, and how did it happen? We could also think of involving end users in the different steps of the building the system. And finally, we can also visualize on certainty and show, okay, here is the possible projection that my output might land. These are the three aspects that we have looked at in our group. So, I will talk each of these in detail in the next slides. So, the first one, explaining the predictions. So, when we talk about trust, there are two types of trusts. So, the first one is trusting the prediction. That means that the users trust the individual prediction outputs sufficiently so that they can act so that they can make decisions based on it. And the second one is trusting a model. And what that means is they trust the whole model to behave in a way that they can predict. So, it's sort of a predictable model, something that users can understand. So, these two definitions are quite different and at the same time also related. So, they can both actually directly affect why user behave the way they behave when it comes to the prediction outputs and how much they understand the model's behavior. So, there are some models that can explain themselves. So, for example, things like decision tree, we have a logistic regression that can explain themselves very well. So, if we look at the decision tree, for example, it's very easy to extract the rules from the decision tree. But there are also other models from support for the machine to neural networks that are very complex and very difficult to explain. So, for these kind of models, we can consider using model agnostic methods. This is something that I want to focus on in this part. So, the first thing that I want to introduce is called LIME. So, LIME is a method that we can use to explain how the outputs are related to how the outputs actually contribute to what is predicted. So, in this figure here, we can see that there is a model that predicts that the patient has a flu. And LIME says, okay, sneeze and headache had a positive contribution to the prediction of flu. But on the other hand, no fatigue actually had a negative contribution to the flu. So, based on this kind of explanation, the doctor can actually make informed decisions. There is also another method called SHAP. So, this was very recently introduced as well. So, the difference between LIME and SHAP is LIME seems to be a bit faster when we use in all practical sense. And SHAP is a little bit slower. But I'm not going to go into much detail because I don't have much time. So, I will skip to the next part about involving end users. So, this has its roots in the user-centered design when it comes to building user-centered machine-only models. So, in the user-centered design, we include end users from the very beginning to the end until the project is launched. So, this is something that Giannis has also highlighted in his presentation, how important it is for us to include the end user in different stages of the development process. We can do something similar in machine learning, of course. The first stage that we might consider involving end user is in the feature selection process. So, we have built this interface. We call it GaGovic. It's a short for GAP correlation visualization. So, on the left-hand side, we can see the target variable. So, the outputs. And on the right-hand side, we can see the features. So, these are the input variables. And what it shows is the bars are indicating the correlation of these, the correlation strength of these output variables with the input variables. So, the longer the bars are, the higher the correlation strength is. And then, on the right-hand side, we can also see the correlation between each of the features that we want to consider. So, the shape actually highlights the strength of the correlation between them. So, what this interface tells us is that there are two things that we can learn from this interface. So, the first one is we can identify the features that have very little to no contribution at all to the prediction outputs. So, and the second one is if there are actually feature pairs that are contributing very similarly to the output, we can try and select the feature that the user can understand the most. So, this is the purpose of this interface. And then, I will also show the next interface that we designed, which is in the next step, model selection. So, we call this interface almost because that's a short for augmented by human model selection. It's with something in the mouth to pronounce, but in the middle of the, in the center of the interface, we can see that the contribution of the different features to the output of the model visualize. So, the output in this case is the grid quality, as we can see on the wine axis. And the X axis are the different features of the, the great variables. So, these values are calculated with the SHAP library that I mentioned earlier. So, there is also a green box that you can see that are laid on top of these dots. What this represents is the green box represents the knowledge of a human expert. So, it could, the experts in this case could be a viticulture expert. So, this represents the knowledge of how the experts expect each of these variables to behave for each, for the, for the prediction of grid quality. So, when we overlay these two on top of each other, what the model expects and what the human experts expect, when they overlap them on top of each other, we can see that the agreement that is agreement between the expert and the model. So, this allows us to actually diagnose two things. So, the first thing we can do is we can diagnose the data itself is the data that we have good enough for, for making predictions. And then we can also diagnose the models. As you can see, we have, we are showing here two models on top of each other. And we can select, we can do this for several different models at the same time. And we can see across all these different models. So, the second thing that we can do is the, is the diagnosing of the model itself. So, which of the model actually agrees with the, with the experts the most? And can we actually select the model to make prediction for this kind of data? So, moving on to visualizing uncertainty, this is a very nice visualization of Hurricane Trekkah that I took from New York Times. So, this is a, this is showing the Hurricane Irene here from 2011. So, we can see that as a time, as a prediction is projected forward, we can see that the boundaries actually get wider and wider, showing that there is some other uncertainty in the, in the predictions. So, if you look at the right-hand side of figure, we can see that, you know, the path is projected to hit New Jersey, but people from as far as Baltimore, Washington D.C. might need to be prepared for the emergency things. So, we have built something as well to visualize uncertainty. So, this is, this is the, the paper where we try to compare different ways of visualizing uncertainty. So, we, as we can see, there are three different types here. And as it turns out, the one on the top left, so this one here came out as one of the best performing ones. So, because of that, we actually continue and started working on a similar visualization, but this time we focused on price prediction. So, this is something that we collaborated with Agrino to try and build a system where we can predict price for the future and then we can also visualize the uncertainty in the prediction. And, of course, how the, how the, how there's a variation between the price that could, that might happen in the future. Okay. So, there are a few visualizing tools that you can use to actually do the visualization. So, the further, the, the very famous one that we use is D3. We also use Vega. Vega is also, well, compared to D3, Vega is a high-level approach. So, it's also quite interesting if you're starting out. We also have Blockly, which is also a bit of different languages. We also have Chart.js, which is very nice actually, but it doesn't allow you to customize quite a, quite a lot. So, I'm running out of time and to end my talk, I'm just going to briefly give you this 10 general principle on design, interaction designs by Jacob Nelson. If you have time, please go and check out that website. It's, it's very, very nice. It's not a specific visualization. So, it's not a specific usability instruction, but there are sort of more broad guideline and principle for interaction design. Very interesting stuff. And with that, I want to conclude my presentation. Thank you very much. Okay. Thank you very much, Nini. I would also like to remind the audience, please feel free to ask any question you like in the chat, then we can gather them and also save them for the Q&A. But before we go to the Q&A, then we finally have Michelis, who will give us a demo of the food archive, food waste prediction dashboard. And Nini, if you can stop sharing screen, then Michelis can also share a screen for the demo. Thank you very much. Perfect. So, hello, everyone. I am Michelis. And before I start presenting myself, Robin Quick Kudos on pronouncing my last name you did right there. So, I am Michelis and we did engineer and team leader here at Agrano. I've actually been involved with providing data powered solutions to a variety of sectors, but I'm from financial and the media and over the past five years, I've been doing this for the food safety one. But enough with this, hopefully you can see my screen for now. And from now on, let's move into a parallel universe where I'm no longer a data engineer, then food safety expert. And in my line of work, I deal with the various ingredients. Let's see how I can make and get actionable decisions based on the prediction dashboard that we've built in our platform, namely food archive. Now, in my line of work, I deal with many ingredients. I deal with rice, I do sesame seeds, it's a mess out there, ginger, almonds, peanuts and things. Now, I want to utilize historical data, historical open data available out there. And I want to take advantage of this data, utilize this fancy tech machine learning and deep learning algorithms that Janis described. I want to utilize visualization techniques that name described. I want to make this all into actionable decisions. How can I get a quick overview of all of the ingredients of my interest? Which ones are the ones that I should pay more attention to over the next 12 years? And this is the first look that we have in our dashboard. Over a quick overview, we have all of the ingredients that are of interest to me. Which of them are going to increase in terms of total number of incidents over the next 12 months? You can see them highlighted in red here. And which ones are the ones that will have a decrease in terms of total number of incidents. And this is by taking into account the historical data available out there. Just a quick note before we dive into the specifics of the prediction dashboard, Fudakai has collected data going back for the years. So what we attempt to do here is that we make use of all of this data in order to provide with actionable insights over the next 12 months. So we use this data and we attempt to go into the future 12 months and say what we most probably have. Okay, let's focus now on the pinnage case. I have selected the pinnet and I want to know what is what will be the distribution of cases based on the historical data over the next 12 months. And this is something that is visible here on the chart on the right. I know what has happened over the past 40 years. This is the green line here. But based on our algorithms, on our most accurate model, can I have a quick preview, a quick overview as far as the total number of incidents on a monthly basis are concerned for the next 12 months? And this is the yellow dotted line that you will see throughout this prediction dashboard. And this is based on our most accurate models. So, okay, I have a quick view as far as the overall distribution on a monthly basis concerned. But what about a quick overview, a quick aggregation as far as the total number of incidents? This is why we have this block here on the left. We know that for pinnage, over the past 12 months, Pudagai has collected data originating from 50 different data sources throughout the world. And these cases are much to a total number of 95 incidents. Okay, we know this. We have access to this. We also know the past 40 years of data. Can we do this? Can we implement this into a machine learning model that we attempt to predict in the future? And if so, with some accuracy, what does our best model believe will take place in the next 12 months? And this is the number here. Next year's incidents 132. And this may not mean a thing, however, we have a sharp increase. We have a sharp increase in terms of tendency. And of course, this is by taking into account that you all know better than me that over the past year we have this pandemic. And this pandemic has affected various aspects of our lives. Among them are also the food safety checks performed by national authorities throughout the world. However, what we can is that by having, by taking into account historical data, we can also make accurate predictions that, okay, we have an outlier year. The past 12 months have been an outlier year in terms of data science. However, by taking into account all of this historical data, we can limit the effect of this outlier year and perform accurate predictions. And our best model believes that there will be an increase of roughly 40% in the total number of incidents for payments. However, this quick overview does not yet make us able to move from being reactive to actually being proactive. And in order to do this, we have to dive in deeper. And this is why we have to dive in deeper into the specific dangers, the specific hazards and the specific cases that will take place over the next months, over the next 12 months, both in terms of risk and of course, in terms of specific hazards. Okay, so what about specific hazards? Why will the total number of incidents for payments increase? And this is why we have this table here. This table here dives in deeper into this, into the initial overall analysis that we perform and that tends to dive in deeper into the specific hazards that will take place for the ingredients that we've chosen over the next 12 months. And as you can see here, the prevalent danger, the prevalent hazard for payments will of course be microtoxin that will increase roughly 30%. And more specifically, the highest contributing one will be aflatoxin that will increase in terms of as far as percentage go, 37%. And the same goes for basic hazards and so on. But okay, what about the risk? Can we perform a risk assessment at all? We have access, we mentioned we have access to many years of data. Can we perform a risk assessment? And the quick answer is of course, yes. And you can see here on the block on the right, with activated tab, the actual one, this is the current state, the current snapshot for the risk assessment as far as payments are concerned, which is the prevalent hazard, of course, microtoxin, aflatoxin and followed by saloon, and so on. And this is based on the historical data. But we just mentioned that we perform predictions going over over the next 12 months. So the main question here would be, can we use this data? Can we make sense of this data and utilize them in order to perform a risk assessment in the future? And of course, the answer is yes, you can see it here in the tab as it's predicted. And this is actually snapshot of the risk assessment, of the same risk assessment, the same algorithm running, however, 12 months in the future. And you can see here that, okay, microtoxin has increased a bit, aflatoxin again, but you do not see salmonella, you see options of health certificates and patient attention. So you see a somehow different point of view, different view, of course, moving over to 12 months in the future. But okay, we have identified so far that microtoxin will be the the prevalent hazard for peanuts. Can we be a bit more specific? Are we actually diving deeper? And as a food safety expert working in the food command, can I make decisions on specifically which months, when should I be on alert as far as this risk is concerned, microtoxin risk is concerned? And again, yes. And you see it here on this chart on the left. This again is by taking into advantage all of this historical data, all of this prediction that we've made so far, and the evolution of risk, the live evolution of risk can be visualized here. Again, in the green line, we have the risk evolution throughout the years that food safety has been collecting data. But now we know what will happen in the future. Now we know what will most probably happen in the future. So can we utilize this data in order to assess to perform a risk assessment over the next month? And if so, can we also perform a quick analysis and alert the end user, alert our users as to when should they be more on alert? And this is the reason for this red bar here. As far as our prediction scores and concept, we believe that 10 months in the future from now, the sharpest increase in terms of risk will take place. Okay, but what we've done so far, we've analyzed the product ingredient, the product term, in terms of total number of ingredients, risk and so on. But this is not all for a food command. Food command is producing finished products, finished specific product recipes that contains various ingredients. And this is something that is offered by Fudakai among his capabilities, one can input his product recipes, his or her product recipes and perform some kind of risk assessment. And this is what is visualized in this tab here. Among all of the product recipes that I have inputted as far as the customization of Fudakai is concerned. I have the chocolate bar here, for instance, that contains peanuts, cocoa, butter and so on. And I'm right now analyzing the peanuts product. Which of my finished product recipes will most probably be affected that contain this ingredient and the good counter is the chocolate bar? Why, what is the danger, the vision, the hazard behind this region to be on higher lead and it's a basic size and it most probably is due to the cocoa presence in the product recipe I have inputted here. And what is the specific risk that is important to this? So, okay, so far we've analyzed an ingredient, we've analyzed the finished product. However, in my line of work and my food company, I'm mostly importing from specific countries, from specific continents. I'm getting ingredients from India, for instance. Can I use this information? Can I make some kind of predictions on a country or continent level? Again, the quick answer is yes and you can see it at this chart on the bottom. Now, this chart is generated based on the historical data again that we have at our availability for all the food safety cases that originate from India. I'm interested in India, but of course I may be interested in a on a continent level, like I could have Asia here or North America and so on. And similarly to what we did over at the top is that we've made the same analysis. Okay, we know all this data, we have a line chart, we have time series data. Can we feed this into a prediction model that will attempt to go 12 months into the future? And the quick answer again is yes, you can see it here. Similarly to what we've done throughout this dashboard, over at the green line, you see the predictions, you see the historical data available in Fudakai and how many cases do we know originating from India. And what are most accurate models believe will take place over the next 12 months. And finally, enough with this. So far, we've focused on a product, on a country, or in a finished product recipe point of view. But we've also talked about the highest contributing hazard. And the highest contributing hazard is mycotoxin. What about mycotoxin as far as the rest of the food safety sector is concerned? What is, can we utilize this highest contributing hazard, mycotoxin, in order to assess its footprint as far as the total, the overall view for food safety concern? And again, this is why we have this blog on the left. Now, in this blog, what we've done is that, okay, we know that mycotoxin is the highest scoring hazard for peanuts, but mycotoxin is a food safety hazard. It will also affect other ingredients throughout the food safety sector. And here we've identified the top five of them, that will be mostly, that will mostly be affected over the next 12 months. And scoring highest, we have dry figs, coming up next figs, and you can see the risk assessment here. And this is why we have this specific blog. Now, this concludes this part of the presentation. However, before I close up, let me mention that if any of the information available here sounds interesting and you want to give it a go for your specific supply chain, but in mind that what we just experienced was actually tailor-made to specific customization I've made. So if it sounds interesting or you just want to give it a go in order to identify specific emerging hazards for your supply chain, please, please, feel free to let us know. There will be a link in the QR code later on where we can book a specific demo and go over your specific cases, your specific ingredients, visualize the results, and attempt to perform some kind of actionable outcomes coming out of it. Thank you very much for your attention. I will stop saying my screen now. Okay, thank you very much, Michels. I will quickly share my screen again. So like Michels mentioned, if you're interested in a demo, you can of course just contact Michels or you can also use the QR code or the link to register for the demo. So now we have some time for Q&A. We haven't seen that many questions yet in this chat. However, if you have any questions, please feel free to post them. I can maybe kickstart the session with a first question. Let me see. What impact, maybe it's targeted to everybody, so maybe everybody can comment a bit on it. What impact do you believe that predictive analytics are having on the food supply chain? I don't know if you have any volunteers to start with an answer. Can I, yeah, Michels? This is a very interesting question because I think that we are looking at two different dimensions, time dimensions, or two different universes. One has to do with what is happening in real time within my system. So I have the control points, I'm measuring data in real time within my plants, within my facilities and my supply lines, and then I can utilize the power of the algorithms trying to predict very quickly if something is going to come up before it does come up. And then there is another universe, and this is I think the universe that we are presenting that is looking outside the system of a particular organization, of a particular manufacturer, or retailer, and that is looking at the rest of the world at a more macro, let's say, scale, trying to get the signals of things that have already happened, dig inside the signals, and try to understand if we can predict something about the macro level. I think we are looking at a tremendous impact in the way that people are taking decisions both at both levels, at both universes. Okay. That's actually excellent, Nikos, and if I may build on what you just said, what we visualized using Fudacai's dashboard, or the visualization techniques, or the machine learning methods that Janis and Nini described is actually something that takes into account the open data out there. Now, just an open question, imagine this being fed with the vast amount, the vast volume of internal data available in food companies out there. I believe that just a quick, my quick point of view on the impact will be amazingly huge. Okay. Thank you very much. In the meantime, I see that we also have a question from the audience, a question from Catherine. So thank you very much. The question is to what extent do you take into account that food processing methods or risk detection methods have evolved over the past decades when processing historical data? Provide my views on that, and of course, the other speakers can complement this well. So that's an excellent question. Thank you, Catherine. Right now, we don't have a factor that normalizes these differences in this evolution in the food processing and the detection methods. In one of the cases that we are working with one company from the food industry, we had added also a factor about the detection methods, about how easy it is to identify such risks with the current detection method, with the current detection technology. So the answer is that it is possible to include such a factor into the risk estimation, into the risk assessment model that will also take into account the feasibility of identifying very important problems in the supply chain, hazards in the supply chain. And this factor can be different, it can change during the years based on the evolution of the technology. So by adding such a factor, we could also take into account this kind of evolution. Robin, if I'm also add a little bit to what Yanis is saying, this is exactly why we believe that predictive analytics for food safety is not just a numerical exercise. Where we take numbers of incidents, we put them in a model and they generate the number. This is why we really believe that the devil in the details is hidden inside the question that we want to answer. And then there is a trade-off that starts in a very nice way that also Katrin mentioned. There is a trade-off that starts even from the negotiation around the assumptions that we have to make when we frame the question, the right question, so that we can then choose the right data, incorporate the dimensions that are important that we want to look for and agree on the indicators that we believe are potential signals of a higher risk. So this is a process, an iterative interactive process and it takes time and effort and it requires human feedback and intelligence to make sense. Otherwise, it's just an exercise, an academic exercise. So that's why I really believe that the process, the method is a very essential piece of the whole solution. Okay, I see that in the meantime more questions are coming in. So the next term is from Elias and he asked, what about the cost of data mining and hypothesis analysis? How far can we go in, can we go in agro-universe? Excellent question. If I may take the rest of the first question. Now of course we've dealing with food safety data over the past five years. Let me tell you that it's not the most data-savvy word out there. There is vast amounts of data, many different sources, many different languages. We're talking here about the official food safety authorities throughout the world and each of them are announcing data in their own internal format, in their own national language. Then really it's a challenge. So as far as the number, the actual cost, of course it's a quite a big one. However, at least what we do in terms of food safety, what we do is that we collect this data, we come across different languages, Chinese and PDFs are an actual case out there. But also other languages, Indian, different formats, XLS, XML and so on and so on, we can dive into the details deeply. So yeah, it's a challenge. And it's not only this, you also have to automatically categorize this data to enrich, as Janis mentioned, because the raw data doesn't make any sense if you're not harmonizing with some kind of internal vocabulary. You need to know what kind of ingredients, what kind of products, which was the company, the date of the recorded border ejections. So a quick question here. Yeah, it involves a cost. Yeah, we've been doing this over the past five years. It's a challenge. It's a huge challenge out there. But as far as the second part of your question goes, how far can we go? I believe we can go quite far. The food records and border ejections are not the only data available out there. We have counter indicators. We have weather data that affect these agricultural commodities. We have price data. We have lab data monitoring results from official sources. And as Nikos mentioned in the first question, we also have internal data. Internal data available in companies, food commerce are performed with their own lab tests. And this is a huge volume of information available. So my quick answer to your second part of the question, how far can we go? Is that really, really far? And I believe we just create the top here. Robin, if I may also add to that, especially for the how far we can go part in the when we apply artificial intelligence and this kind of predict the prediction approaches, it's very important to take into account that we have some trade offs. So the data collection, the processing can be costly. But if we decide that we are more interested in speed and not recall not so much on the accuracy, then you may go up to the point that the accuracy with collecting the data up to the point when the accuracy is quite okay for you to take a good decision. You can continue working on the data. You can you can collecting data, but the added value may be not so big. So it's a process is an interactive process where we need to work with the experts that they need the answer to this critical questions and understand which is the trade off accuracy and data collection. And if I may do the connection with the next question that I read from Jeff, because Yanis is mentioning something that gives a connection. I want to use an example, the example of food adulteration incidents, food fraud incidents. If we look at the number of food fraud incidents historically in a particular product category or ingredient category, I would say let's say paprika. It says something and then suddenly we may see a spike and if we try to do the predictions based only on the number of incidents in paprika, we may predict something that is wrong. The algorithm may predict something that is wrong. This is where the expert knowledge, the root cause analysis comes to play and says, you know something around the time there was an increase in paprika prices in particular in the countries that are producing paprika and we saw it and we knew that adulteration incidents are going to also increase. Then this is what we do in the equation. We go back to the data and we say, okay, what if we include in the model the analysis of prices of paprika and we try to rebuild the model, a new model, a new version of the model that will predict something also taking into account this data signal and then another expert will come and say, okay, but you know something, there is another event, another signal that we have to incorporate in the model and then again we go back to the data and so it goes. Just to add to what Nikos is saying for the question of just for the problems that we have in the food industry, for sure we cannot have a fully automated approach where the machines are taking the decisions. There are so complex these problems that we need to integrate the knowledge of the experts in a way that Nikos is describing. So this is a process that needs collaboration and by understanding which are the main factors and dimensions that we need to include, it is possible to integrate all this knowledge but still we need constantly to be in contact with experts in order to understand and integrate this knowledge into the prediction systems. So it's not a game that is played only on the side of the technology, the people that are building the technology, it's like a tango, it needs to. Thank you very much for going into that as well. In the meantime we received a related question, maybe more geared towards the academic side. How about all the involvement of the experts, how can we enhance the trust of the experts in these prediction services? Yeah, I think it's a very interesting question and it's somewhat related to what Nikos and Skianis said. So we try to focus on how we can bring the experts together and we also try to find a way to show this. So how can we show the experts that there is a problem in the data? How can we let them diagnose that there is a problem, there is a problem with the data, there is a problem with the model. This is also why we focus on these sort of visualizations as well. When it comes to trust, I think the as I highlighted earlier the explanations themselves are quite of course important and the uncertainty visualization also seems to be quite important as well. So these are the two important things that I would add on top of what Nikos and Skianis has already added. So including the experts themselves already into the process of machine learning. Of course we also need to be able to highlight where the problem really is and allow them to actually steer or control the model if needed. Okay, thank you very much. I see that there is another question just appeared from Jeff and then maybe first of all Nini and Katrina, yeah Nini's not muted. How do you wax requirements change in the mobile world where screens are smaller and visualizations need to be more specific and crisp? So of course this is also an excellent question and the mobile world needs the screen space is much smaller so visualizations even need to be much more simple condensed and again I think the user-centered design process here is key to come up then with representations that also work on mobile devices. Maybe because I see that the comment from Jeff was mainly positive feedback to everybody so thank you for that. Maybe a last question if no other questions are coming in. Which are the main difficulties and challenges in finding all the data that you need to build all these predictions? Maybe everybody can briefly comment on it because it's a general question and of course there are many difficulties so it's definitely interesting to hear everybody's viewpoint on that. Is everyone okay if I take the first go here? Yes it was the password data here so I had to pop in I don't know. Yeah so the main challenge I would say as far as official sources and official announcements are concerned from a data engineer point of view again I would say identifying these sources given the language barriers out there the availability of the web internet connections firewalls and all of these tech stuff that are out there at least from a data engineer point of view the most difficult part is actually identifying which is the official food safety authority in China. What can we get excels or PDFs in the Chinese language? If we get this we can make sense of them we can translate them we can edit them but where are they located? And the same goes for other countries throughout the world and just a quick point here mostly an open question the whole continent of Africa is somewhat black box for us as far as information is concerned and information announcement is concerned we'll be looking at this but however we do not have access or there are no official announcements so far so as far as the data engineer concerned I would say the identification of these sources the the hugest challenge out there. I know that Yannis wants to to say something but if I may complete complement what Michael is just saying because he's focusing on the public sector data part and I think that there will always be political or economic reasons for which we will still have some data silos this is not something that we can avoid there will be a reason or many reasons why an official authority will not open up access to all the data that it is collecting managing and using for decision-making on a national level. I think that the challenge there and the question there is can we build for a subset that can be served something like the google maps of FD data that everyone can rely upon and use so can we build this public sector data infrastructure that will make this information available to everyone as a common resource so that at least there is a minimum set of data that everyone can access and use. I think if this is possible and when this becomes possible because I really hope that we can go there we will have made a tremendous step forward. I would like only to add two more dimensions to the difficulties and challenges that Michael has mentioned but also the vision that Nikos shared with us that one of the main issues is that still in the food industry the part of the data that follows standards and that can be interconnected is very small so we have many different data sets that are totally disconnected and that may be relevant and refer to the same thing so it's very important and this is one one challenge and the solution to that could be that option of standards one standard for instance that we all know is the barcode or the product brand or the GLN for the companies because especially in the case of the companies you know how many variations we have in the names of the companies and how complex is the structure of the companies they have parent organizations subsidiaries they have organizations and and how dynamic this is with changing constantly so this is one one is very important challenge and having interoperability and between the different data sets will open new possibilities in the way that we can use the data and the other part is of course there are still concerns about the security the privacy of the data so the industry is relaxed and there are there is very important progress during the last years for anonymization of data for protection of data and still we need to apply and to show how this can work and to be the trust also to the industry that if this data that this data can be shared in a very secure way and this data can be used to predict very important things very important events in the global supply chain okay thank you very much maybe with that I will briefly share the final slide again so I do want to thank everybody for participating in today's call I think it was very interesting we also learned a lot also very interesting talks for everybody so thank you very much for that and before you go away I would first like to ask both Nikols and Katrin to maybe give some closing words that we can wrap up this session thank you very much okay also thank you very much for my side it was a great pleasure for me to be able to present some of our word the challenges that we are addressing for us it is a very interesting domain to work and we've been doing very interesting collaborations also with agronomists domain and we're very keen to expand on the work so if you're interested in potential collaborations of course feel free to contact us we're very much interested in expanding this work in this domain but thanks again for attending thank you I also want to focus on the collaboration part that we wanted to highlight in this webinar it was not about presenting the way that we solve in our company such problems but we wanted to highlight the wealth of scientific problems and computer science related problems that exist when we are trying to solve such an important societal challenge preventing people from getting ill from food it comes back to essential research questions that we can pose to excellent teams like the team at KU 11 and this can pave the way and provide a very rich environment to work together in solving additional problems so I hope that to an extent we we managed to share the complexity of the problems at hand as well as how we believe that they can become real practical solutions and steps thanks to everyone that presented and to everyone that has attended before you go away maybe final administrative mention that the recording will be put online and sent to everybody early next week so thank you very much again