 It's lovely to be with everybody and I'm looking forward to just sharing some work that we've been involved in with colleagues. And if I just, you can see the logos there, but if I just move on one slide, these are all the colleagues. And some of the colleagues I know are on this are involved in this conference, which is great to see them. And so I'm going to talk about the use of compositional data analysis using geochemical data, and also the more recent work using air quality data, and it's to explore the associations with chronic kidney disease. So I suppose, first of all, we have to think about what is in our environment. And we're sort of aware now that obviously with, you know, carbon footprints we want to eat locally we want to use more local produce, but obviously that involves obviously our produce that's grown in the soil. But we're also very aware of what's happening around us in terms of air quality, the sorts of environmental toxins that we might be subject to, as we all try and do a little bit more exercise and even within lockdown, how that has impacted us. And especially if we open up a lockdown, it's very clear what's happening around us. So the World Health Organization has been very aware of how the trace elements do affect our health, and they have defined elemental trace elements into three different categories. One's ones that we're probably familiar with such as iodine and zinc and selenium, and even copper, molybdenum and chromium, but those are that are probably essential to our bodies, including nickel, but then also ones that are potentially toxic elements. And some of these, we're very aware are toxic. And these are particularly maybe have a been linked to chronic kidney disease such as cadmium and mercury that we'd be aware of and arsenic and lead. So I'm going to concentrate on chronic kidney disease and chronic kidney disease and our kidneys are incredibly important functions within our body. They are actually the hardest working organs in our body. They keep our bones healthy by balancing the levels of minerals in our body, and they also remove toxic waste from our body. So they are incredibly busy organs and important organs. And it's really important to have a look at how the environment might be affecting our body in terms of our kidneys. The chronic kidney disease is a, I suppose a collective term, and we know that, you know, it basically it is progressive renal failure, and it is increasing worldwide and that's due to aging obesity and diabetes. But there's also this part that's known as unknown chronic kidney disease of unknown etiology, and it has been linked to environmental factors. And this is certainly a worldwide concern. And again, the World Health Organization has set up a high level organization and to look into this. There is no definitive cause known, but there are known nephrotoxins and the ones that have been highlighted are individual elements such as lead, cadmium, mercury and arsenic. The link between those and chronic kidney disease of unknown etiology still needs to be established. But obviously anything that affects the unknown etiology is also very relevant to chronic kidney disease, because it's irrelevant to the whole progression of our renal failure. And again, it might be linked to diabetes and hypertension. So it's incredibly important to look into all of these different factors. I'm going to introduce now the different data sets. And the first one is the chronic kidney disease data set, and this has been provided by the UK renal registry, and they have a requirement to collect data on all patients. They collect the data before the patients go on to renal replacement therapy. So basically before they start dialysis, and they collect the data on all of these patients and they've provided the data for this study, in terms of a standardized incidence rate. The data is collated over a period between 2006 and 2016 to avoid any small data area or issues. And they provided the data in terms of a census admin output, which is the smallest that we have available within Northern Ireland. They're equivalent to the super output areas of a local level in England, and they're called SOAs or super output areas. They are also provided within age brackets. So we have between 16 and 39, 40 and 64, the plus 65 year olds, all data are greater than 16 age group, and then also for this unknown etiology. And what a standardized incidence rate really means is that the data have been looked at in terms of what would be expected for the average age specific for a region. And so for Northern Ireland, an SIR above one means that the incidence is higher than would be expected, and below one is the incidence lower than would be expected. So these are the maps for all of Northern Ireland and Northern Ireland is part of the UK and the Northern part of Ireland. So we're in a unique and special location. And so you can see that across Northern Ireland, just by the administrative there are the super output areas. So if you look at sort of all of the data, which is greater, basically all data greater than 16 years. And as you progress through the different age brackets, you can see that it is spatially variable where these SOAs of an SIR, which is greater than would be expected for Northern Ireland. And certainly in the unknown etiology, the map for unknown etiology, you can see again that there seems to be a spatial interesting spatial distribution. And if we look at the maps for Belfast and Belfast is the largest urban area within Northern Ireland, and just over about 1.3 million people live in Belfast in the greater area of Belfast. And again, the maps show this spatial different spatial distribution, certainly for the 16 to 39 where we're certainly getting some SOAs that are showing higher than would be expected. And for the chronic kidney disease of unknown origin or unknown etiology, the SIR show up to 12 times higher than would be expected from Northern Ireland's average incident rates. So the second data set that I want to introduce is the deprivation measures. And for Northern Ireland, and I'm showing for Belfast here we have deprivation measures which are provided by our Northern Ireland statistics and research agency. And then they've been divided up into 890 sort of units and a low, so one is the lowest and if you SOA that scores 890 is basically not deprived, so very less deprived. So one is the most deprived area. So in these maps the darkest blue areas are the most deprived and unfortunately they show West Belfast and really North Belfast as being some of the most deprived areas. So the overall deprivation indices broken up into six different measures and employment and income make up 25% of the overall deprivation measure so that's our second data set. So the data set is a telus data set, which is a large regional geochemical data set and I'm showing it for well there's the whole of Northern Ireland, but also the one that we're going to look at is the urban the particular urban data set. So there's 1000 urban samples, and they were analyzed by XRF analysis. And we do know that there's a geological variation across Belfast we have an influence of the paleogene basalts and flood basalts, weathering of those, glacial weathering of those, but also some variation on metamorphic rocks, and also some sandstones across the area. Now the interesting thing about Belfast and many other urban cities is that it has seen a history of industrial growth. And in Northern Ireland our industrial growth really coincides with the opening of the Indian Wolf, which is our large shipbuilding manufacturing area, and really develops and you can map it into the 1858, which is really associated with the shipbuilding and 1901 and 1919 to 1939 and these really developed and developed across Belfast, and they are linked to different types of manufacturing, so that we know that there are anthropogenic sources of things like copper, zinc, tin, antimony and lead. And our most recent development is Belfast City Airport. And again that's obviously another potential anthropogenic pollution source. So the mapping of CKDU, and this is, so this is a one of unknown etiology, just across those different time zones for Belfast, you can see sort of an interesting again different spatial distributions across Belfast. And again, you can see that the deeper red in this case are the SOAs, which have a standardized incidence rate, which is above what you would expect for the average age distribution for Northern Ireland. So how do we model then the data, how do we actually look at the relationship between the different datasets. So obviously we're going to, in this conference what we're going to concentrate in is the use of compositional data analysis, but we're dealing with several different datasets that are all compositional in their, in their different ways. So obviously the geochemical data that we have, we have a ranking of the deprivation indices that we want to look at. And then we also have these standardized incidence rates for the chronic kidney disease. So we use a combination of, we've actually compared different types of log ratios. So we're using particularly the isometric log ratio, a pair wise log ratio, and we're also going to use this balance approach for log ratios. So once we have opened up the data into coordinate space, then we use a number of different types of regression models to really look at the relationship and try to assess the statistical, whether there's any statistical correlation between the datasets. So the first, I suppose, the first type of compositional data set or analysis that I sort of want to mention is one of the balance approach that we're using and we're using the cell ball algorithm. And that's obviously introduced by Riviera and put into an et al. And basically this we used it for all of our data so integrated all of our data. This allows us to look at the relative abundance that might be most closely associated with elevated incidence of chronic kidney disease and also chronic kidney disease of our known etiology. And the process uses an N4 cross validation procedure, and basically allows us to identify the best balance or the balance that comes up most to the highest frequency that is determined. A mean squared error is used to determine the number of components that is best used within the balance. And then we can use we can test that through different types of regression model to see if the if the balances are seen to be statistically significant. The results that I'm showing are just for the chronic kidney disease of unknown etiology and this is with all of the soil potentially toxic elements and six individual domains of deprivation. And the results in this case have shown that there are three most common balances and the frequencies are just shown there in the last row. One of those balances the cell ball algorithm is named as a global balance, but you can see that the different elemental compositions come up certain percentage and you know, nickel and molybdenum arsenic chromium are all coming up at least 45 or more percent of the time. And you can see the frequency of the balances coming up as well so balance one, nickel and chromium and balance two, molybdenum and arsenic are coming up with the greatest frequencies. So what do the results actually show us in terms of urbanization and our health then. Well, if we look at the overall results from this part of the analysis we can see that they, they suggest a correlation. And this is, I'm only representing the significant correlations that were shown here between all ages of CKD so that's the data. For all ages greater than 16, and the multiple deprivation indices of employment and income. And those are those are the ones that were found to be significantly statistically significant. And the interesting thing about these MDMs or these multiple deprivation and to see domains is that they have been used as an indication of socioeconomic factors and linked to things like smoking so there is a health connection there. In terms of the historical industrial Belfast analysis and the most sort of the largest area obviously is the most recent the 1990 to 1939 so it had the most data within it. And the strongest correlation for CKDU, the undone etiology is with the elemental balance of copper and antimony. And these elements have been linked to industrial smelting of various industries. So there is a natural link there so we're the balance that are shown do make sense. And just a historical Belfast with the CKDU over. The results then show for this potential link with air pollution. Well for the sort of greater Belfast area the strongest correlation with CKDU at this point is fine with the sort of a balance of arsenic and and if we just plot the road sort of network on top of the map, you can see that there is an interesting sort of a potential association with the road network. And things like air pollution and traffic and breakware emissions have been cited as sources for these heavy metals. And arsenic and molybdenum have also been linked to atmospheric pollution, as well as obviously pollution from traffic. And interestingly breakware emissions have also been linked to sources of antimony and molybdenum and so the research into air pollution and kidney disease is very recent. So this is work on going on as I say not just the work that we're doing but other in the field of kidney disease they are exploring this link, this potential link with air pollution. And there have been studies that have shown that these ultra fine particles, and that does include lead molybdenum anatomy can become blood bond they can get into our bloodstreams and then translate it to other tissues, and that's including the kidney. So what our study this initial part of the study shows that there is evidence that these air pollution deposition of the modern pollutants and the graph I've just shown on the right hand side is the amount of molybdenum that is actually being developed worldwide. So this is partly in response to the need for brake pads. And we do hope that the electric vehicles will need less of this, but at the moment are our anthropogenic source of molybdenum is actually increasing worldwide. So the results from this part of the study show that urban soils can be used as a proxy for the availability of net nephrotoxins for environmental pollution. Briefly, I would like to sort of expand the study then on to more recent work that we've done so obviously the outputs of one of the previous work which is now being published does indicate this potential link with pollution air pollution, and maybe ground pollution such as traffic. And again, there have been some experimental work done between looking at the link between this and chronic kidney disease. Obviously, there are other links with respiratory disease and unfortunately, you know the recent ruling that air pollution did contribute to the death of a nine and nine year old in London is the most recent and the only ruling of actual link air pollution to health. But but chronic kidney disease is still not completely confirmed. But there have been some work that's looked at air pollution and particularly the very fine particulates PM 2.5 exposure. So for Northern Ireland, we have access to air pollution data that includes PM 22.5 PM or nitrous oxide, different types, sorry, and also PM to PM 10. So over the this final part of the study then what we want to introduce really is it to look further at the urban data sets and as we've already introduced the urban telus soil geochemical data set. The standardized incidence rates that I mentioned earlier the CKD for all of all data over 16 years and the chronic kidney disease of unknown etiology. So we're going to look at the environmental toxins, the air pollution, and again link that potentially with social deprivation. So the techniques that we've used are again the forward selection method using the cell ball algorithm at generalized linear regression, but also looking at the influence of spatial regression. So, what is the impact of this potentially so spatial dependence in our data. So we're looking at the data to confirm the types of data we're using going to we've used 10 geochemical elements and these again have been informed on the literature and the links with air pollution and health. So we've computed the geochemical data, according to the GS and I logical survey Northern Ireland requirements. And then we also use a geometric mean value for each PTA for each of the super output areas within the greater Belfast urban area and there are 365 SOAs for the air pollution data we use the 2006 data set to quite to try and coincide with the tell us survey and the UK are our data. And the air pollution covariates that we used were benzene carbon monoxide nitric oxide sulfur dioxide and then the particulate matter PM 10 and the finer particulate matter PM 2.5. And for the pollutants for SOAs with missing values, we use ordinary creaking with cubic models for imputation. So just looking very briefly very very briefly at the results. And these are the results for both chronic kidney disease for all age groups so basically above 16 years and also for chronic kidney disease of unknown etiology. And again, the cell ball method is very useful and that highlights these potential balances that we can then explore further through regression. And so for the PTEs for the all ages of chronic kidney disease, molybdenum zinc are again that balance is coming out as the most frequency of 64% the time, and the air pollutants, sulfur dioxide and carbon monoxide, again, very high percentage of the time. So for the chronic disease of unknown etiology chromium and nickel the balance between chromium and nickel, and also the air pollutants of PM 2.5 so the fine particulates but also carbon monoxide. Five minutes, sorry. Okay, that's okay. I'm just sort of finishing up. So in terms of the results and just to sort of summarize those results. So using generalized linear regression with a log link. We did find a statistically significant correlation between the CKD for all age groups. Specifically, all age groups above 16. And these, these compositional balances of molybdenum to zinc. And also the multiple deprivation indices of employment income and health so they were the ones that were fine to be statistically significant using GLM. However, when we introduced a spatial lag so uses spatial regression, the coefficients were not found to be statistically significant. In terms of the data for the air pollutants then for chronic kidney disease again for all age groups above 16, sulfur dioxide and carbon monoxide were were found to be the most important. For chronic kidney disease of unknown etiology, the air pollutants of carbon monoxide again and the fine particulates PM 2.5 appeared most within the balances. But the association and the air pollutants for chronic for chronic kidney disease of unknown ages for a greater than 16. Again, when we looked at a spatial regression were not found to be statistically significant. So our preliminary findings and as we say the we've we've more work to do, but the findings are very informative and very interesting, and they do support the argument that these atmospheric pollutions in the form of from sulfur dioxide, carbon monoxide, and certainly the particulates PM 10 and PM 2.5 may be negatively associated with renal disease. Obviously we do need to impact look at the impact further on on this, but it does one of the highlights is that it is raising the profile of some of these anthropogenic sources such as molybdenum from from brake pads that we need to consider. And the work needs to be discussed further and develop further, but I think compositional data analysis has shown us a very robust way to analyze the data. So just finally to say thank you to our sponsors. Some references. Thank you.