 everybody today. I would like to talk about the geochemical characterization of Jade in the greater Caribbean and the source discrimination applying multi-class regression analysis. The geochemical characterization and to provide the database of the Caribbean Jade is a precondition when one wants to actually provenance Jade artifacts as we already heard from Sebastian. There's a problem if we just do it petrographically. I would say the best way is to do it destructively using pedrography and geochemistry because this gives us most information but unfortunately we are not always able to get a piece of an object so we have to find ways to do it macroscopically non-destructively or non-invasively. This is something I will talk about later in my second talk but now I give you basically the basis for artifact provenance analysis. So we all know that there has been extensive exchange between and trade between different islands but also with the mainland and this is evidenced by finds of exotic materials on certain islands and one example is Jade and we already heard through all the talks that Jade was really valued by the Amerindian due to its physical properties. It's really hard, the system is really suited for making tools but it's also really appealing by its color and also interesting for Amerindians for producing beads and pendants and by geochemically fingerprinting those artifacts and comparing it to a database of known sources we are then able to unravel those mobility networks. So this is a global distribution map of known Jedi sources and we already heard before that Jedi Jade is much rare in occurrence than actually Nephrite. I also would like to make a comment about the terminology Jade. So Lasso already told us that it refers to two minerals. I would like to add a third one and talk about Amphysite Jade. So basically Jedi and Amphysite are both pyroxenes. Jedi sodium and Amphysite also contain some calcium or iron and if I talk about Jade I'm mainly referring to Jadotite or Amphysiteite as ant members and then we also have basically rocks which have both minerals in them. So these occurrences are always bound to metamorphic complexes there they generate in subduction zones. One plate is subducting beneath another one then we have the release of fluids either through the breakdown of hydrous minerals or through marine pore waters. These fluids can precipitate in a rock in veins then it's vein precipitated Jade but we also have metasomatic reaction meaning we have a parent rock. The fluid is introducing the rock and is basically a precipitating change and changing the mineralogy. And then in the Caribbean we have three sources namely in Guatemala then in eastern Cuba and in the northern Dominican Republic. Guatemala due to the Mutagua fault zone can be actually subdivided in two sources one north of the Mutagua fault zone and one south of the Mutagua fault zone and as we already see on that graph is basically that all these sources are aligned along this major fault zone which is an inactive subduction zone that was active for roughly 65 million years. And that is why it's not only mineralogically but also chemically really challenging in discriminating between these three Jadotite sources as they basically feature the same more or less the same formation ages the same tectonic settings similar fluids, similar profilates and this is a challenge which we have to deal with. So we analyzed over 100 Jade rocks, Jadotite and Onphysite containing rocks. We did strontium isotope composition analysis and neodymium isotope composition analysis by using the Timstrefermal ionization mass spectrometer and we also analyzed the rocks for their lead isotope composition by using the multicollector ICP-MS. So what you see here is the 87-86 strontium variability of the Jade sources from the Dominican Republic, Cuba, Guatemala as one source and then split into north and south of the Mutagua fault zone. The crosses are the average of the data and these lines are indicating the mode so basically where lies the majority of the data and what is obvious these are no outliers so everything is basically the range these dots are just 0.7 percent of the data but all is actually the geochemical range the variability of the strontium isotopic composition. So first of all what we see is that all sources are heavily overlapping but what is also notably is that Cuba has actually a quite a small range and this might be the reason due to the fact that probably Cuban Jadots have a really short history of formation. This is also consistent with the formation of with the age of formation of Jadot but also with the age of peak metamorphism which are really close to each other just one to ten million years away from each other so therefore we have a really short and also probably just one single event whereas for the other sources which have a really high variability we might have multiple stages of jade formation but we also have late-stage hyperthermal alteration which we clearly do not see in the Cuban samples. When we look at the neodymium isotopic composition variability it's even worse samples are even more overlapping so neodymium clearly is not something that we can use for age discrimination for source discrimination. This is a plot combining the strontium isotopic composition and the neodymium isotopic composition and again you can see that everything is heavily overlapping the Cuban samples are clustering quite closely which is nice but still it's overlapping heavily especially with the Dominican Republic and what we see is this late-stage alteration which mainly affected jadots from the southern motarga faults and but also from the Dominican Republic. If we look at the lead isotopic composition we see that there's a high variability in the data by itself there are different reasons for this we might have fluids generating from different types of rocks but we also have different parent rocks that might have served as a profilate but we also see again for the southern tiger faults and for the Dominican Republic that we have some rocks that have really high time integrated uranium cerium lead ratios pointing to really old rocks that have been served for these jadots to precipitate or to form those fluids. We also analyzed trace elements by ICPMS and this is just an example of how a normalized trace element pattern might look I just picked one for Cuba and I picked one for Guatemala subdividing into jades from Morse and southern tiger fault zone and just by the zigzag pattern we already see that there's mobility going on of these elements and this mobility of the different local regions is something that we can actually use to discriminate those sources from each other so we have for example high field strength elements like niobium, tantalum, zirconium, hafnium which are regionally different but these are actually really hard normally not really mobile but if you have the right ligands in the fluids then they are mobile as well then we have large differences in the large iron little file elements like cesium, rubidium and barium and these are the differences that we are using to discriminate between the sources. Nevertheless what we are not using are just pure element abundances because you see there's also a high variability but what we are using are actually trace element ratios these are more significant when it comes to discriminating the sources from each other but even if we use bivariate plots and we plot to trace element ratios against each other we will always have overlapping sources so we cannot fully discriminate the sources from each other and that's why we thought we could use a statistical approach to better or fully discriminate the sources from each other. So we tried several things with our colleagues from ETH Zürich so we tried principle component analysis we basically used those these statistical methods to reduce the amount of trace elements that we have just to find out which are the most discriminatory trace elements we can use and one approach is principle component analysis but even here you see that everything is heavily overlapping so this is not the correct method for our issue then we tried something quite new the second approach is called TSNE basically as you can see also everything is heavily overlapping here we have a lot of possibilities in tuning this model and we tried different options but there was not one tuning setting that gave us a good split then our colleagues came up with the decision tree algorithm and the decision tree is basically a flowchart structure so we have 100% of the data set in a top node here you see the samples from Cuba, Dominican Republic and from Guatemala we had less samples and analysis when we started this method so as I said now we have more than 100 samples in our analysis and then basically on each node we have a test so in our test is does one rock have a value which is greater or smaller than a certain value for trace alimentation yes or no and then it falls either to the left or to the right and we can do that until we have pure classes at the end but then this is what we would call an overfitted model so then the model fits very well to our source or data but if we have for example an artifact which we would like to fit in it might not be assigned correctly so then another thing that we can do is we can basically prune the tree we can cut it at a certain point to keep it short and then we guarantee that the tree is quite robust we might not even we might not end up with pure classes but after all we could give it a kind of probability saying yes an artifact has 20% probability coming from Cuba and 80% probability coming from the Dominican Republic there's just one issue with this approach and that is basically we do not take into account the analytical error so for those big samples which are basically 80 milligrams of rock powder because this is destructive we the analytical error is more less connectable but on my second talk I will show you that we are working with really tiny amounts of artifact material microgram amounts of artifact material and then the analytical error is getting bigger so then this value yes or no is not valid anymore because it will be in a certain range so that's why we decided to use the another approach which is called the multi-class regression approach and what we do there is basically we are not having one trace element ratio at a split but we are using multiple trace element ratios at a split at the same time and therefore the analytical error is neglectable but what we need to do for the multi-class regression analysis compared to the decision tree this is really elegant because we do not need to normalize the data you saw there's a huge variability in the data and in the values of the data but we have to normalize the data basically for the multi-class regression analysis and what we did is we normalized each trace element ratio for each source in that way that the mean is set to zero and the standard deviation is set to one and then we already saw that there are certain trace element ratios which are really suited in discriminating one source from the other we also performed a so-called a t-test or Welsh test which shows us the significance of trace element ratios that are suited for discriminating sources from each other so for example if we look at the ratio zirconium hafnium here you see these are southern tiger of fold zone northern tiger of fold zone dominican republic cuba and here again cuba dominican republic northern tiger of fold zone southern tiger of fold zone so basically if we want to separate cuba and the dominican republic from Guatemala we can use zirconium hafnium as this is a very significant ratio if we for example would like to discriminate cuba from the southern tiger of fold zone we could use the ratio zirconium hafnium which is a very significant ratio so we performed this test to filter which ratios to use at a certain split and then we came up with a three-class model where we are able to separate cuba versus the dominican republic versus guatemala as one source so first we have a group cuba dominican red traffic versus guatemala and we are splitting the source rocks by using zirconium hafnium lantern thorium and yttrium thorium which gives us more than 98 percent correct classification of the source rocks and then after all we are splitting cuba from the dominican republic using those trace element ratios and here we are still getting 83 percent of correct classification splitting dominican republic from cuba is quite challenging as those sources are pretty close so therefore they have more geochemical similarities but we were really optimistic and tried also to do a four-class model so then also splitting guatemala as two sources and we are using six ratios for doing that and at least we achieved more than 89 percent of correct classification but i would say if it comes to provenance work just the first split already saying whether something comes from cuba the dominican republic or from guatemala i think this is already a big step in in the caribbean archaeology just to be able to discrimination discriminating between those two main areas so to conclude i have showed you that perium shade sources are heavily geochemically overlapping outcrop and hand specimen heterogeneity are significant so that's why we are using trace element ratios which are better suited for discriminating just than trace element concentrations due to the relative young age of most profilates and the time of j formation isotope compositions are not really distinct between source regions nevertheless cuban samples have least radiogenic 87 86 as transium ratios and this might help quite be helpful in characterizing artifacts in the future also giving the complex tectonics of the region we cannot rule out that there are more sources that are still not known so therefore the model has to be adapted if known other sources found by using multiple trace element ratios at a specific split we are decreasing the analytical the significance of the analytical error which is important for future predictive models for artifact data assignment and i have shown you that we are basically able to separate the three main sources from each other but we are also able to do a kind of four class model thanks for your attention