 Rydyn i'n gweld yn siŵn ni'n The chemicals that protect plants against pests and diseases attract herbivores, pollinators, seed dispersal agents. Of course we use them as flavours, fragrances, colours, scents, drugs for industrial biotechnology applications. They're a hugely valuable resource, but they're very complicated. My interest in this area started with a particular group of chemicals called triterpines, which are one of the largest and most structurally diverse types of specialized metabolite produced by plants. The triterpines share a common pathway with sterol biosynthesis, which of course is essential, because sterols are essential components of membranes and they're also hormones. Sterols are essential, they're part of primary metabolism, and sterols are made via the mevalonate pathway here, which goes to this precursor, 23-oxidose squalene, and 23-oxidose squalene is linear, but it can be cyclised by sterol synthases, in plants this would be cycloartanol synthase, to make the first committed precursor in the sterol pathway here, cycloartanol, which can then go on and be modified and turned into various sterols. Alternatively, this molecule can be cyclised to make a range of alternative products that come under the heading of specialized metabolism. Now there are some caveats around that, which we can come back to. So this is a fascinating molecule, and the enzymes that do this origami process are equally, if not more so, more fascinating. And these are just a few examples of the alternative cyclisation products that can be made from 23-oxidose squalene. And I'm going to focus in a little while on this one here, which is called beta-amarine. It's a pentacyclic triterpene. It's one of the most common triterpenes found in plants. So beta-amarine happens to be very common in plants, but it's also a precursor, oh sorry, before I say that, I've just mentioned that these are some of the alternative cyclisation products of 23-oxidose squalene. Actually there are an awful lot more of them out there, and these are just a few of them. So these are structures that have been reported from nature, and their structures are consistent with their being made by cyclisation of 23-oxidose squalene. So just with this very simple precursor, we're already getting a huge amount of structural diversity. And simple triterpenes have important properties in their own right. For example, oleonolic acid, which is the derivative of beta-amarine, the molecule that I just mentioned to you, has weak anti-inflammatory, anti-cancer activity, and there has been a lot of interest in trying to make improved versions of this that can be used for therapeutics. But of course a molecule like this is pretty difficult to modify chemically, selectively, because the scaffold has very little functionalisation to work with. So simple triterpenes are important, but in plants these molecules are often converted to far more complicated things. And here I'm just showing you a few examples of rather more complex molecules that are all made from that simple starting point, beta-amarine. And you can see that I've highlighted a few properties associated with these. So they all have this simple scaffold, but they have various other modifications. So they have oxigenations, they have sugars, here is an asile group, and you can see that there is a lot of diversity. So diversity in the scaffold, and then all of these further modifications, and there is a whole suite of enzymes that mediate this, can convert these molecules into all kinds of interesting bioactives. Very few of these can actually be purchased, and there are lots of problems with accessing them from nature because you have to access the plant material, they're produced in small quantities, they're often present as complex mixtures, there are a lot of issues with trying to get hold of triterpenes so that we can do systematic analysis to understand the relationship between structure and function. And you can see the properties listed here are diverse. This one, which I'm going to come back to, happens to fluoresce bright blue because it has an end methyl antherinolate group here. This is a very unusual property amongst the triterpenes, and as I will show you it's been extremely useful to us. And this molecule is produced only in the roots of oat, the genus avina, and it's antifungal. It also is phytotoxic, so it's a natural herbicide. And then you can see down here we have another triterpene produced by pea, called chromosome 1, and this, instead of being phytotoxic, is actually a plant growth stimulant. So we're really interested in understanding whether those molecules act antagonistically on the same pathway or whether there are different things going on. This molecule produced by legumes is associated with bitterness and antifedant properties, whereas this one from licorice is 50 times sweeter than sugar. And then we have various pharmaceutical properties listed here as well. So this is just a snapshot. This is a very, very small amount of the diversity that's out there in nature. And what got me interested in this whole area is this molecule here, which is called avinacin A1, and this is one of those stories of serendipity. And it started because I was interested in some papers published in the 1940s by a lady called Elizabeth Turner at the University of Oxford, who had seen that the tips of oat roots, avina, fluoresced bright blue under ultraviolet light, and she made extracts from these roots, and she found that there was a substance in there which was antimicrobial. And she called this root tip glycoside because she'd shown that it was glycosylated. At the time she didn't know the structure, which was subsequently determined, and shown to be this molecule, which has a beta-amarine scaffold, the fluorescent N-methyl anthronylate group here, and a trisaccharide chain. So this molecule is produced specifically by oat, so most of these specialised chemicals produced by plants are only produced by particular narrow taxonomic windows. So that's why they're called specialised. They're also often called secondary, but the term secondary is going out of fashion because they're clearly important molecules. And I was interested, so Elizabeth Turner in the 1940s proposed that this molecule might be protecting oats against attack by soil-borne pathogens. And I embarked on this somewhat reckless set of experiments to address this because at the time people were really interested in molecules produced in plants in response to pathogen attack. And this molecule is just sitting there in the soil. It's produced as part of normal growth and development. And nobody had asked at the time whether preformed chemicals might be important in protecting plants, and to me that seemed like an obvious possibility. So we took diploid oat and we treated it with a chemical mutagen, and we simply screened by putting germinating oat seedlings onto a UV transilluminator. We looked for seedlings with reduced fluorescence, and this is one of them here. Clearly this molecule is very complex. We expected to find many genes involved in its synthesis. At the time that we started this, nothing was known about triterpium biosynthesis for any plant species. Oats are very unusual amongst the cereals and grasses in making antimicrobial triterpenes, whereas the dichotoledinus plants make a huge array of triterpenes. So this was an unusual situation in the grasses. And people were interested in why oats were making this and other cereals such as wheat and barley and maize were not. So we did this mutagenesis. We got a bunch of mutants. In fact, we have nearly 100 mutants in this very complex pathway. And then we tested the mutants to see if they were compromised in disease susceptibility, and the answer was very clearly yes. So here is the wild type oat, and this has been challenged with a soil-borne pathogen that causes massive disease losses on wheat, and it's resistant. The mutants all show severe lesions and are clearly susceptible. So we had addressed Elizabeth Turner's question, and we had shown that this molecule appears to be involved, or it is involved, in protection against soil-borne pathogens. And we would have stopped there because diploid oat is not a model species by any means. It's not a rabidopsis. And this is a very, very complicated molecule. But the reason why we kept going was because the genetics started to tell us that something very strange was happening. The loci that we defined by mutation but had not yet characterised were not cosegregating when we did genetic analysis. They were sticking together. And this suggested that the genes for this very complex pathway were genetically linked in the genome. And many years on, we've now characterised all of the genes in the pathway. Here I'm just showing five. This is the gene for the first step, the second step. These are three genes for the ASR group. And these genes are physically adjacent. Those of you who work on bacteria might think this is really funny because this is 400 kilobases here. This is a big distance. But actually in plants, this is a gene-rich region. And there are no other obvious genes in between the genes that we find. And I should emphasise that these genes all encode entirely different types of enzymes. This is the oxido-squaling cyclase that converts 2,3-oxido-squaling to beta-amarine. This is a P450 that introduces modifications to beta-amarine. This is an ASR transferase, a sugar transferase, a methyl transferase. So this is not a tandem duplication event where we have an array of genes that have been duplicated. And I can also tell you with confidence that this region has not originated from microbes. It is not a horizontal gene transfer event. And I can further say, and there is a lot of evidence for this, that this cluster has evolved since oats have separated from other cereals and grasses. So something very strange is happening. Now when we started this work, the general understanding was that genes for the synthesis of specialised metabolites and for other multi-step processes were scattered around the genome. But here we have something that looks a little bit more like a bacterial operon. It's not an operon because the genes are transcribed separately, but nevertheless they are physically clustered. Together they make this molecule which is required for plant defence. The molecule, if you take a cross-section of an oat root, is in the epidermis. So it's in the right cell layer to provide protection. It makes sense. The genes are expressed in the roots but not in the other parts of the plant. And when we do messenger RNA in situ hybridisation, we can see that, in fact, they're also expressed in the epidermis. So these are longitudinal sections of the root and this is where the signal is. So we have physical clustering and co-expression. This is a recently evolved pathway. These genes have somehow been recruited from existing components elsewhere in the genome and assembled, I'm choosing my words carefully here, assembled into a cluster. Now we know a lot more about this pathway, which I won't go into the details of. We've characterised all of the genes and enzymes now. We haven't published on all of them. We know that the blue fluorescent molecule ends up in the vacuole. We know where a number of the proteins are. We do know that these proteins do not exist as a single multi-protein complex because we know, for example, that the acyl transferase is in the vacuole, that the early steps are likely to be on the cytosolic face of the endoplasmic reticulum and that the other intermediate steps are probably cytosolic. And here is the molecule in the vacuole. So one of the very ambitious projects that we have is to see, having learnt what we have from oats, can we take these genes and engineer them into wheat and into other plant species to see if we can make antimicrobial triterpines, not necessarily the blue fluorescent one exactly, but to make triterpene scaffolds and modify them to give antimicrobial compounds. And if we can do this, then perhaps we can prevent this dreadful soil-borne disease of wheat, which causes hundreds of millions of pounds of loss in yield in the UK alone per year. So this is a very ambitious project. One of the things that surprised me when I first started thinking about plant engineering is that despite all of the effort that has gone into this area since the 70s or 80s, there are very few promoters available. And there are certainly very few promoters available that we can have confidence that they will drive co-ordinate gene expression if you want to express multiple genes in the same tissue at the same time. There's still a big challenge there. And we found something rather surprising, which was although when you look at our gene cluster, we can come to the reasons why genes might be clustered and I'm very interested in your views on this. But clearly one explanation or one partial explanation could be that physical clustering enables a higher level of regulation of the gene expression at the level of chromatin and potentially also higher nuclear organisation. However, if we take the promoters of these genes out of this region and we hook them up to a reporter gene and we put them into other plant species, they work. So this is oat and here you can see fluorescence in the root tips as I've shown you. What I haven't shown you is we also see fluorescence in the lateral root initials. So this is the pattern of expression. If we hook the promoters up to a reporter such as Gus, we get blue staining where we see expression. So in diverse plant species such as a rabidopsis and rice and legumes, this is just with the promoter for the first gene in the oat cluster, but it's true also for the others. We see Gus expression in the root tips and the lateral root initials in these diverse plant species. So remember that this pathway has evolved recently. It's specific to oat and so this provides a further conundrum. It looks as though that the promoters for the oat cluster have somehow plugged into some sort of ancient, highly conserved root development process which is conserved across the monocots and the dicots. It's also very useful because it means we can use these promoters in other plant species and here they are in wheat. So this is really nice. We now have a set of 11 promoters that we can use to drive the expression of multiple genes in wheat roots. That's a really useful resource. We still don't quite understand how it works. Coming back to the triterpenes, I mentioned that 2-3-oxidose squalene could be cyclised. It can be converted into all of these different cyclisation products by enzymes known as 2-3-oxidose squalene cyclases and I showed you examples of all of these diverse scaffolds and we now know, oops, very many years on, we now know a lot more about the enzymes that make these scaffolds. We still haven't characterised very many of them but these are some examples. The oat one is up here. Interestingly, dicots make beta-amaran in different ways and then there are other enzymes that make all sorts of diverse scaffolds. So it's a very interesting group of enzymes because they take one substrate but they're able to convert it to all of these different products. Most of these enzymes are specialised, some of them make multiple products. We're beginning to learn about the cytochrome P450 enzymes that are able to oxygenate these scaffolds. So this is beginning to provide us with tools that we can use to selectively modify scaffolds in different places and these are all beta-amaran modifying cytochrome P450, not just from oat but from a whole range of other plant species. So we're putting together now collectively a number of groups around the world have discovered a lot of genes and enzymes for the synthesis of triterpene scaffolds and their modification and we're putting together a toolkit that we can use to make sweets of structural variants of these molecules. So these are beta-amaran modifying P450s. There are also P450s available now that make other scaffolds which we and others have characterised and you can see that we're building up a very powerful set of resources here for the modification of scaffolds which can then be further modified enzymatically or of course they could be modified using chemistry. And we're beginning now to learn also about the sugar transferases because we've had to work our way through the cyclases and then the P450s to get to the next step which is glycosylation and there are now quite a few sugar transferases that put sugars onto various positions around the scaffold and of course glycosylation also changes the physical and the biochemical properties of these molecules quite markedly. One of the things that has been very helpful to us, I mean there's a general feeling out there that working with plants is extremely slow and when you're doing genetics they know it is but one of the things that's really accelerated what we have been doing is this very nice transient plant expression system developed by my colleague George Lomonossoff at the John Innes Centre and what this involves is a very simple system where you can take your gene of interest in this case GFP and George accidentally discovered a number of years ago through his work on Calpimazec virus that if you have a little bit of Calpimazec virus sequence from the RNA2 gene of Calpimazec virus inserted in front of your gene of interest this gives a massive elevation in the amount of protein that's produced and this effect is post transcriptional and so if you have your construct with your Calpimazec virus sequence here and a suppressor of gene silencing P19 put these into agrobacterium the workhorse that we use for plant transformation you just squirt your agrobacterium into the leaves and within five days or six days you have in this case very clear expression of green fluorescent protein and the levels of protein using this system because of these sequences are massively elevated and so we wondered whether this would also work for the molecules that we're interested in and so we made constructs and we did this expression in this Nicotiana Benthymiana plant which is particularly amenable to transient expression and here we've got GC traces and this is an extract from an empty vector control leaf this is an extract from a leaf expressing our beta-amarine synthase and this is a peak of beta-amarine so this is the black scaffold here and then here we have the second step in our OAT pathway which is a very interesting cytochrome p450 which we knew modified beta-amarine but we weren't sure exactly how and when we co-infiltrate the genes for these two constructs the beta-amarine peak goes right down and this peak comes up and we were able to easily get enough to purify and determine the structure using NMR interestingly although I won't go into this in any detail this p450 is able to modify two rings on this structure it's a very interesting enzyme so this then was a proof of concept and we've now gone on taking our triterpine toolkit you'll hear more from Jim about the common plant syntax and the ways in which we've been getting our DNA synthesized and formatted so that we can mix and match and exchange with others so we have a toolkit of genes and enzymes for triterpine synthesis modification to make known and novel molecules and we can now play with these we put them into the the nicotiana leaf expression system and within five or six days we get our answer we know which enzymes will do what so this is very very rapid very quick very powerful and we've been using it to assemble multiple steps in in the synthesis of triterpines to make very complex molecules and we've been able to easily make several hundred milligrams which is useful it's very useful for doing bioassays and recently somebody in the lab has now made a gram of beta-amarine so we can go up to gram levels the question now is can we go down to high throughput robotics and you'll see how that can be even more powerful fairly soon now i mentioned this phenomenon of gene clustering and i said it was unusual it was surprising and at the time we found the oat gene cluster there was only one other example of a gene cluster for a specialized metabolic pathway implants and that had been reported by Alfon's Gil in Munich who was working on maize on molecules known as benzoxazenoids which are also implicated in plant defence that was the only other example that was about 15 years ago now but the sorry the general view until quite recently has been that genes for multi-step pathways are dispersed around the genome and the anthocyanin pathway in maize is a great example of this and the genes for anthocyanin biosynthesis are on different chromosomes in maize and are used to follow the segregation of genes in genetics it's well known you could say well are anthocyanins really specialized because plants all plants make anthocyanins carotenoid genes similarly are dispersed carotenoids are also very widely distributed nevertheless there are examples of other specialized metabolic pathways for example for glucosinolates in brassicas that are not clustered so we wanted to find out how common this clustering phenomenon might be there was the original maize cluster the oat cluster so we went to a rabidopsis because at the time there weren't many plant genome sequences available and the rabidopsis genome was very high quality and we did a very crude mining experiment we simply looked for genes that were predicted to encode two three-oxidosequalene cyclases so genes that look like they might be able to take two three-oxidosequalene and turn it into one of those cyclic products that I've mentioned and in the rabidopsis genome there are 13 genes that belong to that family one of them is required for the synthesis of cycloartanol which is as I said the precursor for essential sterols but the others all make diverse products and that's been shown in yeast so they're not involved in essential sterol biosynthesis this is one of those genes when we look around that gene in the genome we see that its neighbors are genes that are predicted to encode two very different types of cytochrome p450 that have not arisen by tandem duplication they're very different and anacyl transferase and we also saw that these genes were very tightly co-expressed in the available gene expression data so this looked like a candidate metabolic gene cluster and so Ben Field who was in the lab at the time did some very careful analysis using mutant lines that were blocked in these various steps and also using overexpression lines and silenced lines and biochemistry and he was able to show that these genes did indeed form a pathway and this was we called it rather mischievously the first example of an operon-like gene cluster in Arabidopsis it's not an operon as I say but the point is these genes are not tandem duplicates they encode steps in a pathway they're co-expressed so they have operon-like features so we and others have then gone on to to predict and validate other metabolic clusters in Arabidopsis and there's a second one that we found the first one was for the synthesis and modification of Thallianol which is a triterpene and then another one for the synthesis and modification of manarol another triterpene I should say at this point that although the oat of venosyn pathway and the Arabidopsis triterpene pathway which I've just shown you are both triterpene pathways there is compelling evidence that these pathways have evolved independently of each other and if anyone wants to know more I can tell you afterwards so this would appear then to suggest that triterpene pathways may be predisposed to clustering we found a second triterpene cluster in Arabidopsis the manarol cluster and through careful analysis and looking at phylogenetics and intron exon patterns we were able to put together this model which may be a little bit too generous I mean we it's possible there was an ancestral gene pair that that was the foundation for both of these clusters but we can't be sure of that and this is contained in oxido-squating cyclase and potentially also a particular type of cytochrome p450 so conceivably then this duplicated but nevertheless what has happened since then is that other genes have been recruited into these regions and these clustered pathways both make entirely different products so this is a possible scenario for the ways in which in Arabidopsis at least these clusters may form now since the maize and the oat story and the Arabidopsis work others were discovering other clusters in other plant species work from Ruben Peters lab and Professor Yarmone's lab in Tokyo had led to discovery of clusters in rice again for the production of defence-related compounds, di terpenes and this figure which you won't be able to see because it's too small is from a review that we published at the beginning of last year and this shows clusters from a growing number of different plant species for different types of chemical excuse me and this this just gives you an example of some of those so here we have the the oat compound so these molecules are all produced by clustered pathways here is the maize compound Arabidopsis there have been clusters reported for the synthesis of anti-nutritional compounds in tomato medicinal drugs in poppy these are the rice di terpenes clusters interestingly the cyanogenic glycosides which have traditionally been sought of as the most ancient and highly conserved group of plant natural products and these are sporadically distributed across the plant kingdom it turns out that those clusters and this is work from a group in Copenhagen sorry those pathways are clustered in lotus in sorghum in cassava but importantly those clusters have arisen independently so independent evolution then of genes that give rise to cyanogenic glycosides in diverse plant species so the reason for clustering now depending on your background you may have different explanations for this this is clearly a non-random organisation of genes in the genome plant genomes generally contain around 30,000 genes so to have genes right next to each other that are not tandem duplicates that are delivering these beneficial pathways is definitely not random and as I mentioned earlier physical clustering has the potential to facilitate higher level regulation at the level of chromatin at the level of nuclear organisation and I've shown you that the the genes in the oak cluster have this very very tight expression pattern in Arabidopsis this is in silico gene expression data these are the four genes of the thallionol cluster and you can see their expression patterns across different tissues is very similar when we get to the edges to genes that are not involved in this pathway the expression pattern is different so these are windows of co-expression we showed some years ago using DNA fluorescence in situ hybridisation with the oak cluster that if we take probes for the gene at the first gene in the pathway and the second gene in the pathway in green and red here so you can see this is looking at DNA now you can see that in the epidermis these signals are quite strung out they're quite extended and of course the pathway is active in the epidermis whereas in the cortical cells where the pathway is not expressed the signals are more discreet and it was possible to actually quantify these these differences so this is descriptive but it suggests that chromatin decondensation is associated with expression of this pathway in oat. We've gone on to do a lot of work in Arabidopsis where the resources are much better for looking at chromatin modification the the effects of various modifications on cluster expression and a picture is emerging now both in Arabidopsis and from what we've been able to do with other plant species where the resources are available so in Arabidopsis for example genome-wide chromatin immunoprecipitation has enabled regions to be identified that are marked with a histone modification this is histone 3 k27 trimethylation and this is a usually a repressive marking and these clusters have a very very discreet and localised marking of this h3 k27 trimethylation and this is generally installed by the polycomb complex so this is associated with inactivity but we also see that in tissues where clusters are active we have exchange of the histone 2a protein with histone h2a z which again has been associated with poisoning and with with readiness of genes for example in yeast to enable rapid expression in response to environmental change so we still have a lot to learn about this but i should also say that while in animals regions of contiguous marking of genes with h3 k27 trimethylation is well established in plants it isn't so there has been a lot of work in plants on polycomb mediated regulation of gene expression most notably in the context of plant development but these genes are generally isolated they're not contiguous and it transpires that we can use these markings as an additional tool to try to mine genomes to discover new pathways and here you can see these are the genes in the thallianol pathway in arabidopsis and this is the histone 3 k27 trimethylation marking which is very clearly significant and pronounced and we can actually go through data from the arabidopsis genome we can look for windows of co-expression we can also look for windows of h3 k27 trimethylation in tissues where the pathways are not active and we can use that to find new candidate pathways similarly i mentioned that h2a z this alternative histone 2 variant is associated with pathway activation and we can use mutants to look for windows in the arabidopsis genome where we see a very clear effect on transcript levels of cluster components so this is the thallianol cluster this is in the wild type line this dotted line but in a mutant that is lacking h2a z you can see that the transcript levels go right down and there's this very discreet window this is very striking similarly for the manarol cluster and this is a new cluster that we've recently discovered and that others have also done some work on and this is a much larger cluster and it's very very pronounced so we're beginning to learn about the features of these clusters not only in terms of the genes within them but also these kinds of markings and this can all be fed in to what we hope ultimately will turn into a machine for mining plant genomes for discovering new pathways and chemistries so the arguments for clustering i've talked about co-expression i've also told you that these clusters have either been shown to make compounds that provide protection against pests and pathogens or they're likely to have some role in survival in the environment so it's generally accepted that these specialised metabolic pathways are serving a useful purpose in nature so if you have assembled good combinations of genes that together are able to make a protective molecule presumably once you've done that you want to co-inherit that gene set together in order to be able to continue to make the molecule so that leads to another theory that has been proposed which is the co-inheritance theory on top of that we also know that in some cases if we interfere with these pathways with mutants or with overexpression of pathway genes then we get elevated levels of intermediates accumulating and this can have clearly detrimental effects not always but in some cases so in OAT for example failure to add on a glucose to the trisaccharide chain of the triterpin leads to a very sick root phenotype and here inside the wild type this is normal wild type root cell this is inside the mutant the epidermal cell layer is very distorted there are these big membranous sacs and this is callows chaining an accumulation of callows in this way is typical of a response to toxicity and in fact if we combine this mutation with a mutation in the first step in the pathway we lose the ability to make the intermediate and we restore the normal root phenotype although clearly we don't make the molecule so this is a toxic intermediate effect similarly in a rabidopsis if we mess with the thallianol cluster or the manorol cluster we can have these clearly detrimental effects on plant growth so if you interfere with clusters you not only lose the ability to make a protective molecule you can also generate molecules that are very bad that cause serious effects on growth and development so this is very simplistic and as i said it depends on your background which of these you might jump to and for the population genetics in the audience geneticists in the audience there's another level obviously so co-expression co-inheritans toxic intermediates these are not mutually exclusive arguments and they're also not the full story and certainly we don't always see toxic effects when we interfere with pathways and i think we can learn something so i'm victor's comment at the beginning of the meeting about separating out where do biological systems biological objects come from and how do they work to me that's a bit of a blur i mean you can't recreate evolution obviously but i think we're getting glimpses into things into what might have happened in terms of cluster assembly and how clusters work through these kinds of experiments so this issue then of how do these pathways assemble evolve is a bad word in this certainly in this forum how do they assemble and the answer is we still don't really know i've shown you a model from Arabidopsis we have a gene encoding the first step in the pathway which makes the scaffold or at least is the first committed step in the pathway and then we have genes encoding enzymes that modify the product of the first enzyme and we call these tailoring enzymes by analogy with microbial systems and these are things like cytochromes p450 sugar transphases acyl transphases methyl transphases so often the gene for the first step in the pathway as i've shown you for the triterpenes has almost certainly been recruited from primary metabolism in the case of the oat and the Arabidopsis triterpenes from the sterol pathway but then it has diverged it has acquired a new function and also a different expression pattern somehow then this gene has ceded the formation of a cluster and again i'm using my words very carefully here but there is lots of evidence to indicate that these clusters are forming de novo in plant genomes and this is where if the mathematicians have any ideas i would be really grateful to to know them so i've shown you examples of toxicity when you accumulate into medias but also some other very subtle effects and this is one of them so going back to our oats this is the the wild type oat line this is a mutant that is blocked in the first step in the pathway it's not able to make the beta-amarine that is the precursor for the the rest of the pathway at least not in significant quantity this is a late pathway enzyme this is the acyl transferase these roots all look normal we noticed and we're growing these seedlings on big square petri dishes and we had to do that to see this effect because initially we were growing them on small petri dishes we kept thinking the roots of this mutant look shorter but it was a bit variable and here you can clearly see that these roots are shorter and these mutants are blocked here so they're accumulating 40 fold more beta-amarine than the wild type so this is a beta-amarine mutant that is blocked in the first step in the pathway there's still a basal level there's a very low level of beta-amarine detectable in the wild type and a trace detectable here but these mutants are accumulating 40 fold more beta-amarine which as i said is a common plant metabolite and when we look closely at what's going on in these roots these are the root hairs this is the wild type these are mutants blocked in the first step and the roots look normal these are our sad two mutants the mutants that are accumulating beta-amarine and the roots are super hairy these are two independent mutants this is an intermediate mutant that is partially blocked so somehow accumulation of elevated levels of beta-amarine is triggering this super hairy root phenotype and when we looked at this very closely we saw something really interesting so this is patterning in the wild type root epidermis this is before the root hairs emerge we're looking at the the meristem the long gray cells are the cells that will not give rise to root hair cells the pink cells are shorter and they are the cells that will give rise to root hairs in the wild type we can detect low levels of beta-amarine in the mutant we detect as i say a lot more and what we're seeing is actually a change in cell specification this is really important this is happening very early on when the cells are deciding what they're going to be they're being told before they form that they're going to be root hair cells and this is why we have this super hairy root phenotype so this is a bit more like a plant growth hormone effect so when you start to think about all of these things we're seeing the things that intermediates do trying to make sense of it it's a big challenge but there are definitely some really interesting things coming through this so super hairy roots at some point in time may actually have been an advantage maybe you take a cyclotonal synthase gene you duplicate it it acquires a new function the ability to make beta-amarine it's expressed in the roots and maybe that was a good thing and we're now looking to see whether these mutants have enhanced biotic and abiotic stress tolerance as an aside we also see interesting more subtle effects with other situations as well so for example i've shown you about toxic effects with the thallionol pathway in a rabidopsis but we also see some more subtle effects where if we accumulate elevated levels of thallionol in the roots we actually see longer roots if you want to make longer roots and make a fit of plant that's quite a useful thing to know about and in legumes which form symbiosis with nitrogen fixing bacteria there is a different triterpene called lupiol and there is a lupiol synthase that is expressed in nodules when we silence it we see a window of clearly elevated nodule formation so lupiol appears to be suppressing nodule formation so we don't understand all of this but i'm just saying the stuff going on and it gets more complicated and so these these are just more examples of different effects that we can see when we tinker with clustered pathways so just before i finish i want to tell you a little bit more about what we've been doing in a more bigger picture way because i hope you can see how once this is all sorted out and automated this could be incredibly powerful the whole clustering phenomenon and one of the things that we've been doing this is with Alex Butanov who is a bioinformatician based in Russia is taking multiple plant species in this case 17 species for which the genome sequences are available we focused on the terpene family so the terpenes include the triterpenes but also a whole load of other types of terpenes that i won't go into but it is a massive family of diversity in plants and what Alex did was he mined these genomes for all terpene synthase genes so these are the engines of generating diverse scaffolds and he also mined them for all cytochrome p450 genes and then he asked using a sliding window approach how often he found a terpene synthase gene in proximity to a p450 gene relative to what would be expected by chance and the answer was very often and so we've classified all of the terpene synthase genes according to the nomenclature and all of the p450 genes and what we can start to see is that there is a clear skewing there's a clear tendency of p450 genes to be physically linked within 50 kilobases which implants is close to terpene synthase genes and we can also look at the patterns that are emerging in terms of the types of cytochrome p450 genes that tend to be associated with terpene synthases and when Alex did this he found the known examples of terpene synthase genes that have already been reported I've told you about the di-terpene clusters from rice he also found the steroidal glyco alkaloid cluster from tomato the arabidopsis clusters and so on but also a whole bunch of other candidate clusters one of the things he did which was very interesting in terms of trying to get more insight into how these pathways evolve was he took a couple of the major pairings the major groupings of terpene synthases and p450s and this is a little bit just bear with me this is easy to understand once you get it so he took these pairs of terpene synthases and p450s and he did sequence similarity searches so he took a terpene synthase within a pair and asked which one was the most closely related terpene synthase in the other pairs and then he did the same for the p450 partner of that terpene synthase and what he found was something very interesting which was that in the dicots the terpene synthase that that when you found the the one that was most closely related to it its p450 partner was also most closely related to the p450 partner of that terpene synthase in other words that's suggesting an iterative duplication of pairs of terpene synthases and p450 genes that was not the case in the monocots something strange is happening there so it's as if there is a micro-sintony in the dicots and this is kind of the opposite way round to the way that it's normally thought of i've spoken to a lot of genomics experts and they they don't know how to explain this and we're only looking at this on the basis of terpene synthase is not the whole organisation of the genome but it suggests that there is micro-sintonic duplication in the dicots and that would be consistent with our Arabidopsis hypothesis for the thallionol and manorol clusters but in the monocots it's all chopping and changing and de novo combinations he also looked at all of the the the genes within these pairs and compared them to genes across the genome looked at the frequency of transposable elements in the flanking regions because you might expect that transposable elements would play a role in moving genes around and as we might expect he found that there were significantly more transposable elements in the flanking regions of the genes within these pairs than in the ones that were not paired and we then went on and took some of these new candidate clusters that Alex had found and we validated them experimentally and this is work with Reuben Peters in Iowa so this was a predicted cluster for di-terpene synthesis in Caster this was another cluster in Arabidopsis again for triterpene and this is a cluster from Cucumber so again it's more grist for the mill it's more a very powerful way of finding new chemistries so in summary I'd just like to say that I've talked about examples of these metabolic gene clusters implants for the synthesis of diverse chemicals we don't yet know what the balance will be of pathways where the genes are dispersed relative to those that are clustered nevertheless clustering is very useful for mining genomes to discover new pathways and chemistries we can look at these clusters to try to understand what their defining features are and to understand how they work and then we need to know whether we can edit them down we don't know whether we can to make minimal gene clusters which will make it much easier to move these pathways around certainly in plants and make synthetic gene clusters and one of the things we've done for example is to take the promoters from the OAT cluster which you will remember are expressed exquisitely in the root tips and we've used those to drive the expression of a three gene pathway for the synthesis of cyanogenic glycosides in a rabidopsis roots and this opens up opportunities to start to tailor the risosphere by making designer chemicals implant roots and with that I'd like to finish and just draw your attention to the many people who have contributed to this work and also the funding sources and I think you'll hear more about open plant from Jim in the next talk thank you