 So I am Ewan Burney and my current position is director of EMBL EBI It's worth if anybody's you know gonna look at this my name My first name Ewan is actually a sort of family nickname that gets given so I'm actually formerly John Frederick William Burney and Ewan is the nickname that everybody uses me by so Ewan Burney is very much my scientific name My legal name is John Frederick William Burney So I was born in London in 1972 December 6th, 1972 So it probably was a guy called Bob Stevenson. I went to I did a English Private education what it's called public school in England, but just obviously You know the interpretation is quite the opposite So privately funded and to Eaton College and at Eaton there was a guy called Bob Stevenson Who was a really good teacher and scientist and he he fired me up for sure In biology and he was responsible at the time will come on to you know at the time Jim Watson Who's basically a tremendous anglophile? I mean he absolutely loves all things traditionally English Jim had I think gone to Eaton and and Talked to some of the teachers there and seen and been entertained by some of the boys and Decided to take a boy every year from Eaton To work in what's called a gap here, which is a very which you'll know is sort of traditional English a Year that you spend between 18 after school and before University one takes a year out So he made an offer to do that and Bob Stevenson was responsible for selecting the boy so rather than there being some process it was just basically Bob's choice and Bob suggested me So when I was 19 I ended up at Cosman Harbour living with Jim Watson and working with Adrian Cranham So it's so it was I mean again when you're 19. It's you know, there's a You're not aware of quite how Odd and unique the situation is and slightly strange So I lived in Jim Watson's house with Liz Jim's wife and Jim there sort of in loco parentus His eldest son Rufus was also there in the house at the same time. It's very nice And you would I wouldn't spend too much time necessarily talk about you guys, but we still exchange Christmas cards or whatever with Jim and Liz even now And I'd walk in from they have a lovely house the top end of Cosman Harbour The director's house and I would walk in every day and go to the lab And Cosman Harbour was really really fun Really enjoyable experience and that also showed he just sort of how Open science. Well, so I hadn't appreciated just how many unknown things there were and So working in the lab where suddenly you're testing things and you have to work things out and everything else was really really cool I was only 19. It was a great privilege When I was there in Adrian Cranham's lab his studies RNA splicing really the databases has started to get big and people needed ways of searching them and looking at things and And And so I taught myself to program in that year And a very lovely guy called Sanjay Kumar When I said please, you know, can I you know, how do I do this and he was a programmer I knew he was a programmer and he said he bought me this the famous sea book this very thin Book about how to programs in sea and I taught myself And so my first paper was with Adrian before I went to university Which is a very small paper about the presence of RNA binding domains in a particular protein that people didn't think had them and we we sort of just it was basically an alignment by hand But my second paper, which was far far more substantial Came about in my first year at university, but I was still communicating with Adrian about this And that paper which is on the RNA recognition motif, I mean it still gets cited now a little bit, which is kind of cool And it was you know, it was a it was a it was a good paper So that was a good thing to do So I must have started that paper when I was 19 and it was published when I was 20 published in my first year at university, so so it's definitely it and you know So obviously it gave me a in some sense in just a sort of timing sense a huge head start But I think the other key thing it really gave me was This this and a far better understanding of how science is done So that that what you read in the textbook wasn't necessarily what was right That you know that one knowledge is founded by Experiments and discussion and everything else and I had a really really great time I mean being British in America your accent makes you sound like you're You know, you've got ten plus IQ points I Worked really really hard in Cosby Harbour and then I'd either go out in New York City or a friend of mine from Eton had gone to Harvard and at the same time and he hadn't done the gap Basically got straight to Harvard So I'd take the train all the way up to Boston and hang out with him up in Boston for a weekend and then I'll come back down again and So it was a really really fun time the other thing that I remember very distinctly is You know talks when people came in to give a talk at Cosby Harbour everybody would gather and One scientist Winship her would You know as soon as he didn't understand a slide he'd put his hand up and say, you know I don't understand what you're showing on this slide and He did it in a way where I think he he didn't do it He did it because he wanted to he wanted to understand what the what the other person was doing He didn't mind looking a bit stupid sometimes He really didn't care about that he just wanted to make sure that he was following all the slides and again, I it's something that seeing really really High-end scientists feeling happy to ask stupid questions is Was it is a good attitude to have to those talks? You know comment do I understand everything that's being presented at this time? So it's a good. It's a kind of good rigor to have in your own head. You know, have they done all the controls? Has it been set up with the right? fundamental thing behind it basically So Adrian Craner who ran the lab and Cosmic Harbour. I mean again I think it's remarkable to think about someone who let a 19 year old come into their lab and Then by the end of that year write a paper with them and then in the next year write a second paper with them and that shows a lot of You know, I think I I'm not sure I'm not sure everybody would would would sort of encourage that to happen and let me follow my own Approach in this computational work that Adrian wasn't necessarily so strong about but could understand what was going on underneath So Adrian was really really good and then in Oxford the Oxford system is four years and the fourth year You have to do your own project and a NMR spectroscopist called Ian Campbell At that time I had published so I've published a paper in In my first year and then my second year I actually went to embal Heidelberg, which is in fact now my parent organization that I work at work with a guy called Toby Gibson And I wrote a second paper there called pairwise and search-wise and that is the precursor to a lot of my algorithms So I spent a whole summer basically in Germany in Hot Heidelberg Working with Toby and I published two more papers than with Toby over my time And then my fourth year I went to Ian Campbell and effectively said, please, you know, I think I know what I'm doing with my own research I have my own, you know, I've got four publications already or something like that Could you just let me feel around, you know, can I just have a computer all I need is a computer and space and then that's fine and You can, you know, be my supervisor. I'll obviously talk to you And he said fine and at that point I knew that I wanted to do more things with Databasing and more things around sequence analysis and the only open source database at the time was written by Richard Durbin and John Terry Meag Called a DB So I wrote off I wrote email to at that point. I had done all these profile things I wrote an email to Richard saying Please can I come and learn how to use this? And so he invited me and I also gave a talk about my profile work and I'd actually done I had Miss I had incorrectly Implemented an algorithm because I'd sort of taught myself all the way through So I've done something quite interesting in pairwise and search-wise, which wasn't quite correct but Richard Durbin and Sean Eddy was a postdoc there and again they both you know, they sort of I think found my Self-taughtness of computational, but you know of these methods really quite interesting So that was my fourth year at Oxford and at the end of my fourth year of Oxford I had to decide whether I was what I was going to do And a couple of extra things here, so my father is an investment banker and Yeah, and my uncle is Was an electron scientist my Microscopist my my and that was my uncle on my mother's side My father basically was rather skeptical that I would make a good career that he felt that I was over focused on science he felt that you know as I was doing well, but I Hadn't you know that I was going to choose this because it was interesting and then regret it in ten years time because it didn't Have enough money basically. I mean not not that he's very money-orientated But he worried that I was closing down my options too quickly. So At the same time I spent I did a summer working for an investment bank Where I did all sorts of fun things cleaning option Pricing and there was actually I got quite close it never happened, but it got I got quite close to spotting Something that they would have created an option around Which was kind of cool in one of these exotic options So that was one summer. I was mainly in equity research on pharmaceuticals there And then in fact the year after my after Oxford I spent a summer working for the mayor of Baltimore Here just up the road in Baltimore Which was a big education in kind of American life and politics and American politics. I mean obviously not very scientific at all So both of those were slightly to prove to myself and prove to my father that I was making a good decision to stay in science And at the time the welcome trust had something called price studentships Which was really designed I think for people like me where Where rather than so I did I I I applied to being an investment bank after going and working in the city I you know went to Goldman's I went to SPI had offers not from Goldman's but from a whole bunch of investment banks So and I think if there was just a straight British salary MRC star studentship where really you had us, you know, it wasn't enough to live on at all It was pretty I don't think I'd have gone for it, but the welcome trust price studentship was Healthy enough that you could live and you know, you can have a car and you know Anyway, so I I I said yes to the welcome trust studentship with Richard and And so that's that's how I got to sang her and working with Richard basically So, I mean, he's one of my he's still one of my collaborators. I mean, it's really interesting Now being whatever for mid-40s thinking about that because of course I sort of started Knowing of him in my 20s and then working as a as a student Whenever I was 24 something like that But he's a remarkable guy Richard you should get oral history from him So he it was mathematician from Cambridge He's got these curious things. I think he had a basketball scholarship That he came to Harvard for something like that. He was a basketball player volleyball player something like that and I you know maths in Cambridge is the highest level and And I think he was I mean, he's he's an excellent mathematician, but he's not one of those Pure mathematicians and so he had to slightly decide precisely what he was going to do and One of the things he was doing and I can't remember quite how this worked with his undergraduate or his graduate work Was that he started doing confocal? Microscopy he programmed the confocal microsecond microscopy software and stuff like that I think that was before he did his PhD And then he's did his PhD with John Solston And that was very much to bring computation in neural networks alongside the C. Alligan's developing brain And I think he mapped out all the or many of the neuronal connections of C. Alligan's which is for Somebody who's a mass person. That's a that's a pretty big journey already and John Solston had a obviously a passion for the worm but a passion for genomes and passion for getting these things done and So as the worm genome project was coming up It was kind of clear that they needed information systems to do that and I'm not sure how this happened, but somehow Richard met and started working with this slightly mad Frenchman called John Thierry Meig Who always came with his wife's sort of this husband wife? Danielle pair and John Thierry Meig was an ex-physicist and Richard and John Thierry Meig John wrote an open-source database now these days we'd call it a document-orientated Struct semi-structured database at that time. They just wrote something that worked for all of this yeah and And it's going to look you know, it looks very clunky now and you've got to remember that this was Just as Oracle the the database company Oracle was coming up and selling Oracle for massive amounts of money No academic would ever afford that and also before the web And actually their database was an open-source database before the MySQL's and the other famous open-source databases came through And it also had the concepts of hyperlinks Inside of the system of HDB it had an integrated graphical interface with the database and So pre-web you had this rather amazing thing and in fact back in Adrian Craner's lab. I remember so This year I did is my gap here the next summer. I came back to Cosby Harbour as an up And in that summer we installed HDB on their sun machine and I remember you know getting it up for something This is absolutely amazing And so Richard wrote that piece of software for the worm project and and basically was has been this common thread through the worm genome project and then the human genome project and then in fact into the HapMap and then the thousand genomes and everything else of What what where his real skills are is algorithms and understanding what One should do with sequence data He also practically he did write this database and he practically did a number of things There was a period when I was a student with Richard where the Sanger really ran on his software and occasionally You know things would break and Richard was the only person who could go and fix it And you would have to go and fix it and was all not how you would run a system now But he's a total legend His algorithmic his maths and algorithms is just Really top-notch and his understanding of biology and so it was a real pleasure I mean it was a real experience actually working with him as the first time I had you know I had been self-taught and I had been the best programmer Sort of all the best person Up to then and then I met Richard and then it was very clear that there was you know As somebody better exactly somebody better than me at all of these things. So it was it was that was You know it was it was it was great And in my PhD, I both did a practical thing practical things and I did very kind of Algorithmic things so my PhD the practical things was running the PFAM database, which was also slightly kind of it was all You know you again, you just wouldn't do it this way now I mean it's sort of it's gently chaotic not chaotic, but you know kind of Heath Robinson like approach to these things and At the same time I was focusing on developing algorithms, which I thought about from pair-wise and search-wise and Sort of generalizing those algorithms and to to sort of match how the maths worked with how the algorithm works I wrote my own little miniature programming language now again because of my sort of slightly self-taughtness this Was very poorly. It's not a good piece of computer science however it did Elevate dynamic programming, which is the key You probably key sequence alignment methods. It sort of elevated that as a first-class language primitive and So I was able to write far more complicated algorithms. I think I probably still am able to write far more complicated algorithms and Just you know throw them around change things test things, you know do things in different ways and then know that my Programming language would accurately generate all the right code for it So I didn't have to worry about debugging it and things like that As well as it being kind of practical for the for the final set of algorithms that I wrote it actually gave me huge amounts of freedom to to experiment and And that that algorithm, you know the best algorithm that came out from that It was an algorithm called gene-wise which takes a protein sequence and matches it to the genome but but sort of handles splicing and handles errors and In my thesis my major part of the thesis was about the language which was called dynamite. There's only one user That's me really there was a second user, but she didn't unsurprisingly It's one of those things where there's really only one user which is the person who created the first place But actually I think there's a lot of computer science like that I think a lot of computer scientists end up. It's actually a very interesting debate about how you provide abstraction because You know ultimately You know when you start realizing oh, I could write my own computer language that does this and very often that starts off by writing Itself writing a different lower level computer language like C and that's that's the major thing that dynamite did But actually I did go all the way to producing things that where I could manipulate everything So although I didn't never wrote assembly compilers I got quite close because I tried to target it to different architectures Where I had I had a different set of primitives to use. Yeah So that was you know, that was kind of fun Obviously, but the the the practical thing that came out was gene wise which took a protein sequence Say from mouse or from rat and could map it to a genome, but of a different species of human I could deal with the fact that the splicing you didn't know where the splice pattern was and you didn't know where errors were And that algorithm ended up being you know the the most robust algorithm for predicting Genes in the human genome, which was a big topic of conversation at the time It was extortionately computationally expensive But very robust and and did a better job than nearly anything and he's still running today At the end of my PhD obviously so the worm genome was chugging along very very well at Sangames very clear it's going to work and Next door to us were the people analyzing the worm genome, which they did a lot effectively not by not sort of In a computer-assisted but fundamentally by hand way and There was a project to sort of do a similar thing on the human genome kind of match a pedant You know matched for the flow rate of flow coming through the genome project Remember that the human genome project was was projected to finish in 2010 or something like that with a pretty nice slow steady beat all the way across the genome So it's sort of halfway through or close. Yeah halfway through Solera happened and it was this kind of you know explosion of fear and excitement for the the Sanger Institute, you know excitement that There's that this that You know the same institute was or the Sanger Center at the time was one of the most important parts of You know the sort of validation that this was an incredibly important project So everybody got quite excited about that that there was this You know I'm slightly mad American raising money in the stock market and doing all of this stuff but then there was this sort of awful realization that everything would have to speed up and You know so one one had to work through how one sped everything up and some things were quite easy and other things were Very unobvious about how that was going to work out and so There's a whole kind of saga here about About moving away from a mapping first approach to a sequence backs as they come through And stuff like that and I wasn't in the middle of it But I could see the kind of you know, you could you knew that that debate was happening and how that was was changing What was interesting as they accelerated the speed of doing this the public project and making announcements that they would Accelerate and match sort of Solera's run rate to this Then Solera was also saying well That's great. We'll use the public data. That's all wonderful Of course the public people don't know how to analyze it. So everybody's going to become to Solera anyway Because because we've got the brains and they had this, you know very very clever Computational scientist called Gene Myers who was the person who originally sort of said that it was feasible to do the assembly And he was a good friend of Richard Durbin's actually just to go back You know, there's another thing despite all the kind of nastiness between the Solera project and the public project Especially in the computational end there was a lot more sort of mutual respect of of Of each other now come back to that Through this So so anyway, so it became clear that we had to have a solution to the analysis We all sang it had to have a solution to the analysis and then three kind of things came together around that. So one was Someone who Richard at heart to take over the running of the human annotation. That was Tim Hubbard and he had seen this and he had Created an even more extremely Heath Robinson system Then then the p-fan thing to try and keep track of All the bits of DNA that was being sequenced publicly and run it through some very very basic analysis Sometime he had hard Michelle clamp and Michelle I Known she had been a postdoc with Jeff Barton who I'd known from Oxford So I'd known them because of my undergraduate at Oxford and I knew there was this sort of Slightly physics woman who could code really well and she had somehow done great things with Jeff And so she they she and her then boyfriend James cuff got hard to the EBI and she got hard from the EBI to To sang her to help run some of this annotation stuff as well and me and Michelle and James really hit it off actually as in terms of we all wanted to just like you know make new things happen and So there was a combination of Me and Michelle also thought that a CDB really wasn't going to work for the for the for this It just wasn't going to scale and so we got this newfangled thing my SQL Which nobody knew whether it was really robust enough and stable enough and was up to the job We brought that in-house at the same time sang I had brought in Oracle in-house to run its major Database thing also realizing that a CDB really wasn't the sort of long-term solution That was a kind of annoyed Richard because to be fair to Richard lots of things of a CDB worked well but some things just worked awfully absolutely awfully and You know so there was this sort of massive right lock on the database. It was just just not gonna work So we brought in the this my school and Michelle and I were effectively exploring how to make a database schema for genomics and I was exploring I had written gene-wise and I had Was you know getting more and more confident that that was You know that the any prediction that came through gene-wise was going to be pretty good almost certainly correct Asterix except for pseudo gene predictions. So there's a there is an asterix there For this and Then this need to respond to Solara Kind of drove the this next phase of us really creating something and this was just as I started my PhD was about to finish and Another slightly complicated thing to this. Oh, God. So you see this is what this is what all history is a good idea so I'd written this programming language called dynamite and I Was thinking about targeting it to different machine architectures not just my own architect not just standard C programming And there was a company called Paracell whose major customer was the NSA For text mining and stuff like that and they had identified DNA matching as There are the big opportunity and my code G pairwise and search wise and then gene-wise was Probably was incredibly computationally intensive but worked directly off DNA. Most things didn't do that and And was was clearly good was clearly a really sensible thing to do And so they they really thought that was a match made in heaven. So they flew me over to California a number of times. Oh And at the same time I was also doing the same thing with an Israeli company called compugn It also done alternative hardware. So I was going to Israel to help see how my programming language could work with compugn, I was going to California to see how it works with Paracell and And because of that and it was the dot-com boom and then then I was invited to be on the power cells scientific advisory board I was you know, I was a kind of a kid then but the other person on that scientific advisory board was Gene Myers So we were the two computational people and then there were a couple of other people and this was it was an independent company at the time So I had a lot of good interactions with Gene on the scientific advisory board and discussing how one does dynamic programming and you know or That's how one does the algorithm and more what it's useful and how to construct it and how to make how to do different Different things So as I ended up my PhD, it was very clear to me that It was a bit unclear what I would do next But I had lots of options and it was the dot-com boomer era and you know So Lara had founded with lots of money and there's you know, so startups and all sorts of different things So I took myself off and went around America and I visited All sorts of places. So Lara. I visited so Lara for a job. I visited power software job I visited academic labs I went to I went to all sorts of different places and Somehow my last one was parcel and I remember them, you know Giving me a job offer for more money than I ever thought was sensible to give to a scientist and a kind of I I Sort of said to myself, I'm you know, I'm gonna take this job This is my opportunity and I said no look I better go home Talked through with my girlfriend who's now my wife a little bit about that And also just just go home and talk with my Talk talk it over before I signed on the door to the line So after this sort of week and a half this is still the height of the human genome Sort of stuff, you know, and so it's quite surprising when You know, there's this whole narrative about people You know throwing shit at each other publicly through press releases and yet Perfectly pleasant And and collegial Conversations between many members of the science Scientists in the different parts of the project. So there's a quite a funny contrast of of that I think there is a very particular Sang and welcome trust view as well I think that I personally think already there are alternative histories being written about some of this because I Think it is it is true That John at some point got the backing that if necessary the welcome trust would finance the sangha to do all of it It's true and for John. I think that was an incredibly important thing that he could then Come to all of these conversations and say it does not matter What you guys just decide we will do x y and z. Yeah And I think that you know How important those that is is quite an interesting question But it's as part of the welcome trust Sangha mythology, you know, if if if that commitment hadn't been made and if John hadn't made those statements There was potentially a different branching pattern for For what happened next, you know, I think that so anyway going back to So the geeks were so we got on I think well and So I'd gone around all of these different places and before I'd left Richard said Richard said to me No, we want to keep you in Sangha And why don't you run the mouse annotation group which was going to be two people and wasn't kind of the heat of the problem Which was the human annotation stuff. I kind of said, you know, I Don't I'm not sure I really want to do that and I went back So when I came back Richard was incredibly keen to almost immediately talk to me and he sort of almost Physically dragged me off to the EBI to talk to Graham Cameron And Graham and Graham and Richard it sort of in the time that had cooked up the idea that EBI would offer me a position and back half of the annotation project and And That was what triggered the ensemble project being a joint project between EBI and Sangha and the commitment for Of of EMBL was really some money, but also making me a PI So I became a PI. I actually became a PI before I got my PhD Which was a little bit late. I got appointed to EMBL before I got my PhD There was this letter from Francis Kefatos that says basically if you don't submit your PhD in the next month then, you know, this is gonna You know, you know It's kind of yeah, things are gonna explode awful things will happen And at the time we were building up to want to the big Tony Blair what ends up being the Announcement thing so this was sort of this was January before that announcement and So I was working like an idiot with Michelle and other people and trying to get everything to work and then you know Then in the evening I was be trying to finish off my PhD. It was just excruciating But anyway that all worked out So we got we set up ensemble and then we also realized that it had to be funded so we had a meeting with with Yeah, exactly we had a meeting with the welcome trust kind of in the back of the room and So the first thing is we did a number of proposals on our first proposal had something like eight people and John came back with a sort almost like I remember the one almost one line email that said double it And so we wrote it for 16 and John said double it again as we rated for 25 or something like that Which was a huge grant. I mean, you know, you know when most people it's what are you doing? so You know, and it's now a team of about 45 people I now I'm totally used to the idea that this kind of engineering were cars Quite a lot of it just requires a lot of personnel and muscle and stuff like that And we had a meeting with the great and the good there and again It's one of those cases where the welcome trust clearly had to take a risk. I mean If it did go through normal peer review We would just been torn into lots of little pieces plus the fact that it has 25 people to it You know, no no panel would have would have swallowed that At the time especially with a someone who was a straight out as a graduate student I mean, it's ridiculous But I'd nobody really knew what to do and they they knew there wasn't enough computational biologists And they knew that I had a lot of kind of You know self, you know delivered By hook or by crook by persuading all sorts of different things And we there were for three or four PIs Richard Graham Myself and Tim Hubbard and it's to my regret that Michelle Clamp wasn't a named PR. I think that's right Because I think it's quite easy I see ensemble being founded by Tim Michelle and myself and The fact that Michelle was sort of under Tim was a was always a bit of a Something slightly wrong about that setup anyway so ensemble started up and that gave us then a very big You know effort we and we and we really did deliver that make that happen over time So if you look back at it again, it looks clunky But a lot of the for example the fundamental You know the fundamental data model that we had was Especially of the genes transcripts proteins and it's all obvious stuff, but it's actually the the No, it's it's stood the test of time the same Concepts hang around basically And gene-wise for example was right in the middle of the gene prediction But Michelle's code as well about how you call gene wise how you make how where you do it? How you call it where you do it how you tidy up afterwards? Which is a key part of the whole process still I think is almost the same and now as it was way back then I Can't remember which Colesman Harbor this was so this was 2000 I think yes, it was 2000 and it's very clear that there was One session was all about gene prediction and I was there explaining ensemble and gene wise and stuff like that John bison back in uge raced Crelias from Genescope were sequencing tacky foo or Or tetrodon can't remember which one it was very smart And they had used those reads to estimate the number of genes in the human genome and they had come out with Shockingly low number and it was already kind of rumored before we had the the actual session in Colesman Harbor that this was gonna happen and if you remember I did spend some time in this investment bank if you spend time with traders you discover that That you you know they bet on anything and everything, you know anything you can bet on they will bet on And back then I you know teaches you a lot about running markets so I actually ran a book once for So be the bookie to to Go-kart racing With just people and it's actually very very interesting when you're the bookie rather than the person betting because you have to offer odds which are long enough to Attract bets and short or not you know, so this you there's this for as soon as you run a book You have this phrase being over around or under round you also want to be over around over round meaning that you win no matter Where who wins the race you the both the yeah No matter the outcome exactly and that's basically traders have to be in a you know You want to be in an over-round situation not in the situation. We are you're asking for for results to get your way So I've had that experience So I realized that it would be fun to do this and I there was a time when I wondered whether I should kind of offer odds But I realized that wouldn't work At all it was pretty clear that you know, you can offer odds and the number of of Jeans so I said up as a sweepskate And I didn't realize this but Francis Collins had the same idea. I was a bit pissed off Suggested this I just sort of wave this book in the air and of course it was a cost from Harvard So I had to go on to Adrian Crainer's lab and say could I could I just have a lab notebook and I'd pasted You know Jean sweet whatever and that so I had a great fun that night. I So I had to write down the rules of the bet And then take on bats and we I decided this was all decided sort of on the hoof that You could buy in one So this was the only sort of fun thing slightly different thing I did is you could buy one number for one dollar on on the first year So it's going to be decided in 2003 and then it was going to be five dollars a bet on the second year And then twenty dollars a bet on the third year and the fourth year we would decide And so that's all written in there So the idea being that the information was much better as you got later in this But it was a great, you know, I think lots of people I met lots of people Lots of people remember me because I was there. I was this young Brit I used to swear a lot as well as a much more sort of dirty mouth at the time young Brit Precocious Doing this and then I came around with this book and persuaded effectively everybody in the meeting to put a dollar in and put a number in and And of course the the really amusing thing about it is that everybody was wrong I mean everybody was was absolutely horribly everybody overestimated the number and It shows you that when the crowds with no data, you know So here crowds with data probably will make a good estimate crowds with no data Absolutely don't make a good good estimate. We just we were all doing it from you know sort of sort of crappy Backlaw and stuff like that and don't forget that insight was around as an EST company And they were selling access to a hundred thousand human genes So it was considered to be quite radical to put down fifty thousand. You were really saying that Insight was lying basically to put down that kind of number and for to get up And so there are twenty six thousand protein coding genes in the human unit. I mean, you know people thought he was a mad Frenchman I mean it was just I mean I mean people really genuinely thought he had lost it at some level for there to be that low number So By the time we got to so the interesting thing is you put down his estimate Which was I can't remember what it was like twenty six thousand two hundred and a couple of people realizing the speech Gates rules sensibly went below But only two people went below Which is when you think about it very poor strategizing by you know, there's probably 500 votes Yeah, and it's only two that go below that you don't go down. Yeah so So we when we came to 2003 and by the rules of the bet we should settle it And and actually the now the number still wasn't known this time in fact if you wouldn't actually have a number now You'd have a range of bad, which would be above 19,500 below 20,500 yeah with a lot of kind of definitional stuff going on so So we decided in 2003 or the suggestion was is that the pot was split Three ways between the the last two people in Luke and Lee Rowan From Seattle. She actually had the lowest lowest bet so she kind of won But I really give you the biggest in the most amount of credit I mean nobody was down there except for the fact that it was down there if you hadn't gone down there Nobody would have written a bet below 30,000. I think It's kind of interesting for sure. Yeah, so that's the story of the gene sweep It also I met lots and lots of people At Cosme Harbour there The human genome the draft human genome was the first time we'd done anything of that scale in terms of genomics and it was the first time we'd done this big consortia sort of method and And it showed that it was the first time the the analysis showed the paper structure showed whatever But the time we got to the mouse genome was better Far better by the time you got from the mouse genome to the chicken genome even better, you know, we really honed it by the end So it was done In an ad hoc way for sure an absolutely key individual here And again, I really hope you get his oral history as well as Jim Kent Because you know that you know without so David house and I and some and I think Jim on one of the Cosmic Harbors And I don't know if it was 2000 or 1999 There's a whole business of putting together the human genome and even how one thought about an assembly And so we had this thing called a golden path which was a concept from Phil Green in his assembler Which is a golden path of reads and we we sort of took that up into an assembly Concept and a very early version of gene white of of ensemble allowed Simultaneously different golden paths to exist so different alternative assemblies so that our very first design Michelle of mine very first design had this and You know actually these days so somebody could you know People talk about this now for graph genomes the idea of having flexible More than one assembly with the same backbone stuff like that, and we've never really done Done it here, and it was quite funny that we write The system at the start to do that and in fact we end up throwing away that part and just dealing with a single linear reference Slightly sad that we did that but still So Jim was absolutely key in doing this and in some ways it it became clear that Jim was making this wonderful browser We were as academics are competing and collaborating in some sense at the same time and He had thought through a way of making the assembly And so Michelle and I focused on making genes and then you know putting the paper together the person who really made the Who sort of drummed made a drumbeat for the paper was Eric lander for sure There were many people sort of around him so John Solston and Bob Waterston were sort of with him But Eric was the person who wanted to put his fingers in everything During the analysis and he kind of called and I think he gave the name hardcore analysis group was from him and You know I think that's both because of his enthusiasm and his desire to control so that you know Both of those things are wrapped up together And So we that's how that kind of process happened And then there were there were always these phone calls to coordinate the production of the genome and then there came Another run of friend calls around the analysis Michelle and I and Tim would go to the G5 calls Occasionally, but those were always very focused on production Yeah, and it became more and more obvious as things developed that we needed A separate thing that was more analysis focused and that's that's how the hardcore analysis thinking now There was a meeting I remember going to Boston when you know when it wasn't the brod It was still the whitehead and in in that kind of that funny Semi-factory like building and it was all snowy And I I sort of remember that as one of the first Physical get-togethers of all these analysts style people But it was a real raggle-taggle bunch of people And when you read the analysis, it's it's not very good And I think got to be honest about it We were it was the first time we did it, but it's not really very good to in our defense The Solera analysis wasn't very good either And the Solera analysts were also using gene-wise for example in the middle of their pipeline so And there's this whole who are less more for this Tony Blair Craig Venter You know Bill Clinton announcement about the number of human genes. Yeah, and And that led into it was all sort of wrapped up in the same thing as this sort of betting book thing and You know that that was a great example where I Was tempted to send a little message to mark yandala slayer and say, okay, what number of you got that's why don't we just agree? You know together and so that we're not totally out of that because there's this huge fear that we would you know Michelle and I were making estimates in the 20,000 which we call 20,000 confident human genes. We're basically being told that that number looked too low and I remember Michelle's first estimate was up up in sort of it's about 24,000 Whatever and we felt it was too low then when we presented one of these five because it's too low like Solera is gonna You know, it was this is going to be awful We're gonna look bad that we can only find 24,000 genes when Solera can find 35,000 genes or whatever. Yeah, and insight has a hundred thousand genes. Yeah that Did color me for some of the later things for example within code where I've You've got to stick by the data analysis quite you've got to you've got to say to yourself You're gonna have a good talking to yourself before You kind of open the box and you look at it. You're gonna say to yourself, okay You know, which things of my analysis are I'm confident about which things am I going to question? If if the result doesn't look like how it looks like I wish I'd start to my guns Michelle and I'd start to our guns a little bit harder Then we were closer to the money on our on our first analysis far closer than the The initial draft paper and it has an awful phrase like something like we can find confident evidence of up to 26,000 genes and You know exploratory evidence up to 35,000. I mean, you know but the full that's kind of interesting because You know, we were putting together a paper and the hardcore analysis cream And that's just another awful bit in that paper by the way is evidence for horizontal gene transmission We've got we've shoved in at the end again, you know, we got much better at doing this process of checking things You know, we're not having these sort of last minute. Oh my god. I found the most amazing thing ever I think coming in but anyway, you came out and then Francis said Francis and one of these meet a friend called said we need a poster And we need to we need a poster and I want to have a poster With every gene in the genome on the poster Now it's quite hard to it's quite hard, you know, you can't do this by hand. Yep, you can't do it by hand So And it's quite it's quite hard to do it's actually quite hard to do good layout it's hard to automate good layout of these things and So So it's huge and so Eventually I said well, I can program I can I can write post script and So I wrote a system to write these posters And that's why I took a photo because I love when I see it because I don't think anybody appreciates the level of detail It's one of the more complicated It's not the most complicated algorithm But I had to use my dynamite programming language to write a very specific model to get the layout to look aesthetically correct Yeah, and so afterwards I can I'll go to the codon. I'll point out all the little Things that the algorithm has to do to get this lovely sort of shape in the in the Yeah, so so yeah, there I am, you know writing some layout Post-script layout program And I can remember doing this vividly because this this you know The production timelines of this had to be earlier than some of the other ones so that the Foldouts would happen and stuff like that and so I am I had to I worked with an artist here at NHG I Darrell Leach and So in the middle of having all these other deadlines, I had another you know deadline which was producing posters That people like the look of So I'm very proud of those posters at the end of the day I I Certainly had the feeling that we hadn't the draft wasn't anywhere near the end of the human genome. It was You know, it was it was you know, I knew how messy it was and I knew how how How this we couldn't leave it like that basically so that's one thing at the same time There was a whole thing about the mouse genome Going at the same time It's very interesting there because of course there's this whole business that whole genome shotgun wouldn't work And then suddenly the public Eric kind of landed through himself and took a complete 180 and said, oh, no, I'm sure I'm gonna totally fine for mass and You know it shows that Eric is smart. He also shows that he will set his sail. He will change his mind on a sixpence When when necessary Okay, that all that's inside. So there's a kind of theme of genomes and The the mouse paper the mouse assembly was better the mouse gene set was better The analysis was better. We knew what we were doing better Across the board there and then by the time we got to chicken, for example, we were it was, you know, we were like There were problems with the chicken genome because these micro chromosomes but kind of how one thought about Doing the analysis was was now becoming really a much more structured process in many ways So that that there was that theme but the so it was also very clear that Clear that there were going to be two threads afterwards. I mean an issue I said this But also it was clear inside of Sanger as well. So one of them was human variation map That ends up in thousand genomes and the other one is basically beyond protein-coding genes, right? so what is beyond protein-coding genes and I was interested in both but because of Ensemble being so focused on annotation. I really felt that the You know the thing I should throw myself into was the You know with stuff beyond protein-coding genes and so that of course ends up being the encode project though You have the bit before the encode project, which is what should we do and how big should it be and And all of that so we had those meetings And I think something that people sort of again forget it at that time is that We really didn't have good technologies for studying Genomes the human genome at scale. So this was in the epic period of arrays micro arrays and then tiling arrays They were awful, you know, you never you never they're just you know, they were not Some people are nostalgic for them, but I am not nostalgic for them They are they I mean they if that was the only way you could do things and as an encode as in the first encode That would be the only thing you can do, but they were they were absolutely awful batch effects all over the place Completely impossible to compare between platforms So the first encode it was clear that this technology was Was not going to scatter across into our genome Tiling array technology. It was clear that We didn't know what we should even be measuring lots of people had different ideas So I think that you know, the right decision was not to try and do genome-wide things But to instead constrain 1% of the human genome, which was this 1% project and for everybody to do But we also were bringing in a completely different community of people. So all of these Sort of transcriptional cell biologists chromatin people this sort of stuff and so there was also a sort of big socials thing going on about this and I Think it's right. I mean it's interesting. So Are these very big international consortia the right ways of exploring and doing these things one, you know There's post hoc justification because these things this is the way things happen It's let it's far less clear-cut. What is the right structures for these things? However, that one should do it was very clear-cut. I mean, you know, the scientifically one should go and Work out what was going on in the rest of the genome. Yeah was was very Important so in code one. So I can't quite remember quite how so In Dunham from the Sanger was one of the projects that were funded by an HDRI in encode one We were involved partly because of ensemble and maybe maybe for the genes as well through that through Sanger or something like that Anyway, I turned up at all the meetings It's probably the most important thing thing to mention here and In encode one It was decided it was very unclear about how we should Talk about the results and so there was a decision that there were going to be five different papers This was an awful decision But you know, it was it was cat herding and with not with cats with lions They really didn't want to get on they everybody had their own view. Everybody was jostling for position So you had five different sort of things and therefore five different analysis groups And I think I did one of that's right around compared to genomics with Elliot Margolese who was here an HDRI intramural and and Anyway, it was just a big mess basically I'm buzzing. Let me just have a little check that okay, cool So it's a big big old mess For these five things and probably the most important thing which the route we sent them in to review at the same time So two things about this. There's somehow I think I became chair because of my experience with both the human and the mouse Genome papers Francis appointed me basically as Chair of the analysis group and I ran these phone calls. I was used to this now And I obviously had the computational smarts, but the there were a lot of people who really didn't you know I didn't know everything. I didn't know a lot about chromatin. I didn't know a lot about quite a few things there I mean Flip side they didn't necessarily know about Computational techniques and things like that. So it was a it was it was very interesting kind of just trying to manage all of these people and as we put to try to put together these five papers, I mean, there was an awful moment where There was a forced marriage of two groups to lead one of the papers John Stamman office and a ninja Dutter and for example, I had to effectively chair a four-hour phone call where we went through the Ordering of the authors for that paper. We went down to sort of the You know, I Can't remember like the 10th You know number 10 in the list before we went into alphabetical or whatever whatever and then the last thing And I I had never really been ex I hadn't myself been exposed to that level of distrust between scientists on writing papers and things like that all of that it was quite quite novel for me to see all of this So so we had those five papers The sign it was that you know, it was all over the shop. It was very clear to everybody, you know Every time you did statistical analysis, you had this massive lab effect. So Experiments clustered by lab more than they clustered by You know any other feature This is this is you know, it's actually very very common on lots of microwaves. I mean, you know, so lots of microwaves have the same aspect So it wasn't like we were awful. It was well, and we were pretty bad, but but it was It was a feature feature. It was an aspect of the platform that You know, people didn't really like talking about including in gene expression So AFI arrays and Illumina arrays if you took them you clustered them You would you would cluster by array before you clustered by anything else and also the experimental design was also awful in the sense that People would be given freedom to do Experiments on the sun lines they wanted to do them on that they could make work Rather than having any sense of a common someone So we could also not really do comparison. We didn't have anything that looked like a good matrix of comparisons So anyway, the review is unsurprisingly read this and said it was crap Came back with you know, effectively this is awful Was was the you know, what is going on and in particular? I think the thing that particularly annoyed the reviewers at that time was that you know Paper two says something and paper four says something and when you read these paragraphs they are in conflict with each other, you know and It's okay to have two papers from two different groups saying that but but not two papers that are coming off the same data sets Or supposedly coordinated to you go work it out So then we had an awful kind of meeting where I chaired the analysis group and I basically I Think I had talked to Francis or Francis and Eric together And I said well look obviously we've got to have only one analysis like the genome papers And obviously that's the only way through it and we've got all these problems We've got to chuck out a whole bunch of these things We're not going to have to talk about everything that everybody can talk about because that some of it's just not It's not solid enough for us to be able to talk about the things that we had hoped Won't we talk about when you can't? So I kind of agreed that with People here. Maybe that was more with Eric than with Francis and then I went into a Huge meeting with everybody there and basically I talked them to death So that the only answer left on the table was one paper So then we spent another year or something putting together this one paper was it was like the March from Moscow It was really not the you know, not a nice experience at all But we did and The papers pretty you know, it's a bit I see it very similar to the draft human genome paper, so It's not a good paper It is the first paper that tries to do a number of those things at that scale Perhaps a bit unlike the draft human genome actually the take-home messages. We don't have the right technologies But interesting enough, you know after all of this hiatus Around this what actually had happened was that Selexa had come online and was this period being pulled up by Lumina, I guess and it was very very clear that Or it's very clear that there was a completely new option the table that could scale incredibly well Which is rather than using tiling arrays use sequencing as your renown and I remember then writing people It's a question about what in code 2 should be And whether it should be whole genome or not and again, I think if you were If you just took the results of in code 1 or face value, you'd say you must be kidding This is not going to work out, you know, we've got bad enough batch effects At 1% it's just it's going to be impossible but there was this you know enough of this new technology had come out to To persuade people it was doable and So lots of people wrote I mean the grants for in code 2 very much with a well Plan A is tiling arrays, but as soon as we can go to plan B will go to plan B And in fact by the time the grants were awarded everybody had gone to already to plan B Which was do it with Illumina Selexa readout And That was much better and the other good thing between in code 1 and in code 2 is I think I had a because of the experience of the of pulling together the paper. I mean I had saved the projects I Were not I sort of like an individual but pulling this together had Prevented a project the project looking like a complete disaster to it looking like You know a Success of some sorts, you know, so I had got the trust a lot more I wonder when all of these histories come out But I just you know, I got the trust a lot more of NHGRI project staff, so Lee's fine gold and Peter Good and I think they You know, they had a lot of people pushing lots of different opinions on them about how to structure things and how to do things And so so I became a stronger voice in that and One of the things which I was kind of happy and proud about house was not only justifying in code 2 but I'm Dealing with saying we must have common cell lines We must have a core set of cell lines that you must do these experiments on And those were called tier zero tier one and stuff like that. There were six of them We chose and that made that in code 2 was much much better designed So the sort of fundamental experimental design of in code 2 was better We then also had learned about the QC process better And so we were at least trying to address QC upfront though In fact, we I don't think that really came to fruition until in code 3 realistically But at least we were I mean, we weren't much much better. We were in a completely different space from where from the first So in code 2 was good actually and although there was You know still people rubbing so also all the social stuff that had gone wrong in in code 1 Effectively some people had just got so pissed off. They didn't bid for in code 2. So that's one way of solving this problem but also people had sort of Calm down a little bit and known where people are coming from just understood each other a little bit better more So although there was still jostling about the PIs and about who was doing what who was in charge and stuff like that I'm who whose asset was the most important or whose viewpoint was the most important. It was a much more collegiate Process is the consortium acted like a consortium far better. We had a very functional. I think analysis group but That was saved for sure by two individuals So Ian Dunham and Anshil Kunjai So Ian Dunham It was proof that saying that and second made an awful decision an absolutely crazy decision where they sort of they cut about 20% of the institute out and They they sort of pretended that it was not that it was strategic rather than quality Because of that, there's lots of people who weren't You know weren't tremendously good who were cut out of Sanger when 20% got lost To really really excellent people were cut one was Ian Dunham. The other one's Stefan Beck I you know, I I do not understand why somebody didn't say I forgot to say it guys. Let's You know these guys are the guys that so they you know, it's crazy and it meant that Sanger Didn't still you know didn't have the depth of functional studies on the genome for a long period and they had to rebuild it anyway, so Ian Had kind of got a effectively a redundancy package from Sanger very nice redundancy package He could go wherever he goes and he was being interviewed to go for for Institute directors people were trying to court him to do all sorts of different things But Ian's and he's a great guy and two things about it. He was very he didn't really want to move his family So he he wanted to be in the Cambridge area. So that was quite a driver The second thing is he didn't actually he doesn't actually want to be the person having to build a salesman and Justifying of the money and all of that. He just wants to do the science far more So he came with me and he said I wanted he was also an experimenter So I want to learn more bioinformatics. Could I use my redundancy money to work in your lab? And I was like, you know, so Ian had been the supervisor of my Journal clubs, you know when I was a PhD student at Sanger, I mean, you know, he's 15 years my senior he is Extremely clever Wise it was very odd to think that he so I said sure you great come along Feel free come play with encode, you know learn Pearl Do what, you know, you know, it's fine for you to be there and then when encode to come around Actually we argued for deliberate funding of this analysis group and I said to you why don't you You know, why don't you do this? Why don't you be the person who makes all of this happen with me? We'll go and do it together. We've worked enough together and he said yes, thank the Lord So he and I and it was far more him Organized the analysis meetings though on the phone, you know, it's I'm the talker Chitchat, you know, all of that kind of thing. So I'm definitely the Frontman for the for the Ian and you and show as it were And of course our names are very close lots of people can even keep our name straight as well So they would they call me Ian and forget. I mean, it's quite, you know, it's all sort of funny at some level So we were doing quite well at doing this organization not perhaps as well as a thousand genius project But we did have more complex data but then We had to run and we really wanted to run everything through a standard pipeline with good QC and everything else and We just couldn't persuade people to Do it in a way between the different groups and there was this Great graduate student. There was seraphim's graduate student that somehow got attached to He was at Stanford and I don't think Mike Snyder had moved to Stanford at that point called Anshall Yeah, that's right. It was him. Look, he runs DNA Nexus now You can really visualize him He is German on I'll have to find his name Anyway, he was somehow between honor and seraphim Anshall was and Anshall basically put his hand up and said well I'll run this I can do this I'll run all these things and so Anshall created a standardized pipeline for the major chip seek data sets and coupled with that is we had this legend of Non-parametric statistics called Peter Bickel and I'd met Peter in Oxford No, I'm not sitting Cambridge When he was coming over for me, so he is a cops prize winning statistician. He's you know You know people statisticians don't quite bow to him But you know he is a may he's a major force in non-parametric statistics And he's charming guy lovely absolutely lovely and when I first met him in in Cambridge He wanted it was around in code one and he wanted to talk to me about this and that and I Wanted and he'd done something around genome heterogeneity and you know, it was like getting He was very tolerant of me asking stupid questions Stupid question. I mean, I was like, I don't understand what these things mean. How does this thing? Yeah, what yeah? and And so we brought in the Bickel group as part of this analysis group and that was really great and there so the Bickel group developed the statistics Anshall Ran made a pipeline Ian Corraled all the experimentalists with their metadata to to to stuff I was the ringmaster on this analysis phone call to make this work out So I was yeah, so in code to worked far better from my perspective And by this time I've done a lot of genome paper stuff like that. So I'd also kind of Really wanted this in code to paper not to be the mess that the encode one was And so we structured it better We built up to it better We then went into this slightly surreal business because very clear we were presenting at meetings You can see the data that we would be making a publication. And so we sort of had both nature and science Come to us to say that and so we we had a whole kind of bubble. Can you do for us? sort of thing and basically nature pulled out all the stops and so we had You know epic amounts of of print space. We had the main paper. We had other papers. We got them to do Dynamics all sorts of things, which I think was really good So it's with some regret This whole 80% function thing kind of blew up It was always a question inside of the console So the definition was to find the functional elements in the human genome And so you you end up sort of circling around the definition of function. Yeah, and You can have a long old debate about this and There's stuff that we measure That you can measure things on the genome sort of doing or being done to it depends on your perspective and Pragmatically that was most of what encode was it was measurements of things on the genome by a Chemical things and then this question is whether you call that Biochemical activity and then reserve the word function for something else or you or you use the word function for this and this debate went on and on and on inside of the consortium and There were a group of people who were much happier with the word biochemical activity Some of which have function and a group of people who are very keen that that had a Two controlling view of the word function. So so you need to talk to Tom Jingeris and John Staminopoulos. They are the the When we observe biological biochemical things happening on the genome, we can't just dismiss them because we don't understand them is Is that perspective and there's this other perspective which is Lots of things can happen and they can happen thermodynamically and stuff like that So you can't just say because things happen They're important So this debate went on and on and on and as we closed into the paper. We actually had to use a phrase and my regret So we did a fan call about this. I got everybody to discuss it My regret was not to have a formal vote Because a lot of people think there's a one of these slightly alternative histories where I solely came up with the phrase biochemical function rather than it being consortium thing It would be much better. I mean, maybe we did record those fan calls. I wonder if we did It'd be much much better to have To have about reproducible reproducibility of biochemical events gets mapped into Pica Peter Bickel statistics runs financials pipeline and you then discover that an awful lot of the genome Fulfill this criteria and people shouldn't be and I you know when I started presenting this I say you don't be surprised you know transcription from this perspective is You know is absolutely there as one of these reproducible events and Not only transcription as in making RNA But the passage of the polymerase across the DNA and the histone modifications that get deposited as that happens And there is something weird to say You know HVK 36 isn't functional You know, there's something there's something very odd about you know taking the very Hardcore extreme which is it's got to be under evolutionary selection because there's a final group of people Yeah, which is it got to be under evolutionary selection Yeah, there's a there's a sort of this complete sort of transition from these people if it's not in the selection Yeah, it's not useful to if I can measure it I don't want to off the table and then there's a group of people that basically there was there was a middle This middle group is sort of the bar bro. This is the Chris Ponting Obviously Dan Grower is like to the left of this group over here then there's this group which is Barbara walled and and a bunch of people who Who accept that there's things that when that there's probably more than then what's we can detect by Pure selection message, but wants to have a very core high bar for the for the use of the word function And then you get to Tom Jingeris and John Staminopoulos who's like, you know the mistake of molecular biology over the last 30 years has been to Only talk about the things that that we understand now and they always would cite micro RNAs and these things And how if we hadn't been so close-minded, you know, lots of things would have been discovered earlier So but yeah, the regret here somewhat is is In an early draft of the paper that I wrote There's more ambiguity of the levels in the abstract and In the in the main thing by the time we get to the end We get less ambiguity But with the kind of chain of definitions and then we do have a section which goes through all of this and says Well, if you define it like this, it's this percentage if you define it like this as this percentage if you define Like this is this percentage. We so march up the percentages, but you end up with 80% So so my my you know out of all the things that I arranged for Or tried to make happen for in code to Where we got the QC a lot better the experimental design was a lot better We did this virtual machine of all our results and stuff like that It's quite I find it quite frustrating that this fixation about this phrase And this number is the you know the biggest thing and actually I don't think it's the biggest thing Scientifically if you look at how it's being used it just a it gets used routinely fine, but if you look at The kind of next layer of why is it remembered? I? Unfortunately for better or for worse This story is a big part of it and that Kind of pisses me off and if if I had my time again, I would have done my chest moves in a slightly different way and You know, maybe I would have insisted on a different phrase Which would might have been biochemical activity at the very least I would have got the whole consortium to vote Very explicitly so the very least would be very clear that we're all in it together Because there's a period when down grow kind of exploded Where a lot of people said well, it's not nothing to do with me you and write the paper I forgot to say You know we argued about this phrase for a long time. You can't you can't walk away from it like that and because But I'm I'm in the I'm most Comfortable with using the word biochemical activity for the broader thing a substantial amount of which we believe has cellular function or something like that would be would probably be my my optimal phraseology and I should find the early drafts of the paper where there's Where where one tries to transmit more ambiguity of this? percentage number But I don't have sympathy with this hardcore end of people who think you know if it's these people I think are nuts so the the group of people who thinks that I Don't know that everything that comes from a transposon is Not relevant is I mean there they're a bit loopy basically and I'm sure they wouldn't I'm sure I'm characterizing their position a little bit They wouldn't say that but I think they have this very very strong view that it's all about evolution and As soon as you think about other things like cancer for example cancer would be a great example So there's substantial deregulation of all sorts of different things and of course you want to know whether Mick binds here or not even if it's not under selection if the Mick binding gives rise to Cancerous cells. Yeah, you know, you really want to know it now. Do I call it function or not? I don't know. Do I want to know where it is? Understand it measure it characterize it for sure. Yeah, if we just leave aside the function, but So, yeah Have I learned? Yeah, and then it was I mean it was I found it personally very very difficult when Dan grow And some people really laid into me on the internet and this and some people still do and I've you know, I thought I had thick skin But I didn't now I have thick skin because now I look at some Things that are written about me and I'm like well clearly you don't know me You've kind of got this voodoo doll person Which is you and Bernie and you enjoy sticking pins into him. You know fine, you know, I'm not even gonna try and Debate To persuade you that there was you know, you you're so locked into a mindset where I'm the antichrist I'm not I'm just not gonna attempt to Untangle it for you and it has given me insight when people attack other scientists passionately or politicians It's giving to me an insight that you know most of the time. It's very unlikely that people really hold the extreme positions that That they're they're said to have done by other people. It's it's Especially if people are clever. I mean you're intelligent that they went, you know, I think people are when people sort of this process of creating these sort of Characteratures of people it's much more about how people want to frame the debate them really about create, you know, understanding what is going on and Yeah, but as you can see I have a lot of kind of regret because I feel like I did lots of things right in it I do it's my best my best thing my best run consortium and it's sort of Mod has this really bad End of life taste Right at the end because of this and the interesting thing about Dan Graham is that is So does Higgins he used to be at the EBI was a really well not a friend But early down was an early fellow geneticist tree builder all of this sort of thing Which is why why he's into this and he and does nose him What we know well enough to have beers whenever they're on a conference together So it was quite interesting talking to does about Dan and you know Dan clearly holds hard positions and I mean I kind of I Understand the passion that scientists bring to intellectual purity of all sorts of different things And so there's a part of me that thinks you know what if I just have a beer with Dan Graham You know we'll we'll leave friends, you know if he can have beers with Des Higgins You know, there's a kind of transitive beer having process that that says that Should work out and I you know this part of me that's tempted and then I just think oh come on your life It's too short and and he has made such a thing With me on the other side that I just don't think he could Come to like me So I kind of meant he said to myself are you and you just got to forget about it You just you know, I can't make everybody like me Which is if I have one of my failings is like I kind of want everybody to get along and everybody to to be happy and and and and stuff like that and It is regret regrettable, but yes, I've said this you know Twitter and blogs and stuff like that We don't online criticism. We don't have constructive online criticism is hard and You know it degenerates very very quickly into extreme positions and I've actually tried to make sure that I am not someone who Feeds that But it's it's quite interesting because sometimes you can get I mean Same problem with email, which is you know never write an angry email until you've at least slept on it But Twitter, you know moves that ability to go from anger to tweet in in a in a very Quick thing and also the other the other thing. I think the thing which perhaps I didn't My antennae were tuned in a slightly different way was that of course over in the u.s. The junk DNA thing is wrapped up in the Creationism right did debate did I intend to design all of that and And I actually think that that has made some of the people who are the kind of adamant There's lots of junk DNA people they close their eyes to data. They they can't I mean, it's slightly weird I've had some of these into into changes and of course from my perspective You know creationism is Class a bonkers. I mean, you know, it's it's it's it's in the it's in the non science zone Yes, we've got to work out how to educate these people But it's a it's a completely different class of discussion from this discussion, which is you know, I've seen this I've done this how how how how has it happened that these things have emerged every evolution? but for some people they see every They see the support or the fact that creationists use these statements as indication that they've got attack Sciences in great and and I was actually reassured You know, there's two two key moments because people were kind of laying in Twitter blogs and Facebook as well. So there's a whole Facebook kind of thing And there you're it's a web of friends kind of process Yeah, so what have you and so somebody was really going at it with Mike Eisen and stuff like that and then I Not Mike was Mike was pretty Mike was annoyed that the consortium had so much nature publicity. That was Mike's main beef But other people were annoyed about this other thing And so I joined in that Facebook thing and said, you know, you know, just to make sure, you know This is the part of the paper or whatever whatever and I said something which was I don't know something about how, you know, this is Being quite personally difficult for me to hear so so many people who don't know me really vilify me and And and I I give Mike quite a lot of credit that both in that Facebook thing and in Twitter He said look, you know, I don't like encode for this. I don't like it for that I don't like it for the other but you and Bernie's not is neither an idiot nor nor should he get kind of, you know Wacked around like this and that was Nice of him. Well, it's not nice of him. That shows You know, that's the right attitude to have yeah And I've got a lot of time for Mike not everybody has time for Mike. I'm sure Francis Collins Francis doesn't have time for Mike But but I think it fundamentally he's Got a strong