 So I'm Christopher Donahue, an HGRI Historian, and I'm here with Dr. Jane Rogers. Dr. Rogers is the former head of sequencing at Wellcome Trust Sanger Institute, who is also a part of the International Wheat Genome Sequencing Consortium. So to begin, tell me a little bit about when and where you were born and your early life and education. Okay, so I was born in 1954 in Southampton in the UK, and I was brought up from the age of two in a small market town in Wiltshire, so in the southwest of England. And I had a, what I suppose would be a traditional state school education, which fortunately for me had not been messed about with too much by the time I went through it. So it was a fairly standard grammar school, academically oriented education. And they turned me out at the other end with, you know, exam qualifications to go off to university. And I went off to Southampton University at the age of 18 to study biochemistry with physiology. Was there any teacher mentor early in your life who made you really interested in science, biology, genetics? Typically, when I interview scientists, there's a decisive early influence on, and that allows them to study science in a way that is usual for the profession. So I suppose an early influence on me came from my father. He was a pharmacist and his training included chemistry and biology. And we used to talk about that and he used to talk about, you know, he used to help me with homework. And we also used to enjoy walking upon the chalked islands in Wiltshire and talking about the specific flora and thawna that you find. And he would intersperse this with information about, you know, the medicinal compounds that had come from various plants that we saw. So that's one influence. The second influence, I suppose, was at school with a chemistry master in particular, who encouraged me, one particular anecdote. It's a classic experiment that you do and you get prepared for and can be explosive. And in my case, was explosive. I wasn't listening properly where I should eat the piece of cotton wool that contained the water to make the steam to interact with magnesium. And it whooshed at the test tube and at the other end. And he, at that point, he walked over, he could see I was modified, actually, and shook my hand and say every true scientist has at least one explosion. I thought that was just wonderful. But he was a great influence and very, very encouraging and encouraged a group of us, actually, to go to study beyond the normal A level syllabus in chemistry. And that was helpful. So and what was your university education like? What did you study? And for your doctoral work, who is your supervisor? So I went to Southampton University to study biochemistry with physiology. I was interested in biology and chemistry. Biochemistry seemed to combine the two. And I was also interested in how things work. So the physiology would cover that part. And I think one of the things I liked about it, not only the environment of the university, it was a very new department at the time. But it was, it looked as though that they were multiple options. You know, you could either focus later on down the biochemistry route or physiology. And they even had a brand new medical department, if that was an area that I wanted to go in later. And in my third year, we had an influx of new lecturers into the department. So I brought a lot of new blood, a lot of new research ideas. And I decided that I would stay at Southampton for my PhD. And I did a project that was co-supervised by one of the new lecturers, one of the older lecturers. And I spent three years looking at the behavior of fluorescent sterols in membranes of different types. So how did you first meet John Solston and sort of get into the orbit of John Solston? And do you have one or two anecdotes about sort of meeting him and him as a person, him in those early years? I think John was, you know, sort of really quite established by the time I met him. I met John in 1992. By that time, I had been, I'd worked in Cambridge for about 10 years as a postdoc researcher, and not really wanting to go continue trying down the lecturer route, et cetera. I'd gone off to the medical research council to become a scientific administrator. I had a young son at the time. I was commuting to London and the commute was hard work in those days. And I was asked if there were any positions available in Cambridge. And I happened to ask at the time that John was asking for administrative help to set up the proposal, put the proposal into the welcome trust for a large sequencing facility. So we discovered that we lived in the same village. And he invited me to come around and talk to himself and Alan Coulson, his colleague working on the worm, one evening on my way home from work. And I've recorded it ever since as a sort of bizarre interview. So there I was, sitting in the rocking chair in his front room, being asked all sorts of questions by himself and Alan about how you manage budgets. How would you set up a sort of what is now called human resources department? And could I turn an office block into a laboratory facility? And after two glasses of sherry on an empty stomach, I could do anything really. And obviously thought that he could work with me. So he arranged with the MRC for me to be seconded to Cambridge to work on developing the proposal to the welcome trust and that submission. And then once the welcome trust said yes, then he asked if I'd like to work with him on the minister's side and to get the lab up and going. Do you happen to remember any of your answers for that interview? I can remember with the lab block, I can remember talking about theoretically how I would go about it. I mean, subsequently, I realized that they had an office block in mind. There was a new office complex that had been built in Cambridge quite near the airport. And we later went out and saw it. But basically it had been set up as an office block. So there were no wet resources. So that's one of the things we had to think about. And then we had to think about computing, power supply, whether I had enough power supply. And John was always particularly concerned with whether there would be any backup power supply. We always had to have a generator. He got very anxious about that. And when we were at the Sanger, I made sure that he knew how to start the generator if it was ever necessary. So certainly the lab block answers, I can vaguely remember. It's about how to set all that up. We seem to get along. And again, afterwards, John said, I did a lot of recruitment with John. So a lot of interviewing with John. And he told me later that he generally decided whether he could work with somebody or not after about five minutes. And that was absolutely apparent in his interviews too. Because if you find somebody was not interesting, then I had to take over the interview and make sure that the candidate felt that we had explored everything properly. That was my introduction to John and to Alan. So in 92, had you really heard anything about this so-called human genome project? And what did you think? What did the people around you think? So no, I hadn't heard directly about it. Although there was, within the MRC head office, there was a lot of talk about more of the funding having to be allocated to the Molecular Sciences Division. That was the area that funded the MRC Laboratory of Molecular Biology and their funding was ring fenced. So they were already working on the, John was already working on the map of the worm at that point. So whole genomes were in their sites. But no, I think there was just a general concern about how molecular work was going to go. For the MRC, it was the real worry about where the money would come from, because the budgets were certainly very tight at that time. And that proved the case later, because the MRC struggled to find the money to support the worm project, let alone do very much with the human. So you said that budgets were pretty limited during that time. Was there any, this is a related point, was there any time at which the budgets were sort of ample or more than you expected, or was it always a kind of sort of scarcity scarcity scenario? Well, I think when we, I mean, we put in an application to the Wellcome Trust for the Sanger for a budget for £60 million for five years, which, you know, was an enormous amount of money. And they said yes. And we realised that, you know, we obviously had to set up, we were promising a lot for that. And the costs at the time certainly, you know, wouldn't allow us to achieve it. So right from the outset, we had to work on how to make sequencing as cost effective as possible. And, you know, how to maximise the use of the sequencing machines. I mean, we always looked at the budgets. They, I mean, the Wellcome Trust were generous. I think, you know, and I think I can't say any more than that. We were very lucky. A lot of, you know, we weren't, we never had to think about washing perpetites, which certainly, you know, in the UK, some lamps were having to do not very long before that. So that was never, you know, something that we had to worry about that much. But we had to be careful and we had to, you know, be, as I said, we had to make the money go as far as we could. So as a related question, in some, what was John Sulston's contribution to sort of the worm mapping and sequencing project? I mean, could you summarise his contribution and, or try to in two or three points? Yeah, John's contribution to the worm was, so he and Alan developed the Cosmid map of the worm and John led the sequencing. So, and John established the first sequencing group that was in the LMB and that set the model for all of the other sequencing teams, certainly initially within the Sanger. He worked, he did the subcloning himself for the sequencing project. So he's very hands-on and he enjoyed working in the lab. And he worked through the sequencing methods. Initially on the map, he had persuaded Richard Durbin to develop a sort of visualisation tool for the bands, the digestive bands, so that they could put together the mapped Cosmids and take advantage of the automatic reading of the scans. But then he, for the sequencing, they worked with Roger Staden on, I mean, essentially burgling the applied biosystems, to the applied biosystems software and being able to, you know, get at the code. And from that, Roger went on to develop the BAP and GAP visualisation databases, which sequences all used for assessing the sequence reads and the sequence assemblers. So and later on in the worm project, he led the first team. We set up two teams with postdocs that were working in that original team. And then we had two further teams that John essentially supervised. So he was very much aware of what problems were. He did problem solving. He looked at the resolution of how to overcome some of these problems. So the use of the short insert libraries, in the US, they were called Shutter Libraries. John worked that up with one of the finishers and used it for the really gnarly bits of the worm Cosmids. And then he finished the final gaps. He sort of plugged away, you know, isolating the DNA and finishing the final gaps on the worm. So I think that's his contribution to the worm project. So he did the sequence finishing basically. Wow. It's a very laborious process. It is. And by the time he got the last gaps filled, he had actually stepped down as director and sang up. And he was meticulous about negotiating bench space, access to the resources that he needed to actually get that finished. I mean, it was trivial by comparison with the huge budgets that we were spending then, but he's absolutely meticulous about negotiating to finish the worm off. So how important, as a related question, how important was it for John and the Sanger that there would be kind of a complete map and a complete sequence versus say just various, you know, regions of interest or just finding genes of interest? I mean, how important was it that it basically hold genomes? And, you know, what is the significance of that? I think the significance is that you have a good idea about what you might find and what you might do with the, you know, regions of so-called interest. But what you don't know is in the other bits, you don't know anything about what's in the other bits. But if you have a complete sequence, you can go back to that in the future. You have the data there, you can explore it. And, you know, there's all sorts of things that we don't know about. We still don't know about genome regulation. And quite a few of those are in some of the trickier bits to sequence and get at. And by having that whole genome, you have a template that you can work from and start to look at some of these unknown features, shall we say. So we touched on this a little bit, but how precisely do you kind of set up the sequencing center? Just in general terms, I mean, and keep it running. I mean, how do you recruit postdocs, beta test technology, you know, making sure that the groups work well together, issues like that? So, we set out with this notion of replicating what John had built at the LMB. So in fact, the whole of the original building at the Sanger was based on the projection that we would have 17 sequencing teams, a nice, precise number plus one sub cloning lab. So there were 18 labs that were allocated to sequencing. And there was a structure within the team. So there was a team leader. And then there were people doing sequence, sequence preparation, sequence finishing. And originally, there was a sub cloner in each of those teams as well. And that was the, so that was a sort of model that we started out with, we started out from John's single team, duplicating that and having two nematode teams led by two postdocs in the original team. They recruited staff to work with them. But we also, at the same time of working as recruiting technicians, we also recruited people who were going to become team leaders of further groups. So postdocs who would then work, spend some time working with the, with the, you know, in the initial teams, learn how everything worked and then be able to take it out and set up their own groups. We were quite successful in doing that initially. And we had to, so then we had the two nematode teams that started. And we also had Bart Burrell's group from the LMB. And they were, were working on yeast. So Carol Churcher headed that up. And we recruited, similarly, team leaders who worked with Carol initially to learn how to do it. And then we recruited into, into the other positions within the teams. And for postdocs, I think we must have gone to nature and science for those initial, it would be new scientists as well for those initial adverts. And then after that, when we were advertising for technicians, we went, we, we advertised locally. I mean, Cambridge has a, you know, vast scientific population. And we were very lucky in that respect. So, you know, big university, lots of scientific departments. And, and a lot of the time, we were also looking for people who sort of had an interest in science, but weren't scientific scientists themselves, but were technically inclined. And the ads always had had in them, you know, we looked for manual dexterity. And often interesting computing. So, you know, computer hackers were always found favour. And, and people who we used to give the, certainly the technicians are a perpetting test as part of the interviews, so that we could, we could check two things, actually, you know, get people to prepare just water into a microtiter plate. You could see whether their eyesight was any good, which was quite important. And you could also then see, you know, what the manual dexterity was. And I think that's where most of our recruits came from. Once we got going a little bit, then people heard, heard of us, then we would have applications from postdocs to, you know, if there were any posts going. But if it was more on the technical side, then the ads would all be, all be in the local news. Then how do we get going from there. So building the groups is step one. And initially, we had all of the sequencing work going on in the group. But when you're working in a lab, and the L&B is, you know, very well set up, there were central resources that needed to be replicated. Because we were, we set up the Sanger initially, in what had been an old lab building on a country house estate site, outside Cambridge. That's where the, I mean, Hingston is where the site is now. But we move, we originally had to do refurbishment of an old, this is sort of an electronics lab, I think, that was there originally. So they, what was I saying, the, yes, the central services that we had to set up, certain things like, you know, what do you really think about, you take for granted in a lab, the washing up, the pouring of agar plates for all the, you know, the bacterial culture growth and so on. The, and the pouring the gel plates for the sequencers, because we started off with polycrylamide gels poured, you know, for the initial sequencers. And that we found was better done as a central activity rather than the individual teams during their own. And what else do we have to, oh, the other central activities, I suppose, that had to be some of the lab focused activities, we decided to put a separate group for sub cloning fairly, fairly early on as well. Initially, the groups, teams were going to have their own sub cloning, but it worked much better with the sub cloning team being very specialised, you know, postdocs were doing their own development work as we went along as well as overseeing the routine production of sub cloning for the sequencing projects. From there, having, so the sequencing was initially within the teams. As time went on, we built the capacity, we looked at how to automate it, use of perpeting robots initially. We had Colony Picker that was developed in-house, in fact, it was able to pick M13 plaques, as well as pick colonies. And it always struck me as ironic that I can't remember the name of the company who produced the colony pickers that were used in just about every other large sequencing centre, except ours because we actually had our own, and this company is a British company. But we, but setting up sequencing reactions tended to be manual for quite a long time actually because of the problem of not wanting to waste the sequencing mix. That was so valuable that we, you know, putting that in trays and troughs to be allocated out. We needed at least sort of level two and level three robots working before we got to that. So, we built up in the teams, we needed more sequencers. We had to, we initially added sequencers into the sequencer teams, but then it grew beyond that and it made sense to look at centralising various activities. And this, I think, was probably in parallel with other groups doing, you know, doing similar. So, moving the sequencers to a central facility and having dedicated sequence loaders, which went with the sequence plate production initially. And then we had, we separated, this was the real scale up for the Human Genome Project, we separated the production staff from the finishing staff. So, we have production teams, we have finishing teams because they do, they do different things. How do we keep that all going and keep people talking? Well, meetings, lots and lots of meetings, lots and lots of communication. Did we always get it right? Certainly not. People, you know, people used to get upset and feel that they were, you know, not being taken any notice of and being overlooked. And to be honest, it's a delicate balance, especially for post-docs, between working in an environment which, you know, was, I mean, it was labels, a factory environment, and thinking about their career development, as well as, you know, doing a job that was producing a product alongside others and couldn't be distinguished from it, from the point of view of, you know, their own publications and career development. So, it was a balancing act at times. But we had, you know, we had regular meetings, we had a development group, which was initially quite small, John and I used to run that between us. And they would take on either projects that the teams wanted some assistance with or they would take on projects, other sort of side projects. When I think about things, we had a group doing CPG Island sequencing fairly near the beginning of this enterprise. I mean, what a terrible thing to try and sequence. But Adrian Byrd in Edinburgh asked us if we could do this and said, we ought to be able to do these things. So, we had a group looking at that. It was never terribly successful because, I mean, some of those are ferocious and, you know, so tricky to sequencer. And the chemistry just really wasn't right for doing that at the time. Terminator chemistry wasn't in, you know, it was coming along, but certainly was not hugely advanced. And the primer sequencing chemistry didn't seem to get through them very well at all. So, yeah. So, does that explain, have I explained how you agree with that? No, perfectly. That's great. One thing I wanted to follow up on is you've mentioned before that when you basically got a sequencing platform, oftentimes it didn't work quite correctly. So, could you just sort of describe your beta testing process, if that's the way to describe how to essentially adapt a sequencing machine to your group or lab where it actually produces sequence? Well, one of the sequences, as we didn't have the, I suppose, the earliest sequences came out in, the automated sequences were, came out in 1990. So, there was a certain amount, you know, of testing and working out how to use them that had gone on for two years, two, three years before we got going. So, we weren't at the most basic stage. But yes, we had to, well, I suppose to step back from that, one of the first things we had to do was to set up, mean a tracking system that would provide a sort of centralized output of how many, you know, sequences, sequence reads we had, you know, what the overall pass rate was for them. And because always, I mean, it's like, it's like anything that, you know, there are always mitigating circumstances. If your sequencer doesn't generate sequence, it's not only, it's not only could be due to the sequencer, but it could also be due to what you've put on it. And that, you know, that ranges from what the individual groups were generating to the gels and, you know, lots and lots of other things. So, we had to have this tracking system so that we could see where the faults arose. But one of the faults that just sticks in my mind, ABI changed something on the, it must have been on the three, seven, three sequencers. And it was a, they, sorry, it'll come back to me. It was part of the mechanism that shuttled the lamp backwards, the backwards and forwards across the front of the gel. And it was, they changed, I think it was a metal component for something that was definitely cheaper and a bit plasticy, lead screw, the lead screw, that was needed for moving this thing backwards and forwards. And we had lead screw after lead screw breaking. So we had machines breaking down in the middle of runs. And the reps in the UK decided that, you know, they were talking to ABI California about this and saying, you know, this is very good and so on and sang a very cross about it. So they took me out to California and they also took David Bentley. And I put up a presentation for, you know, what we were finding. And I think we'd had, I forget how many, in the course of about three months, we'd had something like 50 lead screws have broken. It was a tremendous number. And, you know, ABI, the room was quite packed and, you know, mouths more or less opened at the sheer volume. And at that point, I think they realized, A, we were a serious player. There was something funny going on on that island, the other side of the world. But B, also, you know, we, you know, there was something that needed looking at and fixing. But the, yeah, I think that I heard a bit, the left of it was apparently heard to say, who is that woman? So, yes. And then the other thing, my other big beef was that these machines, you paid the price of a Rolls-Royce for these machines, you jolly well expected it to run like a Rolls-Royce. And they had lots and lots of faults. In fact, we had, we actually employed our own engineer for, well, it must have been for, yeah, two years, who was the first port of call in terms of coming out, looking at the machines, deciding whether he had to call in people from our local ABI service area or whether he could, he could actually fix the machines. But, yes, it was, it was fun and Cain's getting those. And this was something that John got very excited about because, you know, he was very well aware that in order to maximize, the sequences were the most expensive piece of machinery, therefore they had to be used to, you know, maximum capacity. And when we kept having these problems, it was, it was an issue. And that's a really eliminating series of stories and something that I've, that I've always wondered about and always asked people sort of around the same question is how do you kind of test out to these sequencers once they get into your labs and what happens? And there's kind of a unanimity of, oh, we have to change lots of things that they didn't work very well, or we had to employ our own engineer, for example, it's fairly common. One, one other thing to just ask you is, when did the Sanger start basically human, human sequencing specifically, what year, what year was it? And did you have, was there a year in which there was a kind of a turning point in which all of the sudden there was, there was a significant amount of sequence being, being generated much more than the year before where you basically said, yeah, there were really making a ton of progress where progress before had been a bit more incremental or something like that. I think so the first, the first human projects that we took on were actually in 1993. And that was a couple of, I mean, we had some Cosmids from the Huntington's region on chromosome four. And we had some X chromosome and actually some chromosome 22 Cosmids. So both of the X and the 22 came from David Bentley's group. And they were really sort of pilot projects to look at whether Cosmids were suitable sequencing templates. And progress, we use the same techniques on those as we were using on the worm. And it was clear they were not as easy to assemble. And in fact, it wasn't until some further work had been done, you know, on the, when Fred and FRAP were introduced, FRAP in particular, to assemble the clones that it began to look a bit more viable. So those early projects, I mean, they started off at the beginning, we didn't do a huge amount. And the other thing that they did illustrate though, to us was that Cosmids were not a good template for the sequencing. The overlaps were between the clones were too large. And that reflected the fact that the mapping that mapping of Cosmids was not ideal for the human genome either. So David took that on board and started work with large insert clones. And we worked with the pack libraries from Peter Dion. So was there a point where things started to go more smoothly? You hope so, wouldn't you? I think when we got through, you know, the initial teething problems with the sequencers, probably 95, 96, things started to take an upward turn. By the end of 96, we'd finished the yeast project. So that felt like a significant milestone. The worm was, you know, chundling on nicely, that seemed to be under control. And we were starting to make some inroads with the larger insert human clones. Yeah, I think I would say 95, 96. So that really coincides with the timing of the first Bermuda meeting as well. And you answered the question I also should have asked you, which is what were the challenges of, say, human sequencing versus another model? And was there anything specific to human that you needed to modify or change? The repeat sequences were a problem. And Frapp actually did a good job at the assemblies. We were using the statement assembler before that. And Frapp was a better thing to turn to. What else happened at that time? I think it was around that time that the focus moved from using the primer chemistry, fluorescent primers to the fluorescent terminators. And that was a better sequence in chemistry. Yeah, I think those two things. You've mentioned David Bentley several times. I mean, just to go into kind of sort of who he is and what he did at Sanger and kind of how he came to Sanger would be really, really interesting because I've interviewed him as well. And David, when John was looking at the initial project, so John's interest was in the worm. But Aaron Kluge, who was the head of the LNB, realized that if the UK were going to keep John here and working on genomes, and especially the worm genome, that funding had to be found. I mentioned earlier that the MRC would not have had the money to support this. So it was Aaron that went to talk to Brigid Ogilvy, who was then head of the Welcome Trust. And the Wellcomes View was that they wouldn't support the worm project, but they were very interested in translation of the technology to the human genome project and to the UK making a substantial contribution to the human genome sequence. Because they seem to have taken it on board already that mapping and sequencing would open up a lot of new avenues and that the human genome sequence would need to be done. And that, of course, follows on all of the meetings in the United States. So when John thought about the, you know, obviously he, which projects were going to be part of the work at the Sanger, his interests were, he brought the worm, brought the yeast, and he thought of David for the human genetics element, because David and Ian Dunham had been working at guys, and they had talked to John about the software especially that he was using for both the mapping, for the mapping at the time. And the ACDB software for the analysis. So he, that's how John knew of David, and he knew that David was not just another human gene hunter that he was interested in the genome side of things. So he invited David to join to be part of the proposal. So David's component for mapping and mapping side of the human genome element was included in the actual proposal with the sequencing, which really came out of John's side of things because it was the translation from the other organisms. And David joined or who's, I think he must have come in around September 93. It's a sequencing labs. We got the sequencing labs open in spring of 93. And there was a, in the building that we had, it was a sort of E-shaped, and the E had a long arm in the middle, and that was the human genetics area. And that took a little longer to set up for David. So I think it must have been in autumn 93 when he came and he brought his group from guys. So one other follow-up question is between 92 and 93, the American leadership of the human genome project changes pretty radically with Jim Watson and Michael Goddusman and then Francis Collins coming on. How did that affect you at Sanger and what was your kind of perception of that transition period? Because Sanger was founded while Michael Goddusman was acting director, for example. I think all of that was sort of above my pay grade at the time. I think the significant part that I was aware of was probably Jim going because Jim had been instrumental in setting up the funding for the worm map and then persuading John and Bob to go forward with the sequencing. And I think it was Jim. Jim had organized the initial funding for the first megabase, which was shared between the two labs. So there was US money going into the LMB for that. So I think there was a little bit of apprehension because they were aware that Jim was very supportive and they didn't know what the follow-up would be. And do you remember any specific or do you have any specific thoughts about once Francis becomes the guide, starts guiding the US effort, what his specific contributions to the HDP were and in terms of managerial style or his public guidance of the program or or how he worked with the sequencing centers and so on and so forth? So that came quite a lot later. I don't think I was really aware of Francis until probably around the time of the first Bermuda meeting. And that was probably because David had the contact. He knew Francis from the especially from the Cold Spring Harbour type contact and previous sort of human genetics existence. So it was really about the time of the Bermuda meeting that I first became conscious of Francis and of what he was doing. But for a while, it must have been for another year or so beyond that, they didn't seem to be any really concerted US effort. The work, obviously there had been a mapping program and the initial sequencing started, I mean, as it did with us for the human sequencing, centers were picking up bits and pieces that people had mapped previously. And a lot of it in Cosmids and, you know, not in the larger clone. So all of the larger clone work was starting to be developed. So it wasn't really I suppose until 1998 when the Bermuda meeting of that year made it so clear that the US was still floundering a little bit. And we didn't see what direction they were going to go in. And we decided that we needed to go back to the Welcome Trust and up the ante on our side, as it were, to try and prompt some action from our international partners. At that point, Francis started to loom more and was obviously talking with the genome centers about about what could be done. But then I think, you know, the sudden rise of salara, it must have knocked him for sex. And in some ways, I wonder whether, you know, he was thinking, oh, well, that's all right, you know, it's going to be done. And we don't have to pay for it. Because it's an expensive project. How do we organize it? Bob had been working with us, well, not I mean, sort of in parallel with us, and had had funding to do very similar things to the things that we were doing. But other centers weren't as far advanced at all. So when we went to, so we went to the Welcome Trust because we were so worried about it. And it, you know, on the day that we made our presentation to the Welcome Trust to increase our share of the human genome from a sixth, which is what we set out to do, to a third. That was the day that Craig was making his presentation to the, to the groups at Cold Spring Harbour and saying, well, you know, I'm doing all this, you can go and do the math. So, you know, very much in, in parallel, but the discussions with Francis, you know, really got moving after that. And then Francis was working as the coordinator of the, with the different groups. And, and, and pulling a coherent project together. There were some ups and downs in that, as we mentioned before, in terms of what the, what the strategy was actually going to be, but that there should be a public project really came into focus at that point, and that the Americans, unless you know, they need to, to really come together. Now, my perception, you know, may not reflect the truth, but I think it's, it was just, you know, what I was recalling at the time. No, it's really interesting because in the period that you're describing from say roughly Bermuda 96 to roughly Solaris, when you have the sequencing pilots, and then you have the quality assessment exercises. And then you have basically the sequencing tensors finally producing some, some sequence. But, but, but it, but as Bob told me last week, it was, it was something along what he had proposed, but at a much sort of lower, lower level than, than, than he thought could then, that he had proposed. So, but that's an interesting, it's a very interesting view because that, no, that's, that's actually quite, quite remarkable. I'm not, yeah, that's no. So what was, what was the, your perception, for example, of the quality assessment exercises, where you had basically the, everyone's sequence was essentially a quality assessed by, by another, another group. And I think it happened in 90, the first round was in 97. Was that, was that useful to your group? And, and were there any surprises? It was useful. It was, it was useful to, it's always useful to have, you know, feedback. Yes, what was the surprises? Yeah, some of our clients didn't come up as well as they should have done, I thought. But there, you know, there were, we were doing things, we would, we were doing things differently. And that accounted, that certainly accounted for some of them. But, you know, it's a good, it's a good wake up call. And I suppose, you know, one of the things that we always reckoned to do was to smooth things out in the finishing stage. But again, having some, you know, intermediate feedback on how we were doing on that. No, it was, it was a useful thing to take part in. Painful. Very laborious, I seem to remember. And I can't remember, I can't quite remember whether the, whether the first exercise involved us re-sequencing and re-assembling. I think it must have done, actually, to do, to do that comparison. So yeah, it took quite a bit of effort to, to do it. But yes, no, it was, it was useful. It was useful. And, and what in your, this is, before I ask you about the, the library development, which is not a topic that gets much attention, although it really should. What was, so were you at the the 96th Bermuda meeting? And? Yes. Yes. And so just in two or three, you know, two or three points, what was the, what really was the significance of, of, of that meeting? And in particular, sort of, not simply the data release, but the accuracy and contiguity standards. I mean, how did, how was your assessment of the significance of those features as well? Well, accuracy. This is the base, the base pair accuracy. You know, I can't, you know, to be absolutely honest, I can't remember much discussion of that at the time, but maybe that's just because it didn't come up in the bullet points in the end. I mean, we, we had, we had previously been working with Bob's group and agree, I think we had the agreement that the sequence should be double-stranded that needed, you had to have a strain going in each direction really to be able to call, to call the assembly accurately. The freight quality had to be Q20 or above, which meant that, you know, the base error and a single base error had a probability of less than one in a hundred of, sorry, a base had one in a hundred probability of being an error. So you were saying that that was, that was a high quality base core. And when you put those together, then, then you came up with a, with the frappe score, but it really fell out of those, those scores that we were using all the time. So did it make a significant impact on me? No. Because we were essentially already doing it. Did we need a wake-up call to make sure that our sequence really was that accurate? Yes. Yes, we did. And that, that came with, with the exercises. The, the contiguity, again, the side, the data release was based on what, what had been going on with the worm and the assembled sequences were released onto the HTTP server. When they've been assembled in context of more than one KB, because that way you got rid of, you know, sort of sequencing vector and cloning vector and managed to screen, you screen that out. So should be the organism that you are trying to sequence, not necessarily from where you thought it was from, but certainly, you know, what you were trying to sequence. And they apparently, I really had to check on this, but at the time of the first Bermuda meeting, the databases didn't take unfinished sequence. So the sequence date, the assembled data was going out onto individual websites and could be searched there. And it wasn't until, I think it might have been the second Bermuda meeting when there was a discussion of having the phase one, the phase two, and the phase three quality. And that it was agreed that the sequence would be put out with those labels on it. Yeah. Was that because GenBank had had specific standards that were onerous to meet before they could deposit it? That's why some of the sequence was going on individual websites? Or is that a different? I think it was just, I mean, I think it's just one of those things that nobody really talked to GenBank about putting the, you know, unfinished sequence on it. But when the, it was undoubtedly useful for the worm. And I, you could say that, you know, the way that the worm sequence was released, it was picked up, it was used. It was a useful thing for the community. And when the first Bermuda discussion took place, it seemed an appropriate time to talk about, you know, immediate release of the, of the sequence data, and to release the unfinished data, which people, you know, sort of knowledge to, as Bob said previously, you know, how much they really took in what this was going to mean. I'm not sure. But yeah, it was, it was, I think it was the utility. And the fact that it was going to be, you know, human genome project is going to be a long project. It was going to be very expensive. And a lot of people weren't going to be part of the project. So making the data available so that it could be used as soon as possible. That was very much, you know, one of John's major, I think, contributions to the, to the human genome project. Well, a better way of asking a question about the Bermuda meeting is what took up most of the discussion? I think most of the discussion was, the first part of the discussion was very much who was doing what. And that was a sort of introduction, as it were, you know, because the initial invitee list was, was quite large. And it was people who had done, who were taking up sequencing on the back of previous mapping projects. And some people, and some people had, you know, 10 Cosmids, and that was a significant sequencing project. But it was sort of being, it was to be inclusive initially. And then we'd look at, you know, who had funding and who was serious about, about scaling up. And I think probably, you know, when, when we in WashU in particular started talking about what we were doing, it became clear that, you know, this was going to be, this was a serious effort where you, you know, you can walk around handpicked areas, but it was not going to be a project about handpicked areas. And then it, Kate, you know, sort of the discussions about, well, you know, patenting, patenting took up a lot, a lot of time. Should the sequence be patented? Should it not be patented? It should it be patented to protect it from being, you know, sort of patented as it were. So there should be sort of an open wrap around it, you know, sort of people's views on that. And then a lot of, I think I can remember a lot of time being taken up with, you know, talking about whether, with different, the different international groups about whether sequence release was going to be viable. Because not it, because the one of the initial, you know, Bermuda principles that was put up on the board was that finished sequence should be released as soon as possible. The databases took finished sequence. And that was being held back in the projects, you know, people's careers, people's interests, having to, you know, progress and have a publication. And mine it for as much as they can mine it for before it goes anywhere. So, yeah. So that's my recollection of, you know, what was, what was being talked about. And then subsequent to that there were subsequent meetings, the idea of the whole genome, see, you know, should there be a, should it be a whole genome sequence? And there was a group of people who thought that, you know, whole genome sequencing was, was feasible. And that was the way to go rather than this mapping and, you know, sequencing clone stuff. That, that, you know, went to the, that went to the salaricide eventually, eventually. But, you know, initially, there was a, there was a, you know, very, very healthy debate about whether this would be a viable proposition. And whether, whether that will be the, because the other, the other big thing, I think this was occupying Francis's mind at the time was whether the time anywhere around there was right for scaling up. Because he had Bob on the one side saying, you know, start now, you can change it, you can change the technology, incorporate more new technology as, as you go along, but we should get going. And then you would have other people, and I would imagine Eric would be one of these saying, no, no, no, no, the technology is not there at the moment. You know, it's very, very manual, very, you know, you can't go with, you can't go with this that has, there have to be more improvements before it's ready to go. Yeah. It did. Chromosome, did allocation come up at any point? Yes, it did. Yes, it did. And we, certainly at the Sanger, we had a pre-discussion about, you know, sort of what we were proposing to sequence. And I suppose we were, we were getting in there first. So, I mean, Chromosome 22 was, you know, that was on our books, that came with Emmy and Danum and the map, and a great encouragement from David Weatherall, one of the Welcome Trust governors. Yeah, it was a small chromosome, make your mark, show what you can do with it and get there first. Now, he was a really strong advocate of that. The ex-chromosome was an interest that David, lots of people were interested in the ex-chromosome though, and a little, really a consortium of groups interested in the ex was put together to work on that. Once, essentially it was said that Sanger would sequence it. Chromosome 6 was an early one of ours, and that came with Stefan Beck, who was interested in the MAC, another wonderful region to be starting off with. And so, yes, those were our first ones. And I think, you know, and those were associated with people in the UK who had been working on previous projects. Beyond that, David came up with a list. 20 was the next one that was included in the, in the first group to get us to one-sixth of the genome. And then when we put in for the next sixth, really we were looking at which chromosomes hadn't been taken by other groups. And I can remember in that first meeting, so they were interested in the DOE had interests on 5, 16 and 19. They'd had mapping, they'd had cosmic mapping projects on all of those chromosomes, and they stuck with those chromosomes. The Japanese were on chromosome 21, as well as collaborating with Ian on 22. And Bruce Rowe was also interested in parts of 22. And those are the ones that stand out. And I know that David Cox, I think, in Stanford was involved with, he was involved in some sequencing on chromosome 4. But eventually that chromosome went to, went to Wash U. But the, so it really, you know, the division went along previous interests for the most part, certainly to start with. France put in a bid for a chromosome, and I think they got funding for chromosome 14. I'm not sure whether there was anybody in France who was interested in that, or whether that was an untaken chromosome at the time. I'm not absolutely sure about that. So that's my recollection of how it worked. But you know, from the Sanger's point of view, we felt that, you know, we put ourselves in the driving seat, because that was the strategy that we were following. And we had the funding to move ahead with it. So a related question is, as these sequencing groups and mapping groups were sort of diminishing in number, did you think, and the Human Genome Project was concentrating, did you think there was any kind of loss in innovation? Or was it more of a process of the methods just worked better with groups that knew better what they were doing is something of that sort? I think we reached a stage where we had to, yes, you needed to work better with groups who were familiar and had put in a lot of work to refine the process. Every time you have trouble with innovation, it's a wonderful thing. But then things shoot off at a tangent. And you don't always get back to the path that you originally set out on. So there was quite a long period of, should we say, trying things out. Nobody ever stopped trying things to improve the process. But I think we consolidated on what we needed to do. And there were only limited ways in which to do that. And then concentrating it in the big centres, it took it out of the small centres, which some people would be aggrieved at. But if you label it as a factory repetitive process, did they really want to be doing it? And did they want to be doing it, you know, to standards set by somebody else, et cetera, in the timescale set by somebody else? And I think a lot of groups didn't want to do that. They were quite happy to have it done as long as they, you know, could also have some say over what it was that was done. And lots of people used to gripe about the quality as you can imagine. And yeah, I mean, that's quite interesting because it's a very big debate among historians of science of what's the connection between sort of innovation and scientific success. And whether it's good to have lots of groups working on one problem or a constellation of problems, or whether it's better to have standardisation, centralisation of methods. And there's a big debate about innovation in the Human Genome Project. One follow-up question I had was once centres got really established and really standardised in what they were doing, did you have a good sense of say whether one group was really, really good at, or one centre was really good at production while another group was really good at finishing? Do you remember sort of any groups that you thought were really good at say one part of the sequencing process versus another? Why did you think that was? Well, I think we were pretty well aware of what was going on in WashU because we had regular visits there and they regularly visited us. So I think once a year either a group of people will go from Sanger to WashU or Vise versa. When the other centres became involved in the consortium, so the consortium project, then the visits widened. So there would be more people coming from the Whitehead or from Baylor and vice versa. So people used to go, the people to keep an eye on always WashU because they were very hot on development and they had good links with the companies producing the new reagents and the new gadgets and so on. The Broad were interesting once they got that production line up and running and they were most interesting because it seemed, when you talk to them, that it worked so well almost all of the time and we just didn't believe that. But it generated. It produced secrets and it was impressive the way that they'd set it up as the long flow pipeline that you see in the human genome. Baylor were quite consistent. They came through on the finishing and the innovation on the finishing side quite often. WashU, we talk so regularly with WashU and there was a lot of interchange of the finishing groups on both sides and the exchange of information about what we were doing and trying and so on. Yeah, I think that's my perception. JGI, they were trying new things but didn't get the impression that their production facility was going that well. But they picked up once they moved on to the finishing which was with Rick Myers and Jeremy Schmutz. They did some nice things. What do people say about us? But they say that, oh sang all the ones to watch. Oh they didn't have much of a production set up. It did take us a long time to fully automate the production side I think. We were probably slow on that. One thing we haven't mentioned at all which comes up also in the little bit in the Bermuda meeting but becomes a significant issue in 9790 is the library production, Peter De Young's sort of facility. Prior to that library development was pretty haphazard at least in the US. So what was sort of Peter's contribution, how did his facility get so good at library development not only for human but for nearly everything else? Because he was interested. He was really interested in making comprehensive libraries. Because the first libraries of his that we used were the P1 artificial chromosome of the Pax which had pretty good coverage I think. But they became unethical. But I think it was that he was really interested personally in getting the technology to work. It was really noticeable in the genome project that some people had green fingers and other people didn't. And Peter did in that area. And where you have put somebody like that that you want them to carry out that specialised work. So I think people were very happy with his libraries. And he was willing to talk about what he was doing and what he was trying. And the idea of having it started off with what were the first ones, 120 kb. And then he started increasing the site, sort of pushing to see how far he could increase the size of the clone but also keep the coverage. What was dropping out. So I think that was it. I think he was probably also interested in how the different DNA behaved in different ways from the different organism. We did the zebrafish and asked me some questions about differences in genomes. The zebrafish is a challenging genome, shall we say. It's very aty-rich. And it's very polymorphic. So a lot of polymorphism is in long stretches of A's and T's. And we started out initially with a, because of the amount of DNA that was required, a polymorphic back library, which caused us all sorts of problems in terms of putting them out together and also sequences. But he was very happy to work with a group in Oregon on developing a back library from a single zebrafish. He was interested in getting down to working with those very small amounts of DNA. So I think he was just interested in doing it. So one of the things that we haven't talked about is the sort of the numbers of organisms you've sort of overseen the sequencing of them. And a good general question would be sort of what makes a good sort of sequencing target was sort of one of the defining features of that organism for all. And what makes it, you know, many, many organisms don't make it beyond, say, draft a draft sequence. So when you have a complete model, that means the significant amount of work has been put into it, obviously. So do you have any sense of why one model makes it to that stage and other models don't? So the, what makes a good, so I suppose a genome that's kind of sequencers is no extremes of base bias, no extreme GC content or AT content. Both of them cause problems. So not platypus. Not platypus. For me, you can take zebrafish out. I thought that was a horror genome. And, yeah, it's interesting that when we started off with yeast, and the yeast groups actually moved over to pathogen sequencing at the Sanger. And the first two projects that they started out with, one was TB, high, high GC content in there. And the other one was malaria, which is at the other end of the extreme with AT content. So, I mean, very educational in terms of, you know, sort of having to manipulate the DNA to make sure that we got the sequence covered, the right sequence coverage. But in terms of informed choice, I would think, you know, probably not the best to start out with. So yes, not any extreme bias. I think an organism that makes a lot of DNA would be a good thing. You're not struggling around for minute quantities. And then of the organisms, or what you were saying, the organisms that I've been involved with. The strategy was evolving all the time. And actually, which strategy you adopt does tend to follow trends. So the mapping followed by sequencing approach is a good way for getting the coverage, for knowing where you are, for providing resources that can be distributed to a community in the form of clones, if they want to do, you know, research with it. And for completing the sequence, because you know where you are. Holding and shotgun is a way of getting coverage quickly. If it goes together nicely, if it assembles well, you can get good coverage, you can get a very, very usable sequence for a lot of purposes. If you want, if you really want a resource covering the whole genome for a detailed research project, I think you've got to go, you've got to go deeper than that. You've got to be able to improve on it. And whole genome shotgun is very, well, was very difficult to finish on. Because you couldn't always target the regions that were missing. It was difficult enough, if you were trying to work in gaps between clones, you knew where the DNA, piece of DNA that you're interested in should go. But if you're then going and trying to hook it out from the DNA in order to get a piece to sequence. But if you haven't got that much information, I think for a usable genome, you've got to have mapping information combined with sequencing information. And mapping has become untrendy. And I was berated on more than one occasion by you and Bernie for being backward in the way that I was, you know, going about genomes. But the, I tended to, if people asked me about doing a genome project, I tended to ask them first, what, what they wanted to do with the data. If you want a, you know, an overall analysis for the data, that's one project, you know, very often the whole genome shotgun would give you that. If you want, if you want to base a, you know, a very detailed research project on it, you need to have, you know, improved sequence and at least the areas that you're interested in. And you have to have a way of doing that. So, yeah, we have some. Yeah, it's a fascinating debate because I had the same conversation with Harris Lillan about whether all, all good sequence and all good analysis needs good maps. And he was really adamant that with shotgun, you cannot get the analysis that you need for many of questions of evolution and questions of, of say, for comparative questions. And he was, he was very leery of not doing sort of a more traditional approach of mapping and then high dense mapping and then sequencing as, and he didn't think mapping was, was unfashionable at all. He was, he was the opposite. Yeah, he had the opposite impression that, in fact, that that was the only way you could, you could interrogate the data. That was the only way to produce sequence of good, good quality and contiguity for, for the types of organisms that he was looking at and the types of questions that he wanted to answer. Yeah. Yeah, this is a, I mean, really does depend on what you want to know. It may be, you know, as, I mean, the move mapping becoming, becoming untrendy plus the move to short sequence reads exacerbates the problem. But it may be that, you know, if the reads are now getting longer, you can get better, better assemblies from just sequence data, then, you know, it may prove, but I still think, I mean, you're still going to have a map of some sort, just to link, you know, the genetic side of things with the, with the sequence, I think. Right. I was going to follow up and say with long read sequencing becoming the next big thing, does that, does that abrogate the need for, for, for dense maps? It may do. It may do. It depends on how, again, it depends on, on how good the sequence is. You know, you, you can, if you've got, you know, 500 kb of rubbish, then, you know, or N, you know, this is not terribly helpful. So, yeah. And the wheat genome, which is, I mean, the wheat genome, so I became involved in that project after I left the Sanger and I set up the sequencing centre in Norwich. And we, we were working at the centres based on the, the John Innes site, and they were already, the group there, Mike Devon's group, were already working as part of the wheat genome sequencing consortium. And they had a strategy within the consortium that was already, you know, sort of worked out that they were going to use mapping and then sequencing to, to get at the genome. And the reason for that is, I mean, the genome of wheat is enormous. I mean, it's 15, about 14, 15 gigabases. It's hexaploid. So you've got three diploid genomes in there, three non-identical diploid genomes that make up the whole. But you've got gene complements that are very similar on, on each of the sets of chromosomes. And over 85% replete. So, you know, just a, just a few challenges there. But actually, you asked about technology and the back libraries. So in the wheat, there is a lab in, in the Czech Republic run by Yaroslav Dolosil. And what he does, he flowsorts chromosomes. And he developed a method for flowsorting the individual wheat chromosomes. And then making back libraries from them. So, you know, the ability to do that and those resources are in the, in the plant arena, you know, similar to, to what Peter De Jong was doing in the, in the, in the annual area. But yes, what, what Yaroslav was able to do with his, his flowsorting chromosomes. So where not all of the chromosomes can be quite cleanly prepared, but they're, they're pretty good. And the first pilot project was run on chromosome 3B, which is the largest. And that's just a gigabase. And that was a project that was made by Catherine Fournier in, in France, with Genescope doing sequencing. And that sort of set the template for, for the genome. The other, the other consortium members, and this really was an international project committed to producing physical maps. Initially using fluorescent fingerprinting and more laterally, using pooled backs with tax sequences to, to produce maps. And it was at that, when the technology got to that stage, the, the, in knowledge we joined in with that. But finally, the NR gene came sort of towards the end of the project, probably around 2016. NRG or 2017 NR gene came up with a method for assembling these large genomes from large highly repetitive genomes from short Illumina reads. And they developed a genome sequence for the, for the wheat. But what we were able to do was then add in all of the resources that have been generated previously. So the physical maps, back end sequences, sequence tags from, from mapped backs, chromosome specific shotgun reads, shotgun data, so that we knew, you know, which chromosome, you know, sequences lined up on, put all that together. And I think the product was, was pretty good in the end, actually, for, for a genome of that type. Subsequently, they've added in some bion nanodata, bion nanopore data, and some more shotgun sequence data and resolve some of the problems. But, you know, if after a first, after the first round, you're resolving with, with the, with the additional bion nanomapping information, you know, 10% of problems, that's not bad. It's not bad at all. And I think, I think it's actually proving to be quite, you know, got a useful resource for, for research, which is what the intention was. So that's, and I was to say, when I was in Norwich, we, we started off the, the sequencing side of things and generated whole chromosome shotgun sequence data, which assembled into, you know, two to six KB context, really rewarding. But it was useful in terms of positioning and, and giving a first, I think, genome wide glimpse of genes. So it was, you know, useful to, useful to a lot of people. And then I worked with consortium afterwards, when I stepped out from Norwich, just helping to, you know, guide, I think, the process to, towards getting now, Gina. Yeah, an interesting project. So I actually think we've gone through almost all of my questions. One, one, one person we haven't mentioned at all, and I just wanted to discuss him briefly is Michael Morgan. Yeah, so we forgot, we forgot Michael Morgan. So I guess in, in just thinking about his role and role as Hanger, his contribution, a few one or two anecdotes, something like that. Oh, yes. So, well, Michael, Michael was there as a sort of, I mean, both to encourage us as a sort, and in ways, a sort of fixer, but also as a minder, I think, and to make sure that we didn't go completely off the rails. But also, I mean, he played such an important role once we, once we were going at integrating what we were doing at the Sanger with what was, you know, with the international project. And, and, you know, with John worked on keeping the project as an international project when there was, you know, a threat at one time that it wasn't going to be. Two threats, actually, but, you know, so, so, so Michael, yeah, Michael used to, what can I say? I mean, he used to come and, you know, talk with John about, about what we were doing. And, you know, of course, he, he came from the administrator side of the, of the Welcome Trust. So, you know, he was in, I mean, I had to go down and, you know, justify how he was spending money at various times. But also, he wanted, he wanted the Sanger to be a success. He enjoyed it. And he sort of encouraged us, and he encouraged us to break the rules. I just have one, one, you know, I've got the two, two, two big memories of Michael. One is when we were talking about the pathogen sequencing. The Welcome Trust used to get quite agitated at times that we didn't go over budget on, you know, any, in any particular area. And we knew that to finish what we were doing on TB, we were going to go over budget and only a certain budget had been allocated. And I had gone down, I can't remember whether I was John, with John or whether I was, but I was probably with Bart actually. No, no, I was with John. And, and we, we talked about this with them saying that, you know, we are going, you know, that we cannot finish within, within the budget, you know, we've not been profligate, but we can't finish it. And, you know, he was, you know, basically told us, you know, the message was, this was very bad. This was very bad thing to do. And then it was almost, you know, sort of as we walked out of the room, said, but you better go on and do it. You can get, you can get chastised later, because it's very bad that you go over budget. But, you know, continue the project, continue the project. And then, and then the celebrations, I think Michael, you know, we always remember Michael for, for the celebrations. He enjoyed a party. And when the singer was a success. And, you know, when we made our contribution to the, to the initial draft of the Human Genome project, and then when we, when we got it completed, you know, he was, I just had the pictures of him on the steps of the, of the house that the Wellcome Trusted had renovated. And, you know, that was, you know, with a champagne bottle in his hand. Cheering, cheering everybody on. But he was, he was hugely instrumental in, in making the international coordination work. Hugely, just knew how to do it, knew how to talk to people, you know, get, get people around the table. And John would, you know, talk to him about the, shall we say, the, the sharpest points that need to be got across, but often Michael did the planks.