 Hello everyone. Thank you for joining this presentation. My name is Astrid Haigma and I work as a researcher in the field of molecular microbiology at the Erasmus Medical Center in Rotterdam in the Netherlands. And this presentation is going to be about a nanopore whole genome bacterial sequencing. So what is nanopore sequencing? In this way of sequencing, the strength of DNA is guided to a pore, a biological pore. And when this train of DNA is passing through the pore, there will be a difference in the current. And this difference can be measured and translated into a DNA sequence. So how does this work? You can see the pore, it's actually a bacterial pore in that has been genetically modified. And you can pass a flow through this pore and then there will be a current. Then when there is a piece also going through the pore, in this case an analyte, there will be a difference in this current. It will be disturbed. And when the piece is bigger, there will be a bigger disturbance of the current. So this is how the sequencing works. So when you pass your DNA strength through the pore, there will be difference in currents based on the T's and the A's and the G's and the C's that are passing through and that are different in size. And these disturbances can be translated into a DNA sequence. And it's not only the single basis but also stretches of DNA will be recognized and then also translated to a sequence. So in our lab, we use two different types of nanopore sequencers. We use the min-ion, which is a very small and quite portable and easy to use. Sequencing machine, you can see it here, it's about the size of the stapler. And you can connect it to a computer and then run the software and the sequencing. And we also use this grid ion, which is a lot bigger. It can contain five different flow cells, which you use for the sequencing. And you can run those five flow cells similarly. And also this machine contains the computer and the storage and it's updated more regularly. So that's also very convenient to use. And I like to use the grid ions because we have it available right now. So like I said, I work in microbiology, actually medical microbiology and we are interested in antimicrobial resistance. And that's why we started to use the nanopore technology. So what do we like to do with it? We would like to look at presence and type of resistance and this can translate directly to the patient. The level of resistance, mechanisms of resistance, this will help us also to understand how resistance is involving the risk of spread. And this is also in relation to prevention of spread in the hospital setting. The mechanisms of spread began for prevention, but also for knowledge. And to answer all this, we like to know the location of the resistant genes. Are they present on the chromosome of the bacteria or are they present on a plasmid? What are the surrounding genetic elements? Can resistance easily jump from one plasmid to the other, for example? Because of genetic elements and also the copy number of the resistant genes or maybe of the plasmids. We also like to know because it can say something about the level of resistance, for example, MIC. So here's a picture of bacterium and it has two types of DNA in its genome. It has the chromosomal DNA which is big and circular. And then it can have these plasmids which are smaller and also circular and they can be present in single copy or in multiple copy. And that can also be more than one type of plasmids present in the bacterium. And the plasmids are particularly interesting because resistance is often found on plasmids and then also this resistance is more easily spread because the plasmids can go from one bacterium to another, for example, through conjugation. So what are the benefits of nanopore sequencing? With this method you can sequence really long reads, what you put in your sample, that is also what it will sequence. So it can sequence up to 500,000 base pairs, but that's mostly seen in human DNA sequencing or eukaryotic DNA sequencing. In our case we find at least reads of 100,000 base pairs in our samples. So with these large reads it's more easy to close complete genomes, but also it's more easy to separate plasmids from chromosomal DNA. So there's fast library preparation. There's no need for amplification, so this saves time. You can do this in real time. So you can follow your sequencing and also get a resistant profile if you have a good run in about two hours. This will help also your patient. Furthermore it's portable and especially the min-ion is a low cost device, so it's available for a lot of labs. And then there are many applications. You can do what we do genome assembly, but you can also look at microbiota doing 16S sequencing or metagenomics. You can do RNA sequencing. You can look at methylation and also at proteins and metabolites. So this is also a good platform to have in your lab and to be able to make use of. A problem is that there is still a high error rate in the sequences. It's now about 8%. So if you want to go for SNP level, then it's not the best method to use, although if you do assembly, then also your error rate will go down significantly. And then another problem is although there is support from Oxford Nanopore Technologies and their software available, still, especially for genome assembly, you need by informatic tools and support, and that's not always available. So here just more a bit of an example about closed genomes versus context. So here you have a genome, for example, of E. coli. In this example it has the chromosome, which is about 5 to 6 million base pair. And then here's also the plasmid. But when you do the normal lumina sequencing, you get not the circular pieces of DNA, but you get context, you get pieces that can be smaller or larger. And for example, if this is the plasmid, it's not very obvious where this piece of DNA goes. If it's actually part of the chromosome or if it's actually plasmid. So this makes it difficult. Here's another visualization of an assembly of lumina data. You can see there are pieces that there are circular structures that may be a plasmid, but there are several options because there's repeats in this sequence. Here there's also the chromosome and also due to repeats, it's not very clear where all these pieces go and you get this kind of nice but not clear pictures. So what we aim for when we started to use nanopore in our hospital setting was complete assembly of the chromosome and the plasmids for our clinical isolates. We want to of course discriminate chromosomes from plasmids and we also want to do fast and accessible data analysis. So again, this picture, this is what we had. This is what we wanted. We wanted to have a circular chromosome and then separate the plasmids and then information of course of these DNA sequences about for example resistance. So when you start to use nanopore sequencing, you do have to pay attention that the quality of the DNA is important. You want to have high intact, high molecular rate DNA. Like I said before, what you put in the sequencer, that's also what you get out. So if you break the sequences or the pieces of DNA, then you also will get small pieces. So you want to be careful with preparation. You don't want to shake or vortex too much. And also because the pores are biological molecules, you don't want to have any organic components in your sequence run because this will disturb and damage the pores. So no ethanol or phenol in your sample. So what we are doing right now is we have used two types of DNA isolation kits, the DNA blood and tissue kit and also the genomic tip method. Both are from kaogen. And we use these methods for pseudomonas species E. coli, clefshella pymoniae, which are all gram negative bacteria, but now also we are experimenting with gram positive bacteria. For example, costumium edificiolae, and we also want to use staphylococcus aris. So this is what we use for library preparation. This is just a kit that you can order from Oxford Nanaport Technologies. It's the rapid barcoding kit. So you can barcode your samples. You can run with this kit up to 12 samples at once. And how it works is here you have your high molecular weight DNA, and this has the kit has some transposon that has adapters linked to it. So the transposon will integrate in your DNA and leave some adapters. And then there's another step of sequencing adapters. And this will help guiding your DNA to the pore and also will add the barcoding to your sample. So once you have prepared your library, you can start with the Nanaport sequencing. And then you get to the data analysis. So when I started Nanaport sequencing, I think we were really fast with getting our DNA in a good quantity and also quality and do the runs. But then we had the problem that there was no real good way to analyze the data yet because the software that we're using for Illumina sequencing like Genius or CLC Bio were not equipped to deal with the error rate that is seen in Nanaport sequencing. So there were bioinformatics tools available on JITUP, which is a place where bioinformatics make their tools public. But those tools, they kind of look like this. So it's not like software with an interface, but it's more a script. And if you are not a bioinformatics or you don't have any experiments with these kind of scripts, it's quite difficult to get into and to learn. And this is a really easy example, but those scripts can be really, really long and complex. So for me, this wasn't really working. So now fortunately, Genius and CLC Bio do have tools from last year on to analyze Nanaport sequences, but still these programs are quite expensive and not available for everyone. So when we started to get into the data analysis, we started to work with a group that was using Galaxy. This is a group in our institute with all these bioinformatics and they have made a Galaxy toolkit available in Galaxy. So in a tutorial by Willem de Koning, after me, he will also show you these tools and also how to analyze your sequence data. So this toolkit has different tools. It has for Nanaport, it has assembly, it has QC reporting, it has the ability to look for resistance. And it also has a tool to differentiate between chromosome and plasma DNA. So you can have those tools all separate and run them. And this is also in a tutorial. I think it's also important to be able to do this, to learn about the different tools and how they are working. But of course, it's also nice when you don't have to start every tool separately, but you can use Galaxy pipelines. So all the tools connected and with a couple of versus on a button, you can run your analysis. So that is what we were aiming for. So here is a workflow of an assembly that we use in the lab. It's a mini as a minimap workflow. And also we have made a pipeline from this. And again, this will be in the tutorial. So what it does is first you put your DNA sequences in this workflow. Then you can do an alignment step and it uses minimap for that. And then there will be the novel assembly with mini asm, which is an assembler. Then you do another round of mapping where you use the reads again and also your assembly. And then finally, there will be a consensus of those steps, a consensus sequence or context, which will be done by Recon. So the input sequences, there also data statistic reports will be generated. So you will have that. And then from this consensus DNA, you can do look at antimicrobial resistance using Starra MR. We have a tool that predicts whether a sequence is a plasmid or whether it's chromosomal. Then there's visualization of the assembly and we use Venice for that. And I mentioned the QC report already. So you can do this, all these steps one by one, but also now we have this combined in a pipeline. So we've made this pipeline some time ago, but we have also aimed to make it better. So there's a new assembler. It's been available also for a little bit of time. It's called fly. And it works a bit better than mini asm. So we now also have the same pipeline, but instead of the mini mini asm, we do the fly assembler. Then we also have a unicycler assembler that repeats the assembly to kind of polish your DNA, but it can also do hybrid assemblies. When you have Nanopore data and Illumina data and you want to at SNP level have really good sequences. You can do a hybrid assembly also in a pipeline. We added more function. For example, we added lost. We added an notation with the two proka. There's more visualizations. And then also we have some databases in an I report. I wanted to switch to the I report to just show what the pipeline does. So here is an example of an I report. This is an Ecoli that I sequenced. And the first you can click to the report. You can see different things, but let's just go to this, the read quality. So it gives you information about your input DNA. It gives you a graph, for example, here with the read length and the number of reads. So you can see, yeah, so you have an average of about 6,000 base pairs of read length. And then there's also extra information here. So for example, you can see the number of reads and this run was 65,000. The mean read length was about 8,000. The longer street here, almost 100,000. And here's something about also the quality. So you get this information in this read report. Then there's more information here. So this is the same, but there's more visualizations. And depending on what you like, you can look at this information. Then there's a figure of the assembly, the context. And this is a Band-Aids figure. So here you can see really nice assembly of the chromosome. And here's a smaller piece that looks like a plasmid and also is a plasmid. And there is some more information here about the context. So there's content one and the length of the content and also content two, the length of the content. So back to e-reports, looking at resistance. So this is another visualization where there's several information. So at the outer ring, you can see the context. So this is content one, the light blue, and this is content two. Then there's a ring that says something about GC content. If it's more GC, then it's, I think it's blue and if it's more 80, it's red. The green ring here, it's about the depth of your sequence. You can see, yeah, it's quite good, but you can see also some areas where not the depth was not as good. I think it's a log scale. So here's 10, this is 100 and this is 1000. So we are at depth of about 100. And then also what you can see, and I think it's very nice, is that here the plasmid, which is sequence two, it has more depth. So there's more reach of this plasmid and this also says something about the copy number. So here it's almost up to 1000 in depth. So this would mean that probably this plasmid, there is like 10 copies within each particular cell. So then there's information about the resistant. And in this case, all the resistance was found on the plasmid and not on the chromosome. And there's also a little table. So here you can see content one, which was the chromosome. There's nothing found in content two. You can see the resistance genes, what type of resistance is present. So then there's some annotations. So we built in a tool that completely annotates the context. I only show here 10, but you can go and look at all of them. And you can also download this and put it in some other software to look better. And then there's also Blast tool. It gives you a top 10 hit of each context. So content one and there's Blast and you can see it's in E. coli. But maybe you also want to look at the plasmids and a little bit more. So here's content two, which was the plasmid and it hasn't hit with certain plasmids. So you can also kind of identify what kind of plasmid you are dealing with. Okay, so this was the e-reports and I want to switch back to my presentation. Okay, so we are back at the presentation. So here I want to show you some results. And this is from the read statistics and also validation where we compared the D&Easy DNA isolation with the genomic tip method of DNA isolation. So we did this because we found that with D&Easy the concentrations of DNA were quite low. We need in our sample about 50 nanograms per microliter for an optimal run. But yeah, we got concentrations which were mostly below that. And that's why we started to experiment with the genomic tips. And you can see that with this type of DNA isolation using the same input material that we got higher concentrations and also mostly above this 50 nanograms per microliter. And then we did a small comparison where we compared the read numbers, longest read and medium read length, D&Easy versus genomic tips. And here you can see that the read numbers are going up, which is logical because you put in more DNA. But also the length of the reads is higher when you use genomic tips. So there is less breaks in the DNA and this will help with your assembly. And also the median read length went up when we used the genomic tips. So again, this will help in your assembly. So here are some examples of our assemblies. We did E. coli, Pseudomonas species and E. klepsiolapneumoniae strains. With E. coli we used strains that we bought at DSM. These are publicly available and these were strains that were known of that they contain plasmids and also resistant genes. So this was good for our validation. And you can see here that we get good assembly and also nice separation of the plasmids in these strains. Here there is the old piece which is not clear where it should go. But overall we can differentiate between chromosome and plasmid. Pseudomonas species was a bit a variant. Here you can see pieces, you can see assembled plasmid circular plasmids. Here you can see one piece which was a chromosome and again another piece which was not circular but we could identify it as a plasmid. And here again you can see chromosomal pieces where here it's not clear where something should go. And here another smaller piece which was again a plasmid. Emit klepsiola it can contain several different plasmids and we could see that back. A nice assembly with a complete chromosome and several different plasmids also still a piece of DNA. A bit less assemble an example of an assembly but again full circular plasmids here. And again here a nice example where everything was completely assembled especially only for this one that was still a piece. So I think it worked really well but maybe you do want to have complete assembly or maybe you also want to go to SNP level for your sequence data and then it's good to combine your results, your Illumina and your Nanopore results. And you can use a workflow with the Unicycler assembly for this present in Galaxy. And here an example we did DNA isolation of plasmids. We used purified plasmids for this sequence run. Here is the Illumina results. Here is the Nanopore results with already some nice complete and circular plasmids. But then we did the hybrid assembly we found another small plasmid that we couldn't pick up with Nanopore. And then another result here again the Illumina and Nanopore was not so good as in this case but again when we did the hybrid assembly we were able to really get those circular structures and also discriminate the different plasmids from each other and localize the resistant genes. So in conclusion I think we got really good results with whole genome sequencing using a Nanopore and also the assembly of course. We used the DNEASY kit and the genomic tips for DNA isolation and we got a bit better results with the genomic tips but also the DNEASY kit gives us sufficient assemblies. For library preparation we used the rapid barcoding kit but there are other kits available that you might want to explore but we have good experience and also it's really fast this rapid barcoding kit. Then we have put diaphragmatic tools in Galaxy in a Nanopore toolbox and we also created pipelines for assembly and further analysis of Nanopore sequence data and especially focused on resistance. So we get good assembly of the chromosomes and the plasmids. We can discriminate between chromosomal and plasmid DNA. We can detect and localize the resistant genes and this all if you have depending a bit on the amount of reads that you put in what the assembly can take from about 10 minutes to one hour so that's quite fast. So and unfortunately they are not available for everyone yet but you can contact me if you are interested but we have made new pipelines and there's also new tools they are available. So we have pipeline with the fly assembler and also combined with extra tools that I showed you in the e-reports in our newer versions of the pipelines including visualizations, annotations and last functions. So there's here some publications that I want to mention from our group. So from the microbiology but also from bioinformatics. This is NanoGalaxy so this is about the toolkit. This is a publication actually not on assembly but on Nanopore for Microbiota analysis 16s gene sequencing and this is a paper about our experience with Nanopore sequence and the lessons that we learned when we were introduced to it. And actually we also created another pipeline which is not a Galaxy pipeline but it is and it has an interface and it's quite easy to use once it's installed and we have submitted this paper about it and it also compares the Minioz and Minimap assembler to the fly assembler and also shows some better results with the fly assembler. So this is submitted and then there is a paper submitted about MCR1 carrying classmates in the hospital settings that we did in collaboration with some microbiologists in our department. Okay then I would like to acknowledge a couple of people and our funders. So from our department medical microbiology and infectious diseases I'd like to acknowledge the BORA which helped setting up the Nanopore technology and also did several runs. John Hayes which I collaborated with and also provided a part of the funding. Then from clinical bioinformatics which together with us helps with development of pipelines and necessarily necessary tools. So Helena I worked together with her more recently about with the building of the newer pipelines and adding the newer tools. Willem and Saskia developed Minioz and Minimap pipeline and also adding the tools in the Galaxy Toolbox and then Andrew Stubbs which is a supervisor of clinical bioinformatics. And then for funding we got some European funding for a project called Taylor Treatment where we started Nanopore sequencing pretty much also based on 16S sequencing. Then I got some money from the Dutch Society of Medical Microbiologists for validating the tools and the sequencing and also for building the newer pipelines and then now we are in a project called JPI AMR. This project is funded by JPI AMR and the project is called SEC for AMR where we also are involved in bringing the tools to the public and teaching about the tools and it's all related to resistance. So I'd like to thank you for listening to this presentation. I hope you enjoyed it. I'm available for questions if you have them. Also we can make some of the newer pipelines available to you if you are interested but I think it's good to first go to the tutorial and start to work with Galaxy and start to understand the different steps. And then if you've done that then you're free to contact me and you can maybe newer pipelines available too also. Okay, thank you very much. Bye-bye.