 Genome Annotation with Prokka Before diving into this slide deck, we recommend you to have a look at the following. How to Annotate a Bacterial Genome, How to Visualize Annotated Genomic Features Load Genome into Galaxy Annotate Genome with Prokka View Annotations in Gbrows In these slides, we will learn what is genome annotation and which tools can be used for genome annotation. We will describe in detail a tool called Prokka. Annotating a genome means positioning features along the sequence of a genome. Those features can be anything one can find in a genome sequence, genes, but also binding sites for example. When a feature, like a gene for example, is positioned, you can add information about its function. This operation is named functional annotation. Before annotating a genome, you need to assemble it. If you get a high quality assembly, it will be easier to perform a good quality annotation. Once you have a good genome sequence, you can annotate it. In this example, there are gene coding for a delta toxin. There is a ribosome binding site in red, and the coding sequence of this gene is in green. For each feature annotated on a genome, you can get its position, its type, and some information about its function or how it is expressed. You can annotate features by looking at similarities with known sequences from international databases. Some tools annotate features on a genome by seeking motifs corresponding to known structure, for example gene or exon start or stop. Some lab experiments can help annotate specific regions of a genome, even though it is often much more expensive than an automatic annotation. The lab experiments can provide certainty about function, where automatic annotation is more of a guess. Prokka is a pipeline that runs several other tools to annotate prokaryotic genomes. The input is the assembly of the genome in faster format. Prokka runs Aragorn to annotate transfer RNAs. Ribosomal RNAs are annotated with RNAmer. Infernal uses the RFAM database to annotate non-coding RNAs. Finally, proddable annotated coding genes. Each coding sequence is then compared to the SwissProt sequence database using Blast, and to TIGA and PFAM motif databases using MA3. SignalP is also run to detect signal peptides in each predicted coding sequence. The final result of the whole Prokka pipeline is a set of GFF3, GBK on ASN1 files. More information is available in the introduction to genome annotation slides. Prokka is a useful tool to annotate a bacterial genome, which browse can be used to inspect the annotation of a genome. Thank you for watching.