Learning objectives

To run annotation on a genome and understand the output files.

Exercise 1

First we need to update a program on the servers. Log in and run the following command:
brew upgrade && brew upgrade -v tbl2asn

This will take a couple of minutes.

Now we will run the prokka program which we learned about in the lecture. We will annotate the file contigs.fasta, which are the assembled Staphyloccocus aureus contigs from yesterday’s assembly practical.

prokka --outdir s_aureus --prefix s_aureus --cpus 4 --mincontiglen 500 --locustag saureus contigs.fasta

This will take a couple of minutes. When it is complete, the output files will be in the s_aureus directory. There will be 10 output files.

Look at the .txt file, which has summary statistics about the annotation.

Q1. How many coding sequences were predicted?

Q2. Which file contains the protein sequences for every predicted coding sequence?

Now we will visualise the output of prokka using a tool called Artemis. From here we will take you through the Artemis visualisation with a live demo.