 Now, we talk about the sequence submissions. Sequences are submitted to the databases in order to share them with the scientific community and sometimes they are also required by the funding and the publication agencies to submit them. It is important to ensure that the sequences files they do not contain any special characters because sometimes some control characters they can also be incorporated into our normal sequences and which sometimes mess up the downstream analysis. So there is an issue of how to put the ambiguous nucleotides or amino acids in the sequences. So some places you are not sure whether it is A or C or T or G but you are restricted to put just one single letter. So there is an organization international union of biochemistry that is abbreviated as IUB. It has established some standard codes to represent those ambiguous bases or amino acids. For example here we see obviously G, A, T and C are just guanine, adenine, thymine and cytosine. If we see R that means it can be either A or G. Obviously the word is derived from the group from which they are coming from that is the purines. We see Y that is pyrimidine it can be either CRT. Thymine stands for if they are having any amine group in them, amino group in them. K is if they have keto group that is G or T. S is if they have strong interactions three bonds we can call them triple bonds, C or G they form triple bonds. So S is this position we can see either C or G. W is for weak interactions A or T. H since it follows G so it is everything except G that it can be A, C or T and same is similar procedure is followed down below for B, V and D and can be any base. Same way for amino acids we have unilateral codes from A to Z though there are some letters that are missing and there are some amino acids like for example there are four amino acids that are starting with G so we gave that G letter to glycine for rest of them we might use some other letters like glutamic acid is represented as E, Y stands for tyrosine down below and X can be any amino acid like N in the nucleotide sequences. NCBI has two options for sequence submissions if the sequences are simple and they are not much to be related with some downstream analysis we can simply submit them through option called as bank it and if our data sets are small we can also use this tool so we can transfer the data set over the over the internet. A sequence is for the summation of complex data sets complex sequences and annotations and it is also good if we want to do some offline summations normally where we have our data sets which are huge ones and obviously that data set may be used in future with some analysis tools and softwares so here is the web pages just showing the glances of bank it and sequin same way for the protein sequences just like NCBI tools we have Uniprot similar tool is called as spin so that can take the protein sequences and it can also take the annotations which are recorded into its knowledge base here is the page for spin we can register here and then we can submit our data so what we have seen here in this section is that sequences they are stored in specific format and if we want to submit our sequences we need to follow the guidelines provided by those databases.