GENOMES

In this section you can select one of the genomes from the menu. If you would like other genomes to appear in our selection menu please let us know. Alternatively you can browse and upload a file from your desktop with the genome you are interested in. The genome file must be a TEXT file, and may contain one or more amino acid sequences (200 maximum), all of which must be in FASTA format. A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length.For the description line of the sequences we recommend following the NCBI annotation:

Here is an example of genome file containing three sequences in FASTA format:

>gi|GI:29836498|gb|sars3b|NP_828853.1
MMPTTLFAGTHITMTTVYHITVSQIQLSLLKVTAFQHQNSKKTTKLVVILRIGTQVLKTM
SLYMAISPKFTTSLSLHKLLQTLVLKMLHSSSLTSLLKTHRMCKYTQSTALQELLIQQWI
QFMMSRRRLLACLCKHKKVSTNLCTHSFRKKQVR
>gi|GI:29836499|gb|sars4|NP_828854.1
MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRLCAYCCNIVNVSLVKPTVYVYS
RVKNLNSSEGVPDLLV
>gi|GI:29836504|gb|sars5|NP_828855.1
MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYSNRNRFLYIIKLVFLWLLWPVT
LACFVLAAVYRINWVTGGIAIAMACIVGLMWLSYFVASFRLFARTRSMWSFNPETNILLN
VPLRGTIVTRPLMESELVIGAVIIRGHLRMAGHSLGRCDIKDLPKEITVATSRTLSYYKL
GASQRVGTDSGFAAYNRYRIGNYKLNTDHAGSNDNIALLVQ
>gi|GI:29836500|gb|sars6|NP_828856.1
MFHLVDFQVTIAEILIIIMRTFRIAIWNLDVIISSIVRQLFKPLTKKNYSELDDEEPMEL
DYP
>gi|GI:29836501|gb|sars7a|NP_828857.1
MKIILFLTLIVFTSCELYHYQECVRGTTVLLKEPCPSGTYEGNSPFHPLADNKFALTCTS
THFAFACADGTRHTYQLRARSVSPKLFIRQEEVQQELYSPLFLIVAALVFLILCFTIKRK
TE

Note: For pathogens such HIV1 that are known to be represented by numerous variants we have considered multiple sequence amino acid alignments for each of the gene products, and generated consensus sequences with the variable residues masked (We have masked those with a value of Shannon Entropy > 1). Therefore, prediction of T-cell epitopes is restricted to the conserved regions.