Difference between revisions of "Template:GFF3FASTA"

From GMOD
Jump to: navigation, search
m (New page: GFF3 files can also include sequence in FASTA format at the end of the file. The FASTA sequences are preceded by a <tt>##FASTA</tt> line. This sequence section is optional. If presen...)
 
(New page: GFF3 files can also include sequence in FASTA format at the end of the file. The FASTA sequences are preceded by a <tt>##FASTA</tt> line. This sequence section is optional. If presen...)
(No difference)

Revision as of 01:02, 8 March 2011

GFF3 files can also include sequence in FASTA format at the end of the file. The FASTA sequences are preceded by a ##FASTA line. This sequence section is optional. If present, the sequence section can define sequence for any landmark used in column 1 (the frame of reference). For example: For example:

##gff-version 3
ctg123 . exon            1300  1500  .  +  .  ID=exon00001
ctg123 . exon            1050  1500  .  +  .  ID=exon00002
ctg123 . exon            3000  3902  .  +  .  ID=exon00003
ctg123 . exon            5000  5500  .  +  .  ID=exon00004
ctg123 . exon            7000  9000  .  +  .  ID=exon00005
##FASTA
>ctg123
cttctgggcgtacccgattctcggagaacttgccgcaccattccgccttg
tgttcattgctgcctgcatgttcattgtctacctcggctacgtgtggcta
tctttcctcggtgccctcgtgcacggagtcgagaaaccaaagaacaaaaa
aagaaattaaaatatttattttgctgtggtttttgatgtgtgttttttat
aatgatttttgatgtgaccaattgtacttttcctttaaatgaaatgtaat
cttaaatgtatttccgacgaattcgaggcctgaaaagtgtgacgccattc
...

When the GFF3 file is processed the IDs on the header line of FASTA entries are matched with IDs used in column 1 in the annotation section of the file.

You don't have to store the FASTA in the GFF file. You can also store your sequences in a separate file containing only FASTA entries.