- Mature release
- Active development
- Active support
SOBA is a command line tool and web application for analyzing GFF3 annotations. GFF3 is a standard file format for genomic annotation data. SOBA gathers statistics from GFF3 files and renders them as tables and graphs.
The web version of SOBA will produce the following:
- Summary statistics of feature types and attributes used
- Histograms of feature lengths
- Graphs of Sequence Ontology terms used
- Histograms of intron density
- Suggestions to improve SO compliance for invalid terms
In addition, the command line tool (SOBAcl) flexibly produces a much wider variety of tables, figures and graphs based on the data in a GFF3 file as well as the ability to produce complex and extensible custom reports via a robust template system.
SOBA is intended as a tool for those dealing with genomic sequence annotation who want to view genome wide summaries of their annotation files. For example: SOBA would be a useful tool at an annotation jamboree for a newly sequenced organism and when preparing the resulting genome paper; SOBA would help those developing annotation tools to quickly evaluate updates to their tool; SOBA assists comparative genomics analyses by providing a high-level overview of the genome of multiple organisms. SOBA complements genome browsers by providing a summary of all the features annotated in the genome.
Documentation for the web interface to SOBA is available on the Sequence Ontology Wiki as well as via tool-tips on the site itself.
Documentation for the command line version - SOBAcl - is available as a usage statement with the script itself:
A README and INSTALL document are also included with SOBAcl.
- The Graphviz library
- The libgd graphics library
The SOBA web interface is available at:
SOBAcl is available (via Subversion) from:
svn co svn://malachite.genetics.utah.edu/SOBA/trunk SOBA
SOBA is supported by the Sequence Ontology Developers Mailing list at: