SOBA - The Sequence Ontology Bioinformatics Analysis Tool provides a high-level overview of the features in a GFF3 sequence annotation file. While GFF3 - the standard file format for genome annotation - is simple to produce and work with, whole genome annotation data still present a large and complex dataset. SOBA automatically calculates and displays some common statistics and graphics used when working with GFF3 files:

  • Summary counts and statistics of feature types and attributes used
  • Histograms of feature lengths
  • Graphs of Sequence Ontology terms used
  • Histograms of intron density
  • Suggestions to improve SO compliance for invalid terms

Having ready access to summary details helps annotators and others working with the data make rapid evaluations about the quality and completeness of an annotation set as well as allowing comparison with other genome annotations.

DEMO - Try SOBA with:                   

Use the file browser below to select and upload one or more GFF3 files and run an analysis. Please limit total size of uploaded files to 1.5 GB.

Run SOBA on a new file
Upload one or multiple GFF3 files (max total upload size 1.5 GB): Help
Input URLs (separate URLs with space): Help

Documentation for SOBA can be found on the Sequence Ontology Wiki