Standard URL

From GMOD
Revision as of 22:44, 15 September 2009 by Clements (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

In order to simplify the retrieval of common datasets, the Generic Model Organisms Database (GMOD) community has recommended a series of standard URLs, or a common download URL. Each participating MOD has an index page like the ones below, describing the species and datasets that are available.

MOD Standard URL

Genome datasets available through the GMOD commom URL
MOD Standard URL Description
WormBase http://www.wormbase.org/genome/ Caenorhabditis elegans and related nematodes
wFleaBase http://wfleabase.org/genome/ Daphnia pulex and related crustaceans
DroSpeGe http://insects.eugenes.org/genome/ Twelve Drosophila insect species genomes


About GMOD Standard URL

This standard specifies the following URLs (all located under http://your.org/). Display this HTML-formatted index page that contains links to each of the species available through common URLs. See also Todd Harris' powerpoint presentation given at the Spring, 2005 GMOD meeting. The uses for these common URLs are two-fold:

  • Keep it simple for scientists to guess where to find a genome, when they may be unfamiliar with the MOD website.
  • Keep it standard for computists to program a long-lasting, computer parsable data URL, with no guesswork on spelling, and defined data formats.


Standard URL Description
/genome/Binomial_name An index page for species "Binomial_name". This will be an HTML-format page containing links to each of the genome releases.
/genome/Binomial_name/release Leads to index for the named release. It should be an HTML-format page containing links to each of the data sets described below.
/genome/Binomial_name/current Leads to an index of the most current release, symbolic link style.
/genome/Binomial_name/current/dna Returns a FASTA file containing big DNA fragments (e.g. chromosomes). MIME type is application/x-fasta.
/genome/Binomial_name/current/mrna Returns a FASTA file containing spliced mRNA transcript sequences. MIME type is application/x-fasta.
/genome/Binomial_name/current/ncrna Returns a FASTA file containing non-coding RNA sequences. MIME type is application/x-fasta.
/genome/Binomial_name/current/protein Returns a FASTA file containing all the protein sequences known to be encoded by the genome. MIME type is application/x-fasta
/genome/Binomial_name/current/feature Returns a GFF3 file describing genome annotations. MIME type is application/x-gff3.

Other names for this: Common download URL, Common URL, Standard URL


Note: MODs may optionally provide URLs in the short form of G_species (eg C_elegans) as a convenience for users. This should be supplied in addition to the full Binomial_name standard.

Common /genome pages

These projects provide data information at /genome/, if not yet in the common formats described below.

Common /genome/ data pages
MOD Common URL Description
Medicago http://www.medicago.org/genome/ Medicago truncatula plant genome
MaizeGDB http://www.maizegdb.org/genome/ Maize corn genome
Neurospora http://www.neurosporagenome.org/genome/ Neurospora crassa
Vectorbase http://agambiae.vectorbase.org/Genome/
http://aaegypti.vectorbase.org/Genome/
http://iscapularis.vectorbase.org/Genome/
http://cpipiens.vectorbase.org/Genome/
http://phumanus.vectorbase.org/Genome/
Human vector insects genomes

MOD Non-Standard URL

For those genome projects that haven't yet standardized their URLs there is this site that lists what is available:

Reference Genomes

See also