In order to simplify the retrieval of common datasets, the Generic Model Organisms Database (GMOD) community has recommended a series of standard URLs, or a common download URL. Each participating MOD has an index page like the ones below, describing the species and datasets that are available.
MOD Standard URL
|WormBase||http://www.wormbase.org/genome/||Caenorhabditis elegans and related nematodes|
|wFleaBase||http://wfleabase.org/genome/||Daphnia pulex and related crustaceans|
|DroSpeGe||http://insects.eugenes.org/genome/||Twelve Drosophila insect species genomes|
About GMOD Standard URL
This standard specifies the following URLs (all located under http://your.org/). Display this HTML-formatted index page that contains links to each of the species available through common URLs. See also Todd Harris' powerpoint presentation given at the Spring, 2005 GMOD meeting. The uses for these common URLs are two-fold:
- Keep it simple for scientists to guess where to find a genome, when they may be unfamiliar with the MOD website.
- Keep it standard for computists to program a long-lasting, computer parsable data URL, with no guesswork on spelling, and defined data formats.
|/genome/Binomial_name||An index page for species "Binomial_name". This will be an HTML-format page containing links to each of the genome releases.|
|/genome/Binomial_name/release||Leads to index for the named release. It should be an HTML-format page containing links to each of the data sets described below.|
|/genome/Binomial_name/current||Leads to an index of the most current release, symbolic link style.|
|/genome/Binomial_name/current/dna||Returns a FASTA file containing big DNA fragments (e.g. chromosomes). MIME type is application/x-fasta.|
|/genome/Binomial_name/current/mrna||Returns a FASTA file containing spliced mRNA transcript sequences. MIME type is application/x-fasta.|
|/genome/Binomial_name/current/ncrna||Returns a FASTA file containing non-coding RNA sequences. MIME type is application/x-fasta.|
|/genome/Binomial_name/current/protein||Returns a FASTA file containing all the protein sequences known to be encoded by the genome. MIME type is application/x-fasta|
|/genome/Binomial_name/current/feature||Returns a GFF3 file describing genome annotations. MIME type is application/x-gff3.|
Other names for this: Common download URL, Common URL, Standard URL
Note: MODs may optionally provide URLs in the short form of G_species (eg C_elegans) as a convenience for users. This should be supplied in addition to the full Binomial_name standard.
Common /genome pages
These projects provide data information at /genome/, if not yet in the common formats described below.
|Medicago||http://www.medicago.org/genome/||Medicago truncatula plant genome|
|MaizeGDB||http://www.maizegdb.org/genome/||Maize corn genome|
|Human vector insects genomes|
MOD Non-Standard URL
For those genome projects that haven't yet standardized their URLs there is this site that lists what is available: