GMOD in the Cloud is a virtual server, available through the Amazon compute cloud, equipped with a suite of preconfigured GMOD components, including a Chado database, GBrowse2, JBrowse, Tripal, and Apollo or WebApollo. Users can clone the GMOD Amazon Machine Image (AMI) and create their own server for storing data and making it accessible to the public. Making use of "cloud computing"--hosting data and/or applications on existing networked computer systems--to give users access to preconfigured, extensible servers is an alternative to building and maintaining large computing infrastructure in-house. Potential applications of the GMOD AMI range from short term usage, such as during annotation jamborees, to the long term provision of access to genome data and applications to the community.
This page serves as a "clearing house" for information about GMOD components available in the cloud (typically meaning AWS, but there might be other cloud implementations as well).
GMOD in the Cloud
What's on the Cloud?
Where is the Cloud?
Current GMOD in the Cloud instance:
- Amazon AMI ID: ami-935224fa (in the US East-Virgina zone);
- Name: GMOD in the Cloud 2.0
(as of June 14, 2013)
The GMOD in the Cloud AMI has a separate data partition to make backup and updates easy.
Important note: Starting with the 2.0 release of GMOD in the Cloud, the AMI "phones home" when started as an instance; that is, it sends an email to the developers to let them know that someone is using it. For more information see below.
About instance types
While GMOD in the Cloud 2.0 will run as a micro instance, it won't run well. Specifically, running GBrowse with FastCGI will have a hard time, and WebApollo won't work at all. The demo instance running at cloud.gmod.org is a small instance.
There is a GMOD-centric tutorial for getting started with GMOD in the Cloud.
When you log into your GMOD Cloud instance, you will be in the ubuntu user's home directory, /home/ubuntu, which is part of the root partition of the machine, and so, if you save anything in this directory, it will be deleted when you move to a new version of the GMOD Cloud. To save items from one machine to the next, it must be saved on your EBS partition, and to make that easier from the home directory, we've put in a link to the /data partition in the home directory named "dataHome", so you can "cd /home/ubuntu/dataHome" or "cd ~/dataHome" to save files in a convenient spot relative to your home directory. Already in this dataHome directory is a file called "bashrc" which is automatically included in your .bashrc when you log in, so if you have changes you'd like to make to your shell environment, you can add those changes here.
There are several other directories that are on the /data partition so they will be saved when you go to a new instance as well. These are:
|/data/etc/gbrowse||The config directory for GBrowse.|
|/data/etc/postgresql||The config directory for PostgreSQL|
|/data/opt||A good place to install any other software you want to use|
|/data/var/lib/gbrowse||Other GBrowse files that might be modifed on your instance, including user session data and flat file databases.|
|/data/var/lib/postgresql||Files for the PostgreSQL database|
|/data/var/www/.htaccess||The htaccess file for the main Drupal site|
|/data/var/www/jbrowse/jbrowse_conf.json||Config file for JBrowse|
|/data/var/www/jbrowse/data||All of the data files needed for running JBrowse|
|/data/var/www/sites/default||Site-specific files and directories for Drupal; modules and themes go here (though Tripal is in /var/www/sites/all)|
In all instances, the original files and directories where moved to the /data partition and replaced with symlinks to their new locations. Note that changes made to files in any other locations will be lost. If you find that you must have other files or directories saved, please send an email to email@example.com to request that symlinks be added to future releases.
Installed GMOD software
The postgres database name is "drupal", and the primary user for that database is also called drupal. See the database connection parameters in /var/www/sites/default/settings.php for more information. There is also a postgres account named ubuntu (the same as the login shell user name) that has superuser privileges. This account has its postgres "search_path" set so that it looks in the Chado schema before the public schema, and so this account should be used when using tools that are intended to interact with Chado (like GBrowse, Apollo and any command line tools from GMOD).
Version 6.28 of Drupal was obtained from http://drupal.org/ and installed in /var/www, so that when navigating with a web browser to the Apache document root (i.e., http://127.0.0.1/ or whatever IP address Amazon assigns your machine), you will get the Drupal home page.
New modules can be added at /data/var/www/sites/default/modules and new themes can be added at /data/var/www/sites/default/themes.
Tripal version 1.0 is installed at /var/www/sites/all/modules/tripal
Tripal was used to install the Chado 1.2 schema and load ontologies and a GFF file containing yeast genome annotations from SGD, downloaded from yeastgenome.org:
as well as a sample GFF contig file output from MAKER for the Pythium ultimum. It was downloaded from
The version of GBrowse2 that is installed on this machine is 2.54. The configuration file for the Chado database is /data/etc/gbrowse2 and the yeast configuration file is 07.chado.conf, and for P. ultimum it is pythium.conf.
GBrowse is configured to use fcgid, a web server "add on" that helps speed up GBrowse. To use it, use URLs that look like this: http://ec2-##-##-##-##.compute-1.amazonaws.com/fgb2/gbrowse/yeast, and if there are problems with fcgid, you can still use the non-accelerated GBrowse at http://ec2-##-##-##-##.compute-1.amazonaws.com/cgi-bin/gb2/gbrowse/yeast.
The JBrowse version that was installed on this machine is 1.9.4 installed from
a zip file obtained from http://jbrowse.org/ and is installed in
/var/www/jbrowse, so that navigating to http://ec2-##-##-##-##.compute-1.amazonaws.com/jbrowse
will give the page. The configuration file for defining database connection
parameters and created tracks is in the home directory:
The Pythium dataset was created in a way similar to the JBrowse tutorial and using the configuration file
JBrowse was configured to have multiple datasets using the jbrowse_conf.json file as described in the JBrowse_Configuration_Guide#Dataset_Selector. This file is the data partition at
While the Chado schema was installed by Tripal, the Chado software package is in the home directory, ~/sources/chado, and was used to install many utility scripts via the standard installation method for Perl modules (perl Makefile.PL; make; sudo make install). This checkout can be updated with "svn update" like the Tripal svn checkout.
WebApollo was installed essentially per the directions on the WebApollo page. In addition to creating an admin user, another user with limited permissions was also created. That user is called "guest" and has "guest" as the password.
Starting with the 1.4 release of GMOD in the Cloud, by default when an instance starts up for the first time, the instance sends an email to the GMOD developers letting them know that an instance has started, what the ID of the AMI was that it was started from, the size of the instance and what its public IP address is. This information will only be used for statistical purposes, primarily for applying for grants. The behavior can be modified in two ways:
Turning phone home off
In order to suppress sending the registration email, you can provide userdata when initializing the instance to turn it off. Put this in the userdata box:
NoCallHome : 1
Providing addition details
You can also provide additional details when the instance sends the registration email. You can provide a contact email, name of your organization and the names of the organisms that you intend to use GMOD in the Cloud for. To provide this information, set these items in the userdata box:
email : firstname.lastname@example.org org : name of your organization organism : name of the organism
The additional information will help us when it comes time to apply for grants.
See Getting Started with the EC2 VM for information on GBrowse2 virtual machines on the Amazon cloud.
See http://clovr.org/ for more information.