Cloud

From GMOD
Revision as of 22:28, 12 September 2012 by Girlwithglasses (Talk | contribs)

Jump to: navigation, search

This page serves as a "clearing house" for information about GMOD projects in the cloud (typically meaning AWS, but there might be other cloud implementations as well). See Scott's Genome Informatics (PPT) talk as well.

GMOD

Current GMOD in the Cloud instance:

  • Amazon AMI ID: ami-a9d7f9c0 (in the US East-Virgina zone);
  • Name: GMOD in the Cloud 2.05

(as of December 16, 2013)

Coming soon: an AMI with a separate data partition to make backup and updates easy. Coming not quite as soon: WebApollo.

Tutorial

There is a GMOD-centric tutorial for getting started with GMOD in the Cloud.

From the README:

PostgreSQL

The postgres database name is "drupal", and the primary user for that database is also called drupal. See the database connection parameters in /var/www/sites/default/settings.php for more information. There is also a postgres account named ubuntu (the same as the login shell user name) that has superuser privileges. This account has its postgres "search_path" set so that it looks in the Chado schema before the public schema, and so this account should be used when using tools that are intended to interact with Chado (like GBrowse, Apollo and any command line tools from GMOD).

Drupal

Version 6.26 of Drupal was obtained from http://drupal.org/ and installed in /var/www, so that when navigating with a web browser to the Apache document root (i. e., http://127.0.0.1/ or whatever IP address Amazon assigns your machine), you will get the Drupal home page.

New modules can be added at /data/var/www/sites/default/modules and new themes can be added at /data/var/www/sites/default/themes.

Tripal

Tripal version 0.3.1b is installed at

  /var/www/sites/all/modules/tripal

Tripal was used to install the Chado 1.1 schema and load ontologies and a GFF file containing yeast genome annotations from SGD, downloaded from yeastgenome.org:

 http://downloads.yeastgenome.org/curation/chromosomal_feature/saccharomyces_cerevisiae.gff

GBrowse2

The version of GBrowse2 that is installed on this machine is 2.49. There is also a git repository in the home directory (~/GBrowse). Updated versions of GBrowse2 can be installed from this directory after doing a "git pull" to freshen the directory. The configuration file for the Chado database is /data/etc/gbrowse2/07.chado.conf.

Minor bug work around: a bug in the Tripal GFF3 loader improperly sets featureloc.rank to 1 as the default rather than 0. This change results in GBrowse being unable to see any features. To fix this, merely execute this query:

 UPDATE chado.featureloc SET chado.featureloc.rank = 0
    WHERE chado.featureloc.rank=1;

JBrowse

The JBrowse version that was installed on this machine is 1.4.1 installed from a zip file obtained from http://jbrowse.org/ and is installed in /var/www/jbrowse, so that navigating to http://ec2-##-##-##-##.compute-1.amazonaws.com/jbrowse will give the page. The configuration file for defining database connection parameters and created tracks is in the home directory: ~/jbrowse.conf.

Chado

While the Chado schema was installed by Tripal, the Chado software package is in the home directory, ~/schema/chado, and was used to install many utility scripts via the standard installation method for Perl modules (perl Makefile.PL; make; sudo make install). This checkout can be updated with "svn update" like the Tripal svn checkout.

GBrowse2

The GBrowse2 AMI is not yet public; it will be soon.

Galaxy Cloudman

See the CloudMan page for more information about Galaxy's implementation.

CloVR (Workflow/Ergatis)

See http://clovr.org/ for more information.