Difference between revisions of "Cloud"

From GMOD
Jump to: navigation, search
m (What's on the Cloud?)
Line 1: Line 1:
 
[[File:GitcLogo.png|right|400px|GMOD in the Cloud]]
 
[[File:GitcLogo.png|right|400px|GMOD in the Cloud]]
 +
The release of GMOD in the Cloud 2.0 is imminent, and this page is being edited in preparation.
 +
 
'''GMOD in the Cloud''' is a virtual server, available through the Amazon compute cloud, equipped with a suite of preconfigured GMOD components, including a '''Chado''' database, '''GBrowse2''', '''JBrowse''', '''Tripal''', and '''Apollo or WebApollo'''. Users can clone the GMOD Amazon Machine Image (AMI) and create their own server for storing data and making it accessible to the public. Making use of "cloud computing"--hosting data and/or applications on existing networked computer systems--to give users access to preconfigured, extensible servers is an alternative to building and maintaining large computing infrastructure in-house. Potential applications of the GMOD AMI range from short term usage, such as during annotation jamborees, to the long term provision of access to genome data and applications to the community.
 
'''GMOD in the Cloud''' is a virtual server, available through the Amazon compute cloud, equipped with a suite of preconfigured GMOD components, including a '''Chado''' database, '''GBrowse2''', '''JBrowse''', '''Tripal''', and '''Apollo or WebApollo'''. Users can clone the GMOD Amazon Machine Image (AMI) and create their own server for storing data and making it accessible to the public. Making use of "cloud computing"--hosting data and/or applications on existing networked computer systems--to give users access to preconfigured, extensible servers is an alternative to building and maintaining large computing infrastructure in-house. Potential applications of the GMOD AMI range from short term usage, such as during annotation jamborees, to the long term provision of access to genome data and applications to the community.
  
Line 29: Line 31:
 
The GMOD in the Cloud AMI has a separate data partition to make backup and updates easy.
 
The GMOD in the Cloud AMI has a separate data partition to make backup and updates easy.
  
Versions 1.1-1.3 of GMOD in the Cloud have use [[Apollo]]; version 1.4 onwards contain [[WebApollo]].
+
Versions 1.1-1.3 of GMOD in the Cloud have use [[Apollo]]; version 2.0 onwards contain [[WebApollo]].
  
''Important note:'' Starting with the 1.4 release of GMOD in the Cloud, the AMI "phones home" when started as an instance; that is, it sends an email to the developers to let them know that someone is using it.  For more information see below.
+
''Important note:'' Starting with the 2.0 release of GMOD in the Cloud, the AMI "phones home" when started as an instance; that is, it sends an email to the developers to let them know that someone is using it.  For more information see below.
  
 
===Tutorial===
 
===Tutorial===
Line 52: Line 54:
 
====Drupal====
 
====Drupal====
  
Version 6.26 of Drupal was obtained from http://drupal.org/ and installed
+
Version 6.28 of Drupal was obtained from http://drupal.org/ and installed
 
in /var/www, so that when navigating with a web browser to the Apache
 
in /var/www, so that when navigating with a web browser to the Apache
 
document root (i.e., http://127.0.0.1/ or whatever IP address Amazon
 
document root (i.e., http://127.0.0.1/ or whatever IP address Amazon
Line 61: Line 63:
 
====Tripal====
 
====Tripal====
  
[[Tripal]] version 0.3.1b is installed at
+
[[Tripal]] version 1.0 is installed at
  
 
   /var/www/sites/all/modules/tripal
 
   /var/www/sites/all/modules/tripal
  
Tripal was used to install the Chado 1.1 schema and load ontologies and
+
Tripal was used to install the Chado 1.2 schema and load ontologies and
 
a GFF file containing yeast genome annotations from SGD, downloaded from
 
a GFF file containing yeast genome annotations from SGD, downloaded from
 
yeastgenome.org:
 
yeastgenome.org:
Line 71: Line 73:
 
   http://downloads.yeastgenome.org/curation/chromosomal_feature/saccharomyces_cerevisiae.gff
 
   http://downloads.yeastgenome.org/curation/chromosomal_feature/saccharomyces_cerevisiae.gff
  
====GBrowse2====
+
as well as a sample GFF contig file output from [[MAKER]] for the ''Pythium ultimum''.  It was downloaded from
  
The version of [[GBrowse]]2 that is installed on this machine is 2.49.
+
  http://icebox.lbl.gov/webapollo/data/pyu_data.tgz
There is also a git repository in the home directory (~/GBrowse).
+
Updated versions of GBrowse2 can be installed from this directory after
+
doing a "git pull" to freshen the directory. The configuration file for
+
the Chado database is /data/etc/gbrowse2/07.chado.conf.
+
  
Minor bug work around: a bug in the Tripal GFF3 loader improperly sets
+
====GBrowse2====
featureloc.rank to 1 as the default rather than 0.  This change results in
+
GBrowse being unable to see any features.  To fix this, merely execute
+
this query:
+
  
  UPDATE chado.featureloc SET chado.featureloc.rank = 0
+
The version of [[GBrowse]]2 that is installed on this machine is 2.54. The configuration file for
    WHERE chado.featureloc.rank=1;
+
the Chado database is /data/etc/gbrowse2  and the yeast configuration file is 07.chado.conf, and for ''P. ultimum'' it is pythium.conf.
  
 
====JBrowse====
 
====JBrowse====
  
The [[JBrowse]] version that was installed on this machine is 1.4.1 installed from
+
The [[JBrowse]] version that was installed on this machine is 1.9.4 installed from
 
a zip file obtained from http://jbrowse.org/ and is installed in
 
a zip file obtained from http://jbrowse.org/ and is installed in
 
/var/www/jbrowse, so that navigating to <nowiki>http://</nowiki>{{Template:AWSurl}}/jbrowse
 
/var/www/jbrowse, so that navigating to <nowiki>http://</nowiki>{{Template:AWSurl}}/jbrowse

Revision as of 15:49, 21 May 2013

GMOD in the Cloud

The release of GMOD in the Cloud 2.0 is imminent, and this page is being edited in preparation.

GMOD in the Cloud is a virtual server, available through the Amazon compute cloud, equipped with a suite of preconfigured GMOD components, including a Chado database, GBrowse2, JBrowse, Tripal, and Apollo or WebApollo. Users can clone the GMOD Amazon Machine Image (AMI) and create their own server for storing data and making it accessible to the public. Making use of "cloud computing"--hosting data and/or applications on existing networked computer systems--to give users access to preconfigured, extensible servers is an alternative to building and maintaining large computing infrastructure in-house. Potential applications of the GMOD AMI range from short term usage, such as during annotation jamborees, to the long term provision of access to genome data and applications to the community.


This page serves as a "clearing house" for information about GMOD components available in the cloud (typically meaning AWS, but there might be other cloud implementations as well).

GMOD in the Cloud

GMOD in the Cloud poster from Biocuration 2013

See Scott's Genome Informatics (PPT) talk as well.

What's on the Cloud?

WebApollo logo Chado logo GBrowse logo GBrowse_syn logo JBrowse logo Tripal logo

Where is the Cloud?

Current GMOD in the Cloud instance:

  • Amazon AMI ID: ami-a9d7f9c0 (in the US East-Virgina zone);
  • Name: GMOD in the Cloud 2.05

(as of December 16, 2013)

The GMOD in the Cloud AMI has a separate data partition to make backup and updates easy.

Versions 1.1-1.3 of GMOD in the Cloud have use Apollo; version 2.0 onwards contain WebApollo.

Important note: Starting with the 2.0 release of GMOD in the Cloud, the AMI "phones home" when started as an instance; that is, it sends an email to the developers to let them know that someone is using it. For more information see below.

Tutorial

There is a GMOD-centric tutorial for getting started with GMOD in the Cloud.

From the README:

PostgreSQL

The postgres database name is "drupal", and the primary user for that database is also called drupal. See the database connection parameters in /var/www/sites/default/settings.php for more information. There is also a postgres account named ubuntu (the same as the login shell user name) that has superuser privileges. This account has its postgres "search_path" set so that it looks in the Chado schema before the public schema, and so this account should be used when using tools that are intended to interact with Chado (like GBrowse, Apollo and any command line tools from GMOD).

Drupal

Version 6.28 of Drupal was obtained from http://drupal.org/ and installed in /var/www, so that when navigating with a web browser to the Apache document root (i.e., http://127.0.0.1/ or whatever IP address Amazon assigns your machine), you will get the Drupal home page.

New modules can be added at /data/var/www/sites/default/modules and new themes can be added at /data/var/www/sites/default/themes.

Tripal

Tripal version 1.0 is installed at

  /var/www/sites/all/modules/tripal

Tripal was used to install the Chado 1.2 schema and load ontologies and a GFF file containing yeast genome annotations from SGD, downloaded from yeastgenome.org:

 http://downloads.yeastgenome.org/curation/chromosomal_feature/saccharomyces_cerevisiae.gff

as well as a sample GFF contig file output from MAKER for the Pythium ultimum. It was downloaded from

 http://icebox.lbl.gov/webapollo/data/pyu_data.tgz

GBrowse2

The version of GBrowse2 that is installed on this machine is 2.54. The configuration file for the Chado database is /data/etc/gbrowse2 and the yeast configuration file is 07.chado.conf, and for P. ultimum it is pythium.conf.

JBrowse

The JBrowse version that was installed on this machine is 1.9.4 installed from a zip file obtained from http://jbrowse.org/ and is installed in /var/www/jbrowse, so that navigating to http://ec2-##-##-##-##.compute-1.amazonaws.com/jbrowse will give the page. The configuration file for defining database connection parameters and created tracks is in the home directory: ~/jbrowse.conf.

Chado

While the Chado schema was installed by Tripal, the Chado software package is in the home directory, ~/schema/chado, and was used to install many utility scripts via the standard installation method for Perl modules (perl Makefile.PL; make; sudo make install). This checkout can be updated with "svn update" like the Tripal svn checkout.

Phoning home

Starting with the 1.4 release of GMOD in the Cloud, by default when an instance starts up for the first time, the instance sends an email to the GMOD developers letting them know that an instance has started, what the ID of the AMI was that it was started from, the size of the instance and what its public IP address is. This information will only be used for statistical purposes, primarily for applying for grants. The behavior can be modified in two ways:

Turning phone home off

In order to suppress sending the registration email, you can provide userdata when initializing the instance to turn it off. Put this in the userdata box:

 NoCallHome : 1

Providing addition details

You can also provide additional details when the instance sends the registration email. You can provide a contact email, name of your organization and the names of the organisms that you intend to use GMOD in the Cloud for. To provide this information, set these items in the userdata box:

 email : your@emailaddress.com
 org : name of your organization
 organism : name of the organism

The additional information will help us when it comes time to apply for grants.


GBrowse

See Getting Started with the EC2 VM for information on GBrowse2 virtual machines on the Amazon cloud.

Galaxy Cloudman

See the CloudMan page for more information about Galaxy's implementation.

CloVR (Workflow/Ergatis)

See http://clovr.org/ for more information.