Difference between revisions of "Tripal Tutorial"

Revision as of 20:52, 24 September 2009

{{#icon: TripalLogo.jpg|Tripal|200|Tripal}}
{{#icon: 2009SummerSchoolAmericas170.png|2009 GMOD Summer School - Americas|120|2009 GMOD Summer School - Americas}}

Tripal Session

2009 GMOD Summer School - Americas
19 July 2009
Stephen Ficklin

__NOTITLE__

This tutorial walks you through installing and configuring Tripal, a web front end to Chado databases. This tutorial was originally taught by Stephen Ficklin at the 2009 GMOD Summer School - Americas.

This tutorial references (and complements) the Tripal User's Guide, June 2009 edition.

1 VMware
2 Caveats
3 Background
4 Pre-Course Setup
5 Install Drupal
6 Review the features of Drupal
7 Setup Drupal Cron
8 Tripal Resources
9 Tripal Overview
10 Setup Tripal
11 Organisms
12 Features
13 Blast Module -- An Analysis Example
14 Searching
- 14.1 Advanced Searching
15 Libraries & other Analysis
16 Incorporate JBrowse
17 Customizing Content
18 Writing your own Module

VMware

This tutorial was taught using a VMware system image as a starting point. If you want to start with that same system, download and install the Starting image.

See VMware for what software you need to use a VMware system image, and for directions on how to get the image setup and running on your machine.

Download
Starting Image Ending Image Username: gmod Password: gmod

Caveats

Important Note

This tutorial describes the world as it existed on the day the tutorial was given. Please be aware that things like CPAN modules, Java libraries, and Linux packages change over time, and that the instructions in the tutorial will slowly drift over time. Newer versions of tutorials will be posted as they become available.

Background

Tripal is open source and currently in 'Beta'.
More modules and documentation are coming
Tripal is designed to be customizable.
Success will be in large part due to community response.
Data loaded into Chado for this course was taken from the YeastDB website.

Pre-Course Setup

By way of information the following steps were performed beforehand to the VMware image for the class. These steps will not be needed during the class but are here for reference.

Prepare Postgres

We need to create two PostgreSQL databases, one for the Chado tables and the other for the Drupal tables.

  sudo su - postgres

Create the user that will manage the yeast database:

  postgres@gmod:~$ createuser -P yeast_admin
  Enter password for new role: gmod
  Enter it again: gmod
  Shall the new role be a superuser? (y/n) n
  Shall the new role be allowed to create databases? (y/n) y
  Shall the new role be allowed to create more new roles? (y/n) n

Create the Chado and Drupal databases:

  postgres@gmod:~$ createdb chado_yeast -O yeast_admin
  postgres@gmod:~$ createdb drupal_yeast -O yeast_admin

Install Prereqs

XSLT

 apt-get install xsltproc

Install the Postgres development tools:

  sudo apt-get install libpq-dev

Install the needed Perl modules

 sudo perl -MCPAN -e shell

 install GO::Parser
 install Template
 install XML::Simple
 install Log::Log4perl
 install XML::Parser::PerlSAX
 install DBI
 install DBD::Pg
 install DBIx::DBSchema
 install DBIx::DBStag
 install Parse::RecDescent

Needed for CUGI scripts

 install Spreadsheet::WriteExcel

BioPerl

 cd /home/gmod/tripal/packages
 wget http://bioperl.org/DIST/BioPerl-1.6.0.tar.gz
 tar -zxvf BioPerl-1.6.0.tar.gz
 cd BioPerl-1.6.0/
 perl Makefile.PL
 make
 sudo make install

Install the go-perl modules:

  cd /home/gmod/tripal/packages
  wget http://search.cpan.org/CPAN/authors/id/C/CM/CMUNGALL/go-perl-0.09.tar.gz
  tar -zxvf go-perl-0.09.tar.gz
  cd go-perl-0.09
  perl Makefile.PL
  make
  sudo make install

Chado Installation

Download and extract the gmod package:

  cd /home/gmod/tripal/packages
  wget http://internap.dl.sourceforge.net/sourceforge/gmod/gmod-1.0.tar.gz
  tar -zxvf gmod-1.0.tar.gz
  cd gmod-1.0

  export GO_ROOT=/home/gmod/tripal/lib
  export GMOD_ROOT=/home/gmod/tripal/lib/gmod-1.0
  export CHADO_DB_NAME=chado_yeast
  export CHADO_DB_USERNAME=yeast_admin
  export CHADO_DB_PASSWORD=gmod
  export CHADO_DB_HOST=localhost
  export CHADO_DB_PORT=5432


   perl Makefile.PL PREFIX=/home/gmod/tripal/gmod-1.0

   Use the simple install (uses default database schema, which contains
   all of the modules and extensions to the schema and all of the non-trigger functions.
   This is probably what you want) [Y]
   What database server will you be using? [PostgreSQL]
   What is the Chado database name? [chado_yeast]
   What is the database username? [yeast_admin]
   What is the password for 'yeast_admin'? [gmod]
   What is the database host? [localhost]
   What is your database port? [5432]
   Where shall downloaded ontologies go? [./tmp]
   What is the default organism (common name, or "none")? []
   Do you want to make this the default chado instance? [y]

   Building with the following database options:
     GMOD_ROOT=/home/gmod/tripal/gmod-1.0
     DBDRIVER=PostgreSQL
     DBNAME=chado_yeast
     DBUSER=yeast_admin
     DBPASS=gmod
     DBHOST=localhost
     DBPORT=5432
     LOCAL_TMP=./tmp
     DBORGANISM=
     DEFAULT=y

   Extracting /home/gmod/tripal/packages/gmod-1.0/bin/../lib/Chado/AutoDBI.pm (with variable substitutions)
   Checking whether your kit is complete...
   Looks good

   Checking prerequisites...
   Looks good

   initializing load scripts...
   Deleting Build
   Removed previous script 'Build'

   Creating new 'Build' script for 'Chado' version '0.01'

   <snip>

   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
   Makefile written.  Now you should do the following, in order:
     1. make              (creates necessary build files)
     2. sudo make install (creates $GMOD_ROOT and subdirectories)
   -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*WARNING-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
            STEP 3 WILL DELETE ANY DATA IN A DATABASE WITH THE
               DATABASE NAME YOU PROVIDED!
   -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*WARNING-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
     3. make load_schema (loads SQL schema into database)
     4. make prepdb      (loads basic data)
     5. make ontologies  (loads data for various ontologies)

   Optional Targets:
     make rm_locks     (removes ontology lock files, allowing installation
                        of ontologies on successive builds of the database
                        without removing the ontology files altogether)
     make clean        (remove build related files and ontology tmp dir)
     make instructions (at any moment display these instructions)

   -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


   make
   sudo make install
   make load_schema
   make prepdb
   make ontologies

Web Prep

Install php

  apt-get install php5
  apt-get install php5-pgsql
  apt-get install php5-cli
  apt-get install php5-gd

Change some php settings (as root):

Changes to php.ini were not present in the VMware image used in the course. Please update both php.ini files.

  cd /etc/php5/apache2
  vi php.ini

A word on text editors such as vi.

Set the memory_limit to something larger than 16M (should not exceed physical memory, be conservative but not too much so):

  memory_limit = 2048M;

Now, restart the webserver:

  /etc/init.d/apache2 restart

Do the same for the command-line php.ini:

  cd /etc/php5/cli/
  vi php.ini

A word on text editors such as vi.

Set the memory limit:

  memory_limit = 2048M;

Install phpPgAdmin

  cd /home/gmod/tripal/packages
  wget http://downloads.sourceforge.net/phppgadmin/phpPgAdmin-4.2.2.tar.gz?download

As root:

  cd /var/www
  tar -zxvf /home/gmod/tripal/packages/phpPgAdmin-4.2.2.tar.gz
  ln -s phpPgAdmin-4.2.2/ phppgadmin

Load Yeast Data

Download the GFF file available from the yeast genome website:

  cd /home/gmod/tripal/data
  wget http://downloads.yeastgenome.org/chromosomal_feature/saccharomyces_cerevisiae.gff

Download the coding segments from the Yeast Genome website:

  wget http://downloads.yeastgenome.org/sequence/genomic_sequence/orf_dna/orf_coding.fasta.gz

Adjust the gmod_bulk_load_gff3.pl so it can find the gmod libraries (we didn't install in a default location):

  use lib "/home/gmod/tripal/gmod-1.0/share/perl/5.8.8";

Load the GFF into our Chado database

  export GMOD_ROOT=/home/gmod/tripal/gmod-1.0
  ../gmod-1.0/bin/gmod_bulk_load_gff3.pl \
     --dbname chado_yeast \
     --dbuser yeast_admin \
     --dbhost localhost \
     --dbpass demo \
     --organism yeast \
     --gff saccharomyces_cerevisiae.gff \
     --recreate_cache \
     --no_target_syn

Prepare Apache

Enable the rewrite module for Apache. This is useful so that we can use Clean URLs with Drupal. Clean URLs are not required but make the page URLs easier to use:

  cd /etc/apache2/mods-enabled
  sudo ln -s ../mods-available/rewrite.load

Edit the configuration file and change the AllowOverride from 'None' to 'All'

  cd /etc/apache2/sites-enabled
  sudo vi 000-default

A word on text editors such as vi.

  <Directory /var/www/>
     Options Indexes FollowSymLinks MultiViews
     AllowOverride All
     Order allow,deny
     allow from all
  </Directory>

Restart the web server

  sudo /etc/init.d/apache2 restart

Install Drupal

  cd /var/www
  mv index.html index.old.html
  sudo tar -zxvf /home/gmod/tripal/packages/drupal-6.13.tar.gz

Now that we've unpacked Drupal, we'll move some things around.

  cd drupal-6.13
  sudo mv .htaccess * ../

Set permissions so we can easily work with

  cd /var
  sudo chown -R gmod:gmod www

This will undo the permission changes we did yesterday with GBrowse. Redo them.

sudo chmod 777 -r /var/www/gbrowse/{tmp,databases}

Be more careful in real life.

Configure Drupal

  cd /var/www/sites/default
  cp default.settings.php settings.php
  vi settings.php

A word on text editors such as vi.

Change the $db_url argument to be:

  $db_url = array (
     'default' => 'pgsql://yeast_admin:demo@localhost/drupal_yeast',
     'chado' => 'pgsql://yeast_admin:demo@localhost/chado_yeast'
  );

Make the files directory writable by the webserver

  mkdir files
  sudo chgrp www-data files
  chmod g+rw files

Point Browser to the webserver and follow the two easy pages for installing and then configuration the Drupal site.

  http://localhost/install.php

The username and password are

Username: yeast_admin
Password: gmod

Review the features of Drupal

User Accounts

View History
Edit the Account

Create Content

Create a 'Home' page
Create an 'About' page

Administration - Content Management

Content Management → Content
Content Management → Taxonomy

Administration - Site Building

Site Building → Blocks
- Add the "Who's Online" block to the left sidebar.

Site Building → Menu
- Add a 'Home' link
- Add an 'About' link

Site Building → Modules
- Tripal requires the Path and Search modules. Enable those.
- Return to the About link and add an alias.
- Return to the Menu and update the about link.
- Return to the Blocks administration and add a search box to header.
- Navigate to Drupal.org and click on 'Modules'. Review the wealth of public modules.

Site Building → Themes
- Change the theme to Pushbutton.
- Configure the theme and remove the search box.

Administration - User Management

User Management → Roles
- Create role 'webmaster'
- Create role 'editor'

User Management → Users
- Add a user for yourself and set as webmaster.

User Management → Permissions
- Turn on everything for webmaster
- Set 'Access Content' for editor (and whatever else)

User Management → User Settings

Administration - Site Configuration

Site Configuration → Performance

Administration - Reports

Reports → Recent log entries
Reports → Status reports
- Run cron manually

Setup Drupal Cron

Drupal requires an entry in the crontab to function:

  crontab -e

A word on text editors such as nano.

Add this line to the crontab

  0,30 * * * * /usr/bin/wget -O - -q http://localhost/cron.php > /dev/null

The cron will launch this job every 30 minutes.

Tripal Resources

Availability

Available soon for download on GMOD Sourceforge site
Currently available on CUGI's website: http://www.genome.clemson.edu/software/tripal

Forum

Users can post questions or request help from developers and the community using an online 'Tripal Help' forum: http://www.genome.clemson.edu/forum

Mantis Bug Tracking

Navigate to http://www.genome.clemson.edu/mantis and sign up for an account. If you find any bugs or find odd behavior you can enter an issue into the tracking system.

Documentation and Helper Scripts

Documentation and helper scripts can be obtained on the CUGI website: http://www.genome.clemson.edu/software/tripal

Tripal Overview

Module Overview

Review Table 1 in User's Guide. (pg 12, June 2009 document).

Navigate to "Administer" → "Site Building" → "Modules": Dependencies are heirarchical.

Core Module

Jobs -- long running items
Taxonomy -- for advanced searching
CVterms -- for customized data
Materialized Views -- for speedy
API for new modules -- for expandability

Setup Tripal

Install Tripal

Create the necessary directory structure

  cd /var/www/sites/all
  mkdir modules
  mkdir themes
  cd modules

Unpack the Tripal package into the modules directory:

  tar -zxvf /home/gmod/tripal/packages/tripal-6.x-0.1b.tar.gz

Move the themes to the themes directory:

  mv admire_gray-6.x-1.1.zip theme_tripal ../themes/
  cd ../themes
  unzip admire_gray-6.x-1.1.zip

Enable the Tripal Modules

Navigate to the Modules Administration Page ("Administer" → "Site Building" → "Modules") and click the checkboxes for each of the tripal modules. Do not enable them all at once. Enable in groups:

Group 1 (dependencies, already enabled):

  Path
  Search

Group 2:

  Tripal Core

Group 3:

  Tripal Chado Organism
  Tripal Chado Feature
  Tripal Chado Library
  Tripal Chado Analysis

Setup Tripal Cron

Tripal also require an entry in the crontab to function:

  crontab -e

A word on text editors such as nano.

Add this line to the crontab

  0,15,30,45 * * * * (cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin ) > /dev/null

The will run the Tripal cron every 15 miutes.

Test the cron job to make sure it works:

  cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

Oops. It failed:

 The program 'php' is currently not installed.  You can install it by typing:
 sudo apt-get install php5-cli
 bash: php: command not found

So let's do that (and get php5-gd which we also need).

 sudo apt-get install php5-cli
 sudo apt-get install php5-gd

Now, try again.

 cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

The Tripal job should execute and return the following:

 Tripal Job Launcher
 -------------------

Note: It is important the the final parameter be set to the name of the administrator account for Drupal. In our case 'yeast_admin'

Organisms

Sync the Organism

Navigate to the "Administer" → "Tripal Management" → "Organisms". Click the checkbox beside "Saccharomyces cerevisiae (yeast)"

Bug Fix: before continuing we must create the images directory for the organism or the sync will fail. This is a bug in the code which will be fixed but here's a workaround:

  cd /var/www/sites/default/files/tripal/tripal_organism
  sudo mkdir images
  sudo chown gmod:www-data images
  sudo chmod g+w images

Rather than wait for at most 15 minutes for the cron to launch the job, we'll manually perform that now:

  cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

Check the jobs page and notice the stats.

Now, the organism has a Drupal page! We can view it, but notice it has no useful content.. we need to add some. Click the 'Edit' tab at the top. For a description add the following (stolen from wikipedia):

Saccharomyces cerevisiae is a species of budding yeast. It is perhaps the most useful yeast owing to its use since ancient times in baking and brewing. It is believed that it was originally isolated from the skins of grapes (one can see the yeast as a component of the thin white film on the skins of some dark-colored fruits such as plums; it exists among the waxes of the cuticle). It is one of the most intensively studied eukaryotic model organisms in molecular and cell biology, much like Escherichia coli as the model prokaryote. It is the microorganism behind the most common type of fermentation. Saccharomyces cerevisiae cells are round to ovoid, 5–10 micrometres in diameter. It reproduces by a division process known as budding.

But, our additional information did not show up! We need to install the Tripal theme.

Setup the Tripal Theme

The tripal theme is a sub theme so it is dependent on another theme to work. By default this is the admire-gray theme which accompanies Tripal. The admire-gray theme must be present but will not be enabled. Navigate to the "Administer" → "Site Building" → "Themes" page and select the 'Tripal Theme' radio button and checkbox, remove the check on the 'Garland' theme and click 'Save configuration'.

Open a new tab in the browser and navigate to the "Administer" → "Site Building" → "Blocks" page. The theme has now changed to the admire-gray theme, but with Tripal "extensions". However, the menu item is floating in the middle of the page. The next step is to place the blocks in their proper place.

Alter the blocks accordingly:

  Navigation → Sidebar left
  User login → Sidebar left
  Powered by Drupal → <none>
  Who's online → <none>
  Libraries → Sidebar left
  Organisms → Sidebar left
  Search form → Header

Click 'Save blocks'. This should relocate the menu as well as the Tripal library and organism blocks to the left side of the page.

Now return to the tab with the theme and click 'configure' for the 'Tripal Theme'. Adjust these parameters to your liking.

Now, return to the organism page for 'Yeast' and we should see the information we added previously.

Organism Materialized Views

The organism module is capable of showing the type and number of features for this organism. Currently that information does not appear on the Organism page. This behavior is controlled by a materialized view that is created when the module is installed but needs to be updated.

The SQL Statement <sql>

  SELECT O.organism_id, O.genus, O.species, O.common_name,
      count(F.feature_id) as num_features,
      CVT.name as feature_type
  FROM Organism O
      INNER JOIN Feature F   ON O.Organism_id = F.organism_id
      INNER JOIN Cvterm CVT  ON F.type_id = CVT.cvterm_id
  GROUP BY O.Organism_id, O.genus, O.species, O.common_name, CVT.name

</sql>

Navigate to the "Administer" → "Tripal Management" → "Materialized Views" page. Click 'update' for the view named 'organism_feature_count'. Rather than wait for the cron to launch the job, do it manually now:

  cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

Return to the Yeast page and notice that a new box appeared with a listing of the types and number of features for the organism.

Browse Features

Notice on the yeast page there is a section titled 'Browse Features' with no features listed. This display is managed by the Tripal Feature module. To show features in this box, navigate to the "Administer" → "Tripal Management" → "Features". In the box titled 'Feature Types' we will list all of the feature types that we want to show in the list. Chan the the text in the box to the following:

  gene

Click 'Save configuration' and return to the Organism page. There is now a list of browseable features.

Features

Sync the Features

Navigate to the "Administer" → "Tripal Management" → "Features". Before syncing any features we must first set the accession prefix and the feature types. Each feature on the site is assigned a unique accession number. Essentially this is the feature_id of the feature in the chado table. The prefix is added to the beginning of the feature_id to form the unique accession. The default is 'ID'. Let's change this to YDB (which represents yeast database)

Next, the site will only create drupal content for those features types listed in the Feature Types text box. When we setup the Organism page above we placed 'gene' in this box. We'll leave this as is.

Finally, to sync the features, click the 'Sync all Features' button. Rather than wait on the job cron we'll launch this job manually:

  cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

Note This will probably take a bit of time. Check the status of the sync by navigating to the "Administer" → "Tripal Management" → "Jobs" page. It took about 8 minutes on my laptop.

Themeing

While we wait for the features to sync we want to add the picture to the organism page. Go to the Yeast page, edit and add the Dry_yeast.jpg image to the page.

But, the image doesn't show up! There is a bug in the Tripal template file. So now is a good time to fix that and discuss theming.

Important Files for Drupal Themes:

.info file
screenshot.png
logo.png
style.css
page.tpl.php
customizations

  cd /var/www/sites/all/themes/theme_tripal/
  vi node-chado_organism.tpl.php

Change the line from this:

  <img src=<?php print "sites/default/files/tripal/tripal_organism/images/".$node->genus."_".$node->species.".jpg"?>>

To this:

  <img src=<?php print "/" . file_directory_path() . "/tripal/tripal_organism/images/".$node->genus."_".$node->species.".jpg"?>

Now, go to the organism page and refresh. The image should appear but it's layout is a bit awkward. We need to adjust the CSS file. Edit the CSS file:

  cd /var/www/sites/all/themes/theme_tripal/css
  vi tripal.css

Change this stanza:

  //
  // Copyright 2009 Clemson University
  //

to this:

 /*
 Copyright 2009 Clemson University
 */

The incorrect comments interferred with the first CSS stanza that affected table width.

Now, refresh the organism page and it should look correct.

Feature Syncing -- returned

Syncing should be finished, but if not we can continue. Here's what's new:

The gene names on the organism page are now clickable
(Click on a gene). The feature page for the gene now exists.
Each feature has a URL identical to it's accession

Unfortunately, the GFF file loaded did not have sequence residues for the genes, so they are not appearing. Let's add the sequence to one sequence in particular. Browse for the 'YAL061W' feature on the Yeast page. Edit the feature and add the following sequences:

ATGAGAGCCTTAGCGTATTTCGGTAAAGGTAACATCAGATTCACCAACCA
TTTAAAGGAGCCACATATTGTGGCGCCCGATGAGCTTGTGATTGATATCG
AATGGTGTGGTATTTGCGGTACGGACCTGCATGAGTACACAGATGGTCCT
ATCTTTTTCCCAGAAGATGGACACACACATGAGATTAGTCATAACCCATT
GCCACAGGCGATGGGCCACGAAATGGCTGGTACCGTTTTGGAGGTGGGCC
CTGGTGTGAAAAACTTGAAAGTGGGAGACAAGGTAGTTGTCGAGCCCACA
GGTACATGCAGAGACCGGTATCGTTGGCCCCTGTCGCCAAACGTTGACAA
GGAATGGTGCGCTGCTTGCAAAAAGGGCTACTATAACATTTGTTCATATT
TGGGGCTTTGTGGTGCGGGTGTGCAGAGCGGTGGATTTGCAGAACGTGTT
GTGATGAACGAATCTCACTGCTACAAAGTACCGGACTTCGTGCCCTTAGA
CGTTGCAGCTTTGATTCAACCGTTGGCTGTGTGCTGGCATGCAATTAGAG
TCTGCGAGTTCAAAGCAGGCTCTACGGCTTTGATCATTGGTGCTGGCCCC
ATCGGACTGGGCACGATACTGGCGTTGAACGCTGCAGGTTGCAAGGACAT
CGTCGTTTCAGAGCCTGCCAAGGTAAGAAGAGAACTGGCTGAAAAAATGG
GTGCCAGGGTTTACGACCCAACTGCGCACGCTGCCAAGGAGAGCATTGAT
TATCTGAGGTCGATTGCTGATGGTGGAGACGGCTTCGATTACACATTTGA
TTGCTCCGGGTTGGAAGTCACATTGAATGCTGCTATTCAGTGTCTCACTT
TCAGAGGCACCGCAGTGAACTTGGCCATGTGGGGCCATCACAAGATACAG
TTTTCTCCGATGGACATCACATTGCATGAAAGAAAGTACACAGGGTCCAT
GTGCTACACACACCACGATTTTGAGGCAGTAATAGAAGCTTTGGAAGAAG
GCAGGATTGACATTGATAGAGCAAGACATATGATAACGGGCAGAGTCAAC
ATTGAGGACGGCCTTGATGGCGCCATCATGAAGCTGATAAACGAGAAGGA
GTCTACAATCAAGATTATTCTGACTCCAAACAATCACGGAGAGTTGAACA
GGGAAGCCGATAATGAGAAGAAAGAAATTTCCGAGCTGAGCAGTCGGAAA
GATCAAGAAAGACTACGAGAATCAATAAACGAGGCTAAACTGCGTCACAC
ATGA

Note: a bug exists (which has been fixed in CVS but not here) that won't let you select 'gene' as the feature type. To get past this just select 'EST' or 'contig'.

Feature References

Notice on the page for the gene is a "References" Box. There should be a a reference to SGD and an accession number. We want to make the accession clickable.

Navigate to the PhPgAdmin page

  http://localhost/phpPgAdmin-4.2.2/

Log in as the 'yeast_admin' user with password 'demo'. Open the chado_yeast database schema and find the 'db' table. Locate the 'SGD' entry and edit the row adding the following to the URL and URL prefix columns:

  url:        http://www.yeastgenome.org/
  urlprefix:  http://www.yeastgenome.org/cgi-bin/locus.fpl?dbid=

Now, return to the gene page and refresh the page. The accession should now be clickable and take you to the appropriate page on the SGD website for the gene.

Note: In the future a Tripal module for managing these references will most likely be available.

Blast Module -- An Analysis Example

Analysis modules can be created to display data in any way desired. Tripal currently has two analysis modules that may nor may not be useful to you.

Navigating to the "Administer" → "Site building" → "Modules" page and enable the Blast Analysis module.

The orf_coding.fasta file was blasted against the swissprot database. The results were saved in XML format. For this example only a single results will be used to save space on the VM. The CUGI script split_blastxml-0.1.2 parses the XML file and generates a directory structure needed by the blast module.

We want to store results in the Blast analysis data directory:

 cd /var/www/sites/default/files/tripal
 sudo chown gmod tripal_analysis_blast

Now execute the script to generate the data format needed:

 perl /home/gmod/tripal/packages/split_blastxml-0.1.2.pl \
    -x /home/gmod/tripal/data/blast_demo.xml \
    -l tripal_analysis_blast \
    -h localhost \
    -u yeast_admin \
    -d chado_yeast \
    -p demo \
    -r 'DB:swissprot:display' \
    -e "^(.*?)\s.*$"

Browse for the gene 'YAL061W' on the organism page and click. The page should show the blast results.

To disable the placement of blast results on the Feature pages, navigate to "Administer" → "Tripal Management" → "Analyses". Scroll down to the section titled "Tripal Blast" and uncheck the box. Return to the feature page and blast results should be gone.

Note: It is anticipated that in the future a bulk data load will be available and helper scripts will not be needed.

Features

Allows for storage of large blast results
Does not clutter the Chado database unnecessarily
Quick access
Ajax enabled.
Links to database

Restrictions

Currently only supports NCBI nr, SwissProt and go-seqdb.
Requires a helper script.

Searching

We have a search box at the top of the page but we would like to add a menu item for searching. Navigate to the "Administer" → "Site building" → "Menus" → "Primary Links" page and click the 'Add Item' near the top of the page.

Enter the following into the fields

  Path:  search/node
  Menu link title:  Search

A new 'Search' tab appears in the main menu.

Check the status of the search indexing by navigating to "Administer" → "Site Configuration" → "Search Settings". For searching to work correctly the site must be 100% indexed. If not let's do that now by manually running the Drupal cron:

  /usr/bin/wget -O - -q http://localhost/cron.php

However, due to timing issues, the search for the gene 'YAL061W' may or may not return a result. Additionally, in the blast results just added we need have the term 'Sorbitol dehydrogenase' to be searchable, but if we search on that enzyme we get zero results. We would like for both terms to be searchable. To do this, we need to reindex the features.

How to reindex all content

Navigate to "Administer" → "Tripal Management" → "Features" and click the 'Reindex all feature nodes' button. This will add a Job to Tripal.

Rather than wait for the Tripal Job cron to launch we will do it manually:

  cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

Check the status of this job on the Jobs page.

Note: This will take a few minutes (3 minutes on my laptop)

We should now be able to search for both 'YAL061W' and 'Sorbitol dehydrogenase'.

Anytime major changes are made to the content on feature pages the feature should be re-indexed.

Advanced Searching

Drupal uses "Taxonomy" to provide filtering of search results. In the case of this demo this level of advanced searching wouldn't make sense because we only have one organism and one feature type but we'll walk through how to do it.

An example of advanced searching with multiple taxonomy terms: http://www.marinegenomics.org/search or http://www.fagaceae.org/search

First, navigate to the Search page and open the advanced options. It provides some default filtering mechanisms.

Navigate to "Administer" → "Tripal Management" → "Features". Scroll down to the "Set Taxonomy" section and check the boxes "Organism Name" and "Feature Type" and click the button "Set/Reset Taxonomy for all feature nodes". This will create a Tripal job. Let's launch the job manually rather than wait for the cron to kick in.

  cd /var/www; php ./sites/all/modules/tripal_core/tripal_launch_jobs.php yeast_admin

Check the status of the job on the Tripal Jobs page. When 100% complete return to the Search page and refresh. You'll notice now a box for filtering search results by organism and by feature type. Search for the Feature 'YAL061W'. click on the result and scroll to the bottom of the page. You'll notice some "tags" have been added to the page. These are the Drupal taxonomy tags that have been assigned to the feature page.

Navigate to "Administer" → "Content Management" → "Taxonomy". You'll see the new vocabularies and terms.

Libraries & other Analysis

We do not have libraries for this example but they function in much the same way.

Subsets of features can be synced on the analysis and library administration pages when those modules are installed and features are associated in chado.

Incorporate JBrowse

Exercise:

1. Create a new page titled 'blast'

2. Add the following in the body:

  <iframe id="gbrowseFrame" width="100%" height="1500px" frameborder="0" src="/jbrowse/">

3. Replace ??? with the URL for GBrowse created earlier in the week.

4. Set the Input type to Full HTML.

5. Save the page.

6. Create a menu item

This JBrowse won't contain to the sequence in our demo database but for purposes of demonstration this is adequate.

Customizing Content

Manipulating the Templates

Review Users' Guide pg 28.

In the case of the organism template, the organism abbreviation is not present. This can easily be added by editing the node-chado_organism.tpl.php template file.

Changing Base Themes

Review Users' Guide pg 28.

Make the following changes to the tripal.info file in the ./sites/all/themes/theme_tripal directory:

name = Tripal Theme
description = A Tripal specific subtheme for use with any base theme. Requires customization to .info file for inclusion of the base theme and regions.
version = 6.x-0.1b
core = 6.x
engine = phptemplate
base theme = pushbutton  // <-- changed

stylesheets[all][] = css/tripal.css
stylesheets[all][] = css/tripal_analysis_blast.css
stylesheets[all][] = css/tripal_organism.css

scripts[] = js/tripal.js

regions[header] = Header    // <-- changed
regions[left] = Left        // <-- changed
regions[right] = Right      // <-- changed
regions[footer] = Footer    // <-- changed

But, there is a problem with this configuration file. If you change it, Tripal will get messed up. Restore tripal.info to it's original state:

name = Tripal Theme
description = A Tripal specific subtheme for use with any base theme.  Requires customization to .info file for inclusion of the base theme and regions.
version = 6.x-0.1b
core = 6.x
engine = phptemplate
base theme = admire_gray

stylesheets[all][] = css/tripal.css
stylesheets[all][] = css/tripal_analysis_blast.css
stylesheets[all][] = css/tripal_organism.css

scripts[] = js/tripal.js

regions[header] = Header
regions[sidebar_left] = Sidebar left
regions[content_top] = Content top
regions[sidebar_right] = Sidebar right
regions[footer_block] = Footer block

Now, clear the cache. Navigate to "Administer" → "Site configuration" → "Performance". Click the "Clear cached data" button. That will tell Drupal to rebuild all pages from scratch.

Finally, we have to reset the blocks, but now we're using a new base theme and the Tripal theme is still present.

Important Drupal Modules

Views

Awesome... won't go into detail but allow for customization of pages of aggregate data types.

CCK

Allow for adding new fields to existing data types... and more.

Writing your own Module

Not yet in the User's Guide

Drupal API

Concepts

The Tripal Core provides all necessary functionality to intergate with Tripal.
Drupal API provide the rest

Anatomy of a Drupal module

module name (directory name)
a .info file describing the module
a .install file the gets executed when the modules if first installed and also when uninstalled.
a .module file the code for the module.
templates: the look-and-feel should be separated from the code if possible.

Example .info file: tripal_feature.info

  name = Tripal Chado Feature
  description = A module for interfacing the GMOD chado database with Drupal, providing viewing, inserting and editing of chado features.
  core = 6.x
  project = tripal_feature
  package = Tripal
  dependencies[] = tripal_core
  dependencies[] = tripal_organism
  dependencies[] = search
  dependencies[] = path
  version = "6.x-0.1b-m0.1"

Example .install file: tripal_feature.install

Drupal hooks: {node}_{hook_name} or {module}_{hook_name}

The important functions:

function tripal_feature_install(): actions to do on install
function tripal_feature_schema(): returns the database schema so that Drupal knows what it is
function tripal_feature_uninstall(): actions for cleanup when the module is uninstalled

See the code here: /var/www/sites/all/modules/tripal_feature

Example .module file: tripal_feature.module

The important module-centric functions:

function tripal_feature_admin(): generates the administration form for the module.
function tripal_feature_admin_validate($form, &$form_state): validates the admin form input.
function tripal_feature_node_info(): defines new nodes for Drupal (in this case chado_feature).
function tripal_feature_perm(): defines new permissions for the module.
function tripal_feature_menu(): defines new menu items
function tripal_feature_theme(): when theming must be handled by the module (e.g. search results).
function tripal_feature_nodeapi(&$node, $op, $teaser, $page): used to place content on other nodes pages.

The important node-centric functions:

function chado_feature_access($op, $node, $account): permission checking for the nodes
function chado_feature_insert($node): code to execute on insert of a new node
function chado_feature_delete($node): code to execute on delete of a node
function chado_feature_update($node): code to execute on update of a new node
function chado_feature_form ($node,$param): generates the form when editing or adding a new node
function chado_feature_validate($node): validate the form input on insert or update.
function chado_feature_load($node): loads the node data object when viewed
function chado_feature_view ($node, $teaser = FALSE, $page = FALSE): set's in-module theming.

Tripal API

Tripal provides the following functions for integrating with Tripal:

tripal_create_moddir($module_name)

Every tripal module is expected to have it's own data directory. This directory gets created in the ./sites/default/data/tripal directory. This function creates that directory.

tripal_add_job ($job_name,$modulename,$callback,$arguments,$uid, $priority = 10)

This function adds a new job is added to Tripal. An explanation of the argument is as follows:

job_name: The name of the job as it appears to the user
modulename: The name of the Tripal module submitting the job (e.g. tripal_feature, tripal_organism, etc).
callback: The name of a function in the caller's module that should be called when the job is executed.
arguments: An array of arguments to pass to the callback function when executed.
uid: the user id of the user executing the job.
priority: sets the priority of the job. The lower the number the higher the priority. Administrative jobs should always be 10 or less. User submitted jobs should always be above 10.

Note: This function will always pass in the job id as the first argument to the callback function. Therefore all callback functions should have a jobid argument first.

tripal_job_set_progress($job_id,$percentage)

Allows the callback function for the job to set the progress of the job. The first argument is the job id that get's passed into the callback by Tripal and the percentage is a value between 0 and 100.

tripal_get_module_active_jobs ($modulename)

Allws a module to see if it has any active jobs currently executing. A list of jobs is returned.

tripal_add_cvterms ($name,$definition)

Adds a CVterm to the chado database. Terms added using this function are added to the 'tripal' CV and associated with the 'tripal' database in chado. This function allows modules to create cvterms behind-the-scenes to support the data management they provide. This function is particularly useful in _install hooks of .install files.

function tripal_add_mview ($name,$modulename,$mv_table,$mv_specs,$indexed,$query,$special_index)

Programatically adds a materialized view to chado. This view can then be used behind-the-scenes by the module to help speed data queries. This function is particularly useful in _install hooks of .install files. An explanation of the argument is as follows

name: The name of the materialized view.
modulename: The name of the module submitting the materailzed view (e.g. 'tripal_library')
mv_table: The name of the table to add to chado. This is the table that can be queried.
mv_specs: The table definition
indexed: The columns that are to be indexed
query: The SQL query that loads the materialized view with data
special_index:

Note: These views are managed by Tripal and only the view itself is stored in Chado. This is different from the materialized views that comes with chado 1.0, although the outcome is the same. Any materialized views created with Tripal are not compatible with the Chado scripts for updating materialized views.

Adding a new analysis method

Not yet fully defined, however, the tripal_analysis module will provide a set of "core" functions for all analysis modules. Documentation and finalized analysis API coming soon.

Exercise

Setup

Step 1: Add a publication to Chado

Let's create a very simple module that will place publications on a feature page if an association exists in Chado. We won't be creating a data entry method so we'll have to manually add an example publication. Using phpPgAdmin, execute the following SQL statements (together as a block):

<sql>

  INSERT INTO cv (NAME) VALUES ('pub_demo');

  INSERT INTO db (NAME) VALUES ('pub_demo');

  INSERT INTO dbxref (db_id,accession,version,description)
  VALUES ((SELECT db_id FROM db WHERE NAME = 'pub_demo'),
          'pub0001',' ','demo publication accession');

  INSERT INTO cvterm (cv_id,name,dbxref_id)
  VALUES ((SELECT cv_id FROM cv WHERE NAME = 'pub_demo'),
          'journal',
          (SELECT dbxref_id FROM dbxref WHERE accession = 'pub0001'));

  INSERT INTO pub (title,volumetitle,volume,issue,pyear,pages,uniquename,type_id)
  VALUES ('Responses of pathogenic and nonpathogenic yeast species to steroids reveal the functioning and evolution of multidrug resistance transcriptional networks',
          'Eukaryot Cell','7','1','2008','68-77','demo_pub',
          (SELECT cvterm_id FROM cvterm CVT INNER JOIN CV on CVT.cv_id = CV.cv_id
           WHERE CV.name = 'pub_demo' AND CVT.name = 'journal'));

  INSERT INTO pubauthor (pub_id,rank,surname,givennames)
  VALUES ((SELECT pub_id FROM pub WHERE uniquename = 'demo_pub'),0,'Banerjee','et al.');

  INSERT INTO feature_pub (feature_id,pub_id)
  VALUES ((SELECT feature_id FROM feature WHERE uniquename = 'YAL061W'),
          (SELECT pub_id FROM pub WHERE uniquename = 'demo_pub'));

</sql> These statements do the following:

Adds a db and cv. It's not real, but suits our purposes for the example.
Adds a publication and author
Associates the publication with the feature YAL061W.

Step 2: create the module directory

Change directories to our Drupal install directory where we installed the Tripal modules:

  cd /var/www/sites/all/modules

Create a directory for our module. The directory name should be identical to our module name.

  mkdir tripal_pubs
  cd tripal_pubs

Create tripal_pubs.info file

The first step to creating a module is to define our .info file that provides information to Drupal about our module.

Design considerations:

Our module is dependent on the Tripal core (all modules are) and the Tripal feature (we want to tie our pubs to features).
We want to add this module to the Tripal package.

Create a tripal_pubs.info file with the following content:

; $Id:
name = Tripal Pubs
description = A module for displaying publications on feature pages
core = 6.x
project = tripal_pubs
package = Tripal
version = 6.x-0.1b-m0.1
dependencies[] = tripal_core
dependencies[] = tripal_feature

Create tripal_pubs.install file

We do not have any functionality that needs to be performed when the module is installed, but we'll create a shell with proper drupal hooks so that in the future if we do want to add code to these modules we can. <php> <?php

  //$Id:

  /*******************************************************************************
  * Implementation of hook_install().
  */
  function tripal_pubs_install() {
  }

  /*******************************************************************************
  * Implementation of hook_uninstall().
  */
  function tripal_pubs_uninstall() {
  }

  /*******************************************************************************
  * Implementation of hook_schema().
  */
  function tripal_pubs_schema() {
     $schema = array();
     return $schema;
  }

</php>

Create the tripal_pubs.module file

Our module will be simple. It will add content to a feature page if there are publications associated with the feature. However, we'll add many important hooks for demonstration purposes.

Cut and paste the following code into a new tripal_pubs.module file. We'll discuss each function...

<php> <?php

  /*******************************************************************************
  * Implementation of hook_init()
  */
  function tripal_pubs_init() {

  /*******************************************************************************
  * Implementation of hook_perm()
  */
  function tripal_pubs_perm(){
     return array(
        'access chado_publication content',
        'create chado_publication content',
        'delete chado_publication content',
        'edit chado_publication content',
     );
  }

  /*******************************************************************************
  * Implementation of hook_node_info()
  */
  function tripal_pubs_node_info() {
     $nodes = array();
     $nodes['chado_publication'] = array(
        'name' => t('Publications'),
        'module' => 'chado_publication',
        'description' => t('A publication from the chado database'),
        'has_title' => FALSE,
        'title_label' => t('Publication'),
        'has_body' => FALSE,
        'locked' => TRUE
     );
     return $nodes;
  }

  /*******************************************************************************
  * Implementation of hook_block()
  */
  function tripal_pubs_block($op = 'list', $delta = '0', $edit = array()){
     switch($op){
        case 'list':
           $blocks[0]['info'] = t('Publications');
           return $blocks;

        case 'view':
           if(user_access('access chado_publication content')){
              $items[] = t("This block is not yet setup.");
              $block['subject'] = t('Publications');
              //We theme our array of links as an unordered list
              $block['content'] = theme('item_list', $items);
           }
           return $block;
     }
  }

  /*******************************************************************************
  * Implementation of hook_menu()
  */
  function tripal_pubs_menu() {
     $items = array();

     $items['publications'] = array(
        'menu_name' => ('primary-links'), //Enable the 'Publications' primary link
        'title' => t('Publications'),
        'page callback' => 'tripal_pubs_page',
        'access arguments' => array('access chado_publication content'),
        'type' => MENU_NORMAL_ITEM
     );
     // the administative settings menu
     $items['admin/tripal/tripal_publication'] = array(
       'title' => 'Publications',
       'description' => 'Manage integration of Chado publications including associated features',
       'page callback' => 'drupal_get_form',
       'page arguments' => array('tripal_pubs_admin'),
       'access arguments' => array('administer site configuration'),
       'type' => MENU_NORMAL_ITEM,
     );
     return $items;
  }

  /*******************************************************************************
  * Implementation of hook_admin()
  */
  function tripal_pubs_admin () {
     // provide a form for administrative settings
     $form = array();

     return system_settings_form($form);
  }

  /*******************************************************************************
  * Implementation of hook_admin_validate()
  */
  function tripal_pubs_admin_validate($form, &$form_state) {
     // validate the admin form submission
  }

  /*******************************************************************************
  * Implementation of hook_insert()
  */
  function chado_publication_insert($node){
     // add publication to chado database
  }

  /*******************************************************************************
  * Implementation of hook_delete()
  */
  function chado_publication_delete($node){
     // remove publication from chado database
  }

  /*******************************************************************************
  * Implementation of hook_update()
  */
  function chado_publication_update($node){
     // update a publication record in the chado database
  }

  /*******************************************************************************
  * Implementation of hook_access()
  */
  function chado_publication_access($op, $node, $account){
     if ($op == 'create') {
        return user_access('create chado_publication content', $account);
     }

     if ($op == 'update') {
        if (user_access('edit chado_publication content', $account)) {
           return TRUE;
        }
     }
     if ($op == 'delete') {
        if (user_access('delete chado_publication content', $account)) {
           return TRUE;
        }
     }
     if ($op == 'view') {
        if (user_access('access chado_publication content', $account)) {
           return TRUE;
        }
     }
     return FALSE;
  }

  /*******************************************************************************
  * Implementation of hook_form()
  */
  function chado_publication_form ($node, $param){
     $form = array();

     $form['title']= array(
        '#type' => 'textfield',
        '#title' => t('Publication Title'),
        '#required' => TRUE,
        '#default_value' => ' ',
        '#description' => t('Enter the title for the publication'),
        '#weight' => 1,
        '#maxlength' => 255
     );

     // add additional fields to the form that are needed to populate the
     // chado tables

     return $form;
  }

  /*******************************************************************************
  * Implementation of hook_load()
  */
  function chado_publication_load($node){
     $previous_db = db_set_active('chado');  // use chado database
     $sql = "SELECT * ".
            "FROM pub P ";
     $pubs = db_fetch_object(db_query($sql));
     db_set_active($previous_db);  // now use drupal database

     $additions->pubs = $pubs;

     // we would additionally want to pull out the authors for the publication
     // as well
  }

  /*******************************************************************************
  * Implementation of hook_view()
  */
  function chado_publication_view ($node, $teaser = FALSE, $page = FALSE) {
     // this function provides instructions for establishing the way the
     // publication page will look.
  }

  /*******************************************************************************
  * Callback for the 'publications' menu item
  */
  function tripal_pubs_page(){
     return 'This is where custom content would go if we needed it.';
  }

  /*******************************************************************************
  * Implementation of hook_nodeapi()
  */
  function tripal_pubs_nodeapi(&$node, $op, $teaser, $page) {
     switch ($op) {
     case 'view':
        // Abort if this node is not one of the types we should show.
        if (strcmp($node->type,'chado_feature') != 0) {
           break;
        }

        // Add pubs to the content item if it's not a teaser
        if (!$teaser && $node->feature->feature_id) {
           $node->content['tripal_pubs_form'] = array(
              '#value' => theme('tripal_pubs_feature_pub_list', $node),
              '#weight' => 4
           );
        }
     }
  }

  /*******************************************************************************
  * Implementation of hook_theme()
  */
  function tripal_pubs_theme () {
      return array(
         'tripal_pubs_feature_pub_list' => array (
            'arguments' => array('node'),
         )
      );
  }

  /*******************************************************************************
  * The theme function that sets the content to be added
  */
  function theme_tripal_pubs_feature_pub_list ($node) {
     $feature = $node->feature;

     // get the list of publications that are associated with this
     // feature
     $sql = "SELECT * ".
            "FROM pub P ".
            "   INNER JOIN feature_pub FP ON FP.pub_id = P.pub_id ".
            "WHERE FP.feature_id = %d";
     $previous_db = db_set_active('chado');  // use chado database
     $pubs = db_query($sql,$feature->feature_id);
     db_set_active($previous_db);  // now use drupal database
     // generate the content that will appear on the feature page
     // first we'll make the normal Tripal expandable box for this
     // content to go into
     $content  = "< div id=\"feature-pubs\" class=\"feature-pubs-box\">".
                 "< div class=\"tripal_expandableBox\">".
                 "< h3>Publications</h3></div>".
                 "< div class=\"tripal_expandableBoxContent\" ".
                 "id=\"pubs_$feature->feature_id\">";

     // iterate through the publications, get the authors and print
     // the references
     while($pub = db_fetch_object($pubs)){
        $sql = "SELECT * FROM pubauthor WHERE pub_id = %d";
        $previous_db = db_set_active('chado');  // use chado database
        $authors = db_query($sql,$pub->pub_id);
        db_set_active($previous_db);  // now use drupal database
        while ($author = db_fetch_object($authors)){
           $content .= "$author->surname,  $author->givennames; ";
        }
        $content .= "$pub->title. $pub->volumetitle $pub->pyear $pub->volume($pub->issue):$pub->pages< br>";
     }
     $content .= "</div>< br>";

     return $content;
  }

</php> Note... fix the < div> and < h3>tags in the function immediately above and remove the spaces. Mediawiki was giving problems with these so they should be fixed.

@@ Line 6: / Line 6: @@
 [[User:Sficklin|Stephen Ficklin]]
 |}
+__NOTITLE__
 This [[:Category:Tutorials|tutorial]] walks you through installing and configuring [[Tripal]], a web front end to [[Chado]] databases.  This tutorial was originally taught by [[User:Sficklin|Stephen Ficklin]] at the [[2009 GMOD Summer School - Americas]].

Difference between revisions of "Tripal Tutorial"

Revision as of 20:52, 24 September 2009

Contents

VMware

Caveats

Background

Pre-Course Setup

Prepare Postgres

Install Prereqs

Chado Installation

Web Prep

Load Yeast Data

Prepare Apache

Install Drupal

Review the features of Drupal

Setup Drupal Cron

Tripal Resources

Tripal Overview

Setup Tripal

Install Tripal

Enable the Tripal Modules

Setup Tripal Cron

Organisms

Sync the Organism

Setup the Tripal Theme

Organism Materialized Views

Browse Features

Features

Sync the Features

Themeing

Feature Syncing -- returned

Feature References

Blast Module -- An Analysis Example

Searching

Advanced Searching

Libraries & other Analysis

Incorporate JBrowse

Customizing Content

Manipulating the Templates

Changing Base Themes

Important Drupal Modules

Writing your own Module

Drupal API

Tripal API

Exercise

Setup

Create tripal_pubs.info file

Create tripal_pubs.install file

Create the tripal_pubs.module file

Navigation menu

Search