BioPerl

From GMOD
Revision as of 04:57, 4 February 2008 by Clements (Talk | contribs)

Jump to: navigation, search

BioPerl is a set of modules that support bioinformatics programming in Perl programs. BioPerl is used extensively by several GMOD components. You will need to install it before you can use those components.

This is a one page summary of BioPerl that introduces several high level concepts and some points that are specifically important for GMOD. If you have a detailed question about BioPerl, see the BioPerl web site for more.

BioPerl Background

BioPerl Packages and bioperl-live

BioPerl is an enormous project. To make it more manageable it has been divided into several packages. The most popular package, and the one most frequently used by GMOD components is the Core package. Each GMOD component will tell you which BioPerl packages are required for it to work.

Bioperl-live is the latest version of the core package as it exists in subversion. That is, bioperl-live contains the absolute latest and greatest updates to BioPerl.


BioPerl Releases

At any point in time there are generally 3 BioPerl releases:

Release Description Use in GMOD?
Stable Stable releases have gone through more testing than the other types of releases. They come out infrequently and as of January 2008, the most recent stable release was 1.4.0, released in December 2003. No
Developer Developer releases have gone through some testing, but less than stable releases. These come out more frequently than stable releases. As of January 2007, the latest developer release is 1.5.2, released in December 2006. GBrowse 1.69[1]
BioPerl-live BioPerl-live (and its cousins) is not a release per se, but is rather a copy of what is in BioPerl's subversion repository for the core module on the day you get the files. This is the most up to date version of BioPerl you can get. Yes[2]
  1. The developer release does work with straightforward installations of GBrowse, but if your installation gets more complicated you may need to upgrade to the BioPerl-live release.
  2. Getting the latest code from subversion may sound scary, but it rarely causes problems. Revisions to BioPerl almost always result in a better package.


GMOD components rely on fixes and features that are not in either the stable or developer releases. You need to use BioPerl-live. See BioPerl's Using CVS and Getting BioPerl pages for how to get the latest copy of BioPerl-live.

Installing BioPerl

Installing BioPerl is non-trivial. It has many dependencies both within and outside of Perl. Perl does a pretty good job of dealing with dependencies within Perl (meaning dependencies on other Perl modules). It does not do so well with dependencies outside of Perl. You should address the external dependencies before attempting to install BioPerl.

Dependencies

Outside Perl

You need to install external (i.e.,non-Perl) libraries before you install BioPerl. If these are not installed then attempts to install BioPerl will generate copious error messages.

If you are on Linux, then these will be available as packages and should be installed using the appropriate package manager for your Linux distribution. These may also already be installed on your system.

Library(ies) Description
perl-devel Perl development library.
perl-DB_File Berkeley DB support in Perl.
libgd, lbgd-devel Libraries for creating PNG, JPG, etc images.
expat, libexpat An XML parser.

Inside Perl

There are also a few Perl modules that you should install before installing BioPerl. These are listed on BioPerl's installation pages.

Install

BioPerl has an Installing BioPerl page that includes pointers to specific platform installation pages. That page also lists 4 different ways to install a BioPerl module. However, since you need to install from CVS only one of those methods applies. See BioPerl's Using CVS page for how to get the latest version and how to use it.

Answering Questions During the Install

The installation will typically ask you many questions. How should you answer those? The basic guideline is:

When in doubt use the default answer.

The default answer is usually shown in square brackets, e.g. [y] for 'yes'.

Errors

From BioPerl:

If you've installed everything perfectly and all the network connections are working then you may pass all the tests run in the './Build test' phase. It's also possible that you may fail some tests. Possible explanations: problems with local Perl installation, network problems, previously undetected bug in Bioperl, flawed test script, problems with CGI script used for sequence retrieval at public database, and so on. Remember that there are over 800 modules in Bioperl and the test suite is running more than 12000 individual tests, a few failed tests may not affect your usage of Bioperl.

If you decide that the failed tests will not affect how you intend to use Bioperl and you'd like to install anyway do:

cpan>force install S/SE/SENDU/bioperl-1.5.2_102.tar.gz

This is what most experienced Bioperl users would do. However, if you're concerned about a failed test and need assistance or advice then contact bioperl-l@bioperl.org.

In other words, the install will not pass all its tests and it will have to be forced. But how do you decide what level of errors are acceptable and what are not? Here's a rough guideline:

  • If you see errors related to the non-Perl prerequisites (see Outside Perl above)
    then you should check that you have that prerequisite installed correctly and then try again.
    This type of error usually occurs while building the BioPerl installation rather than while testing it.
  • If your errors occur only in the testing part of the installation, and there are few of them compared to the total number of tests,
    then you are probably safe to do a force install.