Difference between revisions of "GBrowse"

Revision as of 20:04, 19 May 2011

__NOTITLE__

Status

Mature release
Active development
Active support

Resources

The Generic Genome Browser (GBrowse) is a genome viewer and is GMOD's most popular component. For a demo of its features, see WormBase, FlyBase, or Human Genome Segmental Duplication Database and others.

Description

GBrowse running on HapMap.org Click to view at full resolution

GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes. Some of its features include:

Simultaneous bird's eye and detailed views of the genome.
Scroll, zoom, center.
Use a variety of premade glyphs or create your own.
Attach arbitrary URLs to any annotation.
Order and appearance of tracks are customizable by administrator and end-user.
Search by annotation ID, name, or comment.
Supports third party annotation using GFF formats.
Settings persist across sessions.
DNA and GFF dumps.
Connectivity to different databases, including BioSQL and Chado.
Multi-language support.
Third-party feature loading.
Customizable plug-in architecture (e.g. run BLAST, dump & import many formats, find oligonucleotides, design primers, create restriction maps, edit features)

GBrowse Versions

GBrowse 1.X (currently 1.70) is the older series that has been in use since 2002. It is recommended for applications which use a single database only and which must support legacy browsers.

GBrowse 2.0 is a rewrite of the original GBrowse to add dynamic updating via AJAX and a smoother user experience. In addition, it provides administrators with the ability to attach a different genome database to each GBrowse track, making it much easier to manage and update tracks. It also provides a distributed backend system of "slave" renderers, allowing each track to be rendered in parallel on a different machine and significantly increasing performance. GBrowse 2.0 is considered stable,but does not have full internationalization support. In addition, there may be issues with older browsers that do not support newer JavaScript features.

Installation

GBrowse is Perl-based. It can be installed using the standard Perl module build procedure, or automated using a network-based install script. In order to use the net installer, you will need to have Perl 5.8.6 or higher and the Apache web server installed. See the step-by-step instructions below for detailed instructions:

GBrowse Install HOWTO

Documentation

On-line documentation

	GBrowse 1.x	GBrowse 2.0
Usage	OpenHelix	Tutorial
Install	Wiki	Wiki
Configure	Wiki / Tutorial	Wiki / Tutorial

POD documentation

There are many useful POD documents included with the distribution. These are converted to HTML files when you install the package, and can be found in /gbrowse/docs/pod:

Since these are in Perl POD format these files may contain formatting code when viewed in a Web browser.

Downloads

Source Code Download (tar.gz file)

Download the source from the SourceForge download page.

Net-based Installer Script

The net installer script, called gbrowse_netinstall.pl at the GBrowse GitHub repository will automatically download and install GBrowse and its Perl libraries for you. See Installation for details on using this script.

SVN

There are often new features and bug fixes in the current development version which have not yet been released. To get the latest version, please use Subversion (SVN). The recommended branch to use is trunk, which is usually stable:

 svn co https://gmod.svn.sourceforge.net/svnroot/gmod/Generic-Genome-Browser/trunk Generic-Genome-Browser

Once you have successfully checked out the Generic-Genome-Browser distribution, fetch recent changes by executing svn update inside the Generic-Genome-Browser directory.

You can also browse the GBrowse SVN.

1.x Development Version

The link above will get you to the GBrowse2 development version. To get to the GBrowse 1.x development branch, use stable:

 svn co https://gmod.svn.sourceforge.net/svnroot/gmod/Generic-Genome-Browser/branches/stable Generic-Genome-Browser

About Databases

GBrowse has a flexible adaptor (yes, it is spelled that way and is not "adapter") system for running off various types of databases/sources. A common question is "which adaptor should I be using?" This attempts to answer that question.

Adaptor	Other required software	Roughly how many users	Pros	Cons
Bio::DB::SeqFeature::Store (use bp_seqfeature_load.pl)	MySQL, PostgreSQL, SQLite, BerkeleyDB	Many and growing fast.	Roughly 4X faster than Bio::DB::GFF for the same data; designed to work with GFF3	Developed for use with GFF3; about 2X slower than Bio::DB::GFF to load a database
Bio::DB::GFF (use bp_load_gff.pl, bp_bulk_load_gff.pl, bp_fast_load_gff.pl)	A relational database server: MySQL, PostgreSQL, Oracle, or BerkeleyDB	Lots! (Especially MySQL)	Quite fast; large user base; Have to use this if your data is in the (now deprecated) GFF2 format.	Does not work well with GFF3 formatted data
Bio::DB::Sam (available from CPAN)	SAMtools	Growing (particularly with GBrowse2)	Very fast access to NextGen sequencing data	Difficult to use with GBrowse 1.70
Bio::DB::BigWig and Bio::DB::BigWigSet (available from CPAN)	UCSC Formats	Growing (particularly with GBrowse2)	Very fast access to data in bigWig format	Difficult to use with GBrowse 1.70
Bio::DB::BigBed (available from CPAN)	UCSC Formats	Growing (particularly with GBrowse2)	Very fast access to data in bigBed format	Difficult to use with GBrowse 1.70
Bio::DB::Das::Chado (available from CPAN)	PostgreSQL and a Chado schema	Relatively few due to the specialized nature of Chado	Allows 'live' viewing of the features in a Chado database	Slow compared to Bio::DB::GFF
Bio::DB::Das::BioSQL (available from CPAN)	MySQL and a BioSQL schema	Relatively few due to the small number of BioSQL users	Allows 'live' viewing of the features in a BioSQL database	Slow compared to Bio::DB::GFF
Memory (ie, flat file database using either Bio::DB::GFF or SeqFeature::Store)	None	For real servers, none	Easy for rapid development and testing	Very slow for more than a few thousand features
LuceGene	Lucene (searches indexed flat files)	Relatively few

Email Threads

There have been some useful email threads on adaptor choices and tradeoffs.

Memory Database, 2010/06

Contacts

Please report bugs to the SourceForge Bug Tracker (select 'Category: Gbrowse').

	Mailing List Link	Description	Archive(s)
GBrowse & GBrowse_syn	gmod-gbrowse	GBrowse and GBrowse_syn users and developers.	Gmane, Nabble (2010/05+), Sourceforge
GBrowse & GBrowse_syn	gmod-gbrowse-cmts	Code updates.	Sourceforge

Logo

The GBrowse logo was created by Alex Read, a participant in the Spring 2010 Logo Program, while a design student at Linn-Benton Community College.

Difference between revisions of "GBrowse"

Revision as of 20:04, 19 May 2011

Contents

Description

GBrowse Versions

Installation

Documentation

On-line documentation

POD documentation

Downloads

Source Code Download (tar.gz file)

Net-based Installer Script

SVN

1.x Development Version

About Databases

Email Threads

Contacts

Logo

References

See Also

Installation and Setup

Configuration

Development

Other

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Documentation

Community

Tools

@@ Line 1: / Line 1: @@
-{{SessionHead}}
+{{ImageCenter|GBrowseLogo.png|GBrowse Logo|400|http://gmod.org/wiki/GBrowse#Logo}}
-{| class="tutorialheader"
+__NOTITLE__
-| {{TutorialTitleLine|GBrowse}}<br />
-[[2011 GMOD Spring Training]]<br />
--12 March 2011<br />
-[[User:Scott|Scott Cain]]
-| align="right" | {{#icon: GBrowseLogo.png|GBrowse|200|gmod:GBrowse}}
-|}
+{{ComponentBox|{{GBrowseResourcesBoxItem}}|<!--{{ComponentBoxEventsHeader}}|{{GMODAmericas2011BoxItem|2011 GMOD Spring Training|GMOD Spring Training|March 8-12}}-->|||| }}
-{{TocRight}}
+The Generic Genome Browser (GBrowse) is a genome [[Visualization|viewer]] and is [[GMOD]]'s most popular [[GMOD Components|component]]. For a demo of its features, see [http://wormbase.org/db/gb2/gbrowse/c_elegans/ WormBase], [http://flybase.org/cgi-bin/gbrowse/dmel FlyBase], or [http://projects.tcag.ca/cgi-bin/duplication/dupbrowse/human_b35 Human Genome Segmental Duplication Database] and  [[GMOD_Users|others]].
-=Prerequisites=
-Installed before using apt or cpan.
+==Description==
+[[image:gbrowse_screenshot1.gif|right|thumb|350px|GBrowse running on [http://hapmap.org/downloads/index.html HapMap.org] [[Media:gbrowse_screenshot1.gif|Click to view at full resolution]]]]
-=Install GBrowse=
+GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes. Some of its features include:
-Easily installed via the cpan shell:
+* Simultaneous bird's eye and detailed views of the genome.
-  <span class="enter">sudo cpan</span>
+* Scroll, zoom, center.
-  cpan> <span class="enter">install Bio::Graphics::Browser2</span>
+* Use a variety of [[GBrowse Configuration HOWTO#Glyphs and Glyph Options|premade glyphs]] or create your own.
+* Attach arbitrary URLs to any annotation.
+* Order and appearance of tracks are customizable by administrator and end-user.
+* Search by annotation ID, name, or comment.
+* Supports third party annotation using [[GFF]] formats.
+* Settings persist across sessions.
+* DNA and [[GFF]] dumps.
+* Connectivity to different databases, including [[BioSQL]] and [[Chado]].
+* Multi-language support.
+* Third-party feature loading.
+* Customizable [[GBrowse Plugins|plug-in]] architecture (e.g. run [[wp:BLAST|BLAST]], dump & import many formats, find oligonucleotides, [[PrimerDesigner.pm|design primers]], create restriction maps, edit features)
-Which gets all of the prereqs that aren't installed on the machine.
+==GBrowse Versions ==
-=Tutorial=
+'''GBrowse 1.X''' (currently 1.70) is the older series that has been in use since 2002. It is recommended for applications which use a single database only and which must support legacy browsers.
-Go to http://localhost/gbrowse2
+'''GBrowse 2.0''' is a rewrite of the original GBrowse to add dynamic updating via AJAX and a smoother user experience. In addition, it provides administrators with the ability to attach a different genome database to each GBrowse track, making it much easier to manage and update tracks. It also provides a distributed backend system of "slave" renderers, allowing each track to be rendered in parallel on a different machine and significantly increasing performance. GBrowse 2.0 is considered stable,but does not have full internationalization support. In addition, there may be issues with older browsers that do not support newer JavaScript features.
-=Basic [[Chado]] Configuration (if we have time)=
+==Installation==
-{{CPAN|Bio::DB::Das::Chado}} was installed when we created the image.  Sample configuration files are available with GBrowse, and we'll get the sample Chado file:
+GBrowse is [[Glossary#Perl|Perl]]-based. It can be installed using the standard Perl module build procedure, or automated using a network-based install script. In order to use the net installer, you will need to have Perl 5.8.6 or higher and the Apache web server installed. See the step-by-step instructions below for detailed instructions:
-  <span class="enter">wget http://gmod.svn.sourceforge.net/viewvc/gmod/Generic-Genome-Browser/trunk/contrib/conf_files/07.chado.conf -O pythium.conf</span>
+* [[GBrowse Install HOWTO]]
+* [[GBrowse_MacOSX_HOWTO|Install on MacOSX]]
+* [[GBrowse_Windows_HOWTO|Install on Windows]]
+* [[GBrowse_Ubuntu_HOWTO|Install on Ubuntu and other Debian-based systems]]
+* [[GBrowse_RPM_HOWTO|Install on Fedora Core and other RPM-based systems]]
+* [[GBrowse_Gentoo_HOWTO|Install on Gentoo Linux system]]
+* [[GBrowse_Install_HOWTO|Source Code Install (for other Linux systems)]]
+==Documentation==
+===On-line documentation===
+{{GB doc box}}
-Some simple tweaks and additions:
-*Change description
+===POD documentation===
-*Get rid of <tt>database = main</tt>
+There are many useful POD documents included with the distribution.  These are converted to HTML files when you install the package, and can be found in /gbrowse/docs/pod:
-*Remove or change examples (yeast examples don't help anybody)
-*Add initial landmark (<tt>initial landmark = scf1117875582023</tt>)
-==DB connection info==
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/BIOSQL_ADAPTER_HOWTO.pod|BIOSQL_ADAPTER_HOWTO.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/GENBANK_HOWTO.pod|GENBANK_HOWTO.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/PLUGINS_HOWTO.pod|PLUGINS_HOWTO.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/INSTALL.MacOSX.pod|INSTALL.MacOSX.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/DAS_HOWTO.pod|DAS_HOWTO.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/INSTALL.pod|INSTALL.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/README-chado.pod|README-chado.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/FAQ.pod|FAQ.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/MAKE_IMAGES_HOWTO.pod|MAKE_IMAGES_HOWTO.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/README-gff-files.pod|README-gff-files.pod}} (see also [[GFF]])
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/GBROWSE_IMG.pod|GBROWSE_IMG.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/ORACLE_AND_POSTGRESQL.pod|ORACLE_AND_POSTGRESQL.pod}}
+* {{SF_SVN|Generic-Genome-Browser/trunk/docs/pod/README-lucegene.pod|README-lucegene.pod}}
- [annotation:database]
+Since these are in Perl POD format these files may contain formatting code when viewed in a Web browser.
- db_adaptor    = Bio::DB::Das::Chado
- db_args       = -dsn dbi:Pg:dbname=chado
-                 -user gmod
-                 -inferCDS 1
-                 -srcfeatureslice 1
- search options = default
-==Add a BAM data source==
+==Downloads==
- [bam_sample:database]
+=== Source Code Download (tar.gz file) ===
- db_adaptor     = Bio::DB::Sam
- db_args        = -fasta /var/www/gbrowse2/databases/pythium/scf1117875582023.fasta
-                  -bam   /var/www/gbrowse2/databases/pythium/simulated-sorted.bam
- search options = default
-==Add track defaults==
+Download the source from the [http://sourceforge.net/project/showfiles.php?group_id=27707 SourceForge download page].
- [TRACK DEFAULTS]
+=== Net-based Installer Script ===
- glyph       = generic
- database    = annotation
- height      = 8
- bgcolor     = cyan
- fgcolor     = black
- label density = 25
- bump density  = 100
-Note particularly the "database" entry--for most tracks we'll be using the annotation database, but the bam_sample data source will be available when we want it.
+The net installer script, called {{GitHub|GBrowse|master/bin/gbrowse_netinstall.pl|gbrowse_netinstall.pl at the GBrowse GitHub repository}} will automatically download and install GBrowse and its Perl libraries for you. See [[#Installation|Installation]] for details on using this script.
-==Add some tracks==
+=== SVN ===
- [Genes]
+There are often new features and bug fixes in the current development version which have not yet been released. To get the latest version, please use [[Subversion]] (SVN). The recommended branch to use is ''trunk'', which is usually stable:
- feature      = gene
- glyph        = gene
- ignore_sub_part = polypeptide
- #bgcolor      = yellow
- forwardcolor = yellow
- reversecolor = turquoise
- label        = sub { my $f = shift;
-                     my $name = $f->display_name;
-                     my @aliases = sort $f->attributes('Alias');
-                     $name .= " (@aliases)" if @aliases;
-                     $name;
-   }
- height       = 6
- description  = 0
- key          = Named gene
- [CDS]
- feature      = mRNA
- glyph        = cds
- description  = 0
- ignore_sub_part = polypeptide exon
- height       = 26
- sixframe     = 1
- label        = sub {shift->name . " reading frame"}
- key          = CDS
- citation     = This track shows CDS reading frames.
- [repeats]
- feature      = match:repeatmasker
- glyph        = generic
- bgcolor      = black
- key          = Repeats
- [ests]
- feature      = expressed_sequence_match
- glyph        = segments
- stranded     = 1
- bgcolor      = green
- key          = EST matches
- [proteins]
- feature      = protein_match
- glyph        = segments
- stranded     = 1
- bgcolor      = pink
- fgcolor      = red
- key          = protein matches
- [CoverageXyplot]
- feature        = coverage
- glyph          = wiggle_xyplot
- database       = bam_sample
- height         = 50
- fgcolor        = black
- bicolor_pivot  = 20
- pos_color      = blue
- neg_color      = red
- key            = Coverage (xyplot)
- [Reads]
- feature        = match
- glyph          = segments
- draw_target    = 1
- show_mismatch  = 1
- mismatch_color = red
- database       = bam_sample
- bgcolor        = blue
- fgcolor        = white
- height         = 5
- label density  = 50
- bump           = fast
- key            = Reads
- [Pair]
- feature       = read_pair
- glyph         = segments
- database      = bam_sample
- draw_target   = 1
- show_mismatch = 1
- bgcolor       = sub {
-                 my $f = shift;
-                 return $f->attributes('M_UNMAPPED') ? 'red' : 'green';
-                 }
- fgcolor       = green
- height        = 3
- label         = sub {shift->display_name}
- label density = 50
- bump          = fast
- connector     = dashed
- balloon hover = sub {
-                 my $f     = shift;
-                 return '' unless $f->type eq 'match';
-                 return 'Read: '.$f->display_name.' : '.$f->flag_str;
-                 }
- key           = Read Pairs
-==Add our new database to the GBrowse.conf==
+  svn co https://gmod.svn.sourceforge.net/svnroot/gmod/Generic-Genome-Browser/trunk Generic-Genome-Browser
-To let GBrowse know that there is a new database available, we have to add a few lines to GBrowse.conf.  Add this to the bottom:
+Once you have successfully checked out the Generic-Genome-Browser distribution, fetch recent changes by executing <code>svn update</code> inside the <code>Generic-Genome-Browser</code> directory.
- [pythium]
+You can also browse the {{SF_SVN|Generic-Genome-Browser|GBrowse SVN}}.
- description   = Pythium ultimum
- path          = pythium.conf
-===Updating SAMtools===
+==== 1.x Development Version ====
-The version of SAMtools may need to be updated.  Get the samtools release:
+The link above will get you to the GBrowse2 development version.  To get to the GBrowse 1.x development branch, use stable:
-   cd ~/Documents/Software
+   svn co https://gmod.svn.sourceforge.net/svnroot/gmod/Generic-Genome-Browser/branches/stable Generic-Genome-Browser
-  wget -O samtools-0.1.13.tar.bz2 http://sourceforge.net/projects/samtools/files/samtools/0.1.13/samtools-0.1.13.tar.bz2/download
-  tar jxvf samtools-0.1.13.tar.bz2
-  cd samtools-0.1.13
-  make
-Install Bio::DB::Sam:
+== About Databases ==
-  sudo cpan
+{{:GBrowse Adaptors}}
-  cpan> install Bio::DB::Sam
-when asked "Please enter the location of the bam.h and compiled libbam.a files:", answer:
+==Contacts==
-  /home/gmod/Documents/Software/samtools-0.1.13
+Please report bugs to the SourceForge [http://sourceforge.net/tracker/?func=add&group_id=27707&atid=391291 Bug Tracker] (select 'Category: Gbrowse').
-==Add semantic zooming for the BAM tracks==
+{{MailingListsFor|GBrowse}}
-Not doing this for very dense data (like BAM) is probably the number one performance killers for GBrowse; asking GBrowse to draw a track that has thousands of glyphs is time consuming (and ultimately, probably not very informative).
+== Logo ==
- [Reads:5001]
+The [[:Image:GBrowseLogo.png|GBrowse logo]] was created by [mailto:alexisnb1@yahoo.com Alex Read], a participant in the [[Spring 2010 Logo Program]], while a design student at [http://www.linn-benton.edu Linn-Benton Community College].
- feature        = coverage
- glyph          = wiggle_density
- height         = 15
- [Pair:5001]
- feature        = coverage
- glyph          = wiggle_density
- height         = 15
- bgcolor        = purple
-==Add "show summary" functionality==
+==References==
-For other tracks, when zoomed way out (100kb or 1MB), performance can similarly suffer, with a decreasing "information" content.  Newer versions of GBrowse provide the ability to automatically generate density plots when zoomed out.  This functionality is available from Chado and {{CPAN|Bio::DB::SeqFeature::Store}} data adaptors.  To prepare our Chado database to do this semantic zooming, we need to run a script that comes with Bio::DB::Das::Chado:
+==See Also==
-  cd ~/Documents/Software/gbrowse-adaptors/Chado
+{{GBrowse}}
-  svn update
-  perl bin/gmod_create_summary_statistics.pl
-and then add to the pythium.conf file, somewhere near the top (ie, not in the track definitions):
+<references/>
-  show summary = 99999
+[[Category:GBrowse]]
+[[Category:GMOD Components]]
-==Enabling full text searching==
-If we try searching for "<tt>gene 7.92</tt>", we'll get "Not Found" as a result, even though genemark-scf1117875582023-abinit-gene-7.92 does exist.  To look for partial strings, we need to enable full text searching.  To do so, we need to run another script that comes with Bio::DB::Das::Chado:
-  perl /home/gmod/Documents/Software/gbrowse-adaptors/Chado/bin/gmod_chado_fts_prep.pl
-This does several things (including poorly estimating how long it will take to finish), including creating materialized views, using a tool provided by [[gmod:Category:SGN|SOL Genomics Network (SGN)]].  In practice, it would be a good idea to read the documentation of <tt>gmod_materialized_view_tool.pl</tt> for information on keeping the view up to date.
-We also have to tell GBrowse that this Chado database can now do full text searching, by adding this to the Chado database stanza:
-  -fulltext 1
-Now we can search for "<tt>gene 7.92</tt>" and we'll find our gene (plus it's mRNA and exons) and we can click on the gene to see it in GBrowse.
-= Evaluation =
-{{Feedback}}
-{{NextSession|Apollo|Apollo}}