|
|
Line 1: |
Line 1: |
− | This [[JBrowse]] tutorial was presented by [[User:RobertBuels|Robert Buels]] at the [[2012 GMOD Summer School]] in August 2012.
| + | #REDIRECT [[JBrowse Tutorial 2013]] |
− | | + | |
− | {{Template:AMI Summer School day 2}}
| + | |
− | | + | |
− | | + | |
− | == Prerequisites ==
| + | |
− | | + | |
− | These have <b>already been set up</b> on the VM image.
| + | |
− | | + | |
− | Optional, for generating images from Wiggle files:
| + | |
− | * libpng12-0
| + | |
− | * libpng12-dev
| + | |
− | * a C++ compiler
| + | |
− | | + | |
− | Optional, for BAM files (<tt>setup.sh</tt> tries to install these for you in the JBrowse directory):
| + | |
− | * samtools, and its dependency libncurses5-dev
| + | |
− | * perl module: {{CPAN|Bio::DB::SAM}}
| + | |
− | | + | |
− | Other prerequisites are installed by JBrowse automatically.
| + | |
− | | + | |
− | <div class="dont">
| + | |
− | This is how they were installed: <b>(don't do this)</b>
| + | |
− | <pre class="dont">
| + | |
− | $ sudo apt-get install libpng12-0 libpng12-dev build-essential libncurses5-dev
| + | |
− | </pre>
| + | |
− | </div>
| + | |
− | | + | |
− | Make sure you can copy/paste from the wiki.
| + | |
− | | + | |
− | It's also very useful to know how to tab-complete in the shell.
| + | |
− | | + | |
− | == JBrowse Introduction ==
| + | |
− | | + | |
− | How and why [[JBrowse]] is different from most other web-based genome browsers, including [[GBrowse]].
| + | |
− | | + | |
− | More detail: [http://genome.cshlp.org/content/19/9/1630.full paper]
| + | |
− | | + | |
− | [[Media:JBrowse_gmod_aug2012.pdf|JBrowse presentation]]
| + | |
− | | + | |
− | == JBrowse Architecture ==
| + | |
− | [[Image:Jbrowse_arch.png|||600px]]
| + | |
− | | + | |
− | == Setting up JBrowse ==
| + | |
− | | + | |
− | === Getting JBrowse ===
| + | |
− | | + | |
− | * prepare a directory for JBrowse
| + | |
− | | + | |
− | $ <span class="enter">cd /var/www</span>
| + | |
− | $ <span class="enter">sudo mkdir jbrowse_demo</span>
| + | |
− | $ <span class="enter">sudo chown ubuntu.ubuntu jbrowse_demo</span>
| + | |
− | $ <span class="enter">cd jbrowse_demo</span>
| + | |
− | | + | |
− | * download the demo bundle from jbrowse.org and unzip it
| + | |
− | | + | |
− | $ <span class="enter">wget http://jbrowse.org/info/GMOD_Aug_2012/GMOD_Summer_School_2012_JBrowse.zip</span>
| + | |
− | $ <span class="enter">unzip GMOD_Summer_School_2012_JBrowse.zip</span>
| + | |
− | $ <span class="enter">unzip JBrowse-1.6.0-min.zip</span>
| + | |
− | $ <span class="enter">mv JBrowse-1.6.0-min jbrowse</span>
| + | |
− | | + | |
− | * run <code>setup.sh</code> to configure this copy of JBrowse
| + | |
− | | + | |
− | $ <span class="enter">cd jbrowse</span>
| + | |
− | $ <span class="enter">./setup.sh</span>
| + | |
− | | + | |
− | === Starting Point ===
| + | |
− | | + | |
− | Visit in web browser: <nowiki>http://</nowiki>{{Template:AWSurl}}/jbrowse_demo/jbrowse/
| + | |
− | | + | |
− | You should see just a blank white page.
| + | |
− | | + | |
− | === Basic Steps ===
| + | |
− | | + | |
− | Setting up a JBrowse instance with feature data goes in three basic steps:
| + | |
− | | + | |
− | # Specify reference sequences | + | |
− | # Load feature data
| + | |
− | # Index feature names
| + | |
− | | + | |
− | <!--
| + | |
− | === If you didn't follow along in the chado session ===
| + | |
− | | + | |
− | We'll be using the chado database from the chado session; if you didn't follow along exactly, re-load the database like so:
| + | |
− | | + | |
− | <pre>
| + | |
− | $ dropdb chado
| + | |
− | $ createdb chado
| + | |
− | $ bzip2 -cd ~/Documents/Software/schema/chado/complete_db.bz2 | psql chado
| + | |
− | </pre>
| + | |
− | -->
| + | |
− | | + | |
− | === Data from a directory of files ===
| + | |
− | | + | |
− | Here, we'll use the {{CPAN|Bio::DB::SeqFeature::Store}} adaptor in "memory" mode to read a directory of files. There are adaptors available for use with many other databases, such as [[Chado]] and {{CPAN|Bio::DB::GFF}}.
| + | |
− | | + | |
− | Config file:
| + | |
− | <tt>pythium-1.conf</tt>
| + | |
− | <pre>
| + | |
− | {
| + | |
− | "description": "GMOD Summer School 2012 P. ultima Example",
| + | |
− | "db_adaptor": "Bio::DB::SeqFeature::Store",
| + | |
− | "db_args" : {
| + | |
− | "-adaptor" : "memory",
| + | |
− | "-dir" : ".."
| + | |
− | },
| + | |
− | ...
| + | |
− | </pre>
| + | |
− | | + | |
− | ==== Specify reference sequences ====
| + | |
− | | + | |
− | The first script to run is <tt>bin/prepare-refseqs.pl</tt>; that script is the way you tell JBrowse about what your reference sequences are. Running <tt>bin/prepare-refseqs.pl</tt> also sets up the "DNA" track.
| + | |
− | | + | |
− | Run this from within the <tt>jbrowse</tt> directory (you could run it elsewhere, but you'd have to explicitly specify the location of the data directory on the command line).
| + | |
− | | + | |
− | $ <span class="enter">cd /var/www/jbrowse_demo/jbrowse</span>
| + | |
− | $ <span class="enter">bin/prepare-refseqs.pl --gff ../scf1117875582023.gff</span>
| + | |
− | | + | |
− | Refresh it in your web browser, you should new see the JBrowse UI and a sequence track, which will show you the DNA base pairs if you zoom in far enough.
| + | |
− | | + | |
− | ==== Load Feature Data ====
| + | |
− | | + | |
− | Next, we'll use <tt>biodb-to-json.pl</tt> to get feature data out of the database and turn it into [[Glossary#JSON|JSON]] data that the web browser can use.
| + | |
− | | + | |
− | In this case, we have specified all of our track configurations in <code>pythium-1.conf</code>.
| + | |
− | | + | |
− | <syntaxhighlight lang="javascript">...
| + | |
− | | + | |
− | "TRACK DEFAULTS": {
| + | |
− | "class": "feature"
| + | |
− | },
| + | |
− | | + | |
− | "tracks": [
| + | |
− | {
| + | |
− | "track": "Genes",
| + | |
− | "key": "Genes",
| + | |
− | "feature": ["mRNA"],
| + | |
− | "autocomplete": "all",
| + | |
− | "class": "transcript",
| + | |
− | "subfeature_classes" : {
| + | |
− | "CDS" : "transcript-CDS",
| + | |
− | "UTR" : "transcript-UTR"
| + | |
− | },
| + | |
− | "arrowheadClass" : "arrowhead"
| + | |
− | },
| + | |
− | ...
| + | |
− | ]</syntaxhighlight>
| + | |
− | | + | |
− | <tt>track</tt> specifies the track identifier (a unique name for the track, for the software to use). This should be just letters and numbers and - and _ characters; using other characters makes things less convenient.
| + | |
− | | + | |
− | <tt>key</tt> specifies a human-friendly name for the track, which can use any characters you want.
| + | |
− | | + | |
− | <tt>feature</tt> gives a list of feature types to include in the track.
| + | |
− | | + | |
− | <tt>autocomplete</tt> including this setting makes the features in the track searchable.
| + | |
− | | + | |
− | <tt>urltemplate</tt> specifies a URL pattern that you can use to link genomic features to specific web pages.
| + | |
− | | + | |
− | <tt>class</tt> specifies the [[Glossary#CSS|CSS]] class that describes how the feature should look.
| + | |
− | | + | |
− | For this particular track, I've specified the <tt>transcript</tt> feature class.
| + | |
− | | + | |
− | Run the <tt>bin/biodb-to-json.pl</tt> script with this config file to format this track, and the others in the file:
| + | |
− | | + | |
− | $ <span class="enter">bin/biodb-to-json.pl --conf ../pythium-1.conf</span>
| + | |
− | | + | |
− | Refresh JBrowse in your web browser. You should now see a bunch of annotation tracks.
| + | |
− | | + | |
− | ==== Index feature names ====
| + | |
− | | + | |
− | When you generate JSON for a track, if you specify <tt>"autocomplete"</tt> then a listing of all of the feature names from that track (along with feature locations) will also be generated and used to provide feature searching and autocompletion.
| + | |
− | | + | |
− | The <tt>bin/generate-names.pl</tt> script collects those lists of names from all the tracks and combines them into one big tree that the client uses to search.
| + | |
− | | + | |
− | $ <span class="enter">bin/generate-names.pl -v</span>
| + | |
− | | + | |
− | Visit in web browser, try typing a feature name, such as '''maker-scf1117875582023-snap-gene-0.26-mRNA-1'''. Notice that JBrowse tries to auto-complete what you type.
| + | |
− | | + | |
− | === Data from flat files ===
| + | |
− | | + | |
− | We're going to add a couple more tracks that come from <tt>repeats.gff</tt>, a different flat file.
| + | |
− | | + | |
− | ==== Features ====
| + | |
− | | + | |
− | To get feature data from flat files into JBrowse, use <tt>flatfile-to-json.pl</tt>.
| + | |
− | | + | |
− | * We'll add a RepeatMasker track:
| + | |
− | | + | |
− | $ <span class="enter">bin/flatfile-to-json.pl --trackLabel repeatmasker \
| + | |
− | --type match:repeatmasker --getType --getSubfeatures --key RepeatMasker \
| + | |
− | --arrowheadClass arrowhead --className generic_parent \
| + | |
− | --subfeatureClasses '{"match_part" : "feature"}' --gff ../repeats.gff</span>
| + | |
− | | + | |
− | * And then a RepeatRunner track:
| + | |
− | | + | |
− | $ <span class="enter">bin/flatfile-to-json.pl --trackLabel repeatrunner \
| + | |
− | --type protein_match:repeatrunner --getType --getSubfeatures \
| + | |
− | --key RepeatRunner --arrowheadClass arrowhead --className generic_parent \
| + | |
− | --subfeatureClasses '{"match_part" : "feature"}' --gff ../repeats.gff</span>
| + | |
− | | + | |
− | Visit in web browser; you should see the two new RepeatMasker and RepeatRunner tracks.
| + | |
− | | + | |
− | ==== BAM data ====
| + | |
− | | + | |
− | Now let's add some simulated short-read alignments from a BAM file. To import data from a BAM source:
| + | |
− | | + | |
− | $ <span class="enter"> bin/bam-to-json.pl \
| + | |
− | --bam ../simulated-sorted.bam \
| + | |
− | --tracklabel BAM_data --key "BAM Data"
| + | |
− | | + | |
− | === Quantitative data ===
| + | |
− | | + | |
− | ==== BigWig ====
| + | |
− | | + | |
− | JBrowse can display quantitative data directly from a BigWig file on your web server. Simply place the BigWig file in a directory accessible to your web server, and add a snippet of configuration to JBrowse to add the track, similar to:
| + | |
− | | + | |
− | <syntaxhighlight lang="javascript">
| + | |
− | {
| + | |
− | "label" : "bam_coverage",
| + | |
− | "key" : "BAM coverage",
| + | |
− | "storeClass" : "BigWig",
| + | |
− | "urlTemplate" : "../../simulated-sorted.bam.coverage.bw",
| + | |
− | "type" : "Wiggle",
| + | |
− | "variance_band" : true
| + | |
− | }
| + | |
− | </syntaxhighlight>
| + | |
− | | + | |
− | This can be added by either editing the <tt>data/trackList.json</tt> file with a text editor, or by running something like this at the command line to inject the track configuration:
| + | |
− | | + | |
− | $ <span class="enter">echo ' {
| + | |
− | "label" : "bam_coverage",
| + | |
− | "key" : "BAM coverage",
| + | |
− | "storeClass" : "BigWig",
| + | |
− | "urlTemplate" : "../../simulated-sorted.bam.coverage.bw",
| + | |
− | "type" : "Wiggle",
| + | |
− | "variance_band" : true
| + | |
− | } ' | bin/add-track-json.pl data/trackList.json</span>
| + | |
− | | + | |
− | ==== Tiled Images ====
| + | |
− | | + | |
− | JBrowse also has a formatter that converts wiggle-format data to image tiles. JBrowse does this with a C++ program, <code>setup.sh</code> attempts to compile for you.
| + | |
− | | + | |
− | There isn't any Pythium wiggle example data for this class, but the command to make image tiles from a wiggle file takes the form:
| + | |
− | | + | |
− | $ <span class="enter">bin/wig-to-json.pl --wig /path/to/wiggle.wig \
| + | |
− | --tracklabel "coverage_wig" --key "Wiggle Coverage" --min 0 --max 50</span>
| + | |
− | | + | |
− | === Faceted Track Selection ===
| + | |
− | | + | |
− | JBrowse has a new, very powerful faceted track selector that can be used to search for tracks using metadata associated with them.
| + | |
− | | + | |
− | The track metadata is kept in a CSV-format file, with any number of columns, and with a "label" column whose contents must correspond to the track labels in the JBrowse configuration.
| + | |
− | | + | |
− | The demo bundle contains an example <tt>trackMetadata.csv</tt> file, which can be copied into the <tt>data</tt> directory for use with this configuration.
| + | |
− | | + | |
− | $ <span class="enter">cp trackMetadata.csv jbrowse/data</span>
| + | |
− | | + | |
− | Then a simple faceted track selection configuration might look like:
| + | |
− | | + | |
− | <syntaxhighlight lang="javascript">
| + | |
− | trackSelector: {
| + | |
− | type: 'Faceted',
| + | |
− | },
| + | |
− | trackMetadata: {
| + | |
− | sources: [
| + | |
− | { type: 'csv', url: 'data/trackMetadata.csv' }
| + | |
− | ]
| + | |
− | }
| + | |
− | </syntaxhighlight>
| + | |
− | | + | |
− | The <tt>jbrowse_conf.json</tt> file in the <tt>jbrowse</tt> directory already conveniently contains this stanza, commented out. Uncomment it, refresh your browser, and you should now see the faceted track selector activated.
| + | |
− | | + | |
− | == Upgrading an Existing JBrowse ==
| + | |
− | | + | |
− | If the old JBrowse is 1.3.0 or later, simply move the data directory from the old JBrowse directory into the new JBrowse directory.
| + | |
− | | + | |
− | == Common Problems ==
| + | |
− | | + | |
− | * JSON syntax errors
| + | |
− | | + | |
− | | + | |
− | == Future JBrowse Plans ==
| + | |
− | | + | |
− | See the [[Media:JBrowse_gmod_aug2012.pdf|accompanying slides (PDF)]]
| + | |
− | | + | |
− | | + | |
− | == Other links ==
| + | |
− | | + | |
− | * Config file ref: http://jbrowse.org/code/jbrowse-master/docs/config.html
| + | |
− | * DIV test: http://jbrowse.org/test/boatdiv/boat.html
| + | |
| | | |
| [[Category:Tutorials]] | | [[Category:Tutorials]] |
| [[Category:JBrowse]] | | [[Category:JBrowse]] |
| [[Category:2012 Summer School]] | | [[Category:2012 Summer School]] |