User talk:RobertBuels

From GMOD
Revision as of 20:30, 12 June 2012 by RobertBuels (Talk | contribs)

Jump to: navigation, search

(add on JBrowse)

JBrowse Configuration Guide

Anonymous Usage Statistics

JBrowse instances report usage statistics to the JBrowse developers. This data is very important to the JBrowse project, since it is used to make the case to grant agencies for continuing to fund JBrowse development. No research data is transmitted, the data collected is limited to standard Google Analytics, along with a count of how many tracks the JBrowse instance has, how many reference sequences are present, their average length, and what types of tracks (wiggle, feature, etc) are present. Users can disable usage statistics by setting suppressUsageStatistics: true in the JBrowse configuration.

Configuring Faceted Track Selection

Starting with version 1.4.0, JBrowse has an advanced "faceted" track selector tailored for sites with hundreds or thousands of tracks in a single JBrowse instance. This track selector allows users to interactively search for the tracks they are interested in using metadata that is associated with each track.

An example of a faceted track selector in action with about 1,800 tracks can be seen here. This is an example installation containing a snapshot of modENCODE track metadata. Note that the track data and reference sequences in this example are not real (they are actually all just copies of the same volvox test track), this is just an example of the faceted track selector in action.

Configuring the Track Selector

To enable the faceted track selector, set trackSelector.type to Faceted in the JBrowse configuration. The Faceted track selector takes all sources of track metadata, aggregates them, and makes the tracks searchable using this metadata. By default, tracks only have a few default metadata facets that come from the track configuration itself. After initially turning on the faceted track selector, most users will want to add their own metadata for the tracks: see #Defining Track Metadata below.

There are some other configuration variables that can be used to customize operation of the track selector:

Option Value
trackSelector.displayColumns Array of which facets should be displayed as columns in the track list. Columns are displayed in the order given. If not provided, all facets will be displayed as columns, in lexical order.
trackSelector.renameFacets Object containing "display names" for some or all of the facet names. For example, setting this to { 'developmental-stage': 'Conditions' } would display "Conditions" as the name of the developmental-stage facet.

Example

 trackSelector: {
     type: 'Faceted',
     displayColumns: ['key', 'organism', 'technique', 'target', 'factor', 'developmental-stage','principal_investigator','submission' ],
     renameFacets: { 'developmental-stage': 'Conditions', submission: 'Submission ID' }
 }

Defining Track Metadata

To add your own track metadata to JBrowse, add a trackMetadata section to the JBrowse configuration.

JBrowse currently supports track metadata that in Excel-compatible comma-separated-value (CSV) format, but additional track metadata backends are relatively easy to add. Write the JBrowse mailing list if you have a strong need for another format for track metadata.

Option Value
trackMetadata.sources Array of source definitions, each of which takes the form { type: 'csv', url: '/path/to/file' }. The url is interpreted as relative to the url of the page containing JBrowse (index.html in default installations). Source definitions can also contain a class to explicitly specify the JavaScript backend used to handle this source.
trackMetadata.filterFacets Array of facet names that should be the only ones made searchable. This can be used improve the speed and memory footprint of JBrowse on the client by skipping indexing unused metadata facets.

Example

 trackMetadata: {
     filterFacets: [ 'category','organism','target','technique','principal_investigator',
                     'factor','developmental-stage','strain','cell-line','tissue','compound',
                     'temperature'
                   ],
     sources: [
          { type: 'csv', url:  'myTrackMetaData.csv' }
     ]
 }

Name Searching and Autocompletion

The JBrowse search box auto-completes the names of features and reference sequences that are typed into it. After loading all feature and reference sequence data into a JBrowse instance (with prepare-refseqs.pl, flatfile-to-json.pl, etc.), generate-names.pl must be run to build the indexes used for name searching and autocompletion.

Autocompletion Configuration

Most users will not need to configure any of these variables.

Option Value
autocomplete.stopPrefixes Array of word-prefixes for which autocomplete will be disabled. For example, a value of ['foo'] will prevent autocompletion when the user as typed 'f', 'fo', or 'foo', but autocompletion will resume when the user types any additional characters.
autocomplete.resultLimit Maximum number of autocompletion results to display. Defaults to 15.
autocomplete.tooManyMatchesMessage Message displayed in the autocompletion dropdown when more than autocomplete.resultLimit matches are displayed. Defaults to 'too many matches to display'.

generate-names.pl

This script builds indexes of features by label (the visible name below a feature in JBrowse) and/or by alias (a secondary name that is not visible in the web browser, but may be present in the JSON used by JBrowse).

To search for a term, type it in the autocompleting text box at the top of the JBrowse window.

Basic syntax:

bin/generate-names.pl [options]

Note that generate-names.pl does not require any arguments. However, some options are available:

Option Value
dir A path to the output directory (default is 'data/names' in the current directory).
thresh A lower-bound on the Patricia trie chunk size. Specifically, the lowest possible chunk size is (thresh + 1). The default value is 200. In this context, a chunk is a group of connected Patricia trie nodes that can be visualized as a single entity, and the chunk size is the total number of genomic features contained in a chunk. The lower the value of thresh, the more chunks there will be.
verbose This setting causes information about the division of nodes into chunks to be printed to the screen.