Difference between revisions of "User talk:RSCummings"

From GMOD
Jump to: navigation, search
m (cssClass: removed 'color' column from table. Minor rewording.)
m (clientConfig: Added table description)
Line 232: Line 232:
 
You can think of clientConfig as being "an option with sub-options." Any (or all) of these options can be omitted. Here is a list of them, along with descriptions:
 
You can think of clientConfig as being "an option with sub-options." Any (or all) of these options can be omitted. Here is a list of them, along with descriptions:
  
 +
'''List of clientConfig Options'''
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-

Revision as of 01:04, 22 July 2011

More details about certain JBrowse script options

cssClass

Every built-in JBrowse track type is defined as a class in the file 'genome.css'. This file can be edited as a regular text file, and classes can be added, changed, or removed (although direct editing is not necessary; it is possible to use the clientConfig option to avoid changing genome.css). The argument to the cssClass option is the name of a genome.css class. As of JBrowse v1.2.1, these classes are available:

Class Name Directional? Other Details
basic No Useful when it is desirable for the track appearance to be defined entirely by clientConfig.
cds0 Yes
cds1 Yes
cds2 Yes
dblhelix No A black double helix outline is superimposed over the light red background.
est No
exon No
feature Yes
feature2 Yes
feature3 Yes
feature4 Yes Pacman design.
feature5 Yes
generic_parent No
generic_part_a No
helix No A single black coil outline is superimposed over a faint green background.
transcript No
transcript-CDS No
transcript-exon No
transcript-five_prime_UTR No Identical to transcript-exon.
transcript-three_prime_UTR No Identical to transcript-exon.
transcript-UTR No Identical to transcript-exon.
triangle No

subfeatureClasses

In order to make subfeatures appear in JBrowse, it is necessary to assign a genome.css class to them. This is done in with an association list in JSON syntax, where the key is the type of subfeature (e.g. CDS, UTR, match_part, mRNA), and the value is the genome.css class that will be used as the appearance of that subfeature.

As an example, the '--subfeatureClasses' argument to flatfile-to-json might look something like '{ "CDS" : "transcript-CDS", "UTR" : "transcript-UTR" }'. This could be rewritten as:

'{
   "CDS" : "transcript-CDS",
   "UTR" : "transcript-UTR"  
 }'

This second format makes the JSON structure more obvious, but the first format is easier to use as a command line argument.

urlTemplate

Note: To follow along with the next few examples, switch to the jbrowse directory, then do:

bin/prepare-refseqs.pl --fasta docs/tutorial/data_files/volvox.fa

urlTemplate can be used to link features to an external website. As a simple example, this call to flatfile-to-json.pl will link every feature in the track to google.com:

bin/flatfile-to-json.pl --urlTemplate http://www.google.com --type remark --gff docs/tutorial/data_files/volvox.gff3 --tracklabel same_URL

When used in this way, urlTemplate is not very useful, because every feature in the track links to the same url.

In order to make urlTemplate link different features to different urls, try this example:

bin/flatfile-to-json.pl --urlTemplate http://www.google.com/search?q={name} --getLabels --type remark --gff docs/tutorial/data_files/volvox.gff3 --tracklabel unique_URLs

Now, for any given feature, clicking on the link causes google to be queried for that feature's name. With the correct website, this function could be used to link each feature to an annotation page that specifically describes it.

In order to understand how this works, it is necessary to understand a few aspects of the output JSON. Before reading any further, open the first output JSON file with a pager program such as less:

less data/tracks/ctgA/same_URL/trackData.json

Kind of ugly, no? By default, the JSON file is as compact as possible, containing the minimum amount of data necessary for the javascript to render the track. Locate the "headers" key toward the beginning of the text. This key should be associated with the JSON array: ["start","end","strand","id"]. These are the elements of feature data that are preserved by default. Further down, following the "featureNCList" key, you will find the actual feature data that corresponds to these columns.

Now, open 'data/tracks/ctgA/unique_URLs/trackData.json' with your pager program.

Once again, locate the "headers" array. You will notice that, in addition to the columns that were previously present, there is now an additional one called "name". This is present because of the --getLabels switch that was used in the second call to flatfile-to-json.pl.

Here's how urlTemplate works: by putting any of the headers from the output JSON files in curly braces, the values in the header's column are substituted for each individual feature. In the case of this example, the names of each feature are substituted. You could have also substituted values from the "start" column, or from any of the other columns, e.g.:

... --urlTemplate http://www.google.com/search?q={start} ...
... --urlTemplate http://www.google.com/search?q={id} ...

This can be very useful in conjunction with extraData, which makes it possible to create additional headers from data that would normally be ignored by the program.

extraData

The extraData option is used to incorporate additional data from the data source into the output JSON. In particular, it is most useful when used with the urlTemplate option, because the data it extracts can be used to query an online annotation database.

The argument for the extraData option is a JSON association list, where the keys are names (strings) and the values are perl subroutine definitions (also strings). By the way, a subroutine is just another name for a function. The perl subroutine can be anything, and it gets evaluated for each feature that will be in the JBrowse track.

To convince yourself of this, switch to your jbrowse directory, and try the following:

$ bin/prepare-refseqs.pl --fasta volvox.fa
$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel ExtraData_NoTrackChanges --type mRNA --extraData '{ "empty_column" : "sub { print(\"$0 is invoking the subroutine you defined.\\n\") }"}' 

The message is printed four times, because there are four features whose type is 'mRNA'. For this simple example, the subroutine did not return anything. Normally, it will return data specific to each feature, and this will be silent (unless you specifically tell it to print that data).

Let's take a step back for a moment. The ability to extract feature data from the data structures in the underlying code suggests that we will need to understand how the data is stored in those structures. After a few minor simplifications, this is what the data structure of each feature object looks like:

{
  "attributes" => {
    # attributes are optional; the ones listed here may or may not be defined for a given feature.
    # also, there could be any number of additional attributes.
    "load_id" => [<list of strings>],
    "parent_id" => [<list of strings>],
    "Alias" => [<list of strings>],
    "Note" => [<list of strings>],
    ...
  }
  "ref" => <string>,
  "type" => <string>
  "name" => <string>,
  "phase" => <number>,
  "score" => <number>,
  "start" => <number>,
  "stop" => <number>, 
  "strand" => <number>
}

When the extraData subroutine is invoked, it is invoked with a feature object (i.e. the data structure shown above) as the only argument.

To get the type for each feature, one could do:

--extraData '{ "the_type" : "sub { return $_[0]->{\"type\"}; }" }'  

or, equivalently,

--extraData '{ "the_type" : "sub { shift->{\"type\"}; }" }'  

I will use the first syntax, since I think it is more intuitive. $_[0] is a reference to the first argument to the subroutine (the feature object), and the arrow pointing to the curly braced, escaped string ("type") gets the data associated with that string from the feature object. That data is then returned.

Are you familiar with the --getType option from flatfile-to-json.pl? This is doing almost exactly the same thing as --getType. The only difference is that I have chosen to refer to the extracted data as "the_type", and when it is done through --getType, the data is referred to simply as "type". I have only used this type extraction example for demonstration purposes; when you actually want to get the type, you should use --getType because it is more succinct and more easily understood.

Now, let's try to do something useful with --extraData that cannot be done with any other option. We are going to extract an attribute.

Here's the command to extract the load_id attribute:

--extraData '{ "load_id" : "sub { return $_[0]->{\"attributes\"}->{\"load_id\"}[0]; }" }' 

It turns out that there is a somewhat cleaner way of doing this:

--extraData '{ "load_id" : "sub { return $_[0]->attributes(\"load_id\"); }" }'

Now that we have the load_id for every feature, how can we use it? The most useful way to use it is with the --urlTemplate option.

I have shown the most basic case, where it is desirable to get data from each feature object and then immediately return it as is. Although this is outside the scope of this tutorial, with some knowledge of perl, it would be straightforward to extend this case to map another subroutine over the data, or to use a lookup table to convert one type of identifier to another.

One final word of caution. When you use the extraData option, the files with the data for the track must get bigger to accommodate this extra data. This can be thought of as inserting an additional column into a table -- this size increase is on the order of the number of features multiplied by the average size of the data in each slot of that column. Unnecessary use of extraData will unnecessarily increase the amount of time needed to transfer data from the server to a client. That said, try not use more space than you need.

clientConfig

Non-quantitative tracks have a few different types of representations. There is the traditional feature representation, where each feature is visible as a horizontal bar extending from its start to end positions on the chromosome, and there is also a histogram representation, where the density of features in each region of the chromosome is shown with a histogram. Normally, when zoomed in on a chromosome, the feature representation is used, and after zooming out to a certain extent, it is replaced with the histogram representation. The clientConfig option can be used to alter the appearance of either of these representations, and can also be used to choose the zoom levels at which the transition between them occurs. It can also be used to choose at which zoom level the feature labels and subfeatures become visible.

The structure of the argument to clientConfig looks something like this:

'{
   "featureCss" : <string of css settings>,
   "histCss" : <string of css settings>,
   "histScale" : <number>,
   "labelScale" : <number>,
   "subfeatureScale" : <number>
 }'

There is also a sixth option, "featureCallback", that makes it possible to pass a callback function to the javascript client code. Because of its complexity, this option will not be discussed.

You can think of clientConfig as being "an option with sub-options." Any (or all) of these options can be omitted. Here is a list of them, along with descriptions:

List of clientConfig Options

Option Description
featureCss CSS configuration edits for the features. Overrides any configuration in genome.css.
histCss CSS configuration edits for the histogram. Overrides any configuration in genome.css.
histScale A number that defines the zoom levels at which the individual features are replaced with the histogram (or vice versa). For higher histScale values, the histogram representation will be used at more zoom levels (it will be necessary to zoom in more in order to view the feature representation). The default value is 4.
labelScale A number that defines the zoom level at which the labels begin to be visible. Decreasing this value causes the feature label visibility transition to occur at a lower zoom level (when zoomed out further). The default value is 50.
subfeatureScale A number that defines the zoom level at which the subfeatures begin to be visible. Decreasing this value causes the subfeature visibility transition to occur at a lower zoom level (when zoomed out further). The default value is 80.

CSS Options for the Backgrounds and Borders
Most of these options can be used with both featureCss and histCss.

Option Description Input
height The top to bottom height of each feature. Some number of pixels (e.g., 5px). Do not edit the height property for histCss. Changing it will cause each of the histogram bars to be set to the same height.
background-color the background color of each feature or each bar of the histogram. A RGB hexadecimal color code (e.g. #A4C or #8FA366).
background-image An image to use as the background for the features or histogram bars. A path to an image file, with the syntax: url(<path/to/image>).
background-repeat Describes how repetition of a background image will occur, if at all. One of these: repeat, repeat-x, repeat-y, no-repeat.
border-style The type of border each feature or histogram bar has. One of these: solid, dotted, dashed, double, groove, ridge, inset, outset.
border-color The border color of a feature or histogram bar. A RGB hexadecimal color code (e.g. #A4C or #8FA366).
border-width The thickness of the border for a feature or histogram bar. Some number of pixels (e.g., 1px).

Examples

If you would like to follow along, please begin by switching to your jbrowse directory and inputting the reference sequence:

$ bin/prepare-refseqs.pl --fasta docs/tutorial/data_files/volvox.fa

Example 1: Change the colors in an existing CSS class.

$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel PurpleFeatures --type mRNA --cssClass exon --clientConfig '{ "featureCss": "background-color:#9A61CF;border-color:#440D78"}'

Example explained: If the color of the features hadn't changed, they would have been blue, as defined by the 'exon' class in genome.css. This example demonstrates that input to clientConfig overrides the default characteristics of a CSS class.

Example 2: Force a histogram to be displayed at low zoom levels. Also, use a background image for the features.

$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel AllFeatures --cssClass basic --clientConfig '{"histScale": 30, "histCss": "background-color:#11A""featureCss": "height:12px;background-image:url(img/helix2-green.png);background-repeat:repeat-x"}'

Example explained: Increasing histScale (from the default value 4) causes the histogram to appear. The histogram is made blue by editing the background color for histCss. The image for the features is created using background-image and background-repeat for featureCss. This example demonstrates the use of several clientConfig options at once.

Example 3: Cause the subfeatures to appear at all zoom levels.

$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel SubfeaturesEverywhere --cssclass transcript --getSubs --subfeatureClasses '{"CDS": "transcript-CDS", "UTR": "transcript-UTR"}' --clientConfig '{"subfeatureScale": 0}'

Example explained: In order to make the subfeatures appear at all levels, the value associated with subfeatureScale must be reduced. In the extreme case that we want subfeatures visible at all zoom levels, the value associated with subfeatureScale can simply be set to zero. This idea is also applicable to labelScale (making the labels appear at every zoom level) and histScale (making the histogram never appear).

Example 4: Cause the names to appear at the two highest zoom levels only.

$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel FewerZoomsWithNames --type match --getLabel --clientConfig '{"labelScale": 1600}'

Example explained: By default, the names appear at the seven highest levels. This is demonstrated by counting the zoom levels with feature labels from the track generated by:

$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel MoreZoomsWithNames --type match --getLabel

Thus, the value of labelScale must be increased from the default value. Assuming that each zoom-in operation with the smaller magnifying glass icon approximately doubles the level of magification, the difference in magnification between the second and seventh highest zoom levels is approximately 32-fold. So multiplying 32 by the default value (50) should yield a value for labelScale that is close to what we want, if not precisely what we want. This logic is also applicable to subfeatureScale and histScale.