Difference between revisions of "User talk:RSCummings"

From GMOD
Jump to: navigation, search
(clientConfig: started filling)
(clientConfig)
Line 245: Line 245:
 
There are a number of built-in tracks that come with JBrowse; these tracks are defined in genome.css, and some of them use images from the img/ directory. For a selected track type from genome.css, instructions to clientConfig override the definitions already present in genome.css.  
 
There are a number of built-in tracks that come with JBrowse; these tracks are defined in genome.css, and some of them use images from the img/ directory. For a selected track type from genome.css, instructions to clientConfig override the definitions already present in genome.css.  
  
To get an idea of the types of things we might want to change, let's examine the 'basic' class. This class is meant to be used with clientConfig to produce completely novel track types, and when it is not used with clientConfig, it is invisible.
+
The structure of the argument to clientConfig looks something like this:
  
Comparing a typical solid-color background track's class with the basic track's class, we can observe the types of things we might be interested in adding or changing:
+
'{
 +
    "featureCss" : <string of css settings>,
 +
    "histCss" : <string of css settings>,
 +
    "histScale" : <number>
 +
  }'
 +
 
 +
You can think of clientConfig as being "an option with sub-options", where the sub-options are featureCss, histCss, and histScale. Any or all of these three options can be omitted. 
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Option
 +
! Description
 +
|-
 +
| featureCss
 +
| CSS configuration edits for the features.
 +
|-
 +
| histCss
 +
| CSS configuration edits for the histogram.
 +
|-
 +
| histScale
 +
| A number that defines the zoom levels at which the individual features are replaced with the histogram (or vice versa). For higher histScale values, it is necessary to be zoomed in more in order to view the features (as opposed to the histogram). 
 +
|}
 +
 
 +
 
 +
'''CSS Options for the Backgrounds and Borders'''
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 255: Line 279:
 
! Input
 
! Input
 
|-
 
|-
| height
+
| height*
| the top to bottom height of each feature icon
+
| The top to bottom height of each feature.
| Some number of pixels (e.g., 5px)
+
| Some number of pixels (e.g., 5px).
 
|-
 
|-
 
| background-color
 
| background-color
| the background color of each feature icon
+
| the background color of each feature or each bar of the histogram.
| A RGB hex color (e.g. #19D or #8FA366)  
+
| A RGB hexadecimal color code (e.g. #A4C or #8FA366).
 +
|-
 +
| background-image
 +
| An image to use as the background for the features or histogram bars.
 +
| A path to an image file, with the syntax: url("path/to/image").
 +
|-
 +
| background-repeat
 +
| Describes how repetition of a background image will occur, if at all.
 +
| One of these: repeat, repeat-x, repeat-y, no-repeat.
 
|-
 
|-
 
| border-style
 
| border-style
| the type of border each feature has
+
| The type of border each feature or histogram bar has.
| One of these: solid, dotted, dashed, double, groove, ridge, inset, outset
+
| One of these: solid, dotted, dashed, double, groove, ridge, inset, outset.
 
|-
 
|-
 
| border-color
 
| border-color
| the color of a feature's border
+
| The border color of a feature or histogram bar.
| A RGB hex color (e.g. #19D or #8FA366)  
+
| A RGB hexadecimal color code (e.g. #A4C or #8FA366).
 
|-
 
|-
 
| border-width
 
| border-width
| the thickness of a feature's border
+
| The thickness of the border for a feature or histogram bar.
| Some number of pixels (e.g., 1px)  
+
| Some number of pixels (e.g., 1px).
 
|}
 
|}
 +
'''*Note:''' Do not edit the height property for histCss.
 +
 +
To get an idea of the types of things we might want to change, let's examine the 'basic' class. This class is meant to be used with clientConfig to produce completely novel track types, and when it is not used with clientConfig, it is invisible.

Revision as of 01:22, 16 July 2011

More details about certain JBrowse script options

cssClass

Every built-in track appearance is defined as a class in the file 'genome.css'. This file can be edited as a regular text file, and classes can be added, changed, or removed (although direct editing is not necessary; use the clientConfig option to avoid changing genome.css). The argument to the cssClass option will be the name of a genome.css class. As of JBrowse v1.2.1, these classes are available:

Class Name Color Directional? Other Details
basic Completely transparent; invisible. No Useful when it is desirable for the track appearance to be defined entirely by clientConfig.
cds0 Cyan Yes
cds1 Blue Yes
cds2 Dark blue Yes
dblhelix Light red No A black double helix outline is superimposed over the light red background.
est Light red No
exon Blue No
feature Alternating blue and grey Yes
feature2 Green Yes
feature3 Alternating yellow and dark yellow Yes
feature4 Black and yellow Yes Pacman design.
feature5 Alternating blue and light blue Yes
generic_parent Grey No
generic_part_a green No
helix Light green No A single black coil outline is superimposed over a faint green background.
transcript Grey No
transcript-CDS Light red No
transcript-exon Red No
transcript-five_prime_UTR Red No Identical to transcript-exon.
transcript-three_prime_UTR Red No Identical to transcript-exon.
transcript-UTR Red No Identical to transcript-exon.
triangle Black No

subfeatureClasses

In order to make subfeatures appear in JBrowse, it is necessary to assign a genome.css class to them. This is done in with an association list in JSON syntax, where the key is the type of subfeature (e.g. CDS, UTR, match_part, mRNA), and the value is the genome.css class that will be used as the appearance of that subfeature.

As an example, the '--subfeatureClasses' argument to flatfile-to-json might look something like '{ "CDS" : "transcript-CDS", "UTR" : "transcript-UTR" }'. This could be rewritten as:

'{
   "CDS" : "transcript-CDS",
   "UTR" : "transcript-UTR"  
 }'

This second format makes the JSON structure more obvious, but the first format is easier to use as a command line argument.

urlTemplate

Note: To follow along with the next few examples, switch to the jbrowse directory, then do:

bin/prepare-refseqs.pl --fasta docs/tutorial/data_files/volvox.fa

urlTemplate can be used to link features to an external website. As a simple example, this call to flatfile-to-json.pl will link every feature in the track to google.com:

bin/flatfile-to-json.pl --urlTemplate http://www.google.com --type remark --gff docs/tutorial/data_files/volvox.gff3 --tracklabel same_URL

When used in this way, urlTemplate is not very useful, because every feature in the track links to the same url.

In order to make urlTemplate link different features to different urls, try this example:

bin/flatfile-to-json.pl --urlTemplate http://www.google.com/search?q={name} --getLabels --type remark --gff docs/tutorial/data_files/volvox.gff3 --tracklabel unique_URLs

Now, for any given feature, clicking on the link causes google to be queried for that feature's name. With the correct website, this function could be used to link each feature to an annotation page that specifically describes it.

In order to understand how this works, it is necessary to understand a few aspects of the output JSON. Before reading any further, open the first output JSON file with a pager program such as less:

less data/tracks/ctgA/same_URL/trackData.json

Kind of ugly, no? By default, the JSON file is as compact as possible, containing the minimum amount of data necessary for the javascript to render the track. Locate the "headers" key toward the beginning of the text. This key should be associated with the JSON array: ["start","end","strand","id"]. These are the elements of feature data that are preserved by default. Further down, following the "featureNCList" key, you will find the actual feature data that corresponds to these columns.

Now, open 'data/tracks/ctgA/unique_URLs/trackData.json' with your pager program.

Once again, locate the "headers" array. You will notice that, in addition to the columns that were previously present, there is now an additional one called "name". This is present because of the --getLabels switch that was used in the second call to flatfile-to-json.pl.

Here's how urlTemplate works: by putting any of the headers from the output JSON files in curly braces, the values in the header's column are substituted for each individual feature. In the case of this example, the names of each feature are substituted. You could have also substituted values from the "start" column, or from any of the other columns, e.g.:

... --urlTemplate http://www.google.com/search?q={start} ...
... --urlTemplate http://www.google.com/search?q={id} ...

This can be very useful in conjunction with extraData, which makes it possible to create additional headers from data that would normally be ignored by the program.

extraData

The extraData option is used to incorporate additional data from the data source into the output JSON. In particular, it is most useful when used with the urlTemplate option, because the data it extracts can be used to query an online annotation database.

The argument for the extraData option is a JSON association list, where the keys are names (strings) and the values are perl subroutine definitions (also strings). By the way, a subroutine is just another name for a function. The perl subroutine can be anything, and it gets evaluated for each feature that will be in the JBrowse track.

To convince yourself of this, switch to your jbrowse directory, and try the following:

$ bin/prepare-refseqs.pl --fasta volvox.fa
$ bin/flatfile-to-json.pl --gff docs/tutorial/data_files/volvox.gff3 --tracklabel ExtraData_NoTrackChanges --type mRNA --extraData '{ "empty_column" : "sub { print(\"$0 is invoking the subroutine you defined.\\n\") }"}' 

The message is printed four times, because there are four features whose type is 'mRNA'. For this simple example, the subroutine did not return anything. Normally, it will return data specific to each feature, and this will be silent (unless you specifically tell it to print that data).

Let's take a step back for a moment. The ability to extract feature data from the data structures in the underlying code suggests that we will need to understand how the data is stored in those structures. After a few minor simplifications, this is what the data structure of each feature object looks like:

{
  "attributes" => {
    # attributes are optional; the ones listed here may or may not be defined for a given feature.
    # also, there could be any number of additional attributes.
    "load_id" => [<list of strings>],
    "parent_id" => [<list of strings>],
    "Alias" => [<list of strings>],
    "Note" => [<list of strings>],
    ...
  }
  "ref" => <string>,
  "type" => <string>
  "name" => <string>,
  "phase" => <number>,
  "score" => <number>,
  "start" => <number>,
  "stop" => <number>, 
  "strand" => <number>
}

When the extraData subroutine is invoked, it is invoked with a feature object (i.e. the data structure shown above) as the only argument.

To get the type for each feature, one could do:

--extraData '{ "the_type" : "sub { return $_[0]->{\"type\"}; }" }'  

or, equivalently,

--extraData '{ "the_type" : "sub { shift->{\"type\"}; }" }'  

I will use the first syntax, since I think it is more intuitive. $_[0] is a reference to the first argument to the subroutine (the feature object), and the arrow pointing to the curly braced, escaped string ("type") gets the data associated with that string from the feature object. That data is then returned.

Are you familiar with the --getType option from flatfile-to-json.pl? This is doing almost exactly the same thing as --getType. The only difference is that I have chosen to refer to the extracted data as "the_type", and when it is done through --getType, the data is referred to simply as "type". I have only used this type extraction example for demonstration purposes; when you actually want to get the type, you should use --getType because it is more succinct and more easily understood.

Now, let's try to do something useful with --extraData that cannot be done with any other option. We are going to extract an attribute.

Here's the command to extract the load_id attribute:

--extraData '{ "load_id" : "sub { return $_[0]->{\"attributes\"}->{\"load_id\"}[0]; }" }' 

It turns out that there is a somewhat cleaner way of doing this:

--extraData '{ "load_id" : "sub { return $_[0]->attributes(\"load_id\"); }" }'

Now that we have the load_id for every feature, how can we use it? The most useful way to use it is with the --urlTemplate option.

I have shown the most basic case, where it is desirable to get data from each feature object and then immediately return it as is. Although this is outside the scope of this tutorial, with some knowledge of perl, it would be straightforward to extend this case to map another subroutine over the data, or to use a lookup table to convert one type of identifier to another.

One final word of caution. When you use the extraData option, the files with the data for the track must get bigger to accommodate this extra data. This can be thought of as inserting an additional column into a table -- this size increase is on the order of the number of features multiplied by the average size of the data in each slot of that column. Unnecessary use of extraData will unnecessarily increase the amount of time needed to transfer data from the server to a client. That said, try not use more space than you need.

clientConfig

Non-quantitative tracks have a few different types of representations. There is the histogram representation, where each vertical bar represents the density of features at the corresponding location in the chromosome, and then there is the representation that shows each feature as a horizontal bar extending from its start to its end location on the chromosome. The clientConfig option can be used to alter the appearance of either of these representations.

For the start-to-end feature representation, the simplest backgrounds have solid colors, but more complex backgrounds use an image.

There are a number of built-in tracks that come with JBrowse; these tracks are defined in genome.css, and some of them use images from the img/ directory. For a selected track type from genome.css, instructions to clientConfig override the definitions already present in genome.css.

The structure of the argument to clientConfig looks something like this:

'{
   "featureCss" : <string of css settings>,
   "histCss" : <string of css settings>,
   "histScale" : <number>
 }'

You can think of clientConfig as being "an option with sub-options", where the sub-options are featureCss, histCss, and histScale. Any or all of these three options can be omitted.

Option Description
featureCss CSS configuration edits for the features.
histCss CSS configuration edits for the histogram.
histScale A number that defines the zoom levels at which the individual features are replaced with the histogram (or vice versa). For higher histScale values, it is necessary to be zoomed in more in order to view the features (as opposed to the histogram).


CSS Options for the Backgrounds and Borders

Option Description Input
height* The top to bottom height of each feature. Some number of pixels (e.g., 5px).
background-color the background color of each feature or each bar of the histogram. A RGB hexadecimal color code (e.g. #A4C or #8FA366).
background-image An image to use as the background for the features or histogram bars. A path to an image file, with the syntax: url("path/to/image").
background-repeat Describes how repetition of a background image will occur, if at all. One of these: repeat, repeat-x, repeat-y, no-repeat.
border-style The type of border each feature or histogram bar has. One of these: solid, dotted, dashed, double, groove, ridge, inset, outset.
border-color The border color of a feature or histogram bar. A RGB hexadecimal color code (e.g. #A4C or #8FA366).
border-width The thickness of the border for a feature or histogram bar. Some number of pixels (e.g., 1px).

*Note: Do not edit the height property for histCss.

To get an idea of the types of things we might want to change, let's examine the 'basic' class. This class is meant to be used with clientConfig to produce completely novel track types, and when it is not used with clientConfig, it is invisible.