News/GMOD Paper Cuts, Feb 10th, 2014

From GMOD
Jump to: navigation, search


GMOD Paper Cuts is a periodic selection of choice cuts from the scientific literature featuring interesting, exciting, or otherwise eye-catching GMOD-related work.

If you would like a paper to appear in GMOD Paper Cuts, please email the details to the GMOD helpdesk. Ideally the paper should be in an open-access publication so that anyone can read it.

For more GMOD and GMOD-related papers, and to contribute your own GMOD-related publications, join our Mendeley group.


Finding the missing honey bee genes: lessons learned from a genome upgrade [1]

The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes.

Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data.

Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.

Interesting findings from the new assembly of the honey bee genome, including many more genes than were found in the initial assembly. The Hymenoptera Genome Database uses numerous GMOD resources, including MAKER for automated genome annotation, JBrowse and GBrowse for sequence browsing, and WebApollo for community genome annotation.


Highly Specific and Efficient CRISPR/Cas9-Catalyzed Homology-Directed Repair in Drosophila [2]

We and others recently demonstrated that the readily programmable CRISPR/Cas9 system can be used to edit the Drosophila genome. However, most applications to date have relied on aberrant DNA repair to stochastically generate frame-shifting indels and adoption has been limited by a lack of tools for efficient identification of targeted events. Here we report optimized tools and techniques for expanded application of the CRISPR/Cas9 system in Drosophila through homology-directed repair (HDR) with double-stranded DNA (dsDNA) donor templates that facilitate complex genome engineering through the precise incorporation of large DNA sequences including screenable markers. Using these donors, we demonstrate the replacement of a gene with exogenous sequences and the generation of a conditional allele. To optimize efficiency and specificity, we generated transgenic flies that express Cas9 in the germline, and directly compared HDR and off-target cleavage rates of different approaches for delivering CRISPR components. We also investigated HDR efficiency in a mutant background previously demonstrated to bias DNA repair towards HDR. Finally, we developed a web-based tool that identifies CRISPR target sites and evaluates their potential for off-target cleavage using empirically rooted rules. Overall, we have found that injection of a dsDNA donor and guide RNA-encoding plasmids into vasa-Cas9 flies yields the highest efficiency HDR, and that target sites can be selected to avoid off-target mutations. Efficient and specific CRISPR/Cas9-mediated HDR opens the door to a broad array of complex genome modifications and greatly expands the utility of CRISPR technology for Drosophila research.

CRISPR is one of the most exciting recent technological advancements of the past couple of years. This paper reports new techniques and tools for using the CRISPR/Cas9 system for complex genome engineering. For more information, see the flyCRISPR website.


Analysis of Global Gene Expression in Brachypodium distachyon Reveals Extensive Network Plasticity in Response to Abiotic Stress [3]

Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.

Check out the JBrowse-powered Brachypodium web genome browser and other resources on the new Brachypodium website!


Analyses of Hypomethylated Oil Palm Gene Space[4]

Demand for palm oil has been increasing by an average of ~8% the past decade and currently accounts for about 59% of the world's vegetable oil market. This drives the need to increase palm oil production. Nevertheless, due to the increasing need for sustainable production, it is imperative to increase productivity rather than the area cultivated. Studies on the oil palm genome are essential to help identify genes or markers that are associated with important processes or traits, such as flowering, yield and disease resistance. To achieve this, 294,115 and 150,744 sequences from the hypomethylated or gene-rich regions of Elaeis guineensis and E. oleifera genome were sequenced and assembled into contigs. An additional 16,427 shot-gun sequences and 176 bacterial artificial chromosomes (BAC) were also generated to check the quality of libraries constructed. Comparison of these sequences revealed that although the methylation-filtered libraries were sequenced at low coverage, they still tagged at least 66% of the RefSeq supported genes in the BAC and had a filtration power of at least 2.0. A total 33,752 microsatellites and 40,820 high-quality single nucleotide polymorphism (SNP) markers were identified. These represent the most comprehensive collection of microsatellites and SNPs to date and would be an important resource for genetic mapping and association studies. The gene models predicted from the assembled contigs were mined for genes of interest, and 242, 65 and 14 oil palm transcription factors, resistance genes and miRNAs were identified respectively. Examples of the transcriptional factors tagged include those associated with floral development and tissue culture, such as homeodomain proteins, MADS, Squamosa and Apetala2. The E. guineensis and E. oleifera hypomethylated sequences provide an important resource to understand the molecular mechanisms associated with important agronomic traits in oil palm.

The newly-sequenced oil palm genome used the MAKER automated annotation pipeline. The oil palm is one of a number of genomics projects taking off in Malaysia at the moment. Perfect timing for a GMOD workshop!


Production of a reference transcriptome and transcriptomic database (EdwardsiellaBase) for the lined sea anemone, Edwardsiella lineata, a parasitic cnidarian [5]

The lined sea anemone Edwardsiella lineata is an informative model system for evolutionary-developmental studies of parasitism. In this species, it is possible to compare alternate developmental pathways leading from a larva to either a free-living polyp or a vermiform parasite that inhabits the mesoglea of a ctenophore host. Additionally, E. lineata is confamilial with the model cnidarian Nematostella vectensis, providing an opportunity for comparative genomic, molecular and organismal studies.

[...]

The transcriptomic data and database described here provide a platform for studying the evolutionary developmental genomics of a derived parasitic life cycle. In addition, these data from E. lineata will aid in the interpretation of evolutionary novelties in gene sequence or structure that have been reported for the model cnidarian N. vectensis (e.g., the split NF-κB locus). Finally, we include custom computational tools to facilitate the annotation of a transcriptome based on high-throughput sequencing data obtained from a “non-model system.”

Information and resources for the newly-sequenced cnidarian E. lineata; all genomic data is publicly available at EdwardsiellaBase, and can be searched according to contig ID, gene ontology, protein family motif (Pfam), enzyme commission number, and BLAST. The alignment of the raw reads to the contigs can also be visualized via JBrowse.


Happy reading!


  1. Cite error: Invalid <ref> tag; no text was provided for refs named DOI:10.1186.2F1471-2164-15-86
  2. Cite error: Invalid <ref> tag; no text was provided for refs named DOI:10.1534.2Fgenetics.113.160713
  3. Cite error: Invalid <ref> tag; no text was provided for refs named DOI:10.1371.2Fjournal.pone.0087499
  4. Cite error: Invalid <ref> tag; no text was provided for refs named DOI:10.1371.2Fjournal.pone.0086728
  5. Cite error: Invalid <ref> tag; no text was provided for refs named DOI:10.1186.2F1471-2164-15-71
Disclaimer: the papers included in this feature are for your entertainment and edification only. Inclusion does not imply an endorsement of the material or any association between the authors and the GMOD project.


Posted to the GMOD News on 2014/02/10