Difference between revisions of "Chado - Getting Started"

From GMOD
Jump to: navigation, search
m
(Chado From SVN: New svn url)
 
(102 intermediate revisions by 9 users not shown)
Line 1: Line 1:
 +
{{ComponentBox
 +
|{{ChadoResourcesBoxItem}}
 +
| | | | | | |}}
  
Chado, properly pronounced. [[[[Image:speaker_0.gif]]]]
+
Chado is a [[Glossary#Database Schema|relational database schema]] that underlies [[GMOD_Users|many GMOD installations]]. It is capable of representing many of the general classes of data frequently encountered in modern biology such as sequence, sequence comparisons, phenotypes, genotypes, ontologies, publications, and phylogeny. It has been designed to handle complex representations of biological knowledge and should be considered one of the most sophisticated relational schemas currently available in molecular biology. The price of this capability is that the new user must spend some time becoming familiar with its fundamentals.
  
This is a stub document for describing the Chado schema at a high level.
+
==Documentation==
  
Until this is fleshed out a little more, here are a few useful links:
+
* [[Introduction to Chado]]
 +
* [http://bioinformatics.oxfordjournals.org/cgi/content/abstract/23/13/i337?ijkey=QYeUct9uLSzefgk&keytype=ref Chado paper in Bioinformatics]
 +
* [[Chado Tutorial]]
 +
* [[Chado Manual]]
 +
* [[Chado FAQ|FAQ for Chado]]
 +
* [[Chado_Tables|Chado Tables]]
 +
* [[Chado_Best_Practices|Chado Best Practices]]
 +
* [[Sample_Chado_SQL|Sample Chado SQL]]
 +
* [[PostgreSQL Performance Tips]]
  
* [[FAQ for chado]]
+
==Modules==
* [http://gmod.sourceforge.net/schema/index.shtml The old schema summary page]
+
* [http://gmod.sourceforge.net/schema/doc/ The autogenerated documentation]
+
  
As well as these documents that are a good reference for new comers to Chado:
+
Chado is a modular schema, designed in such a way as to allow the addition of new modules for new data types. The existing modules are:
  
* [[Chado_Schema_Documentation.doc]], a MS Word document giving a detailed explaination of many Chado concepts (older),
+
{{ChadoModules}}
* [[Chado Manual]], a set of html pages automatically generated from tex files, where some of those tex files were automatically generated from sql files (newer).
+
* [[chado-scenarios.xml]], an XML file with Chado usage scenarios. This document has an xslt applied to it, so hopefully your browser will support that as well as mine.
+
* [http://www.gmod.org/schema-cvs/chado/doc/ChadoCSHMay03Slides.pdf ChadoCSHMay03Slides.pdf], a presentation given by Stan Letovsky at the May, 2003 GMOD meeting to introduce the basic concepts in Chado.
+
  
The GMOD schema, known as Chado, is the foundation on which GMOD applications interoperate, so consistent use of the database is crucial. It is a modular schema, designed in such a way as to allow the addition of new modules for new data types. It was designed by [http://www.flybase.org FlyBase] and [http://www.fruitfly.org BDGP]. The existing core modules are:
 
  
* sequence - for sequences/features
+
==Installation==
* cv - for controlled-vocabs/ontologies
+
* general - currently just dbxrefs
+
* organism - taxonomic data
+
* pub - publication and references
+
* companalysis - augments sequence module with computational analysis data
+
* map - non-sequence maps (PRELIMINARY SCHEMA)
+
* genetic - genetic and phenotypic data (IN DEVELOPMENT)
+
* expression - gene expression (PRELIMINARY SCHEMA)
+
  
CREATING A WORKING CHADO INSTANCE
+
First you will need database software, or Relational Database Management System (RDBMS). The recommended RDBMS for Chado currently is [http://www.postgresql.org/ Postgres]. Postgres is free software, usually used on a Unix operating system such as Linux or Mac OS X.  You can also install Postgres, and Chado, on Windows but most Chado installations are found on some version of Unix - you'll probably get the best support by choosing Unix.  (See [[Databases and GMOD]] for more discussion.)  Once you've installed your RDBMS you can install Chado.
  
You can create a working chado instance in a few ways. First, if you are running the Fedora Core 2 linux distribution, you are in luck! Allen Day has created a set of RPM files for installing Chado and all of the prerequisites. See [http://www.biopackages.net/ biopackages.net] for directions to install Chado with yum.
 
  
To install Chado from source, my best current advice is to get chado from the schema cvs. See [[this FAQ]] for a description of how to get it and install it.
+
===Download a Stable Release of Chado===
  
Below are the documation for the various modules.
+
See [[Downloads]]
  
 +
<!--
 +
* Go to [http://sourceforge.net/project/showfiles.php?group_id=27707 GMOD at Sourceforge]
 +
* Download the latest '''gmod''' (the Chado source code is contained within this package)
 +
* Follow the instructions in the  {{CVS|schema/chado/INSTALL.Chado}} file
 +
-->
  
 +
=== Chado From SVN ===
  
* [[1. Documentation in CVS]]
+
You can get the most up-to-date, not even released yet, version of Chado from [[Subversion]].  To get a copy of the latest Chado source, enter this at the command line:
* [[Audit Module]]
+
* [[Companalysis Module]]
+
* [[Contact Module]]
+
* [[Controlled Vocabulary Module]]
+
* [[Expression Module]]
+
* [[General Module]]
+
* [[Genetic Module]]
+
* [[Library Module]]
+
* [[Map Module]]
+
* [[Organism Module]]
+
* [[Phylogeny Module]]
+
* [[Publication Module]]
+
* [[Sequence Module]]
+
  
[[Category:To Do]]
+
svn co https://svn.code.sf.net/p/gmod/svn/schema/trunk
 +
 
 +
Once the package has been downloaded <code>cd</code> to the <code>trunk/chado</code> directory.
 +
 
 +
Follow the instructions in the <tt>INSTALL.Chado</tt> file, including the installation of the prerequisites. Or read <tt>{{SF_SVN|schema/trunk/chado/INSTALL.Chado|INSTALL.Chado}}</tt> online.
 +
 
 +
==Loading Data==
 +
 
 +
After completing these steps, you can load your chado schema with data in a number of ways:
 +
 
 +
* [[Load_RefSeq_Into_Chado|Load RefSeq into Chado HOWTO]]
 +
* [[Load_GFF_Into_Chado|Load GFF into Chado HOWTO]]
 +
* Using [[XORT]]
 +
 
 +
You can also use the application [[Apollo]] to curate data in Chado.
 +
 
 +
== Mailing Lists ==
 +
 
 +
{{MailingListsFor|Chado}}
 +
 
 +
==Pronunciation==
 +
 
 +
''Chado'' is usually pronounced [[Media:Chado.mp3|like this]].
 +
 
 +
[[Category:Chado]]
 +
[[Category:Database Tools]]
 +
[[Category:GMOD Components]]

Latest revision as of 18:08, 13 February 2014

Status
  • Mature release
  • Active development
  • Active support
Resources

Chado is a relational database schema that underlies many GMOD installations. It is capable of representing many of the general classes of data frequently encountered in modern biology such as sequence, sequence comparisons, phenotypes, genotypes, ontologies, publications, and phylogeny. It has been designed to handle complex representations of biological knowledge and should be considered one of the most sophisticated relational schemas currently available in molecular biology. The price of this capability is that the new user must spend some time becoming familiar with its fundamentals.

Documentation

Modules

Chado is a modular schema, designed in such a way as to allow the addition of new modules for new data types. The existing modules are:


Installation

First you will need database software, or Relational Database Management System (RDBMS). The recommended RDBMS for Chado currently is Postgres. Postgres is free software, usually used on a Unix operating system such as Linux or Mac OS X. You can also install Postgres, and Chado, on Windows but most Chado installations are found on some version of Unix - you'll probably get the best support by choosing Unix. (See Databases and GMOD for more discussion.) Once you've installed your RDBMS you can install Chado.


Download a Stable Release of Chado

See Downloads


Chado From SVN

You can get the most up-to-date, not even released yet, version of Chado from Subversion. To get a copy of the latest Chado source, enter this at the command line:

svn co https://svn.code.sf.net/p/gmod/svn/schema/trunk

Once the package has been downloaded cd to the trunk/chado directory.

Follow the instructions in the INSTALL.Chado file, including the installation of the prerequisites. Or read INSTALL.Chado online.

Loading Data

After completing these steps, you can load your chado schema with data in a number of ways:

You can also use the application Apollo to curate data in Chado.

Mailing Lists

Mailing List Link Description Archive(s)
Chado gmod-schema All issues regarding Chado, Chado::AutoDBI, and Bio::Chado::Schema Gmane, Nabble (2010/05+), Sourceforge
gmod-schema-cmts Chado code updates. Sourceforge

Pronunciation

Chado is usually pronounced like this.