Chaos-XML and Chado
Chaos-XML is based on the Chado relational model and is a subset of Chado's content. For a full explanation of the meaning of the elements in Chaos-XML, please refer to the Chado manual and in particular the sequence module documentation.
Chaos XML was created at around the same time the main chado software development team at FlyBase devised the official Chado XML format. Chado XML and Chaos XML are semantically very similar, but they are different in how the XML to relational database mapping is performed. Chado XML is also considerably more verbose than Chaos XML. This is because Chaos uses some denormalisations of the Chado model, explained below. In our view these two formats are complementary. Conversions between the formats should be trivial.
The Chaos Model
Elements in Chaos XML will generally have an equivalent table or column in the Chado relational schema. Thus the Chado documentation should also serve as documentation for the Chaos XML format.
The central concept in Chaos/Chado is a "feature". A feature can represent any genomic or sequence entity that is typed by the Sequence Ontology (SO).
Features are interconnected in a feature graph using the feature_relationship element. This is to indicate which exons and proteins belong to which transcript, which transcripts belong to which gene.
The location of a feature, relative to another feature, is described by the featureloc element. All locations are interbase (i.e. counting from 0, not 1. It is the gaps between bases that are counted, not the bases themselves). In contrast to chado, which uses fmin/fmax to indicate the left and right coordinates, chaos use nbeg/nend to indicate the five prime (natural start) and three prime (natural end) coordinates.
The Chaos-XML Library consists of specifications and software for dealing with Chaos-XML files.
The DTD specification can be found in
Soon there will also be specifications as XML Schema and/or Relax-NG
XSL transformations can be found in the
Example Chaos XML
Example Chaos-XML can be found in the
The scripts are in the
chaos-xml/bin/ directory. You need to install the perl chaos library before running these scripts.
Chaos Perl Module
You can browse the perl modules in the
chaos-xml/lib/ directory. To install, download the chaos-xml library and follow the instructions in the
Converting between Chaos-XML and Chado-XML
There are XSLT stylesheets defined for mapping between these two similar formats, see the
If you are not familiar with XSLT, you can use these scripts, part of this distribution (see the
Differences between Chaos XML and the Chado Relational Database Schema
Chaos uses nbeg and nend as opposed to the fmin and fmax found in the feature table.
Chaos collapses the normalised chado table dbxref into a single "dbxrefstr" PCDATA element.
Chaos uses a PCDATA element type in both feature and feature_relationship. In Chado, types are represented as a foreign key into the cvterm table. In Chaos the type string is implicitly mapped to cvterm with the same name as the type, from the Sequence Ontology (SO) CV.
Chaos uses an organismstr PCDATA element to represent the normalised chado table organism.