Chado Tables

From GMOD
Jump to: navigation, search

Chado Schema Version 1.11

Contents

Table: db

A database authority. Typical databases in bioinformatics are FlyBase, GO, UniProt, NCBI, MGI, etc. The authority is generally known by this shortened form, which is unique within the bioinformatics and biomedical realm. To Do - add support for URIs, URNs (e.g. LSIDs). We can do this by treating the URL as a URI - however, some applications may expect this to be resolvable - to be decided.

db Structure
FK Name Type Description
db_id serial PRIMARY KEY
name character varying(255) UNIQUE NOT NULL
description character varying(255)
urlprefix character varying(255)
url character varying(255)

Tables referencing this one via Foreign Key Constraints:



Table: dbxref

A unique, global, public, stable identifier. Not necessarily an external reference - can reference data items inside the particular chado instance being used. Typically a row in a table can be uniquely identified with a primary identifier (called dbxref_id); a table may also have secondary identifiers (in a linking table <T>_dbxref). A dbxref is generally written as <DB>:<ACCESSION> or as <DB>:<ACCESSION>:<VERSION>.

dbxref Structure
FK Name Type Description
dbxref_id serial PRIMARY KEY

db

db_id integer UNIQUE#1 NOT NULL
accession character varying(255) UNIQUE#1 NOT NULL

The local part of the identifier. Guaranteed by the db authority to be unique for that db.
version character varying(255) UNIQUE#1 NOT NULL DEFAULT ''::character varying
description text

Tables referencing this one via Foreign Key Constraints:



Table: project

project Structure
FK Name Type Description
project_id serial PRIMARY KEY
name character varying(255) UNIQUE NOT NULL
description character varying(255) NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: tableinfo

tableinfo Structure
FK Name Type Description
tableinfo_id serial PRIMARY KEY
name character varying(30) UNIQUE NOT NULL
primary_key_column character varying(30)
is_view integer NOT NULL
view_on_table_id integer
superclass_table_id integer
is_updateable integer NOT NULL DEFAULT 1
modification_date date NOT NULL DEFAULT now()

Tables referencing this one via Foreign Key Constraints:



Table: cv

A controlled vocabulary or ontology. A cv is composed of cvterms (AKA terms, classes, types, universals - relations and properties are also stored in cvterm) and the relationships between them.

cv Structure
FK Name Type Description
cv_id serial PRIMARY KEY
name character varying(255) UNIQUE NOT NULL

The name of the ontology. This corresponds to the obo-format -namespace-. cv names uniquely identify the cv. In OBO file format, the cv.name is known as the namespace.
definition text

A text description of the criteria for membership of this ontology.

Tables referencing this one via Foreign Key Constraints:



Table: cvterm

A term, class, universal or type within an ontology or controlled vocabulary. This table is also used for relations and properties. cvterms constitute nodes in the graph defined by the collection of cvterms and cvterm_relationships.

cvterm Structure
FK Name Type Description
cvterm_id serial PRIMARY KEY

cv

cv_id integer UNIQUE#1 NOT NULL

The cv or ontology or namespace to which this cvterm belongs.
name character varying(1024) UNIQUE#1 NOT NULL

A concise human-readable name or label for the cvterm. Uniquely identifies a cvterm within a cv.
definition text

A human-readable text definition.

dbxref

dbxref_id integer UNIQUE NOT NULL

Primary identifier dbxref - The unique global OBO identifier for this cvterm. Note that a cvterm may have multiple secondary dbxrefs - see also table: cvterm_dbxref.
is_obsolete integer UNIQUE#1 NOT NULL

Boolean 0=false,1=true; see GO documentation for details of obsoletion. Note that two terms with different primary dbxrefs may exist if one is obsolete. Duplicate names when obsoletes are included: Note that is_obsolete is an integer and can be incremented to fake uniqueness.
is_relationshiptype integer NOT NULL

Boolean 0=false,1=true relations or relationship types (also known as Typedefs in OBO format, or as properties or slots) form a cv/ontology in themselves. We use this flag to indicate whether this cvterm is an actual term/class/universal or a relation. Relations may be drawn from the OBO Relations ontology, but are not exclusively drawn from there.

Tables referencing this one via Foreign Key Constraints:


Table: cvterm_dbxref

In addition to the primary identifier (cvterm.dbxref_id) a cvterm can have zero or more secondary identifiers/dbxrefs, which may refer to records in external databases. The exact semantics of cvterm_dbxref are not fixed. For example: the dbxref could be a pubmed ID that is pertinent to the cvterm, or it could be an equivalent or similar term in another ontology. For example, GO cvterms are typically linked to InterPro IDs, even though the nature of the relationship between them is largely one of statistical association. The dbxref may be have data records attached in the same database instance, or it could be a "hanging" dbxref pointing to some external database. NOTE: If the desired objective is to link two cvterms together, and the nature of the relation is known and holds for all instances of the subject cvterm then consider instead using cvterm_relationship together with a well-defined relation.

cvterm_dbxref Structure
FK Name Type Description
cvterm_dbxref_id serial PRIMARY KEY

cvterm

cvterm_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL
is_for_definition integer NOT NULL

A cvterm.definition should be supported by one or more references. If this column is true, the dbxref is not for a term in an external database - it is a dbxref for provenance information for the definition.


Table: cvterm_relationship

A relationship linking two cvterms. Each cvterm_relationship constitutes an edge in the graph defined by the collection of cvterms and cvterm_relationships. The meaning of the cvterm_relationship depends on the definition of the cvterm R refered to by type_id. However, in general the definitions are such that the statement "all SUBJs REL some OBJ" is true. The cvterm_relationship statement is about the subject, not the object. For example "insect wing part_of thorax".

cvterm_relationship Structure
FK Name Type Description
cvterm_relationship_id serial PRIMARY KEY

cvterm

type_id integer UNIQUE#1 NOT NULL

The nature of the relationship between subject and object. Note that relations are also housed in the cvterm table, typically from the OBO relationship ontology, although other relationship types are allowed.

cvterm

subject_id integer UNIQUE#1 NOT NULL

The subject of the subj-predicate-obj sentence. The cvterm_relationship is about the subject. In a graph, this typically corresponds to the child node.

cvterm

object_id integer UNIQUE#1 NOT NULL

The object of the subj-predicate-obj sentence. The cvterm_relationship refers to the object. In a graph, this typically corresponds to the parent node.


Table: cvtermpath

The reflexive transitive closure of the cvterm_relationship relation.

cvtermpath Structure
FK Name Type Description
cvtermpath_id serial PRIMARY KEY

cvterm

type_id integer UNIQUE#1

The relationship type that this is a closure over. If null, then this is a closure over ALL relationship types. If non-null, then this references a relationship cvterm - note that the closure will apply to both this relationship AND the OBO_REL:is_a (subclass) relationship.

cvterm

subject_id integer UNIQUE#1 NOT NULL

cvterm

object_id integer UNIQUE#1 NOT NULL

cv

cv_id integer NOT NULL

Closures will mostly be within one cv. If the closure of a relationship traverses a cv, then this refers to the cv of the object_id cvterm.
pathdistance integer UNIQUE#1

The number of steps required to get from the subject cvterm to the object cvterm, counting from zero (reflexive relationship).


Table: cvtermprop

Additional extensible properties can be attached to a cvterm using this table. Corresponds to -AnnotationProperty- in W3C OWL format.

cvtermprop Structure
FK Name Type Description
cvtermprop_id serial PRIMARY KEY

cvterm

cvterm_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

The name of the property or slot is a cvterm. The meaning of the property is defined in that cvterm.
value text UNIQUE#1 NOT NULL DEFAULT ''::text

The value of the property, represented as text. Numeric values are converted to their text representation.
rank integer UNIQUE#1 NOT NULL

Property-Value ordering. Any cvterm can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used.


Table: cvtermsynonym

A cvterm actually represents a distinct class or concept. A concept can be refered to by different phrases or names. In addition to the primary name (cvterm.name) there can be a number of alternative aliases or synonyms. For example, "T cell" as a synonym for "T lymphocyte".

cvtermsynonym Structure
FK Name Type Description
cvtermsynonym_id serial PRIMARY KEY

cvterm

cvterm_id integer UNIQUE#1 NOT NULL
synonym character varying(1024) UNIQUE#1 NOT NULL

cvterm

type_id integer

A synonym can be exact, narrower, or broader than.


Table: dbxrefprop

Metadata about a dbxref. Note that this is not defined in the dbxref module, as it depends on the cvterm table. This table has a structure analagous to cvtermprop.

dbxrefprop Structure
FK Name Type Description
dbxrefprop_id serial PRIMARY KEY

dbxref

dbxref_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text NOT NULL DEFAULT ''::text
rank integer UNIQUE#1 NOT NULL


Table: wwwuser

Keep track of WWW users. This may also be useful in an audit module at some point.

wwwuser Structure
FK Name Type Description
wwwuser_id serial PRIMARY KEY
username character varying(32) UNIQUE NOT NULL
password character varying(32) NOT NULL
email character varying(128) NOT NULL
profile text

Tables referencing this one via Foreign Key Constraints:



Table: wwwuser_cvterm

Track wwwuser interest in cvterms.

wwwuser_cvterm Structure
FK Name Type Description
wwwuser_cvterm_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_expression

Track wwwuser interest in expressions.

wwwuser_expression Structure
FK Name Type Description
wwwuser_expression_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

expression

expression_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_feature

Track wwwuser interest in features.

wwwuser_feature Structure
FK Name Type Description
wwwuser_feature_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

feature

feature_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_genotype

Track wwwuser interest in genotypes.

wwwuser_genotype Structure
FK Name Type Description
wwwuser_genotype_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

genotype

genotype_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_organism

Track wwwuser interest in organisms.

wwwuser_organism Structure
FK Name Type Description
wwwuser_organism_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

organism

organism_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_phenotype

Track wwwuser interest in phenotypes.

wwwuser_phenotype Structure
FK Name Type Description
wwwuser_phenotype_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

phenotype

phenotype_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_project

Link wwwuser accounts to projects

wwwuser_project Structure
FK Name Type Description
wwwuser_project_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

project

project_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuser_pub

Track wwwuser interest in publications.

wwwuser_pub Structure
FK Name Type Description
wwwuser_pub_id serial PRIMARY KEY

wwwuser

wwwuser_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1


Table: wwwuserrelationship

Track wwwuser interest in other wwwusers.

wwwuserrelationship Structure
FK Name Type Description
wwwuserrelationship_id serial PRIMARY KEY

wwwuser

objwwwuser_id integer UNIQUE#1 NOT NULL

wwwuser

subjwwwuser_id integer UNIQUE#1 NOT NULL
world_read smallint NOT NULL DEFAULT 1

Generated by PostgreSQL Autodoc

W3C HTML 4.01 Strict

Table: feature

A feature is a biological sequence or a section of a biological sequence, or a collection of such sections. Examples include genes, exons, transcripts, regulatory regions, polypeptides, protein domains, chromosome sequences, sequence variations, cross-genome match regions such as hits and HSPs and so on; see the Sequence Ontology for more.

feature Structure
FK Name Type Description
feature_id serial PRIMARY KEY

dbxref

dbxref_id integer

An optional primary public stable identifier for this feature. Secondary identifiers and external dbxrefs go in the table feature_dbxref.

organism

organism_id integer UNIQUE#1 NOT NULL

The organism to which this feature belongs. This column is mandatory.
name character varying(255)

The optional human-readable common name for a feature, for display purposes.
uniquename text UNIQUE#1 NOT NULL

The unique name for a feature; may not be necessarily be particularly human-readable, although this is preferred. This name must be unique for this type of feature within this organism.
residues text

A sequence of alphabetic characters representing biological residues (nucleic acids, amino acids). This column does not need to be manifested for all features; it is optional for features such as exons where the residues can be derived from the featureloc. It is recommended that the value for this column be manifested for features which may may non-contiguous sublocations (e.g. transcripts), since derivation at query time is non-trivial. For expressed sequence, the DNA sequence should be used rather than the RNA sequence.
seqlen integer

The length of the residue feature. See column:residues. This column is partially redundant with the residues column, and also with featureloc. This column is required because the location may be unknown and the residue sequence may not be manifested, yet it may be desirable to store and query the length of the feature. The seqlen should always be manifested where the length of the sequence is known.
md5checksum character(32)

The 32-character checksum of the sequence, calculated using the MD5 algorithm. This is practically guaranteed to be unique for any feature. This column thus acts as a unique identifier on the mathematical sequence.

cvterm

type_id integer UNIQUE#1 NOT NULL

A required reference to a table:cvterm giving the feature type. This will typically be a Sequence Ontology identifier. This column is thus used to subclass the feature table.
is_analysis boolean NOT NULL DEFAULT false

Boolean indicating whether this feature is annotated or the result of an automated analysis. Analysis results also use the companalysis module. Note that the dividing line between analysis and annotation may be fuzzy, this should be determined on a per-project basis in a consistent manner. One requirement is that there should only be one non-analysis version of each wild-type gene feature in a genome, whereas the same gene feature can be predicted multiple times in different analyses.
is_obsolete boolean NOT NULL DEFAULT false

Boolean indicating whether this feature has been obsoleted. Some chado instances may choose to simply remove the feature altogether, others may choose to keep an obsolete row in the table.
timeaccessioned timestamp without time zone NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone

For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado.
timelastmodified timestamp without time zone NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone

For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado.

Tables referencing this one via Foreign Key Constraints:



Table: feature_cvterm

Associate a term from a cv with a feature, for example, GO annotation.

feature_cvterm Structure
FK Name Type Description
feature_cvterm_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL

Provenance for the annotation. Each annotation should have a single primary publication (which may be of the appropriate type for computational analyses) where more details can be found. Additional provenance dbxrefs can be attached using feature_cvterm_dbxref.
is_not boolean NOT NULL DEFAULT false

If this is set to true, then this annotation is interpreted as a NEGATIVE annotation - i.e. the feature does NOT have the specified function, process, component, part, etc. See GO docs for more details.

Tables referencing this one via Foreign Key Constraints:



Table: feature_cvterm_dbxref

Additional dbxrefs for an association. Rows in the feature_cvterm table may be backed up by dbxrefs. For example, a feature_cvterm association that was inferred via a protein-protein interaction may be backed by by refering to the dbxref for the alternate protein. Corresponds to the WITH column in a GO gene association file (but can also be used for other analagous associations). See http://www.geneontology.org/doc/GO.annotation.shtml#file for more details.

feature_cvterm_dbxref Structure
FK Name Type Description
feature_cvterm_dbxref_id serial PRIMARY KEY

feature_cvterm

feature_cvterm_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL


Table: feature_cvterm_pub

Secondary pubs for an association. Each feature_cvterm association is supported by a single primary publication. Additional secondary pubs can be added using this linking table (in a GO gene association file, these corresponding to any IDs after the pipe symbol in the publications column.

feature_cvterm_pub Structure
FK Name Type Description
feature_cvterm_pub_id serial PRIMARY KEY

feature_cvterm

feature_cvterm_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: feature_cvtermprop

Extensible properties for feature to cvterm associations. Examples: GO evidence codes; qualifiers; metadata such as the date on which the entry was curated and the source of the association. See the featureprop table for meanings of type_id, value and rank.

feature_cvtermprop Structure
FK Name Type Description
feature_cvtermprop_id serial PRIMARY KEY

feature_cvterm

feature_cvterm_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. cvterms may come from the OBO evidence code cv.
value text

The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query.
rank integer UNIQUE#1 NOT NULL

Property-Value ordering. Any feature_cvterm can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used.


Table: feature_dbxref

Links a feature to dbxrefs. This is for secondary identifiers; primary identifiers should use feature.dbxref_id.

feature_dbxref Structure
FK Name Type Description
feature_dbxref_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL
is_current boolean NOT NULL DEFAULT true

The is_current boolean indicates whether the linked dbxref is the current -official- dbxref for the linked feature.


Table: feature_pub

Provenance. Linking table between features and publications that mention them.

feature_pub Structure
FK Name Type Description
feature_pub_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: feature_pubprop

Property or attribute of a feature_pub link.

feature_pubprop Structure
FK Name Type Description
feature_pubprop_id serial PRIMARY KEY

feature_pub

feature_pub_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: feature_relationship

Features can be arranged in graphs, e.g. "exon part_of transcript part_of gene"; If type is thought of as a verb, the each arc or edge makes a statement [Subject Verb Object]. The object can also be thought of as parent (containing feature), and subject as child (contained feature or subfeature). We include the relationship rank/order, because even though most of the time we can order things implicitly by sequence coordinates, we can not always do this - e.g. transpliced genes. It is also useful for quickly getting implicit introns.

feature_relationship Structure
FK Name Type Description
feature_relationship_id serial PRIMARY KEY

feature

subject_id integer UNIQUE#1 NOT NULL

The subject of the subj-predicate-obj sentence. This is typically the subfeature.

feature

object_id integer UNIQUE#1 NOT NULL

The object of the subj-predicate-obj sentence. This is typically the container feature.

cvterm

type_id integer UNIQUE#1 NOT NULL

Relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed. The most common relationship type is OBO_REL:part_of. Valid relationship types are constrained by the Sequence Ontology.
value text

Additional notes or comments.
rank integer UNIQUE#1 NOT NULL

The ordering of subject features with respect to the object feature may be important (for example, exon ordering on a transcript - not always derivable if you take trans spliced genes into consideration). Rank is used to order these; starts from zero.

Tables referencing this one via Foreign Key Constraints:



Table: feature_relationship_pub

Provenance. Attach optional evidence to a feature_relationship in the form of a publication.

feature_relationship_pub Structure
FK Name Type Description
feature_relationship_pub_id serial PRIMARY KEY

feature_relationship

feature_relationship_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: feature_relationshipprop

Extensible properties for feature_relationships. Analagous structure to featureprop. This table is largely optional and not used with a high frequency. Typical scenarios may be if one wishes to attach additional data to a feature_relationship - for example to say that the feature_relationship is only true in certain contexts.

feature_relationshipprop Structure
FK Name Type Description
feature_relationshipprop_id serial PRIMARY KEY

feature_relationship

feature_relationship_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Currently there is no standard ontology for feature_relationship property types.
value text

The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query.
rank integer UNIQUE#1 NOT NULL

Property-Value ordering. Any feature_relationship can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used.

Tables referencing this one via Foreign Key Constraints:



Table: feature_relationshipprop_pub

Provenance for feature_relationshipprop.

feature_relationshipprop_pub Structure
FK Name Type Description
feature_relationshipprop_pub_id serial PRIMARY KEY

feature_relationshipprop

feature_relationshipprop_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: feature_synonym

Linking table between feature and synonym.

feature_synonym Structure
FK Name Type Description
feature_synonym_id serial PRIMARY KEY

synonym

synonym_id integer UNIQUE#1 NOT NULL

feature

feature_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL

The pub_id link is for relating the usage of a given synonym to the publication in which it was used.
is_current boolean NOT NULL DEFAULT true

The is_current boolean indicates whether the linked synonym is the current -official- symbol for the linked feature.
is_internal boolean NOT NULL DEFAULT false

Typically a synonym exists so that somebody querying the db with an obsolete name can find the object theyre looking for (under its current name. If the synonym has been used publicly and deliberately (e.g. in a paper), it may also be listed in reports as a synonym. If the synonym was not used deliberately (e.g. there was a typo which went public), then the is_internal boolean may be set to -true- so that it is known that the synonym is -internal- and should be queryable but should not be listed in reports as a valid synonym.


Table: featureloc

The location of a feature relative to another feature. Important: interbase coordinates are used. This is vital as it allows us to represent zero-length features e.g. splice sites, insertion points without an awkward fuzzy system. Features typically have exactly ONE location, but this need not be the case. Some features may not be localized (e.g. a gene that has been characterized genetically but no sequence or molecular information is available). Note on multiple locations: Each feature can have 0 or more locations. Multiple locations do NOT indicate non-contiguous locations (if a feature such as a transcript has a non-contiguous location, then the subfeatures such as exons should always be manifested). Instead, multiple featurelocs for a feature designate alternate locations or grouped locations; for instance, a feature designating a blast hit or hsp will have two locations, one on the query feature, one on the subject feature. Features representing sequence variation could have alternate locations instantiated on a feature on the mutant strain. The column:rank is used to differentiate these different locations. Reflexive locations should never be stored - this is for -proper- (i.e. non-self) locations only; nothing should be located relative to itself.

featureloc Structure
FK Name Type Description
featureloc_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

The feature that is being located. Any feature can have zero or more featurelocs.

feature

srcfeature_id integer

The source feature which this location is relative to. Every location is relative to another feature (however, this column is nullable, because the srcfeature may not be known). All locations are -proper- that is, nothing should be located relative to itself. No cycles are allowed in the featureloc graph.
fmin integer

The leftmost/minimal boundary in the linear range represented by the featureloc. Sometimes (e.g. in Bioperl) this is called -start- although this is confusing because it does not necessarily represent the 5-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. To convert this to the leftmost position in a base-oriented system (eg GFF, Bioperl), add 1 to fmin.
is_fmin_partial boolean NOT NULL DEFAULT false

This is typically false, but may be true if the value for column:fmin is inaccurate or the leftmost part of the range is unknown/unbounded.
fmax integer

The rightmost/maximal boundary in the linear range represented by the featureloc. Sometimes (e.g. in bioperl) this is called -end- although this is confusing because it does not necessarily represent the 3-prime coordinate. Important: This is space-based (interbase) coordinates, counting from zero. No conversion is required to go from fmax to the rightmost coordinate in a base-oriented system that counts from 1 (e.g. GFF, Bioperl).
is_fmax_partial boolean NOT NULL DEFAULT false

This is typically false, but may be true if the value for column:fmax is inaccurate or the rightmost part of the range is unknown/unbounded.
strand smallint

The orientation/directionality of the location. Should be 0, -1 or +1.
phase integer

Phase of translation with respect to srcfeature_id. Values are 0, 1, 2. It may not be possible to manifest this column for some features such as exons, because the phase is dependant on the spliceform (the same exon can appear in multiple spliceforms). This column is mostly useful for predicted exons and CDSs.
residue_info text

Alternative residues, when these differ from feature.residues. For instance, a SNP feature located on a wild and mutant protein would have different alternative residues. for alignment/similarity features, the alternative residues is used to represent the alignment string (CIGAR format). Note on variation features; even if we do not want to instantiate a mutant chromosome/contig feature, we can still represent a SNP etc with 2 locations, one (rank 0) on the genome, the other (rank 1) would have most fields null, except for alternative residues.
locgroup integer UNIQUE#1 NOT NULL

This is used to manifest redundant, derivable extra locations for a feature. The default locgroup=0 is used for the DIRECT location of a feature. Important: most Chado users may never use featurelocs WITH logroup > 0. Transitively derived locations are indicated with locgroup > 0. For example, the position of an exon on a BAC and in global chromosome coordinates. This column is used to differentiate these groupings of locations. The default locgroup 0 is used for the main or primary location, from which the others can be derived via coordinate transformations. Another example of redundant locations is storing ORF coordinates relative to both transcript and genome. Redundant locations open the possibility of the database getting into inconsistent states; this schema gives us the flexibility of both warehouse instantiations with redundant locations (easier for querying) and management instantiations with no redundant locations. An example of using both locgroup and rank: imagine a feature indicating a conserved region between the chromosomes of two different species. We may want to keep redundant locations on both contigs and chromosomes. We would thus have 4 locations for the single conserved region feature - two distinct locgroups (contig level and chromosome level) and two distinct ranks (for the two species).
rank integer UNIQUE#1 NOT NULL

Used when a feature has >1 location, otherwise the default rank 0 is used. Some features (e.g. blast hits and HSPs) have two locations - one on the query and one on the subject. Rank is used to differentiate these. Rank=0 is always used for the query, Rank=1 for the subject. For multiple alignments, assignment of rank is arbitrary. Rank is also used for sequence_variant features, such as SNPs. Rank=0 indicates the wildtype (or baseline) feature, Rank=1 indicates the mutant (or compared) feature.
featureloc Constraints
Name Constraint
featureloc_c2 CHECK ((fmin <= fmax))

Tables referencing this one via Foreign Key Constraints:



Table: featureloc_pub

Provenance of featureloc. Linking table between featurelocs and publications that mention them.

featureloc_pub Structure
FK Name Type Description
featureloc_pub_id serial PRIMARY KEY

featureloc

featureloc_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: featureprop

A feature can have any number of slot-value property tags attached to it. This is an alternative to hardcoding a list of columns in the relational schema, and is completely extensible.

featureprop Structure
FK Name Type Description
featureprop_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

The name of the property/slot is a cvterm. The meaning of the property is defined in that cvterm. Certain property types will only apply to certain feature types (e.g. the anticodon property will only apply to tRNA features) ; the types here come from the sequence feature property ontology.
value text

The value of the property, represented as text. Numeric values are converted to their text representation. This is less efficient than using native database types, but is easier to query.
rank integer UNIQUE#1 NOT NULL

Property-Value ordering. Any feature can have multiple values for any particular property type - these are ordered in a list using rank, counting from zero. For properties that are single-valued rather than multi-valued, the default 0 value should be used

Tables referencing this one via Foreign Key Constraints:



Table: featureprop_pub

Provenance. Any featureprop assignment can optionally be supported by a publication.

featureprop_pub Structure
FK Name Type Description
featureprop_pub_id serial PRIMARY KEY

featureprop

featureprop_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: synonym

A synonym for a feature. One feature can have multiple synonyms, and the same synonym can apply to multiple features.

synonym Structure
FK Name Type Description
synonym_id serial PRIMARY KEY
name character varying(255) UNIQUE#1 NOT NULL

The synonym itself. Should be human-readable machine-searchable ascii text.

cvterm

type_id integer UNIQUE#1 NOT NULL

Types would be symbol and fullname for now.
synonym_sgml character varying(255) NOT NULL

The fully specified synonym, with any non-ascii characters encoded in SGML.

Tables referencing this one via Foreign Key Constraints:



Table: phylonode

This is the most pervasive element in the phylogeny module, cataloging the "phylonodes" of tree graphs. Edges are implied by the parent_phylonode_id reflexive closure. For all nodes in a nested set implementation the left and right index will be *between* the parents left and right indexes.

phylonode Structure
FK Name Type Description
phylonode_id serial PRIMARY KEY

phylotree

phylotree_id integer UNIQUE#1 UNIQUE#2 NOT NULL

phylonode

parent_phylonode_id integer

Root phylonode can have null parent_phylonode_id value.
left_idx integer UNIQUE#1 NOT NULL
right_idx integer UNIQUE#2 NOT NULL

cvterm

type_id integer

Type: e.g. root, interior, leaf.

feature

feature_id integer

Phylonodes can have optional features attached to them e.g. a protein or nucleotide sequence usually attached to a leaf of the phylotree for non-leaf nodes, the feature may be a feature that is an instance of SO:match; this feature is the alignment of all leaf features beneath it.
label character varying(255)
distance double precision

Tables referencing this one via Foreign Key Constraints:



Table: phylonode_dbxref

For example, for orthology, paralogy group identifiers; could also be used for NCBI taxonomy; for sequences, refer to phylonode_feature, feature associated dbxrefs.

phylonode_dbxref Structure
FK Name Type Description
phylonode_dbxref_id serial PRIMARY KEY

phylonode

phylonode_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL


Table: phylonode_organism

This linking table should only be used for nodes in taxonomy trees; it provides a mapping between the node and an organism. One node can have zero or one organisms, one organism can have zero or more nodes (although typically it should only have one in the standard NCBI taxonomy tree).

phylonode_organism Structure
FK Name Type Description
phylonode_organism_id serial PRIMARY KEY

phylonode

phylonode_id integer UNIQUE NOT NULL

One phylonode cannot refer to >1 organism.

organism

organism_id integer NOT NULL


Table: phylonode_pub

phylonode_pub Structure
FK Name Type Description
phylonode_pub_id serial PRIMARY KEY

phylonode

phylonode_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: phylonode_relationship

This is for relationships that are not strictly hierarchical; for example, horizontal gene transfer. Most phylogenetic trees are strictly hierarchical, nevertheless it is here for completeness.

phylonode_relationship Structure
FK Name Type Description
phylonode_relationship_id serial PRIMARY KEY

phylonode

subject_id integer UNIQUE#1 NOT NULL

phylonode

object_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
rank integer

phylotree

phylotree_id integer NOT NULL


Table: phylonodeprop

phylonodeprop Structure
FK Name Type Description
phylonodeprop_id serial PRIMARY KEY

phylonode

phylonode_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

type_id could designate phylonode hierarchy relationships, for example: species taxonomy (kingdom, order, family, genus, species), "ortholog/paralog", "fold/superfold", etc.
value text UNIQUE#1 NOT NULL DEFAULT ''::text
rank integer UNIQUE#1 NOT NULL


Table: phylotree

Global anchor for phylogenetic tree.

phylotree Structure
FK Name Type Description
phylotree_id serial PRIMARY KEY

dbxref

dbxref_id integer NOT NULL
name character varying(255)

cvterm

type_id integer

Type: protein, nucleotide, taxonomy, for example. The type should be any SO type, or "taxonomy".

analysis

analysis_id integer
comment text

Tables referencing this one via Foreign Key Constraints:



Table: phylotree_pub

Tracks citations global to the tree e.g. multiple sequence alignment supporting tree construction.

phylotree_pub Structure
FK Name Type Description
phylotree_pub_id serial PRIMARY KEY

phylotree

phylotree_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: library

library Structure
FK Name Type Description
library_id serial PRIMARY KEY

organism

organism_id integer UNIQUE#1 NOT NULL
name character varying(255)
uniquename text UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

The type_id foreign key links to a controlled vocabulary of library types. Examples of this would be: "cDNA_library" or "genomic_library"

Tables referencing this one via Foreign Key Constraints:



Table: library_cvterm

The table library_cvterm links a library to controlled vocabularies which describe the library. For instance, there might be a link to the anatomy cv for "head" or "testes" for a head or testes library.

library_cvterm Structure
FK Name Type Description
library_cvterm_id serial PRIMARY KEY

library

library_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: library_feature

library_feature links a library to the clones which are contained in the library. Examples of such linked features might be "cDNA_clone" or "genomic_clone".

library_feature Structure
FK Name Type Description
library_feature_id serial PRIMARY KEY

library

library_id integer UNIQUE#1 NOT NULL

feature

feature_id integer UNIQUE#1 NOT NULL


Table: library_pub

library_pub Structure
FK Name Type Description
library_pub_id serial PRIMARY KEY

library

library_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: library_synonym

library_synonym Structure
FK Name Type Description
library_synonym_id serial PRIMARY KEY

synonym

synonym_id integer UNIQUE#1 NOT NULL

library

library_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL

The pub_id link is for relating the usage of a given synonym to the publication in which it was used.
is_current boolean NOT NULL DEFAULT true

The is_current bit indicates whether the linked synonym is the current -official- symbol for the linked library.
is_internal boolean NOT NULL DEFAULT false

Typically a synonym exists so that somebody querying the database with an obsolete name can find the object they are looking for under its current name. If the synonym has been used publicly and deliberately (e.g. in a paper), it my also be listed in reports as a synonym. If the synonym was not used deliberately (e.g., there was a typo which went public), then the is_internal bit may be set to "true" so that it is known that the synonym is "internal" and should be queryable but should not be listed in reports as a valid synonym.


Table: libraryprop

libraryprop Structure
FK Name Type Description
libraryprop_id serial PRIMARY KEY

library

library_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: contact

Model persons, institutes, groups, organizations, etc.

contact Structure
FK Name Type Description
contact_id serial PRIMARY KEY

cvterm

type_id integer

What type of contact is this? E.g. "person", "lab".
name character varying(255) UNIQUE NOT NULL
description character varying(255)

Tables referencing this one via Foreign Key Constraints:



Table: contact_relationship

Model relationships between contacts

contact_relationship Structure
FK Name Type Description
contact_relationship_id serial PRIMARY KEY

cvterm

type_id integer UNIQUE#1 NOT NULL

Relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed.

contact

subject_id integer UNIQUE#1 NOT NULL

The subject of the subj-predicate-obj sentence. In a DAG, this corresponds to the child node.

contact

object_id integer UNIQUE#1 NOT NULL

The object of the subj-predicate-obj sentence. In a DAG, this corresponds to the parent node.


Table: stock

Any stock can be globally identified by the combination of organism, uniquename and stock type. A stock is the physical entities, either living or preserved, held by collections. Stocks belong to a collection; they have IDs, type, organism, description and may have a genotype.

stock Structure
FK Name Type Description
stock_id serial PRIMARY KEY

dbxref

dbxref_id integer

The dbxref_id is an optional primary stable identifier for this stock. Secondary indentifiers and external dbxrefs go in table: stock_dbxref.

organism

organism_id integer UNIQUE#1 NOT NULL

The organism_id is the organism to which the stock belongs. This column is mandatory.
name character varying(255)

The name is a human-readable local name for a stock.
uniquename text UNIQUE#1 NOT NULL
description text

The description is the genetic description provided in the stock list.

cvterm

type_id integer UNIQUE#1 NOT NULL

The type_id foreign key links to a controlled vocabulary of stock types. The would include living stock, genomic DNA, preserved specimen. Secondary cvterms for stocks would go in stock_cvterm.
is_obsolete boolean NOT NULL DEFAULT false

Tables referencing this one via Foreign Key Constraints:



Table: stock_cvterm

stock_cvterm links a stock to cvterms. This is for secondary cvterms; primary cvterms should use stock.type_id.

stock_cvterm Structure
FK Name Type Description
stock_cvterm_id serial PRIMARY KEY

stock

stock_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: stock_dbxref

stock_dbxref links a stock to dbxrefs. This is for secondary identifiers; primary identifiers should use stock.dbxref_id.

stock_dbxref Structure
FK Name Type Description
stock_dbxref_id serial PRIMARY KEY

stock

stock_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL
is_current boolean NOT NULL DEFAULT true

The is_current boolean indicates whether the linked dbxref is the current -official- dbxref for the linked stock.


Table: stock_genotype

Simple table linking a stock to a genotype. Features with genotypes can be linked to stocks thru feature_genotype -> genotype -> stock_genotype -> stock.

stock_genotype Structure
FK Name Type Description
stock_genotype_id serial PRIMARY KEY

stock

stock_id integer UNIQUE#1 NOT NULL

genotype

genotype_id integer UNIQUE#1 NOT NULL


Table: stock_pub

Provenance. Linking table between stocks and, for example, a stocklist computer file.

stock_pub Structure
FK Name Type Description
stock_pub_id serial PRIMARY KEY

stock

stock_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: stock_relationship

stock_relationship Structure
FK Name Type Description
stock_relationship_id serial PRIMARY KEY

stock

subject_id integer UNIQUE#1 NOT NULL

stock_relationship.subject_id is the subject of the subj-predicate-obj sentence. This is typically the substock.

stock

object_id integer UNIQUE#1 NOT NULL

stock_relationship.object_id is the object of the subj-predicate-obj sentence. This is typically the container stock.

cvterm

type_id integer UNIQUE#1 NOT NULL

stock_relationship.type_id is relationship type between subject and object. This is a cvterm, typically from the OBO relationship ontology, although other relationship types are allowed.
value text

stock_relationship.value is for additional notes or comments.
rank integer UNIQUE#1 NOT NULL

stock_relationship.rank is the ordering of subject stocks with respect to the object stock may be important where rank is used to order these; starts from zero.

Tables referencing this one via Foreign Key Constraints:



Table: stock_relationship_pub

Provenance. Attach optional evidence to a stock_relationship in the form of a publication.

stock_relationship_pub Structure
FK Name Type Description
stock_relationship_pub_id serial PRIMARY KEY

stock_relationship

stock_relationship_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: stockcollection

The lab or stock center distributing the stocks in their collection.

stockcollection Structure
FK Name Type Description
stockcollection_id serial PRIMARY KEY

cvterm

type_id integer UNIQUE#1 NOT NULL

type_id is the collection type cv.

contact

contact_id integer

contact_id links to the contact information for the collection.
name character varying(255)

name is the collection.
uniquename text UNIQUE#1 NOT NULL

uniqename is the value of the collection cv.

Tables referencing this one via Foreign Key Constraints:



Table: stockcollection_stock

stockcollection_stock links a stock collection to the stocks which are contained in the collection.

stockcollection_stock Structure
FK Name Type Description
stockcollection_stock_id serial PRIMARY KEY

stockcollection

stockcollection_id integer UNIQUE#1 NOT NULL

stock

stock_id integer UNIQUE#1 NOT NULL


Table: stockcollectionprop

The table stockcollectionprop contains the value of the stock collection such as website/email URLs; the value of the stock collection order URLs.

stockcollectionprop Structure
FK Name Type Description
stockcollectionprop_id serial PRIMARY KEY

stockcollection

stockcollection_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

The cv for the type_id is "stockcollection property type".
value text
rank integer UNIQUE#1 NOT NULL


Table: stockprop

A stock can have any number of slot-value property tags attached to it. This is an alternative to hardcoding a list of columns in the relational schema, and is completely extensible. There is a unique constraint, stockprop_c1, for the combination of stock_id, rank, and type_id. Multivalued property-value pairs must be differentiated by rank.

stockprop Structure
FK Name Type Description
stockprop_id serial PRIMARY KEY

stock

stock_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: stockprop_pub

Provenance. Any stockprop assignment can optionally be supported by a publication.

stockprop_pub Structure
FK Name Type Description
stockprop_pub_id serial PRIMARY KEY

stockprop

stockprop_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: environment

The environmental component of a phenotype description.

environment Structure
FK Name Type Description
environment_id serial PRIMARY KEY
uniquename text UNIQUE NOT NULL
description text

Tables referencing this one via Foreign Key Constraints:



Table: environment_cvterm

environment_cvterm Structure
FK Name Type Description
environment_cvterm_id serial PRIMARY KEY

environment

environment_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL


Table: feature_genotype

feature_genotype Structure
FK Name Type Description
feature_genotype_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

genotype

genotype_id integer UNIQUE#1 NOT NULL

feature

chromosome_id integer UNIQUE#1

A feature of SO type "chromosome".
rank integer UNIQUE#1 NOT NULL

rank can be used for n-ploid organisms or to preserve order.
cgroup integer UNIQUE#1 NOT NULL

Spatially distinguishable group. group can be used for distinguishing the chromosomal groups, for example (RNAi products and so on can be treated as different groups, as they do not fall on a particular chromosome).

cvterm

cvterm_id integer UNIQUE#1 NOT NULL


Table: genotype

Genetic context. A genotype is defined by a collection of features, mutations, balancers, deficiencies, haplotype blocks, or engineered constructs.

genotype Structure
FK Name Type Description
genotype_id serial PRIMARY KEY
name text

Optional alternative name for a genotype, for display purposes.
uniquename text UNIQUE NOT NULL

The unique name for a genotype; typically derived from the features making up the genotype.
description character varying(255)

Tables referencing this one via Foreign Key Constraints:



Table: phendesc

A summary of a _set_ of phenotypic statements for any one gcontext made in any one publication.

phendesc Structure
FK Name Type Description
phendesc_id serial PRIMARY KEY

genotype

genotype_id integer UNIQUE#1 NOT NULL

environment

environment_id integer UNIQUE#1 NOT NULL
description text NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: phenotype_comparison

Comparison of phenotypes e.g., genotype1/environment1/phenotype1 "non-suppressible" with respect to genotype2/environment2/phenotype2.

phenotype_comparison Structure
FK Name Type Description
phenotype_comparison_id serial PRIMARY KEY

genotype

genotype1_id integer UNIQUE#1 NOT NULL

environment

environment1_id integer UNIQUE#1 NOT NULL

genotype

genotype2_id integer UNIQUE#1 NOT NULL

environment

environment2_id integer UNIQUE#1 NOT NULL

phenotype

phenotype1_id integer UNIQUE#1 NOT NULL

phenotype

phenotype2_id integer

cvterm

type_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: phenstatement

Phenotypes are things like "larval lethal". Phenstatements are things like "dpp-1 is recessive larval lethal". So essentially phenstatement is a linking table expressing the relationship between genotype, environment, and phenotype.

phenstatement Structure
FK Name Type Description
phenstatement_id serial PRIMARY KEY

genotype

genotype_id integer UNIQUE#1 NOT NULL

environment

environment_id integer UNIQUE#1 NOT NULL

phenotype

phenotype_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: acquisition

This represents the scanning of hybridized material. The output of this process is typically a digital image of an array.

acquisition Structure
FK Name Type Description
acquisition_id serial PRIMARY KEY

assay

assay_id integer NOT NULL

protocol

protocol_id integer

channel

channel_id integer
acquisitiondate timestamp without time zone DEFAULT ('now'::text)::timestamp(6) with time zone
name text UNIQUE
uri text

Tables referencing this one via Foreign Key Constraints:



Table: acquisition_relationship

Multiple monochrome images may be merged to form a multi-color image. Red-green images of 2-channel hybridizations are an example of this.

acquisition_relationship Structure
FK Name Type Description
acquisition_relationship_id serial PRIMARY KEY

acquisition

subject_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

acquisition

object_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: acquisitionprop

Parameters associated with image acquisition.

acquisitionprop Structure
FK Name Type Description
acquisitionprop_id serial PRIMARY KEY

acquisition

acquisition_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: arraydesign

General properties about an array. An array is a template used to generate physical slides, etc. It contains layout information, as well as global array properties, such as material (glass, nylon) and spot dimensions (in rows/columns).

arraydesign Structure
FK Name Type Description
arraydesign_id serial PRIMARY KEY

contact

manufacturer_id integer NOT NULL

cvterm

platformtype_id integer NOT NULL

cvterm

substratetype_id integer

protocol

protocol_id integer

dbxref

dbxref_id integer
name text UNIQUE NOT NULL
version text
description text
array_dimensions text
element_dimensions text
num_of_elements integer
num_array_columns integer
num_array_rows integer
num_grid_columns integer
num_grid_rows integer
num_sub_columns integer
num_sub_rows integer

Tables referencing this one via Foreign Key Constraints:



Table: arraydesignprop

Extra array design properties that are not accounted for in arraydesign.

arraydesignprop Structure
FK Name Type Description
arraydesignprop_id serial PRIMARY KEY

arraydesign

arraydesign_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: assay

An assay consists of a physical instance of an array, combined with the conditions used to create the array (protocols, technician information). The assay can be thought of as a hybridization.

assay Structure
FK Name Type Description
assay_id serial PRIMARY KEY

arraydesign

arraydesign_id integer NOT NULL

protocol

protocol_id integer
assaydate timestamp without time zone DEFAULT ('now'::text)::timestamp(6) with time zone
arrayidentifier text
arraybatchidentifier text

contact

operator_id integer NOT NULL

dbxref

dbxref_id integer
name text UNIQUE
description text

Tables referencing this one via Foreign Key Constraints:



Table: assay_biomaterial

A biomaterial can be hybridized many times (technical replicates), or combined with other biomaterials in a single hybridization (for two-channel arrays).

assay_biomaterial Structure
FK Name Type Description
assay_biomaterial_id serial PRIMARY KEY

assay

assay_id integer UNIQUE#1 NOT NULL

biomaterial

biomaterial_id integer UNIQUE#1 NOT NULL

channel

channel_id integer UNIQUE#1
rank integer UNIQUE#1 NOT NULL


Table: assay_project

Link assays to projects.

assay_project Structure
FK Name Type Description
assay_project_id serial PRIMARY KEY

assay

assay_id integer UNIQUE#1 NOT NULL

project

project_id integer UNIQUE#1 NOT NULL


Table: assayprop

Extra assay properties that are not accounted for in assay.

assayprop Structure
FK Name Type Description
assayprop_id serial PRIMARY KEY

assay

assay_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: biomaterial

A biomaterial represents the MAGE concept of BioSource, BioSample, and LabeledExtract. It is essentially some biological material (tissue, cells, serum) that may have been processed. Processed biomaterials should be traceable back to raw biomaterials via the biomaterialrelationship table.

biomaterial Structure
FK Name Type Description
biomaterial_id serial PRIMARY KEY

organism

taxon_id integer

contact

biosourceprovider_id integer

dbxref

dbxref_id integer
name text UNIQUE
description text

Tables referencing this one via Foreign Key Constraints:



Table: biomaterial_dbxref

biomaterial_dbxref Structure
FK Name Type Description
biomaterial_dbxref_id serial PRIMARY KEY

biomaterial

biomaterial_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL


Table: biomaterial_relationship

Relate biomaterials to one another. This is a way to track a series of treatments or material splits/merges, for instance.

biomaterial_relationship Structure
FK Name Type Description
biomaterial_relationship_id serial PRIMARY KEY

biomaterial

subject_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

biomaterial

object_id integer UNIQUE#1 NOT NULL


Table: biomaterial_treatment

Link biomaterials to treatments. Treatments have an order of operations (rank), and associated measurements (unittype_id, value).

biomaterial_treatment Structure
FK Name Type Description
biomaterial_treatment_id serial PRIMARY KEY

biomaterial

biomaterial_id integer UNIQUE#1 NOT NULL

treatment

treatment_id integer UNIQUE#1 NOT NULL

cvterm

unittype_id integer
value real
rank integer NOT NULL


Table: biomaterialprop

Extra biomaterial properties that are not accounted for in biomaterial.

biomaterialprop Structure
FK Name Type Description
biomaterialprop_id serial PRIMARY KEY

biomaterial

biomaterial_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: channel

Different array platforms can record signals from one or more channels (cDNA arrays typically use two CCD, but Affymetrix uses only one).

channel Structure
FK Name Type Description
channel_id serial PRIMARY KEY
name text UNIQUE NOT NULL
definition text NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: control

control Structure
FK Name Type Description
control_id serial PRIMARY KEY

cvterm

type_id integer NOT NULL

assay

assay_id integer NOT NULL

tableinfo

tableinfo_id integer NOT NULL
row_id integer NOT NULL
name text
value text
rank integer NOT NULL


Table: element

Represents a feature of the array. This is typically a region of the array coated or bound to DNA.

element Structure
FK Name Type Description
element_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1

arraydesign

arraydesign_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer

dbxref

dbxref_id integer

Tables referencing this one via Foreign Key Constraints:



Table: element_relationship

Sometimes we want to combine measurements from multiple elements to get a composite value. Affymetrix combines many probes to form a probeset measurement, for instance.

element_relationship Structure
FK Name Type Description
element_relationship_id serial PRIMARY KEY

element

subject_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

element

object_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: elementresult

An element on an array produces a measurement when hybridized to a biomaterial (traceable through quantification_id). This is the base data from which tables that actually contain data inherit.

elementresult Structure
FK Name Type Description
elementresult_id serial PRIMARY KEY

element

element_id integer UNIQUE#1 NOT NULL

quantification

quantification_id integer UNIQUE#1 NOT NULL
signal double precision NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: elementresult_relationship

Sometimes we want to combine measurements from multiple elements to get a composite value. Affymetrix combines many probes to form a probeset measurement, for instance.

elementresult_relationship Structure
FK Name Type Description
elementresult_relationship_id serial PRIMARY KEY

elementresult

subject_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

elementresult

object_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: magedocumentation

magedocumentation Structure
FK Name Type Description
magedocumentation_id serial PRIMARY KEY

mageml

mageml_id integer NOT NULL

tableinfo

tableinfo_id integer NOT NULL
row_id integer NOT NULL
mageidentifier text NOT NULL


Table: mageml

This table is for storing extra bits of MAGEml in a denormalized form. More normalization would require many more tables.

mageml Structure
FK Name Type Description
mageml_id serial PRIMARY KEY
mage_package text NOT NULL
mage_ml text NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: protocol

Procedural notes on how data was prepared and processed.

protocol Structure
FK Name Type Description
protocol_id serial PRIMARY KEY

cvterm

type_id integer NOT NULL

pub

pub_id integer

dbxref

dbxref_id integer
name text UNIQUE NOT NULL
uri text
protocoldescription text
hardwaredescription text
softwaredescription text

Tables referencing this one via Foreign Key Constraints:



Table: protocolparam

Parameters related to a protocol. For example, if the protocol is a soak, this might include attributes of bath temperature and duration.

protocolparam Structure
FK Name Type Description
protocolparam_id serial PRIMARY KEY

protocol

protocol_id integer NOT NULL
name text NOT NULL

cvterm

datatype_id integer

cvterm

unittype_id integer
value text
rank integer NOT NULL


Table: quantification

Quantification is the transformation of an image acquisition to numeric data. This typically involves statistical procedures.

quantification Structure
FK Name Type Description
quantification_id serial PRIMARY KEY

acquisition

acquisition_id integer NOT NULL

contact

operator_id integer

protocol

protocol_id integer

analysis

analysis_id integer UNIQUE#1 NOT NULL
quantificationdate timestamp without time zone DEFAULT ('now'::text)::timestamp(6) with time zone
name text UNIQUE#1
uri text

Tables referencing this one via Foreign Key Constraints:



Table: quantification_relationship

There may be multiple rounds of quantification, this allows us to keep an audit trail of what values went where.

quantification_relationship Structure
FK Name Type Description
quantification_relationship_id serial PRIMARY KEY

quantification

subject_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL

quantification

object_id integer UNIQUE#1 NOT NULL


Table: quantificationprop

Extra quantification properties that are not accounted for in quantification.

quantificationprop Structure
FK Name Type Description
quantificationprop_id serial PRIMARY KEY

quantification

quantification_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: study

Study represents an experiment, published or otherwise, that produced microarray data

study Structure
FK Name Type Description
study_id serial PRIMARY KEY

contact

contact_id integer NOT NULL

pub

pub_id integer

dbxref

dbxref_id integer
name text UNIQUE NOT NULL
description text

Tables referencing this one via Foreign Key Constraints:


Table: study_assay

study_assay Structure
FK Name Type Description
study_assay_id serial PRIMARY KEY

study

study_id integer UNIQUE#1 NOT NULL

assay

assay_id integer UNIQUE#1 NOT NULL


Table: studydesign

studydesign Structure
FK Name Type Description
studydesign_id serial PRIMARY KEY

study

study_id integer NOT NULL
description text

Tables referencing this one via Foreign Key Constraints:



Table: studydesignprop

studydesignprop Structure
FK Name Type Description
studydesignprop_id serial PRIMARY KEY

studydesign

studydesign_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL


Table: studyfactor

studyfactor Structure
FK Name Type Description
studyfactor_id serial PRIMARY KEY

studydesign

studydesign_id integer NOT NULL

cvterm

type_id integer
name text NOT NULL
description text

Tables referencing this one via Foreign Key Constraints:



Table: studyfactorvalue

studyfactorvalue Structure
FK Name Type Description
studyfactorvalue_id serial PRIMARY KEY

studyfactor

studyfactor_id integer NOT NULL

assay

assay_id integer NOT NULL
factorvalue text
name text
rank integer NOT NULL


Table: treatment

A biomaterial may undergo multiple treatments. Examples of treatments: apoxia, fluorophore and biotin labeling.

treatment Structure
FK Name Type Description
treatment_id serial PRIMARY KEY
rank integer NOT NULL

biomaterial

biomaterial_id integer NOT NULL

cvterm

type_id integer NOT NULL

protocol

protocol_id integer
name text

Tables referencing this one via Foreign Key Constraints:



Table: featuremap

featuremap Structure
FK Name Type Description
featuremap_id serial PRIMARY KEY
name character varying(255) UNIQUE
description text

cvterm

unittype_id integer

Tables referencing this one via Foreign Key Constraints:



Table: featuremap_pub

featuremap_pub Structure
FK Name Type Description
featuremap_pub_id serial PRIMARY KEY

featuremap

featuremap_id integer NOT NULL

pub

pub_id integer NOT NULL


Table: featurepos

featurepos Structure
FK Name Type Description
featurepos_id serial PRIMARY KEY

featuremap

featuremap_id serial NOT NULL

feature

feature_id integer NOT NULL

feature

map_feature_id integer NOT NULL

map_feature_id links to the feature (map) upon which the feature is being localized.
mappos double precision NOT NULL


Table: featurerange

In cases where the start and end of a mapped feature is a range, leftendf and rightstartf are populated. leftstartf_id, leftendf_id, rightstartf_id, rightendf_id are the ids of features with respect to which the feature is being mapped. These may be cytological bands.

featurerange Structure
FK Name Type Description
featurerange_id serial PRIMARY KEY

featuremap

featuremap_id integer NOT NULL

featuremap_id is the id of the feature being mapped.

feature

feature_id integer NOT NULL

feature

leftstartf_id integer NOT NULL

feature

leftendf_id integer

feature

rightstartf_id integer

feature

rightendf_id integer NOT NULL
rangestr character varying(255)


Table: analysis

An analysis is a particular type of a computational analysis; it may be a blast of one sequence against another, or an all by all blast, or a different kind of analysis altogether. It is a single unit of computation.

analysis Structure
FK Name Type Description
analysis_id serial PRIMARY KEY
name character varying(255)

A way of grouping analyses. This should be a handy short identifier that can help people find an analysis they want. For instance "tRNAscan", "cDNA", "FlyPep", "SwissProt", and it should not be assumed to be unique. For instance, there may be lots of separate analyses done against a cDNA database.
description text
program character varying(255) UNIQUE#1 NOT NULL

Program name, e.g. blastx, blastp, sim4, genscan.
programversion character varying(255) UNIQUE#1 NOT NULL

Version description, e.g. TBLASTX 2.0MP-WashU [09-Nov-2000].
algorithm character varying(255)

Algorithm name, e.g. blast.
sourcename character varying(255) UNIQUE#1

Source name, e.g. cDNA, SwissProt.
sourceversion character varying(255)
sourceuri text

This is an optional, permanent URL or URI for the source of the analysis. The idea is that someone could recreate the analysis directly by going to this URI and fetching the source data (e.g. the blast database, or the training model).
timeexecuted timestamp without time zone NOT NULL DEFAULT ('now'::text)::timestamp(6) with time zone

Tables referencing this one via Foreign Key Constraints:



Table: analysisfeature

Computational analyses generate features (e.g. Genscan generates transcripts and exons; sim4 alignments generate similarity/match features). analysisfeatures are stored using the feature table from the sequence module. The analysisfeature table is used to decorate these features, with analysis specific attributes. A feature is an analysisfeature if and only if there is a corresponding entry in the analysisfeature table. analysisfeatures will have two or more featureloc entries, with rank indicating query/subject

analysisfeature Structure
FK Name Type Description
analysisfeature_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

analysis

analysis_id integer UNIQUE#1 NOT NULL
rawscore double precision

This is the native score generated by the program; for example, the bitscore generated by blast, sim4 or genscan scores. One should not assume that high is necessarily better than low.
normscore double precision

This is the rawscore but semi-normalized. Complete normalization to allow comparison of features generated by different programs would be nice but too difficult. Instead the normalization should strive to enforce the following semantics: * normscores are floating point numbers >= 0, * high normscores are better than low one. For most programs, it would be sufficient to make the normscore the same as this rawscore, providing these semantics are satisfied.
significance double precision

This is some kind of expectation or probability metric, representing the probability that the analysis would appear randomly given the model. As such, any program or person querying this table can assume the following semantics: * 0 <= significance <= n, where n is a positive number, theoretically unbounded but unlikely to be more than 10 * low numbers are better than high numbers.
identity double precision

Percent identity between the locations compared. Note that these 4 metrics do not cover the full range of scores possible; it would be undesirable to list every score possible, as this should be kept extensible. instead, for non-standard scores, use the analysisprop table.


Table: analysisprop

analysisprop Structure
FK Name Type Description
analysisprop_id serial PRIMARY KEY

analysis

analysis_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text UNIQUE#1


Table: eimage

eimage Structure
FK Name Type Description
eimage_id serial PRIMARY KEY
eimage_data text

We expect images in eimage_data (e.g. JPEGs) to be uuencoded.
eimage_type character varying(255) NOT NULL

Describes the type of data in eimage_data.
image_uri character varying(255)

Tables referencing this one via Foreign Key Constraints:



Table: expression

The expression table is essentially a bridge table.

expression Structure
FK Name Type Description
expression_id serial PRIMARY KEY
uniquename text UNIQUE NOT NULL
md5checksum character(32)
description text

Tables referencing this one via Foreign Key Constraints:



Table: expression_cvterm

expression_cvterm Structure
FK Name Type Description
expression_cvterm_id serial PRIMARY KEY

expression

expression_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL
rank integer NOT NULL

cvterm

cvterm_type_id integer UNIQUE#1 NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: expression_image

expression_image Structure
FK Name Type Description
expression_image_id serial PRIMARY KEY

expression

expression_id integer UNIQUE#1 NOT NULL

eimage

eimage_id integer UNIQUE#1 NOT NULL


Table: expression_pub

expression_pub Structure
FK Name Type Description
expression_pub_id serial PRIMARY KEY

expression

expression_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL


Table: feature_expression

feature_expression Structure
FK Name Type Description
feature_expression_id serial PRIMARY KEY

expression

expression_id integer UNIQUE#1 NOT NULL

feature

feature_id integer UNIQUE#1 NOT NULL

pub

pub_id integer UNIQUE#1 NOT NULL

Tables referencing this one via Foreign Key Constraints:



Table: feature_phenotype

feature_phenotype Structure
FK Name Type Description
feature_phenotype_id serial PRIMARY KEY

feature

feature_id integer UNIQUE#1 NOT NULL

phenotype

phenotype_id integer UNIQUE#1 NOT NULL


Table: phenotype

A phenotypic statement, or a single atomic phenotypic observation, is a controlled sentence describing observable effects of non-wild type function. E.g. Obs=eye, attribute=color, cvalue=red.

phenotype Structure
FK Name Type Description
phenotype_id serial PRIMARY KEY
uniquename text UNIQUE NOT NULL

cvterm

observable_id integer

The entity: e.g. anatomy_part, biological_process.

cvterm

attr_id integer

Phenotypic attribute (quality, property, attribute, character) - drawn from PATO.
value text

Value of attribute - unconstrained free text. Used only if cvalue_id is not appropriate.

cvterm

cvalue_id integer

Phenotype attribute value (state).

cvterm

assay_id integer

Evidence type.

Tables referencing this one via Foreign Key Constraints:



Table: phenotype_cvterm

phenotype_cvterm Structure
FK Name Type Description
phenotype_cvterm_id serial PRIMARY KEY

phenotype

phenotype_id integer UNIQUE#1 NOT NULL

cvterm

cvterm_id integer UNIQUE#1 NOT NULL


Table: pub

A documented provenance artefact - publications, documents, personal communication.

pub Structure
FK Name Type Description
pub_id serial PRIMARY KEY
title text

Descriptive general heading.
volumetitle text

Title of part if one of a series.
volume character varying(255)
series_name character varying(255)

Full name of (journal) series.
issue character varying(255)
pyear character varying(255)
pages character varying(255)

Page number range[s], e.g. 457--459, viii + 664pp, lv--lvii.
miniref character varying(255)
uniquename text UNIQUE NOT NULL

cvterm

type_id integer NOT NULL

The type of the publication (book, journal, poem, graffiti, etc). Uses pub cv.
is_obsolete boolean DEFAULT false
publisher character varying(255)
pubplace character varying(255)

Tables referencing this one via Foreign Key Constraints:



Table: pub_dbxref

Handle links to repositories, e.g. Pubmed, Biosis, zoorec, OCLC, Medline, ISSN, coden...

pub_dbxref Structure
FK Name Type Description
pub_dbxref_id serial PRIMARY KEY

pub

pub_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL
is_current boolean NOT NULL DEFAULT true


Table: pub_relationship

Handle relationships between publications, e.g. when one publication makes others obsolete, when one publication contains errata with respect to other publication(s), or when one publication also appears in another pub.

pub_relationship Structure
FK Name Type Description
pub_relationship_id serial PRIMARY KEY

pub

subject_id integer UNIQUE#1 NOT NULL

pub

object_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL


Table: pubauthor

An author for a publication. Note the denormalisation (hence lack of _ in table name) - this is deliberate as it is in general too hard to assign IDs to authors.

pubauthor Structure
FK Name Type Description
pubauthor_id serial PRIMARY KEY

pub

pub_id integer UNIQUE#1 NOT NULL
rank integer UNIQUE#1 NOT NULL

Order of author in author list for this pub - order is important.
editor boolean DEFAULT false

Indicates whether the author is an editor for linked publication. Note: this is a boolean field but does not follow the normal chado convention for naming booleans.
surname character varying(100) NOT NULL
givennames character varying(100)

First name, initials
suffix character varying(100)

Jr., Sr., etc


Table: pubprop

Property-value pairs for a pub. Follows standard chado pattern.

pubprop Structure
FK Name Type Description
pubprop_id serial PRIMARY KEY

pub

pub_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text NOT NULL
rank integer UNIQUE#1


Table: organism

The organismal taxonomic classification. Note that phylogenies are represented using the phylogeny module, and taxonomies can be represented using the cvterm module or the phylogeny module.

organism Structure
FK Name Type Description
organism_id serial PRIMARY KEY
abbreviation character varying(255)
genus character varying(255) UNIQUE#1 NOT NULL
species character varying(255) UNIQUE#1 NOT NULL

A type of organism is always uniquely identified by genus and species. When mapping from the NCBI taxonomy names.dmp file, this column must be used where it is present, as the common_name column is not always unique (e.g. environmental samples). If a particular strain or subspecies is to be represented, this is appended onto the species name. Follows standard NCBI taxonomy pattern.
common_name character varying(255)
comment text

Tables referencing this one via Foreign Key Constraints:



Table: organism_dbxref

organism_dbxref Structure
FK Name Type Description
organism_dbxref_id serial PRIMARY KEY

organism

organism_id integer UNIQUE#1 NOT NULL

dbxref

dbxref_id integer UNIQUE#1 NOT NULL


Table: organismprop

Tag-value properties - follows standard chado model.

organismprop Structure
FK Name Type Description
organismprop_id serial PRIMARY KEY

organism

organism_id integer UNIQUE#1 NOT NULL

cvterm

type_id integer UNIQUE#1 NOT NULL
value text
rank integer UNIQUE#1 NOT NULL