Glossary

From GMOD
Revision as of 18:25, 14 December 2007 by Clements (Talk | contribs)

Jump to: navigation, search

This glossary explains terms that

  • are specific to the GMOD project, or
  • are computing terms that are used in the GMOD project.

This glossary does not define biology terms.

Database

A database can be any set of organized data that is readable by a computer. It can be anywhere from an implementation of a database schema in a particular database management system to regular files that have a defined format.
For example, the database behind the FlyBase web site contains data on drosopholids, and uses the Chado schema and the PostgreSQL database management system.
See also:

Database Management System

Database management systems (DBMSs) are software systems that can manage data. PostgreSQL, MySQL, Oracle and Sybase are all examples of DBMSs. DBMSs are containers of databases. That is, they are the systems that manage databases, which is distinct from the data that they manage.
Most DBMSs are relational, which is a particular way of representing data. All DBMSs that GMOD is concerned with are relational, so GMOD uses the termsdatabase management system and relational database management system (RDBMS) interchangeably.
See also:

Database Schema

A database schema is the design of a particular database, independent of its contents. Chado is an example of a database schema. Designs (like Chado) can be reused across multiple databases.

See also:



DBMS

See Database Management System.


Gene Finder Format

See GFF.

General Feature Format

See GFF.

GFF

If you get into the more technical side of GMOD, loading databases in particular, you will come across this term. It refers to a tab-delimited file format for storing sequence annotations (curiously, the acronym has different definitions, Gene Finder Format, or General Feature Format). Here is an example:

 test.fa      RepeatMasker    similarity      238     289     15.4    +       .       Target "Motif:(TA)n" 2 53

The line above describes a match to a sequence motif (TAn) on a sequence contained in the "file.fa", where the match goes from position 238 to position 289 on the "+" strand.

One encounters GFF files frequently in the GMOD world. It's used as interchange format, so a script or an application may create GFF as output and some other script or application may load this GFF into a database. Or it may the database itself. There are ways to create databases directly from GFF files, though it turns out that these work well only with smaller sets of data. See GFF for more information.

RDBMS

See Database Management System.


Operating System

An operating system (OS) is the software that controls a computer and manages the sharing of resources on that computer. Example operating systems are [http:www.microsoft.com Microsoft Windows] and Linux.

See also:


OS

See Operating System.

Relational Database Management System

See Database Management System.