Difference between revisions of "Chado Audit Module"

From GMOD
Jump to: navigation, search
m
m
Line 99: Line 99:
  
  
[[Category:To Do]]
 
 
[[Category:Chado]]
 
[[Category:Chado]]

Revision as of 14:51, 5 April 2007

Introduction

The Audit module is not a module in the same sense as the other Chado modules. It is a set of recommendations on how one might implement auditing in Chado. This documentation describes an example of such an implementation which is in use at FlyBase. The SQL used at Flybase is chado/modules/audit/audit_tables.sql.


Audit Policy

We consider the implementation of auditing policy as a database administration issue and subject to the local requirements of specific database groups. We therefore have not developed a detailed audit module, providing instead this policy statement.

Database project groups who would use chado as their primary DBMS would be well advised to implement as robust and detailed an audit trail as possible, while groups who would use chado more informally, for instance primarily as a read-only resource, may have no need for auditing.

We here describe how auditing will be implemented at FlyBase, where chado will serve as the primary DBMS.

Auditing will be implemented in a separate, orthogonal database to the chado, with each chado table having a corresponding shadow audit table which will include all of the columns of the chado table, plus these additional columns: <sql> transaction_date timestamp not null transaction_type char not null </sql>

Where transaction_type value is one of:

  • i (insertion)
  • u (update)
  • d (deletion)


Thus an chado db instance chado.1 containing a table gene, for example: <sql> create table gene ( gene_id serial not null, primary key (gene_id), name varchar(255) not null, dbxref_id int, foreign key (dbxref_id) references dbxref(dbxref_id), unique(name), unique(dbxref_id) ); </sql>

Would have a corresponding audit database, chado.1_audit, containing a shadow audit table, gene_audit: <sql> create table gene_audit ( gene_id serial not null, primary key (gene_id), name varchar(255) not null, dbxref_id int,

              foreign key (dbxref_id) references dbxref(dbxref_id),
              unique(name),
              unique(dbxref_id),
              transaction_date timestamp not null,
              transaction_type char not null

); </sql>


Transaction Use Cases

Insertion of a New Record

When a record is intially inserted into an chado table, the same record is inserted into the corresponding shadow audit table, with the current timestamp and transaction_type = 'i'.

Update of an Existing Record

When a record is to be updated, before the update transaction is executed in chado, the existing chado record is copied to the corresponding shadow audit table, with the current timestamp and transaction_type = 'u'. The update transaction is then executed in chado.

Deletion of an Existing Record

When a record is to be deleted, before the delete trasaction is executed in chado, the existing chado record is copied to the corresponding shadow audit table, with the corrent timestamp and transaction_type = 'd'. The delete transaction is then executed in chado.


Implementation Issues

The benefit of this system is that it allows a certain amount of "time travel", or the ability to reconstruct the database to its state at a particular time in the past. We don't anticipate making this functionality available to the general user population, but recognize that from a database programming point of view, it is often quite useful, and sometimes vital, for a programmer to be able to track changes in the database in detail.

While the auditing policy described here fulfills all of our auditing requirements in principle, it remains to be seen how it will work in practice. The size of the shadow audit database may become an issue over time. How the handling of audit transactions will affect overall database transaction performance is also unknown. FlyBase makes a practice of reporting certain audit details to the public (e.g. date entered and date last updated of gene records) - it is not know whether implementing this will be too expensive using this shadow audit db method