David <foo.bar24 (AT) yahoo (DOT) com> writes:
Quote:
The normal way I've seen database schema versioning is to create a
database version table and use an integer as the schema version. |
A database schema is unlike an application in that it's much more
troublesome to swap out one schema and try a different one in its place.
For this reason, and others, I find a version number doesn't tell me
enough to reliably re-create the state of the schema.
Instead, I follow the discipline in “Refactoring Databases”, making
changes to the schema only via incremental patches that preserve the
semantics of the existing data. Those patches are kept in a VCS as
discrete schema patch files, with descriptive names; each patch file
makes a small, conceptually distinct, atomic change to the schema.
I track the schema's state with a “patches applied” table in a separate
schema:
=====
CREATE SCHEMA revision_control;
COMMENT ON SCHEMA revision_control IS
'Metadata for revisions to database schema.';
CREATE TABLE revision_control.entry (
revision INTEGER NOT NULL,
name VARCHAR NOT NULL,
applied TIMESTAMP WITH TIME ZONE NOT NULL,
PRIMARY KEY (revision)
);
COMMENT ON TABLE revision_control.entry IS
'Record of individual revisions applied to the running schema.';
COMMENT ON COLUMN revision_control.entry.revision IS
'Sequential revision number of the schema revision.';
COMMENT ON COLUMN revision_control.entry.name IS
'Name of the schema revision applied.';
COMMENT ON COLUMN revision_control.entry.applied IS
'Timestamp when the revision was applied to the database.';
INSERT INTO revision_control.entry
(revision, name, applied)
VALUES
(0, 'initial state', NOW());
=====
Each new entry corresponds to a patch file, written in the DDL of the
database, that has been applied to this instance of the schema. The
files are named to be briefly descriptive of the atomic change being
made by the DDL.
The author of “Refactoring Databases” advocates keeping the DDL of each
patch as another field, of type TEXT, directly in (what corresponds to)
my ‘revision_control.entry’ table above. That has the advantage of
making the changes available directly from the database.
Quote:
I've found a few instances where applications use the product version
also as the schema version. |
I find the granularity of “what changes have been made to the schema” is
much finer-grained than “what version of the product is this”. A single
increment of the application version would encompass many changes made
to the schema. Likewise, many changes made to the application don't
correspond to a change in the schema. Not a good match.
Quote:
Are there any documents, web sites..., that explain the pros and cons
for the different ways of doing schema versioning? |
The best treatment of this topic that I know is the book “Refactoring
Databases” <URL:http://www.databaserefactoring.com/>.
You can get a brief overview of the book covers by reading an essay
<URL:http://www.agiledata.org/essays/databaseRefactoring.html> by the
same author. The essay also includes links to further resources.
--
\ “[i]t is impossible for anyone to begin to learn that which he |
`\ thinks he already knows.” —Epictetus, _Discourses_ |
_o__) |
Ben Finney