dbTalk Databases Forums  

Shared or separate primary key sequence(s)?

comp.databases comp.databases


Discuss Shared or separate primary key sequence(s)? in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Ian Taylor
 
Posts: n/a

Default Shared or separate primary key sequence(s)? - 02-06-2009 , 04:17 PM






Hi,

I have a question regarding sequences as primary keys. I have some
example tables each having an integer as a primary key.

create sequence vegetable_id_seq ;
CREATE TABLE vegetable (
id INTEGER PRIMARY KEY default nextval ('vegetable_id_seq'),
name varchar (255) not null
);

create sequence fruit_id_seq ;
CREATE TABLE fruit (
id INTEGER PRIMARY KEY default nextval ('fruit_id_seq'),
name varchar (255) not null
);

create sequence meat_id_seq ;
CREATE TABLE meat (
id INTEGER PRIMARY KEY default nextval ('meat_id_seq'),
name varchar (255) not null
);

My question is: Is it better to use separate sequence counters (as in
the example) or is it better to share a sequence counter among all (or
some) tables? The advantage of sharing a sequence counter is that it
stores redundancy information in the ID number itself which can aid in
debugging code. For example, if you have a function which compares
two 'meat' objects generated from the information in meat table, one
can verify by looking at the ID values that both objects are really
genuine 'meat' objects as both values can only occur in the meat
table. When using separate sequence counters this is not possible as
ID numbers are not unique and it is much harder to find code that
accidentally mixes up ID values. I personally would tend to using a
shared sequence but I don't have enough experience with databases to
make a good judgement about this.

I'm using PostgreSQL 8.1.15 (Debian Etch) if that is of interest to
anyone.

TIA,
Ian Taylor.


Reply With Quote
  #2  
Old   
Robert Klemme
 
Posts: n/a

Default Re: Shared or separate primary key sequence(s)? - 02-07-2009 , 09:03 AM






On 06.02.2009 23:17, Ian Taylor wrote:
Quote:
Hi,

I have a question regarding sequences as primary keys. I have some
example tables each having an integer as a primary key.

create sequence vegetable_id_seq ;
CREATE TABLE vegetable (
id INTEGER PRIMARY KEY default nextval ('vegetable_id_seq'),
name varchar (255) not null
);

create sequence fruit_id_seq ;
CREATE TABLE fruit (
id INTEGER PRIMARY KEY default nextval ('fruit_id_seq'),
name varchar (255) not null
);

create sequence meat_id_seq ;
CREATE TABLE meat (
id INTEGER PRIMARY KEY default nextval ('meat_id_seq'),
name varchar (255) not null
);

My question is: Is it better to use separate sequence counters (as in
the example) or is it better to share a sequence counter among all (or
some) tables? The advantage of sharing a sequence counter is that it
stores redundancy information in the ID number itself which can aid in
debugging code.
I'm not sure I understand this: a number is a number. If you want to
"store" additional information in a number you need to use a particular
numbering scheme (such as "all odd numbers mean X"). Choosing between a
single and multiple sequences won't help you there.

Quote:
For example, if you have a function which compares
two 'meat' objects generated from the information in meat table, one
can verify by looking at the ID values that both objects are really
genuine 'meat' objects as both values can only occur in the meat
table.
Again, I don't follow this reasoning: if you have two meat objects and
want to verify you have to _query the table_ with the number as filter
no matter what. Looking at the number alone won't help you unless you
have said naming scheme in place (see above).

Quote:
When using separate sequence counters this is not possible as
ID numbers are not unique and it is much harder to find code that
accidentally mixes up ID values. I personally would tend to using a
shared sequence but I don't have enough experience with databases to
make a good judgement about this.
It seems the code would not be mixing up id values but rather tables,
wouldn't it?

Here's what I think about the pros and cons:

Shared sequence pros:
- less number loss with Oracle's CACHE setting (no idea whether
PostgreSQL has something similar)
- you *might* get an idea of the overall creation order

Cons
- higher risk of running out of numbers
- potential bottleneck (synchronization on one vs. multiple locks)
- no per table control over numbering, i.e. you might want to wrap
numbering for one table but not for another

Cheers

robert


Reply With Quote
  #3  
Old   
Ian Taylor
 
Posts: n/a

Default Re: Shared or separate primary key sequence(s)? - 02-07-2009 , 12:35 PM



Robert Klemme wrote:

Quote:
I'm not sure I understand this: a number is a number. If you want to
"store" additional information in a number you need to use a
particular
numbering scheme (such as "all odd numbers mean X"). Choosing between
a single and multiple sequences won't help you there.
The problem that I fear is suddenly seeing an ID number and not knowing
exactly where it came from or what it refers to. When I share a seq
counter among tables all I have to do is search for the number in all
the tables' PK columns and then I know exactly where it was stored as
it is unique, so then I know which code queried it and that might help
me in finding buggy code.

I agree: the advantage this method provides is not that big but I just
wanted to know if anyone had any experience with this or if there were
serious arguments against this practice.

But then again, if it turns out to be slow or otherwise not practical
the effort required to switch back to separate counters is not that
big.

Quote:
- higher risk of running out of numbers
PostgreSQL supports integers in the range from -9223372036854775808 to
9223372036854775807, so I don't know if it really matters, but I will
take this into consideration.

Quote:
- potential bottleneck (synchronization on one vs. multiple locks)
Good point, haven't thought about this. This could really become a
problem.

Quote:
- no per table control over numbering, i.e. you might want to wrap
numbering for one table but not for another
I'd only share a sequence among tables with the same numbering
convention.

Thanks Robert,
Ian.



Reply With Quote
  #4  
Old   
--CELKO--
 
Posts: n/a

Default Re: Shared or separate primary key sequence(s)? - 02-08-2009 , 01:37 PM



you have some classic design errors here.
1) Tables have collective or plural names unless there really only one
of that kind of entity.
2) There is no such thing as a magival, universal, "Elixir of Life"
identifier called "id"; to be is to be something in particular.
3) There is no such thing as vague, generic attrivutre called "name";
it has to be a particular kind of name.
4) There is no such thing as an "_id_seq"; an attribute can an
identifier or a sequence, but not both.

CREATE TABLE Vegetables
(vegetable_id INTEGER PRIMARY KEY ,
vegetable_name VARCHAR (255) NOT NULL);

CREATE TABLE Fruits
(fruit_id INTEGER PRIMARY KEY,
fruit_name VARCHAR (255) NOT NULL);

CREATE TABLE Meats
(meat_id INTEGER PRIMARY KEY ,
meat_name VARCHAR (255) NOT NULL);

This is still a stinking mess. Can you name a vegetable with a 255
letter name? Don't worry; since you allowed it, someone will
eventually fill the table with dirty data.

Quote:
My question is: Is it better to use separate sequence counters (as in the example) or is it better to share a sequence counter among all (or some) tables?
NEITHER! When looking for a key, you proceed in this order:
1) Find an industry standard. In your case, that looks like UPC in my
grocery store.
2) Find a company standard. This usually means going to the
accounting department.
3) Use an external company standard. This means talking to suppliers,
competitors, etc
4) When you cannot find an industry or company standard (< 5% of the
time), then **carefully design** your own encoding scheme

The order of the physical insertions is not a logical key. No
validation, no verification, not portability, etc. It is also non-
deterministic! And you cannot claim that this shows any care in
design. What do you think is right about this?

What you are doing, but don't now it, is faking a 1970's network DB
pointer chain in SQL and throwing out RDBMS completely.

Quote:
I personally would tend to using a shared sequence but I don't have enough experience with databases to make a good judgment about this.
Get a copy of SQL PROGRAMMING STYLE for some help on the design of
encoding schemes. I am the only guy who teaches this along with scales
and measurements in my database classes.

My guess is that you will have a UPC code for all the grocery items,
then design a hierarchical (think Dewey Decimal) "house category"
encoding scheme for reporting purposes. There will be a table that
ties the two encodings together.



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.