dbTalk Databases Forums  

What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000?

comp.databases comp.databases


Discuss What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #41  
Old   
Bob Badour
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 08:16 AM






"Trevor Best" <bouncer@localhost> wrote

Quote:
On Tue, 2 Dec 2003 09:04:35 -0500 in comp.databases, "Bob Badour"
bbadour (AT) golden (DOT) net> wrote:


"Louis Davidson" <dr_dontspamme_sql (AT) hotmail (DOT) com> wrote in message
news:#rIbinJuDHA.3436 (AT) tk2msftngp13 (DOT) phx.gbl...
While it is true that the chosen primary key cannot contain any
optional
values, it is more the praticioner (sp?) that disagrees with this
stance.
Compound keys are unwieldy and bad for performance, but the theorist in
me
says "who cares?" It is all about what is right/best, not what is
fastest/easiest.

Performance is determined by the physical structure and not by the
logical
interface. Legitimate theorists have written ad nauseum on the severe
logical problems caused by using compound keys for references when data
may
be missing.

Are you talking about data being missing from a foreign key side of a
relationship? I'm pro identity column myself but I don't see how that
would help in this instance.
The specific issue is compound keys and missing information. A simple key
does not exhibit the same problems regardless whether it is an identity
column.

Consider a compound key with attributes A and B. What happens when the user
inserts a referencing row with a known A and an unknown B? Should the dbms
allow the insert? When should it allow the insert? Should the dbms verify
the A exists at least once in the referenced table? If the A value exists
only once in the referenced table, should the dbms substitute the only
corresponding B value that could be correct in the inserted row? Suppose the
user deletes all the rows from the referenced table that contain a specific
A value. What happens when the referencing table contains corresponding rows
with a known A and an unknown B?




Reply With Quote
  #42  
Old   
Bob Badour
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 08:17 AM






"Trevor Best" <bouncer@localhost> wrote

Quote:
On Tue, 2 Dec 2003 17:53:44 -0000 in comp.databases, "David Portas"
REMOVE_BEFORE_REPLYING_dportas (AT) acm (DOT) org> wrote:

living people than VendorID values. I don't think an identity VendorID
value is any more artificial than any other unique way of identifying a
vendor, so long as it is assigned the first time a vendor enters the
system and is never changed.

But the point is that a "natural" key is verifiable outside of the
system.

When I see a NG post from "skass[at]drew.edu" I don't care whether that's
based on your "real" name or even whether S.Kass is the same name as on
your
passport or driver's licence. What's important to me is that it's
determined
by a consistent method outside of the system which gives me some
acceptable
degree of confidence that you're the same person who posted here as
"skass[at]drew.edu" yesterday. Of course that validity is destroyed if
you
change your email address or if someone spoofs your address. But it's
still
intrinsically better than an arbitrary ID allocated by the server.

If Steve leaves that educational facility he's at now, his email
address will surely change. If he goes into a witness protection
scheme his name, address and SS number will change, he or someone else
could change quiet a bit about him but if he's on a database with an
identity column as his PK then it's more likely that it will *never*
change.
With all due respect, the whole point of the witness protection programme is
to prevent people from associating the individual with their previous
identity.




Reply With Quote
  #43  
Old   
Bob Badour
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 08:20 AM



"Stijn Verrept" <sverrept (AT) nospan (DOT) vub.ac.be> wrote

Quote:
I've read through this thread but I don't understand it. I always use an
int or smallint as primary key, with identity. I believe it would be a
mess
otherwise.
Your belief does not alter the correct criteria for choosing a key:
simplicity, familiarity and stability.


Quote:
Example: I have a table with people, last name, first name, address, ...
So
suppose you would make a natural key then you need at least the last name
and the first name. I have >25 other tables that reference that table.
If
I get this right I will need to use the name and firstname field in all
the
other tables as well to reference. Isn't that just a lot of data waste?
If
I'm missing something, please tell me what because this seems a bit silly.
You have constructed a straw man. One anecdote does not demonstrate or
justify a general principle or rule.




Reply With Quote
  #44  
Old   
Bob Badour
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 08:23 AM



I see why you were elected a most vociferous person. That's a very long way
of stating something as simple as the criteria for choosing a primary key:
simplicity, familiarity and stability.

"Aaron Bertrand [MVP]" <aaron (AT) TRASHaspfaq (DOT) com> wrote

Quote:
You have definitely hit a couple of the strong points in IDENTITY's favor.
But there are cases where a "natural" kind of key can be small as well.
Consider when eBay bought PayPal (and other than that, the rest of this
paragraph is completely fictional), they likely had to merge some data...
perhaps eBay used an IDENTITY to generate customer numbers, but they want
to
align those primary keys with the new data in the PayPal tables. So, the
keys in the PayPal data become INTs, but not IDENTITY. They are kind of
"natural" because they came to the PayPal from an external source, so to
speak, rather than generated arbitrarily from within.

Of course, completely fictional. But surely you can see that not all
natural keys are going to be larger than an IDENTITY, or less efficient.
There are other examples, too. In a small stats system, a SMALLDATETIME
could be the primary key (perhaps several subrelated tables are organized
by
day). In fact, part of http://www.aspfaq.com/stats.asp (and plenty more
that you can't see) is derived on a set of tables where SMALLDATETIME is
the
only key of relevance. Okay, so that's still 4 bytes, but you save 4 if
your other alternative is to store an IDENTITY along with the
SMALLDATETIME
value. Consider:

CREATE TABLE calendar
(
dateValue SMALLDATETIME PRIMARY KEY
)

vs.

CREATE TABLE calendar
(
dateID INT IDENTITY PRIMARY KEY,
dateValue SMALLDATETIME NOT NULL
)

Never mind my goofy naming scheme. :-)

Now, Kass could probably show me some cool dateadd tricks that would allow
me to store just an INT (or maybe even a SMALLINT, depending on the date
range required), and determine what the date value is at runtime. Not
that
I think that's what his argument would be, but rather just to show that it
is still possible to choose either route. I think the usability of the
date
value representing what it is, rather than having to derive its value from
some formula, is a good thing.

In cases like e-mail address and SSN (and in fact most cases), I still
prefer your route, where there is a surrogate key (IDENTITY) that prevents
me from having to cascade changes all over the place, and store larger
foreign keys.

Firstname + lastname is obviously a bad choice for a key of any kind,
because I know more than one Aaron Bertrand. So then you bring middle
name
into the key, and it can still be repeated. Other things like getting
adopted, re-married, legally changing their name, and other reasons why
this
"key" would change are minor; changes to the key can be dealt with in the
database using DRI/CASCADE or, worst case scenario, through rigorous
update
code; it will be tougher to re-train users to look up all the tables
containing "Carmen Bertrand" instead of "Carmen Electra." :-) However, I
think the possibility of two people having the same key is a far more
compelling argument for bypassing the natural key and placing some
meaningless identifier, like IDENTITY, that the user doesn't care about
and
would never have to change.

Now, you might think, "why not bring SSN into the FirstName + MiddleName +
LastName key? That would make it unique." Yes, and hideously large. If
SSN is unique, then why not just use SSN as the key? Again, it's large
even
on its own (CHAR(9) and surely to become CHAR(10) in our lifetimes), so I
fail to see the benefit of repeating the value in every related table, DRI
or not.

Sorry about the earful, sometimes I get a little typographical diarrhea.
Hopefully that was at least marginally intelligible.

--
Aaron Bertrand
SQL Server MVP
http://www.aspfaq.com/




"Stijn Verrept" <sverrept (AT) nospan (DOT) vub.ac.be> wrote in message
news:#G5HW$TuDHA.1512 (AT) TK2MSFTNGP10 (DOT) phx.gbl...
I've read through this thread but I don't understand it. I always use
an
int or smallint as primary key, with identity. I believe it would be a
mess
otherwise.

Example: I have a table with people, last name, first name, address, ...
So
suppose you would make a natural key then you need at least the last
name
and the first name. I have >25 other tables that reference that table.
If
I get this right I will need to use the name and firstname field in all
the
other tables as well to reference. Isn't that just a lot of data waste?
If
I'm missing something, please tell me what because this seems a bit
silly.


Stijn Verrept.







Reply With Quote
  #45  
Old   
Stijn Verrept
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 08:43 AM



"Bob Badour" <bbadour (AT) golden (DOT) net> wrote

Quote:
Your belief does not alter the correct criteria for choosing a key:
simplicity, familiarity and stability.
Well non natural keys do meet the simplicity and stability criteria. Even
more than natural keys I believe.

Quote:
You have constructed a straw man. One anecdote does not demonstrate or
justify a general principle or rule.
Hmmm one anecdote? I have a database full of tables like this. List of
doctors, departments, users, contacts, medication, ... everywhere I used a
non natural key. The use of natural keys will only happen sometimes when
it's really appropriate, otherwise not.


Stijn Verrept.




Reply With Quote
  #46  
Old   
David Portas
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 09:05 AM



Do you really allow the same Doctor, Department, etc to appear twice in its
table with different keys? If you don't declare unique natural keys then
that's the kind of problem you have. An IDENTITY isn't a *surrogate* key at
all unless the table also has a natural key - it's just a physical row
identifier.

--
David Portas
------------
Please reply only to the newsgroup
--



Reply With Quote
  #47  
Old   
Bruce Lewis
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 09:21 AM



"Trey Walpole" <treyNOpole (AT) SPcomcastAM (DOT) net> writes:

Quote:
I am a firm believer that natural keys should only be used to logically
design/normalize the data. When it comes to the real reason for keys, data
integrity, more often than not I have seen that natural keys are
intrinsically not good physical primary keys.
This differs from my experience.

Quote:
1. Natural keys are, being natural and therefore user entered [i.e.,
provided to the database by external means], fungible. If a user enters
data, they must also be able to modify it. If data can be modified, then its
value as a systemic primary key is gone. Yes, you can cascade updates to
these, but why do it when it can be avoided to start with.
Why do it? To avoid duplicates of course. Why not do it? You don't
seem to be making any sort of case here.

Quote:
2. Natural keys are typically a composite of atomic attributes. If using a
composite, these must be propagated to referencing tables as foreign keys.
Your normalization drops below par, by having these [potentially] massively
duplicated columns.
I agree that natural keys should be avoided because they're too large.
However, most business reports I see typically have columns that consist
of abbreviations chosen to make the report less wide. These make great
natural keys.

Quote:
Attributes that are single, [supposedly] unique attributes (e.g., SSN),
usually represent some official, governmentally recognized ID, and therefore
have legal issues with being propagated throughout a system.
Yes, SSNs are problematic because too many organizations use them for
authentication, i.e. "You put Trey Walpole's SSN on this form, so you must
be Trey Walpole". Even in the absence of such stupid organizations,
privacy advocates oppose national IDs for a very good reason: such IDs
make it easy to create good databases that include people. However, I
assume the original poster had the opposite goal: make it easy to create
good databases.

Quote:
Also, for amateurs and many professionals, natural keys are very often
chosen incorrectly. e.g., I believe some combination of Name and other info
has been used by my ISP as their primary key. My last name was entered into
their system incorrectly, but they cannot fix it because their system will
not allow it. Preposterous and poor design.
It sounds like this "preposterous and poor design" could have been fixed
with a simple REFERENCES ... ON UPDATE CASCADE. Perhaps you should
offer them your services. I can't tell from your story whether or not
they used a poor choice of natural key.

Yes, people can make bad choices as to natural primary keys, but I think
this one additional opportunity to do bad database design is well worth
the risk, given the problems that arise from redundant or duplicate
data.

Quote:
Surrogate keys generated by using the identity property are ideal for data
integrity, because
1. They are static values [i.e., once entered, it does not change] and the
DBA has control over allowing values in identity columns to be modified.
This is only meaningful if there's something wrong with ON UPDATE
CASCADE, which I think there isn't.

Quote:
2. They are singleton row ids.
And thus the problem. Earlier this year an e-mail alert system I wrote
was sending two copies when it should just send one. Looking into it,
the employee table had been doubled. I switch to a select distinct to
work around the problem, and someone deleted the duplicates. If we had
a natural primary key for the employee table, I doubt those duplicates
would have gone in.

Primary key constraint errors are your friends.


Reply With Quote
  #48  
Old   
Stijn Verrept
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 10:31 AM



"David Portas" <REMOVE_BEFORE_REPLYING_dportas (AT) acm (DOT) org> wrote

Quote:
Do you really allow the same Doctor, Department, etc to appear twice in
its
table with different keys? If you don't declare unique natural keys then
that's the kind of problem you have. An IDENTITY isn't a *surrogate* key
at
all unless the table also has a natural key - it's just a physical row
identifier.
I never said I allow them to appear twice in the column, you have Unique
Constraint for that. I could use that as a natural key, but I prefer using
an int or smallint. I don't want to note Name, Firstname, ... in another
table as foreign key! Also in the application I don't see me writing:
select SN_Active from seniors where (SN_Name = :SNName) and (SN_FirstName =
:FirstName) and (SN_BirthDate = :SNBirthDate).


Stijn Verrept.




Reply With Quote
  #49  
Old   
Steve Kass
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQLSERVER 2000? - 12-03-2003 , 10:49 AM





Stijn Verrept wrote:

Quote:
"David Portas" <REMOVE_BEFORE_REPLYING_dportas (AT) acm (DOT) org> wrote in message
news:NvCdnazr-9_dZlCiRVn-vA (AT) giganews (DOT) com...


Do you really allow the same Doctor, Department, etc to appear twice in


its


table with different keys? If you don't declare unique natural keys then
that's the kind of problem you have. An IDENTITY isn't a *surrogate* key


at


all unless the table also has a natural key - it's just a physical row
identifier.



I never said I allow them to appear twice in the column, you have Unique
Constraint for that. I could use that as a natural key, but I prefer using
an int or smallint. I don't want to note Name, Firstname, ... in another
table as foreign key! Also in the application I don't see me writing:
select SN_Active from seniors where (SN_Name = :SNName) and (SN_FirstName =
:FirstName) and (SN_BirthDate = :SNBirthDate).


Stijn Verrept.



How about putting a UNIQUE NOT NULL CLUSTERED constraint on the identity
column and putting the PRIMARY KEY NONCLUSTERED constraint on the
multi-column primary key?

That might confuse the anti-identity fanatics enough so they'll stop
complaining. You will have a natural primary key, so they won't think
the world is coming to an end, but you will go on as you always have,
using the identity column for its convenience in queries, FK
constraints, etc.

SK



Reply With Quote
  #50  
Old   
Aaron Bertrand - MVP
 
Posts: n/a

Default Re: What are cons and pros for using IDENTITY property as PK in SQL SERVER 2000? - 12-03-2003 , 11:04 AM



Quote:
How about putting a UNIQUE NOT NULL CLUSTERED constraint on the identity
column and putting the PRIMARY KEY NONCLUSTERED constraint on the
multi-column primary key?
Certainly, I suppose you could...

CREATE TABLE splunge
(
splungeID INT IDENTITY(1,1) NOT NULL UNIQUE,
email VARCHAR(128) PRIMARY KEY CLUSTERED
)
GO

CREATE TABLE blat
(
splungeID INT NOT NULL
FOREIGN KEY REFERENCES splunge(splungeID)
)
GO

--
Aaron Bertrand
SQL Server MVP
http://www.aspfaq.com/




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.