![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
When I learned databases, about a quarter of a century ago, the preferred practice was to use natural keys to refer to individual table rows, unless there's some real good reason to go with an artificial key. I've run into a few cases where I chose to use an artificial key, but most of the time I've used a natural key. (I should clarify: if a university's registration office assigns each student a StudentID, before entering that student into the database, I'm treating that key as "natural" in the context of the database, even though one might argue that it's "artificial" in some other context.) |
|
Every time the issue of natural keys comes up in Stackoverflow.com, the prevailing view seems to be that the best primary keys are artificial and opaque. And responses that take this view get the votes. They emphasize efficiency (mainly efficiency in the index that you usually get automatically when you declare a PRIMARY KEY). But data integrity seems to be forgotten, here. |
|
The same is not generally tru when the conversation turns to referential integrity. Slowly but surely the programming community seems to heve been dragged, kicking and screaming, into turning over referential integrity enforcement to the DBMS. |
#3
| |||
| |||
|
#4
| |||
| |||
|
|
Tony Toews [MVP] wrote: ... Basically, it's just a phony issue foisted by db semi-literates, usually product-oriented to the extreme, like the kind who get into arguments about whether table names should have an 's' at the end or who grew up with 1980's 4GL's and their puerile successor products that couldn't support composites. A table/relation/relvar should have exactly the attributes and exactly the keys that the app needs. It is just self-flagellation to ignore a dbms feature that will generate some of them (assuming the dbms doesn't insist on them). The so-called 'natural' key for all the airwaybills in a ULD on a typical airline flight segment is about seven or eight attributes. When the cargo master, at the last minute, offloads a container or two onto the tarmac, you don't want a ponderous logic to un-assign it and put it on the plane at the next gate. You need a form of indirection, for that matter the same is needed in any robotic application such as a freight warehouse. Somebody in this thread suggested, if I took the meaning right, that a design could actually profit from using a single attribute to stand in for a composite key, where both are part of the design. I've seen such systems and considered that to be perfectly reasonable, although one of the big names in the field told me I was confusing physical with logical. But I say the big names aren't right all the time. I think there is such a thing as logical efficiency, not just physical efficiency and sometimes it is a requirement that a 'time-critical' massive update be coded as simply as possible, leaving the slow-poke work-a-day requirements to handle the ponderous side. I admit that doesn't happen very often now with all the fast machinery around. The decision for me would be based entirely on whether the application required it, and nothing to do with any dogma. |
#5
| |||
| |||
|
|
"Walter Mitty" <wamitty (AT) verizon (DOT) net> wrote: When I learned databases, about a quarter of a century ago, the preferred practice was to use natural keys to refer to individual table rows, unless there's some real good reason to go with an artificial key. I've run into a few cases where I chose to use an artificial key, but most of the time I've used a natural key. (I should clarify: if a university's registration office assigns each student a StudentID, before entering that student into the database, I'm treating that key as "natural" in the context of the database, even though one might argue that it's "artificial" in some other context.) That's a decent natural key. But life is seldom so clear. Consider names. My brother has the same birth date, first, middle and last name as a career criminal in Canada. And we have a somewhat rare last name. It's a pain for him to cross the border into the US. He allows himself an extra hour. Although they must now have photo's in the US Customs computers as lately the delay as been quite a bit shorter. Consider business name. Duplicate business names can exist in adjacent provinces, states or countries. Now what happens when you get to the child tables? For example student/course table. What do you use as a primary key? Student ID, Course Number? What if a student takes the course a second time? Now the primary key needs three fields and some logic to ensure the third field is unique. What about child tables of that child table? Now it's a table with four fields, or more in the primary key. Every time the issue of natural keys comes up in Stackoverflow.com, the prevailing view seems to be that the best primary keys are artificial and opaque. And responses that take this view get the votes. They emphasize efficiency (mainly efficiency in the index that you usually get automatically when you declare a PRIMARY KEY). But data integrity seems to be forgotten, here. How does this affect data integrity? When there is a possibility of duplicate data we should be presenting the users with the current data that is a close match so the user can decide if the person/company/whatever is already in the system or not. For example the first two letters of the first and last name give surprisingly few matches in a database I have of 10,000 names. Even Jo Sm for Joan Smythe. The same is not generally tru when the conversation turns to referential integrity. Slowly but surely the programming community seems to heve been dragged, kicking and screaming, into turning over referential integrity enforcement to the DBMS. I don't quite understand that the problem is with turning referential integrity over to the DBMS. I'm quite happy with an artificial primary key (autonumber in Access) and unique indexes in data fields as appropriate. That said Access forms and reports don't handle multifield artificial keys very well. The wizards, for example, don't handle them at all. So you're first to labouriously enter the field names by hande in the various properties such as a subforms Link Child Fields and Link Master Fields properties. I should point out that 1) I build simple and moderately complex applications in MS Access so that clearly colours my viewpoints. A product many readers of my postings in this newsgroup hold in derision. 2) I look at this from both the DBAs viewpoint and the developers viewpoint although again many folks in this newsgroup would not consider me a DBA of any sort. 3) I've managed to irritate a number of the folks in this newsgroup with my postings in the past. As a result I've been plonked by a few. Oh well. 3a) As a result if one or two of those read replies to me they will be complaining that they were forced to read my opinion. <shrug Tony -- Tony Toews, Microsoft Access MVP Tony's Main MS Access pages - http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/ Granite Fleet Manager http://www.granitefleet.com/ |
#6
| |||
| |||
|
|
My big bugaboo with the ID field is that if two rows have different IDs but otherwise identical data, there tends to be a disconect between developers and analysts as to whether this is harmful duplication or not. |
#7
| |||
| |||
|
|
Consider business name. Duplicate business names can exist in adjacent provinces, states or countries. |
|
I'm quite happy with an artificial primary key (autonumber in Access) and unique indexes in data fields [sic: columns are not fields] as appropriate. |

#8
| |||
| |||
|
|
Consider business name. Duplicate business names can exist in adjacent provinces, states or countries. Ever look up the DUNS? It is great for International use. Again, do your Google Diligence, do your Wikipedia Diligence. |
|
I'm quite happy with an artificial primary key (autonumber in Access) and unique indexes in data fields [sic: columns are not fields] as appropriate. The Department of Redundancies Department, just like Monty Python ![]() Multi-column keys are quite easy to use with a text editor. |
|
Assuming that you don't name them all "id" in violation of basic data modeling rules. |
#9
| |||
| |||
|
|
The so-called 'natural' key for all the airwaybills in a ULD on a typical airline flight segment is about seven or eight attributes. When the cargo master, at the last minute, offloads a container or two onto the tarmac, you don't want a ponderous logic to un-assign it and put it on the plane at the next gate. You need a form of indirection, for that matter the same is needed in any robotic application such as a freight warehouse. |
#10
| |||
| |||
|
|
In the case of stackoverflow, I don't feel quite comforatble dismissing those who hold opinions that differe from mine as "semi literates". The rest of what they write suggest that they have been building databases for some time. At least, that's true for some of them. ... |
![]() |
| Thread Tools | |
| Display Modes | |
| |