![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
In this thread, I want to further explore the use of ID fields, and approach the question of whether using ID fields is relevant to the discussion of whether an UPDATE is somehow different from a DELETE followed by an INSERT. |
#3
| |||
| |||
|
|
Walter Mitty wrote: In this thread, I want to further explore the use of ID fields, and approach the question of whether using ID fields is relevant to the discussion of whether an UPDATE is somehow different from a DELETE followed by an INSERT. So we're explicitly talking about SQL here, is that right? |
#4
| |||
| |||
|
|
In message <xWQTl.1481$9L2.793 (AT) nwrddc02 (DOT) gnilink.net>, Walter Mitty wamitty (AT) verizon (DOT) net> writes Let's say that the design has followed the practice of including an auto-generated key called ID in every table, and declaring the ID as the PRIMARY KEY. Let's further assume a practice that says that you never update the ID field of any tuple, once it's been assigned. Now let's say you have a table that records values that look like (ID, value). In an actual case, value would probably ityself be a tuple, so our tagged tuples might look like, for example, (ID, LastName, FirstName, PhoneNumber). Now, if we update the LastName, the FirstName, and the Phone number of an entry, it's still the same entry, because the ID field retains the same value. This appears to be an update anomaly but it's not as simple as that. If we think of the logical data structure the ID isn't in the same entity as the data we want to manage. It's in a separate entity with only one attribute. There is supposed to be a 1:1 relationship between the ID entity and the data entity, but this is not usually maintained by the database. The cardinality of that relationship is managed either programmatically or by external business processes. |
#5
| |||
| |||
|
|
Walter Mitty wrote: ... So, even if the relational model doesn't forbid updates on any attribute, the neophyte learning on MS Access is apt to learn that autonumber fields can't be updated. ... The neophyte learning MS Access is learning MS Access, not relational theory. Probably the same can said of whoever designed MS Access. It's true of other fields too that the average practitioner knows very little theory but what is strange to me about this one is that compared to some fields, there is very little relational theory to learn! The question of how to implement 'updates', if asked with relational theory in mind, should be put as how to replace one relation with another. There are lots of ways to program that, how too pick one is a different question. In fact, such comments aren't about relational theory, they would be about 'table theory' if there were such a thing. |
#6
| |||
| |||
|
|
In message <xWQTl.1481$9L2.793 (AT) nwrddc02 (DOT) gnilink.net>, Walter Mitty wamitty (AT) verizon (DOT) net> writes Let's say that the design has followed the practice of including an auto-generated key called ID in every table, and declaring the ID as the PRIMARY KEY. Let's further assume a practice that says that you never update the ID field of any tuple, once it's been assigned. Now let's say you have a table that records values that look like (ID, value). In an actual case, value would probably ityself be a tuple, so our tagged tuples might look like, for example, (ID, LastName, FirstName, PhoneNumber). Now, if we update the LastName, the FirstName, and the Phone number of an entry, it's still the same entry, because the ID field retains the same value. This appears to be an update anomaly but it's not as simple as that. If we think of the logical data structure the ID isn't in the same entity as the data we want to manage. It's in a separate entity with only one attribute. There is supposed to be a 1:1 relationship between the ID entity and the data entity, but this is not usually maintained by the database. The cardinality of that relationship is managed either programmatically or by external business processes. Updating the data without making a simultaneous update to the ID field is perfectly permissible within the relational model. They are after all in separate entities with only a tenuous connection. It's an update anomaly and it's caused by failure to fully normalise the data structure. |
#7
| |||
| |||
|
|
Just a couple of comments about a post that is unusually vague and fuzzy coming from you, it's tedious to dismantle every sentence, so I mention only a couple: |
#8
| |||
| |||
|
|
Walter Mitty wrote: ... I think the process by which 714 comes to be associated with Joe Friday is actually somewhat mysterious. Googling "Badge 714" confirmed this opinion. BTW, I an NOT using "mysterious" as a code word for "mystical". The one thing I can say is that it's not a consequence of the data. It's either part of the data as given, or it's part of some fairly arbitrary process. ... There will always be 'mysteries' that machines don't deal with. I'd say requirements should always trump any notion of some data being innately different from other data. Requirements are arbitrary, to say they aren't is to say any system will suit any purpose. For me if it's a requirement that a system generate badge numbers, that's pretty much the same thing as a requirement that it generate invoice numbers. If it's a requirement that a clerk generate the numbers instead, so be it. Whether the numbers are calculated or sequential or reversible might be another requirement. I agree, in general. But I still think it might be worthwhile to |
#9
| |||
| |||
|
|
"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message news:vdUTl.29814$PH1.1299 (AT) edtnps82 (DOT) .. Just a couple of comments about a post that is unusually vague and fuzzy coming from you, it's tedious to dismantle every sentence, so I mention only a couple: I apologize for how vague and fuzzy this topic is. If my wording is vague and fuzzy, my thinking is even more vague and fuzzy. The first thing is whether or not I'm talking about a tuple within a tuple. This was a digression. As far as I'm concerned, the difference between the following two is not fundamental: (714, Joe, Friday, 234-5678) (714, (Joe, Friday,234-5678)) At least not fundamental for the purposes of figuring out how "714" comes to be associated with Joe Friday and not with some other policeman. Next, the question arises whether the association is managed by the database or whether its managed programatically or by some external function, manual or automatic. Some responder addressed precisely that question. I think it was Bernard. In my experience discussing this topic with fans of the ID field concept, I've found that, contrary to Bernard's opinion, they turn over management of the association to the database. I smell a rat at this point, because I think they are asking a database to perform a function that is not really a database function. I think that assigning 714 to Joe Friday is inventing data, and it's not the database system's job to invent data. It's the database system's job to manage the data it's given. That's not to say that some popular tool like MS Access might not do it anyway. And it's not clear to me whether MS Access is really a DBMS. I think it's an application management system. I think the process by which 714 comes to be associated with Joe Friday is actually somewhat mysterious. Googling "Badge 714" confirmed this opinion. BTW, I an NOT using "mysterious" as a code word for "mystical". The one thing I can say is that it's not a consequence of the data. It's either part of the data as given, or it's part of some fairly arbitrary process. Next, my statement of the issue is fogged by my own ignorance of relational theory. My introduction to relational theory was only as a backdrop to my introduction to database design. At the time, I was learning how to design SQL databases, which all of us called "relational databases". I don't recall if I heard Ed Codd's name, but I know I never read anything he wrote until much later. This informal and spotty education notwithstanding, I continue to view my own interpretation of relational theory to be largely orthodox, when compared to some of the crazy things that have been accepted as "best practices" in database design. Anyway, how would relational theory decribe the operation that I called "UPDATE" (using the SQL keyword intentionally.)? would it be something like the following? relvar R gets assigned relvar R minus relation(tuple T) plus relation(tuple U). How does this differ from a DELETE followed by an INSERT? I don't think this post clears up any of the fuzziness. But maybe it lays the groundwork for clearingf it up. |
ivorced},
ivorced},
ivorced}}
ivorced}}
ivorced}}
ivorced}}
ivorced}}
#10
| |||
| |||
|
|
In another thread, I suggested that prefixing every tuple with an auto-generated field (column) called ID amounted to reference by location rather than reference by content. Some people correctly point out to me that there is a level of indirection gained by using ID fields instead of pointers. That means the records (rows) "pointed to" by references to the ID are not pinned, and can still be shuffled. That's good, but it's trivial, IMO, when compared to the advantages claimed by Codd and refined by Date for the Relational model over alternative models. As far as I can see, a database where every foreign key refers to an auto-generated ID column is just a graph database disguised to look like a relational database. In this thread, I want to further explore the use of ID fields, and approach the question of whether using ID fields is relevant to the discussion of whether an UPDATE is somehow different from a DELETE followed by an INSERT. I declined to engage Brian in this question when he posted a response in the earlier thread. I'm willing to dialogue about that here. Let's say that the design has followed the practice of including an auto-generated key called ID in every table, and declaring the ID as the PRIMARY KEY. Let's further assume a practice that says that you never update the ID field of any tuple, once it's been assigned. Now let's say you have a table that records values that look like (ID, value). In an actual case, value would probably ityself be a tuple, so our tagged tuples might look like, for example, (ID, LastName, FirstName, PhoneNumber). Now, if we update the LastName, the FirstName, and the Phone number of an entry, it's still the same entry, because the ID field retains the same value. Now what happens if instead of updating the three dependent fields, we delete the row, and insert a new one. Here, I'm making an assumption about how auto-generate works. I'm assuming that, once used, and ID will never be auto-generated again, even if it's no longer in use. So, when we delete a row and insert a new row, we get a brand new ID. All of the references to the deleted row remain as orphans, unless they got deleted by a cascaded delete. So, Brian's claim that deleting a value and inserting a new one is somehow really different from updating an existing value works in this model, if you regard ID as somehow "special", and not "part of the value recorded in the table". |
![]() |
| Thread Tools | |
| Display Modes | |
| |