![]() | |
#21
| |||
| |||
|
|
On Dec 27, 3:31 pm, Hugo Kornelis h... (AT) perFact (DOT) REMOVETHIS.info.INVALID> wrote: On Wed, 26 Dec 2007 23:17:52 -0400, Bob Badour wrote: Attempting to store both the age and the reason why an age is unknown in a single column violates first normal form. Attempting to store both the age and whether the age is known in a single column etc. etc. |
#22
| |||
| |||
|
|
See my reply to JOG's message. |
#23
| |||
| |||
|
|
On Thu, 27 Dec 2007 16:41:51 -0800 (PST), JOG wrote: On Dec 27, 11:31 pm, Hugo Kornelis h... (AT) perFact (DOT) REMOVETHIS.info.INVALID> wrote: On Wed, 26 Dec 2007 23:17:52 -0400, Bob Badour wrote: Codd pointed out that a single NULL marker did not suffice and suggested 2 markers. Date pointed out that one can apply the same argument to 2 markers leading to an infinite progression once one heads down that path, which suggests the path was never a productive one to head down in the first place. Hi Bob, Unfortunately, both Codd and Date forgot that the NULL marker in (for instance) the age column should represent only that the age is not on file and not try to represent a reason for this as well. Codd's suggestions to use two markers (for "not applicable" and "unknown", IIRC) assumes that we want to store both the age of a person (if on file), and the reason why an age is not on file (if it isn't). That can of course be necessary - but in that case, we have two attributes that should be stored in two seperate columns. Attempting to store both the age and the reason why an age is unknown in a single column violates first normal form. Is the Zaniolo approach you are favouring not doing exactly the same thing? Surely it is attempting to store both an age and the fact that an age is not on file in a single column, which also violates first normal form. Hi J, That would be the case if I were intentionally storing the fact that an age is not on file, and using (or rather: abusing) the age column for that. But I'm not. I'm just storing ages I have on file. Yes, if there is no value (as represented by the NULL marker) in this column in a specific row, then you can infer from this that I have no age on file for the person described by that row - but this is an unintended though unavoidable by-effect. Just as the fact that you can infer "Jack is an adult" from "Jack is 43 years old" - surely you wouldn't use that to maintain that storing an age violated 1NF? It still appears to be a hack imo, and one still ends up with 3VL. 3VL is not a result of using NULL to represent missing information, but a result of allowing missing information. IMO, there are only two options: either you deal with missing information, and with the 3VL that results from it -- or you somehow alter reality so that information is never missing again, for any reason. There has to be a more elegant way....Regards, J. I doubt it. Given this information: * "Employee Jack is male" * "Employee Mary is female" * "Employee Jack is 43 years old" * "Employee JJ is 32 years old" how would you answer the below questions: * "List all employees aged 40 and above". * "What is the average age of our employees?" * "For each employee, how many years left until retirement (assume a country with laws for retirement age of 65)" * "List all employees that are female, under 35 years old, or both" * "Is JJ older than Mary?" * etc Try to describe how you would answer those questions, in an elegant way and without using anything even remotely resembling 3VL. If you succeed at that, a database without 3VL is just around the corner - because the hardest part is not implementing, but finding out what to implement. I can give answers to all questions above, but I have to use 3VL in all cases. For me, that signifies that 3VL is part of reality and hence can't be left out of a database that attempts to model (aspects of) reality. |
#24
| ||||||
| ||||||
|
|
On Thu, 27 Dec 2007 16:41:51 -0800 (PST), JOG wrote: On Dec 27, 11:31 pm, Hugo Kornelis h... (AT) perFact (DOT) REMOVETHIS.info.INVALID> wrote: On Wed, 26 Dec 2007 23:17:52 -0400, Bob Badour wrote: Codd pointed out that a single NULL marker did not suffice and suggested 2 markers. Date pointed out that one can apply the same argument to 2 markers leading to an infinite progression once one heads down that path, which suggests the path was never a productive one to head down in the first place. Hi Bob, Unfortunately, both Codd and Date forgot that the NULL marker in (for instance) the age column should represent only that the age is not on file and not try to represent a reason for this as well. Codd's suggestions to use two markers (for "not applicable" and "unknown", IIRC) assumes that we want to store both the age of a person (if on file), and the reason why an age is not on file (if it isn't). That can of course be necessary - but in that case, we have two attributes that should be stored in two seperate columns. Attempting to store both the age and the reason why an age is unknown in a single column violates first normal form. Is the Zaniolo approach you are favouring not doing exactly the same thing? Surely it is attempting to store both an age and the fact that an age is not on file in a single column, which also violates first normal form. Hi J, That would be the case if I were intentionally storing the fact that an age is not on file, and using (or rather: abusing) the age column for that. But I'm not. I'm just storing ages I have on file. |
|
Yes, if there is no value (as represented by the NULL marker) in this column in a specific row, then you can infer from this that I have no age on file for the person described by that row - but this is an unintended though unavoidable by-effect. Just as the fact that you can infer "Jack is an adult" from "Jack is 43 years old" - surely you wouldn't use that to maintain that storing an age violated 1NF? |
|
It still appears to be a hack imo, and one still ends up with 3VL. 3VL is not a result of using NULL to represent missing information, but a result of allowing missing information. IMO, there are only two options: either you deal with missing information, and with the 3VL that results from it -- or you somehow alter reality so that information is never missing again, for any reason. |
|
There has to be a more elegant way....Regards, J. I doubt it. Given this information: * "Employee Jack is male" * "Employee Mary is female" * "Employee Jack is 43 years old" * "Employee JJ is 32 years old" how would you answer the below questions: * "List all employees aged 40 and above". * "What is the average age of our employees?" * "For each employee, how many years left until retirement (assume a country with laws for retirement age of 65)" * "List all employees that are female, under 35 years old, or both" * "Is JJ older than Mary?" * etc |
|
Try to describe how you would answer those questions, in an elegant way and without using anything even remotely resembling 3VL. If you succeed at that, a database without 3VL is just around the corner - because the hardest part is not implementing, but finding out what to implement. |

|
I can give answers to all questions above, but I have to use 3VL in all cases. For me, that signifies that 3VL is part of reality and hence can't be left out of a database that attempts to model (aspects of) reality. Best, Hugo |
#25
| ||||
| ||||
|
|
3VL is not a result of using NULL to represent missing information, but a result of allowing missing information. IMO, there are only two options: either you deal with missing information, and with the 3VL that results from it -- or you somehow alter reality so that information is never missing again, for any reason. |
|
There has to be a more elegant way....Regards, J. I doubt it. Given this information: * "Employee Jack is male" * "Employee Mary is female" * "Employee Jack is 43 years old" * "Employee JJ is 32 years old" how would you answer the below questions: * "List all employees aged 40 and above". * "What is the average age of our employees?" * "For each employee, how many years left until retirement (assume a country with laws for retirement age of 65)" * "List all employees that are female, under 35 years old, or both" * "Is JJ older than Mary?" * etc |
|
[...] I can give answers to all questions above, but I have to use 3VL in all cases. |
|
For me, that signifies that 3VL is part of reality and hence can't be left out of a database that attempts to model (aspects of) reality. |
#26
| |||
| |||
|
#27
| |||
| |||
|
|
I believe myself, david and marshall have all just said the same thing in different ways... |
#28
| |||
| |||
|
|
If everyone is clear NULLS shouldn't be used, why the debate as to what to do about them ? Because SQL allows NULL and even promotes the idea that NULL solves some problem instead of introducing many. So everyone is agreed that NULLs shouldn't appear anywhere, and its just a matter of time before NULLS become a legacy problem and a relational language supercedes SQL? |
|
Are the two solutions I suggested before the widely accepted as resolutions to the two problems NULLs were introduced to eradicate? Missing information? Then surely there should just be no tuple? |
|
To indicate that there can be no value? Why not a seperate table with a boolean value in the non-key column? |
#29
| |||
| |||
|
|
"stevedtrm" <steved... (AT) hotmail (DOT) com> wrote in message news:a3716365-7bc4-4a75-ab58-ee7ac957c299 (AT) n20g2000hsh (DOT) googlegroups.com... If everyone is clear NULLS shouldn't be used, why the debate as to what to do about them ? Because SQL allows NULL and even promotes the idea that NULL solves some problem instead of introducing many. So everyone is agreed that NULLs shouldn't appear anywhere, and its just a matter of time before NULLS become a legacy problem and a relational language supercedes SQL? No, Steve, not everyone is agreed. Are the two solutions I suggested before the widely accepted as resolutions to the two problems NULLs were introduced to eradicate? Missing information? Then surely there should just be no tuple? A row may contain a lot more information that would otherwise be useful. Would you tell a potential customer that you won't serve them because you don't know everything about them? I think not! Haven't you been asked for your e-mail address at a retailer? What would you think if the cashier told you to get lost for refusing to supply it? To indicate that there can be no value? Why not a seperate table with a boolean value in the non-key column? In my opinion, null should only ever be used to indicate "There should be a value here, but it hasn't been supplied." That there should be a value or that there must be a value are structural issues, and therefore should not be subject to interpretation. "No value here" invites interpretation, because it begs the question, "Should there have even been a value?" Structural information is lost by separating a nullable column into its own table: the fact that there should always be a value, even if it hasn't been supplied is lost. |
|
That's not to say that it isn't possible to eliminate nulls altoghether, it's just that you need several separate tables instead of one: one table for the values that must always be present, one table for each nullable column (or set of columns in those cases where whenever one value is supplied, another value must also be supplied) that indicates whether or not there should be a value (or set of values), and one table with a column (or set of columns) for those values that just happen to have been supplied. Note also that the spectre of 3VL disappears when null indicates only "there should be a value here, but it hasn't been supplied." Since there should be a value, and only one value, a null can simply be replaced by an exclusive disjunction that encompasses all of the values in the attribute's domain. So there is no need for 3VL, 2VL suffices. (For infinite domains, a quantifier could be used that means instead of "there is...," "there is exactly one....") |
#30
| |||
| |||
|
|
"stevedtrm" <stevedtrm (AT) hotmail (DOT) com> wrote in message news:a3716365-7bc4-4a75-ab58-ee7ac957c299 (AT) n20g2000hsh (DOT) googlegroups.com... If everyone is clear NULLS shouldn't be used, why the debate as to what to do about them ? Because SQL allows NULL and even promotes the idea that NULL solves some problem instead of introducing many. So everyone is agreed that NULLs shouldn't appear anywhere, and its just a matter of time before NULLS become a legacy problem and a relational language supercedes SQL? No, Steve, not everyone is agreed. Are the two solutions I suggested before the widely accepted as resolutions to the two problems NULLs were introduced to eradicate? Missing information? Then surely there should just be no tuple? A row may contain a lot more information that would otherwise be useful. Would you tell a potential customer that you won't serve them because you don't know everything about them? I think not! Haven't you been asked for your e-mail address at a retailer? What would you think if the cashier told you to get lost for refusing to supply it? To indicate that there can be no value? Why not a seperate table with a boolean value in the non-key column? In my opinion, null should only ever be used to indicate "There should be a value here, but it hasn't been supplied." |
![]() |
| Thread Tools | |
| Display Modes | |
| |