![]() | |
#21
| |||
| |||
|
|
paul c wrote: Bob Badour wrote: ... You asked about semantic joinability. Semantically, the join works if like-named attributes have an equality comparison defined and not if they don't. ... In RT, how could they not define equality? What reason would one have to define an equality comparison to compare employee ids with department ids or with names? Or charge with current or potential? Or age with price? Hi paul, |
#22
| |||
| |||
|
|
What reason would one have to define an equality comparison to compare employee ids with department ids or with names? Or charge with current or potential? Or age with price? |
#23
| |||
| |||
|
|
Bob Badour wrote: paul c wrote: Bob Badour wrote: ... You asked about semantic joinability. Semantically, the join works if like-named attributes have an equality comparison defined and not if they don't. ... In RT, how could they not define equality? What reason would one have to define an equality comparison to compare employee ids with department ids or with names? Or charge with current or potential? Or age with price? None in theory, I thought you were talking about attributes in the same domain. |
#24
| |||
| |||
|
|
On Oct 24, 8:56 pm, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote: What reason would one have to define an equality comparison to compare employee ids with department ids or with names? Or charge with current or potential? Or age with price? shoot your age and you are worth your weight in gold. |
#25
| |||
| |||
|
|
paul c wrote: Bob Badour wrote: paul c wrote: Bob Badour wrote: ... You asked about semantic joinability. Semantically, the join works if like-named attributes have an equality comparison defined and not if they don't. ... In RT, how could they not define equality? What reason would one have to define an equality comparison to compare employee ids with department ids or with names? Or charge with current or potential? Or age with price? None in theory, I thought you were talking about attributes in the same domain. Nope. Natural join doesn't care about attributes in the same domain per se. It compares like named attributes regardless of domain. TTM explains when and how the most specific types can differ and still have an equality comparison return true: they must share some common supertype other than the universal supertype. In this case, the MST of the join will be the most specific common supertype. If the MSTs of the original relations are mutually exclusive, the body will be empty. If not, the join body will consist of values drawn from subtypes of both MSTs and that appear in both relations. Otherwise, the body of the join will be empty and the type of the like named attribute will be the universal supertype. As a type with a set of values and a set of operations, the universal supertype's set of values is the union of all values, and its set of operations is empty. I expect most implementations would treat an attribute with the universal supertype as an error, or at least complain loudly with a warning. |
#26
| |||
| |||
|
|
Bob Badour wrote: paul c wrote: Bob Badour wrote: paul c wrote: Bob Badour wrote: ... You asked about semantic joinability. Semantically, the join works if like-named attributes have an equality comparison defined and not if they don't. ... In RT, how could they not define equality? What reason would one have to define an equality comparison to compare employee ids with department ids or with names? Or charge with current or potential? Or age with price? None in theory, I thought you were talking about attributes in the same domain. Nope. Natural join doesn't care about attributes in the same domain per se. It compares like named attributes regardless of domain. TTM explains when and how the most specific types can differ and still have an equality comparison return true: they must share some common supertype other than the universal supertype. In this case, the MST of the join will be the most specific common supertype. If the MSTs of the original relations are mutually exclusive, the body will be empty. If not, the join body will consist of values drawn from subtypes of both MSTs and that appear in both relations. Otherwise, the body of the join will be empty and the type of the like named attribute will be the universal supertype. As a type with a set of values and a set of operations, the universal supertype's set of values is the union of all values, and its set of operations is empty. I expect most implementations would treat an attribute with the universal supertype as an error, or at least complain loudly with a warning. I can't make much comment on type theory, which I think is optional, not essential to RT. Appendix A, the relational underpinning of TTM, doesn't depend on MST's or inheritance, just set membership except that set equality crops up for unions of headings, as far as I can tell. It does say, eg., in the <AND> definition: "It is required that if <A,T1 'is member of' Hr1 and <A,T2> 'is member of' Hr2, then T1 = T2" (edited quote). I presume those qualified attributes are the "like named" ones you mean. While I wouldn't argue when you say "it compares like named attributes regardless of domain", TTM does require that the "state is set" beforehand, Appendix A uses the qualification to ensure that the type for both attributes is the same. |
#27
| |||
| |||
|
|
Anyhow, the question here is not one of our imagination but rather simply this: if it makes sense for the RM to support constraints on relational /values/ (taken on by variables) why does it not make sense to support constraints on relational /expressions/? That is a question of general principle not specific design. |
#28
| |||
| |||
|
|
Domains, however, define what is or is not joinable because natural join applies the equality comparison to attributes with like names, and domains define equality comparisons. |
#29
| |||
| |||
|
|
Hmm. Usually we talk about constraints when they're syntactic, because that's what makes it possible to enforce them automatically, simply by manipulating the symbols in the database. So what you actually mean is, what syntactic construct is appropriate for enforcing the semantics you describe. I'd argue that that's what domains were invented for. I agree that it is possible to enforce such join semantics with domains but I'm not sure it is entirely appropriate. I say that because to the greatest extent the RM is orthogonal to domains. To quote CJ Date "Databases In Depth" Chapter 2 Summary: "It's a very common misconception that the relational model deals only with rather simple types: numbers, strings, perhaps dates and times, and not much else. In this chapter, I've tried to show that this is indeed a misconception. Rather, relations can have attributes of /any type whatsoever/ -- the relational model nowhere predicates what those types must be, and in fact they can be as complex as we like ... In other words, the question as to what types are supported is orthogonal to the question of support for the relational model itself. Or (less precisely but more catchily): /types are orthogonal to tables/." To put this another way, the RM should be just as complete and effective for a universe with a single domain as it is for a universe with a rich domain set (such as you propose to handle my GenericID join constraint example). And placing constraints on /relational expressions/ seems as fundamental to the RM as placing constraints on /relational values/. By pushing such expression constraints off to domains, we are admitting that the RM itself has no support for constraining relational expressions and I'm not prepared to admit that limitation. Especially since it does provide support for constraining the values of relational variables. I will also note that both Codd and Date proposed RM mechanisms to operate the other way around ie to allow designers to force cross-domain operations. Codd with "Domain Check Overrides" and date with the THE_ operators that provide type coercion. Domains, however, define what is or is not joinable because natural join applies the equality comparison to attributes with like names, and domains define equality comparisons. Domains also define a set of permissible values for attributes. However, that doesn't stop RM from adding additional constraints to limit those values further. You asked about semantic joinability. Semantically, the join works if like-named attributes have an equality comparison defined and not if they don't. Why should relational operations be any different in this respect from relational variables? Ie why shouldn't the RM allow us to further constrain what domains permit for both values AND operations? Because the RM doesn't have anything much to say about domains. Domains are a separate--albeit vitally important--concern. |
#30
| |||||||||||||
| |||||||||||||
|
|
I agree that it is possible to enforce such join semantics with domains but I'm not sure it is entirely appropriate. |
|
To put this another way, the RM should be just as complete and effective for a universe with a single domain as it is for a universe with a rich domain set (such as you propose to handle my GenericID join constraint example). |
|
By pushing such expression constraints off to domains, we are admitting that the RM itself has no support for constraining relational expressions and I'm not prepared to admit that limitation. Especially since it does provide support for constraining the values of relational variables. |
|
I will also note that both Codd and Date proposed RM mechanisms to operate the other way around ie to allow designers to force cross-domain operations. Codd with "Domain Check Overrides" and date with the THE_ operators that provide type coercion. |
|
I think there are other views, see above. |
|
In short, we should not let our lack of imagination guide our design principles. |
|
It's not for us to wonder such whys because our imagination is usually quite limited. For example, GenericID seems an entirely reasonable thing to me if I labelled every item I sent out (even the coupons) with a barcode and have other relations expressing facts about those barcodes apart from what they label. |

|
Anyhow, the question here is not one of our imagination but rather simply this: if it makes sense for the RM to support constraints on relational /values/ (taken on by variables) why does it not make sense to support constraints on relational /expressions/? |
|
If you have the means of keeping such ID's private -- this would necessitate making the ID's fully opaque from an outside user's point of view, which cannot be done in any DBMS I know of -- you might just be able to justify their existence as surrogates. But this is still stuff that requires tremendous discipline and insight into the entity integrity issues that need to be addressed in addition to the referential integrity between the surrogates. I wouldn't recommend going down this slippery slope unless you're *absolutely* sure you know what you're doing -- I haven't seen a single wider scale deployment which got this right and didn't suffer integrity issues in the long run. Frankly I don't understand the point of the above paragraph at all. |
|
Also I don't see how entity integrity enters into this at all; I'm assuming we follow the sage advice of having no NULLs to begin with; so entity integrity is non-issue. |
|
I don't think I'm talking about "physical level" types (posreps) at all. |
|
Consider Date's point that RM is orthogonal to type support. Then what argument from principle do you have for prohibiting the RM from defining constraints on relational expressions by pushing that responsibility off to domain support? |
|
How would we express the constraints in a universe with one domain? |
![]() |
| Thread Tools | |
| Display Modes | |
| |