![]() | |
#11
| |||
| |||
|
|
On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour (AT) golden (DOT) net wrote: Try thinking about it this way . . . At the conceptual level, a domain is just a data type, and a type is, among other things, a set of all possible values. One way to handle a set of values is to store them in a table. It's the "among other things" that kills you. A variable is not a time invariant set of values and their associated operations. It doesn't kill *me*. SQL databases don't support domains in the relational sense of "domains". That's why it's often helpful to try thinking in other ways. In designing a SQL database based on a logical model, you have to map types, relvars, and assignments (among other things) from the logical model to features supported by the target platform. This TYPE USER_ID POSSREP ( INTEGER ) CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997}; (where the enumeration of values is unavoidable) *might* map to something like this for a primitive data type CREATE DOMAIN User_ID AS INTEGER; CONSTRAINT user_id_values CHECK (VALUE IN (1, 112, 314, ... 9997)); But that probably won't work. (And it's not even worth trying for any type more complex than a primitive data type.) Most SQL systems won't let you have more than a few hundred values in that CHECK constraint. Instead, you might map the relational TYPE...CONSTRAINT to a SQL CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT statement for each value, one or more CHECK constraints, a bucketful of REVOKE statements, a FOREIGN KEY constraint from each table in which User_ID is (cough) used, and some administrative procedures to make sure the table is always used. *If* the designer is careful. Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms like, say, Mimer SQL? (I mention Mimer because you can download it from www.mimer.com.) Let's say you have six thousand non-consecutive integers to enumerate. |
#12
| ||||||
| ||||||
|
|
Mike Sherrill <MSherrillnonono (AT) compuserve (DOT) com> wrote On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour (AT) golden (DOT) net wrote: Try thinking about it this way . . . At the conceptual level, a domain is just a data type, and a type is, among other things, a set of all possible values. One way to handle a set of values is to store them in a table. snip |
|
In designing a SQL database based on a logical model, you have to map types, relvars, and assignments (among other things) from the logical model to features supported by the target platform. This TYPE USER_ID POSSREP ( INTEGER ) CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997}; (where the enumeration of values is unavoidable) *might* map to something like this for a primitive data type CREATE DOMAIN User_ID AS INTEGER; CONSTRAINT user_id_values CHECK (VALUE IN (1, 112, 314, ... 9997)); But that probably won't work. (And it's not even worth trying for any type more complex than a primitive data type.) Most SQL systems won't let you have more than a few hundred values in that CHECK constraint. Instead, you might map the relational TYPE...CONSTRAINT to a SQL CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT statement for each value, one or more CHECK constraints, a bucketful of REVOKE statements, a FOREIGN KEY constraint from each table in which User_ID is (cough) used, and some administrative procedures to make sure the table is always used. *If* the designer is careful. Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms like, say, Mimer SQL? (I mention Mimer because you can download it from www.mimer.com.) Let's say you have six thousand non-consecutive integers to enumerate. The only reason you are trying to enumerate "six thousand non-consecutive integers" is because you are under the mistaken impression that the set of users who actually use a system is a domain, when it isn't. |
|
may leave etc. |
|
all time) set of all possible values from which users may be given a user_id. Of course, it is conceivable that an organisation might decide in advance that all user_id values must be taken from the enumerated set {1, 112, 314, ... 9997}, but if so then of course they pay the penalty in the complexity of enforcing the constraint. |
|
more typical would be "between 5 and 10 alhphanumeric characters" or "an integer between 1000 and 9999". I don't understand the criteria upon which you decide that one degree of |
|
I agree that SQL database don't support domains very well, but the example being discussed here is NOT a domain. |
#13
| |||
| |||
|
|
"Tony" <andrewst (AT) onetel (DOT) net.uk> wrote in message news:c0e3f26e.0402090330.1ba1c048 (AT) posting (DOT) google.com... Of course, it is conceivable that an organisation might decide in advance that all user_id values must be taken from the enumerated set {1, 112, 314, ... 9997}, but if so then of course they pay the penalty in the complexity of enforcing the constraint. |
#14
| |||
| |||
|
|
"Tony" <andrewst (AT) onetel (DOT) net.uk> wrote in message news:c0e3f26e.0402090330.1ba1c048 (AT) posting (DOT) google.com... Mike Sherrill <MSherrillnonono (AT) compuserve (DOT) com> wrote in message news:<pp7c20dospovtg3vv925vhtrtikgdvm190 (AT) 4ax (DOT) com>... On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour (AT) golden (DOT) net wrote: Try thinking about it this way . . . At the conceptual level, a domain is just a data type, and a type is, among other things, a set of all possible values. One way to handle a set of values is to store them in a table. snip In designing a SQL database based on a logical model, you have to map types, relvars, and assignments (among other things) from the logical model to features supported by the target platform. This TYPE USER_ID POSSREP ( INTEGER ) CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997}; (where the enumeration of values is unavoidable) *might* map to something like this for a primitive data type CREATE DOMAIN User_ID AS INTEGER; CONSTRAINT user_id_values CHECK (VALUE IN (1, 112, 314, ... 9997)); But that probably won't work. (And it's not even worth trying for any type more complex than a primitive data type.) Most SQL systems won't let you have more than a few hundred values in that CHECK constraint. Instead, you might map the relational TYPE...CONSTRAINT to a SQL CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT statement for each value, one or more CHECK constraints, a bucketful of REVOKE statements, a FOREIGN KEY constraint from each table in which User_ID is (cough) used, and some administrative procedures to make sure the table is always used. *If* the designer is careful. Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms like, say, Mimer SQL? (I mention Mimer because you can download it from www.mimer.com.) Let's say you have six thousand non-consecutive integers to enumerate. The only reason you are trying to enumerate "six thousand non-consecutive integers" is because you are under the mistaken impression that the set of users who actually use a system is a domain, when it isn't. This is quite a presumption. While it could be one reason, it's not the only possible reason for defining a "complex" enumerated domain. Quite the contrary, if a system is "modeled" to represent something in the real world upon which we want to make decisions or have the system execute upon, then modeled domains "by constraint" at the logical level are entirely valid. I further submit that having the database management system enforce domains and therefore its consistency is not only desireable, but necessary. Tomorrow, a new user may be enrolled, another may leave etc. Then you change the implementation (the implemented constraint) to reflect a change in the logical model as dictated by the user or business. It's a fact of life that systems in the real world evolve, but it doesn't change the fact that logic should to be deterministic to reflect the real world and provide some modicum of behavior that is both expected and that conforms to the model. By contrast, the domain "user_id" is the (fixed for all time) set of all possible values from which users may be given a user_id. Of course, it is conceivable that an organisation might decide in advance that all user_id values must be taken from the enumerated set {1, 112, 314, ... 9997}, but if so then of course they pay the penalty in the complexity of enforcing the constraint. I think everyone agrees that there is a penalty involved with enforcing certain constraints to ensure quality of the system, but that is not what is at issue. People argued, and still argue, that association by value as a performance liability for the same reasons - to justify their rational for not accepting something. Much more typical would be "between 5 and 10 alhphanumeric characters" or "an integer between 1000 and 9999". I don't understand the criteria upon which you decide that one degree of specificity in defining a domain of representations in the form of strings or numbers is necessarily better than another. For example, if I devised a logical model for "processes" where I defned states of a process at a logical level as only being started and completed, I'd want to ensure that 1) the system only allowed for two state values for any given process; 2) being forced to use SQL enforcement at a lower representation level, I chose two representations and ensured that each representation identified the correct state; and 3) that the DBMS could enforce state transitions if it was so desired. I could choose any two strings or set of two numbers to represent those states, but declaring and defining column variables as VARCHAR(7) or INTEGER(1) will do nothing to enforce the integrity of my model from a data management perspective. If I chose to represent my domain as the strings, 'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the VARCHAR(7) domain definition will not enforce the two state domain in that it allows for 562,949,953,421,302 states other than the two I want - assuming an alphabet that allows for 2^7 symbols, which is equivalent to 28 characters and the disallowance of null. In the case of INTEGER(1), the DBMS would allow for 8 additional states that are not part of my logical domain. Since we are dealing in large part with string and number representations, we might consider a defined language (sequences of symbols composed of alphabets upon which rules of syntax and grammer are applied) a proxy for a defined domain. If you want to argue the point by saying that you have to generalize data and domain definition to allow for evolution and incremental change to what one today defines as the domain, then who is to say the VARCHAR(7) or NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10 characters, is sufficient? Why not just call everying attributeN and defined everying as alphanumeric 100 wide character strings? Even more importantly, what assurances can the DBMS give me that my universe or model of a shared real world is properly being represented at the current time? I agree that SQL database don't support domains very well, but the example being discussed here is NOT a domain. Yes it is. It's just a rather inconvenient domain. Best regards, Dan |
#15
| ||||
| ||||
|
|
(where the enumeration of values is unavoidable) *might* map to something like this for a primitive data type CREATE DOMAIN User_ID AS INTEGER; CONSTRAINT user_id_values CHECK (VALUE IN (1, 112, 314, ... 9997)); |
|
The only reason you are trying to enumerate "six thousand non-consecutive integers" is because you are under the mistaken impression that the set of users who actually use a system is a domain, when it isn't. |
|
I agree that SQL database don't support domains very well, but the example being discussed here is NOT a domain. |
|
"D Guntermann" <guntermann (AT) hotmail (DOT) com> wrote in message news:Hstwz4.59I (AT) news (DOT) boeing.com... "Tony" <andrewst (AT) onetel (DOT) net.uk> wrote in message news:c0e3f26e.0402090330.1ba1c048 (AT) posting (DOT) google.com... Of course, it is conceivable that an organisation might decide in advance that all user_id values must be taken from the enumerated set {1, 112, 314, ... 9997}, but if so then of course they pay the penalty in the complexity of enforcing the constraint. Dan, Why does it seem your entire response to Tony ignores the above statement? Are you trying to talk past him? |
#16
| ||||||
| ||||||
|
|
The only reason you are trying to enumerate "six thousand non-consecutive integers" is because you are under the mistaken impression that the set of users who actually use a system is a domain, when it isn't. This is quite a presumption. While it could be one reason, it's not the only possible reason for defining a "complex" enumerated domain. Quite the contrary, if a system is "modeled" to represent something in the real world upon which we want to make decisions or have the system execute upon, then modeled domains "by constraint" at the logical level are entirely valid. I further submit that having the database management system enforce domains and therefore its consistency is not only desireable, but necessary. Tomorrow, a new user may be enrolled, another may leave etc. Then you change the implementation (the implemented constraint) to reflect a change in the logical model as dictated by the user or business. It's a fact of life that systems in the real world evolve, but it doesn't change the fact that logic should to be deterministic to reflect the real world and provide some modicum of behavior that is both expected and that conforms to the model. By contrast, the domain "user_id" is the (fixed for all time) set of all possible values from which users may be given a user_id. Of course, it is conceivable that an organisation might decide in advance that all user_id values must be taken from the enumerated set {1, 112, 314, ... 9997}, but if so then of course they pay the penalty in the complexity of enforcing the constraint. I think everyone agrees that there is a penalty involved with enforcing certain constraints to ensure quality of the system, but that is not what is at issue. People argued, and still argue, that association by value as a performance liability for the same reasons - to justify their rational for not accepting something. Much more typical would be "between 5 and 10 alhphanumeric characters" or "an integer between 1000 and 9999". I don't understand the criteria upon which you decide that one degree of specificity in defining a domain of representations in the form of strings or numbers is necessarily better than another. For example, if I devised a logical model for "processes" where I defned states of a process at a logical level as only being started and completed, I'd want to ensure that 1) the system only allowed for two state values for any given process; 2) being forced to use SQL enforcement at a lower representation level, I chose two representations and ensured that each representation identified the correct state; and 3) that the DBMS could enforce state transitions if it was so desired. I could choose any two strings or set of two numbers to represent those states, but declaring and defining column variables as VARCHAR(7) or INTEGER(1) will do nothing to enforce the integrity of my model from a data management perspective. If I chose to represent my domain as the strings, 'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the VARCHAR(7) domain definition will not enforce the two state domain in that it allows for 562,949,953,421,302 states other than the two I want - assuming an alphabet that allows for 2^7 symbols, which is equivalent to 28 characters and the disallowance of null. In the case of INTEGER(1), the DBMS would allow for 8 additional states that are not part of my logical domain. Since we are dealing in large part with string and number representations, we might consider a defined language (sequences of symbols composed of alphabets upon which rules of syntax and grammer are applied) a proxy for a defined domain. If you want to argue the point by saying that you have to generalize data and domain definition to allow for evolution and incremental change to what one today defines as the domain, then who is to say the VARCHAR(7) or NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10 characters, is sufficient? Why not just call everying attributeN and defined everying as alphanumeric 100 wide character strings? Even more importantly, what assurances can the DBMS give me that my universe or model of a shared real world is properly being represented at the current time? I agree that SQL database don't support domains very well, but the example being discussed here is NOT a domain. Yes it is. It's just a rather inconvenient domain. Best regards, Dan What can I say? I really thought this was quite a simple concept, and yet you have completely failed to comprehend me. |
|
saying that if I run a business with 5 employees whom I assign the user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id column are in the DOMAIN "user_id" that is restricted to the values (5, 7, 42, 101 and 123). |
|
to redfine the "user_id" DOMAIN to include another value? See leading comments.... |
|
Sorry, but you are talking nonsense. |
|
USERS table with a user_id taken from the DOMAIN of user_id's, which may be (for my system) a number between 1 and 999, or a character string of up to 10 characters. I am NOT restricted to the list id user_id's that have currently been allocated to existing employees. Whoever said you were restricted in such a way? I did not intend to give |
|
My point is not that an enumerated domain is impossible, but that if one is defined it should be complete. It does not consist solely of the subset of values that I happen to have used to date. |
#17
| ||||||
| ||||||
|
|
[snip] The only reason you are trying to enumerate "six thousand non-consecutive integers" is because you are under the mistaken impression that the set of users who actually use a system is a domain, when it isn't. This is quite a presumption. While it could be one reason, it's not the only possible reason for defining a "complex" enumerated domain. Quite the contrary, if a system is "modeled" to represent something in the real world upon which we want to make decisions or have the system execute upon, then modeled domains "by constraint" at the logical level are entirely valid. I further submit that having the database management system enforce domains and therefore its consistency is not only desireable, but necessary. Tomorrow, a new user may be enrolled, another may leave etc. Then you change the implementation (the implemented constraint) to reflect a change in the logical model as dictated by the user or business. It's a fact of life that systems in the real world evolve, but it doesn't change the fact that logic should to be deterministic to reflect the real world and provide some modicum of behavior that is both expected and that conforms to the model. By contrast, the domain "user_id" is the (fixed for all time) set of all possible values from which users may be given a user_id. Of course, it is conceivable that an organisation might decide in advance that all user_id values must be taken from the enumerated set {1, 112, 314, ... 9997}, but if so then of course they pay the penalty in the complexity of enforcing the constraint. I think everyone agrees that there is a penalty involved with enforcing certain constraints to ensure quality of the system, but that is not what is at issue. People argued, and still argue, that association by value as a performance liability for the same reasons - to justify their rational for not accepting something. Much more typical would be "between 5 and 10 alhphanumeric characters" or "an integer between 1000 and 9999". I don't understand the criteria upon which you decide that one degree of specificity in defining a domain of representations in the form of strings or numbers is necessarily better than another. For example, if I devised a logical model for "processes" where I defned states of a process at a logical level as only being started and completed, I'd want to ensure that 1) the system only allowed for two state values for any given process; 2) being forced to use SQL enforcement at a lower representation level, I chose two representations and ensured that each representation identified the correct state; and 3) that the DBMS could enforce state transitions if it was so desired. I could choose any two strings or set of two numbers to represent those states, but declaring and defining column variables as VARCHAR(7) or INTEGER(1) will do nothing to enforce the integrity of my model from a data management perspective. If I chose to represent my domain as the strings, 'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the VARCHAR(7) domain definition will not enforce the two state domain in that it allows for 562,949,953,421,302 states other than the two I want - assuming an alphabet that allows for 2^7 symbols, which is equivalent to 28 characters and the disallowance of null. In the case of INTEGER(1), the DBMS would allow for 8 additional states that are not part of my logical domain. Since we are dealing in large part with string and number representations, we might consider a defined language (sequences of symbols composed of alphabets upon which rules of syntax and grammer are applied) a proxy for a defined domain. If you want to argue the point by saying that you have to generalize data and domain definition to allow for evolution and incremental change to what one today defines as the domain, then who is to say the VARCHAR(7) or NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10 characters, is sufficient? Why not just call everying attributeN and defined everying as alphanumeric 100 wide character strings? Even more importantly, what assurances can the DBMS give me that my universe or model of a shared real world is properly being represented at the current time? I agree that SQL database don't support domains very well, but the example being discussed here is NOT a domain. Yes it is. It's just a rather inconvenient domain. Best regards, Dan What can I say? I really thought this was quite a simple concept, and yet you have completely failed to comprehend me. Well, I don't think I did. But I can see where the disconnect is. You presume that the enumerated domain under discussion represents actual users at time *now* as part of a set of propositions that meet some predicate. I did not see where that was clearly stated in Mr. Sherrill's example and one of your statements allowed for the domain being discussed to be an enumerated domain of "allowable" values, which, given the contradiction, probably confused me. |
|
Another part of the disconnect is that we are discussing this at two different levels - your point, if I understand you correctly, is that you want to arbitrarily define some pool of numbers that works as an identity function, generating user ids for users, now and forever, which I admit is entirely valid. My argument was from a more generalized point of view and did not work under the assumption that the domain under discussion reflected actual current users, in contrast with the pool of allowable user_id's; but rather reflected the viewpoint that the domain was an enumerated domain, representative of any domain that might be finite under some model of logic. |
|
Examples of this might be social security numbers or phone numbers. |
|
Are you seriously saying that if I run a business with 5 employees whom I assign the user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id column are in the DOMAIN "user_id" that is restricted to the values (5, 7, 42, 101 and 123). I won't belabor the point. I think Bob is right in that we are talking past one another, but nonetheless |
|
, is there some criteria that states that a domain constraint is ok in one instance but ridiculous in this particular case? Where is that line? I suppose its where the user draws it. |
|
So if tomorrow I hire a 6th employee I have to redfine the "user_id" DOMAIN to include another value? See leading comments.... Sorry, but you are talking nonsense. I'm sorry you feel that way. Hopefully we can put it behind us. I still stand behind my comments though, given my own set of assumptions. |
#18
| |||
| |||
|
|
[snip] I agree with your sentiments to a point, though I still think there are exceptions, generally speaking. I'll concede this particular case. Sorry for any misunderstanding. Regards, Dan |
|
CREATE TABLE user ( id INT NOT NULL ) <--------------------- | CREATE TABLE user_history ( | user_id INT NOT NULL, oo----------------| valid_from DATETIME NOT NULL, | name CHAR(100) NOT NULL, | ... ) | | CREATE TABLE log ( | id INT NOT NULL, | tstamp DATETIME NOT NULL, | user_id INT NOT NULL, oo---------------- description CHAR(100) NOT NULL, ... ) There are other related tables like log here, where I'm really not interested in historical data but just the identity of the user, and those where exact user properties in point of time matter. The one-field user table smells like a design problem. Merging user and user_history tables introduces ugly problems with relational integrity. Are these one-field tables normal or how am I supposed to tackle this? Please share your thoughts and links. |
|
Smells like a domain. At the conceptual level, every domain is just a column (or several columns) of values. Think of it as a domain of user ID numbers. (And call it "user_id" if you can.) |
![]() |
| Thread Tools | |
| Display Modes | |
| |