dbTalk Databases Forums  

Re: Table design question

comp.databases.theory comp.databases.theory


Discuss Re: Table design question in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
Tony
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 05:30 AM






Mike Sherrill <MSherrillnonono (AT) compuserve (DOT) com> wrote

Quote:
On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour (AT) golden (DOT) net
wrote:

Try thinking about it this way . . .

At the conceptual level, a domain is just a data type, and a type is,
among other things, a set of all possible values. One way to handle a
set of values is to store them in a table.

It's the "among other things" that kills you. A variable is not a time
invariant set of values and their associated operations.

It doesn't kill *me*. SQL databases don't support domains in the
relational sense of "domains". That's why it's often helpful to try
thinking in other ways.

In designing a SQL database based on a logical model, you have to map
types, relvars, and assignments (among other things) from the logical
model to features supported by the target platform.

This

TYPE USER_ID POSSREP ( INTEGER )
CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997};

(where the enumeration of values is unavoidable) *might* map to
something like this for a primitive data type

CREATE DOMAIN User_ID AS INTEGER;
CONSTRAINT user_id_values
CHECK (VALUE IN (1, 112, 314, ... 9997));

But that probably won't work. (And it's not even worth trying for any
type more complex than a primitive data type.) Most SQL systems won't
let you have more than a few hundred values in that CHECK constraint.

Instead, you might map the relational TYPE...CONSTRAINT to a SQL
CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT
statement for each value, one or more CHECK constraints, a bucketful
of REVOKE statements, a FOREIGN KEY constraint from each table in
which User_ID is (cough) used, and some administrative procedures to
make sure the table is always used. *If* the designer is careful.

Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms
like, say, Mimer SQL? (I mention Mimer because you can download it
from www.mimer.com.) Let's say you have six thousand non-consecutive
integers to enumerate.
The only reason you are trying to enumerate "six thousand
non-consecutive integers" is because you are under the mistaken
impression that the set of users who actually use a system is a
domain, when it isn't. Tomorrow, a new user may be enrolled, another
may leave etc. By contrast, the domain "user_id" is the (fixed for
all time) set of all possible values from which users may be given a
user_id. Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course they
pay the penalty in the complexity of enforcing the constraint. Much
more typical would be "between 5 and 10 alhphanumeric characters" or
"an integer between 1000 and 9999".

I agree that SQL database don't support domains very well, but the
example being discussed here is NOT a domain.


Reply With Quote
  #12  
Old   
D Guntermann
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 12:06 PM






"Tony" <andrewst (AT) onetel (DOT) net.uk> wrote

Quote:
Mike Sherrill <MSherrillnonono (AT) compuserve (DOT) com> wrote

On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour (AT) golden (DOT) net
wrote:

Try thinking about it this way . . .

At the conceptual level, a domain is just a data type, and a type is,
among other things, a set of all possible values. One way to handle
a
set of values is to store them in a table.

snip

Quote:
In designing a SQL database based on a logical model, you have to map
types, relvars, and assignments (among other things) from the logical
model to features supported by the target platform.

This

TYPE USER_ID POSSREP ( INTEGER )
CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997};

(where the enumeration of values is unavoidable) *might* map to
something like this for a primitive data type

CREATE DOMAIN User_ID AS INTEGER;
CONSTRAINT user_id_values
CHECK (VALUE IN (1, 112, 314, ... 9997));

But that probably won't work. (And it's not even worth trying for any
type more complex than a primitive data type.) Most SQL systems won't
let you have more than a few hundred values in that CHECK constraint.

Instead, you might map the relational TYPE...CONSTRAINT to a SQL
CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT
statement for each value, one or more CHECK constraints, a bucketful
of REVOKE statements, a FOREIGN KEY constraint from each table in
which User_ID is (cough) used, and some administrative procedures to
make sure the table is always used. *If* the designer is careful.

Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms
like, say, Mimer SQL? (I mention Mimer because you can download it
from www.mimer.com.) Let's say you have six thousand non-consecutive
integers to enumerate.

The only reason you are trying to enumerate "six thousand
non-consecutive integers" is because you are under the mistaken
impression that the set of users who actually use a system is a
domain, when it isn't.
This is quite a presumption. While it could be one reason, it's not the
only possible reason for defining a "complex" enumerated domain.

Quite the contrary, if a system is "modeled" to represent something in the
real world upon which we want to make decisions or have the system execute
upon, then modeled domains "by constraint" at the logical level are entirely
valid. I further submit that having the database management system enforce
domains and therefore its consistency is not only desireable, but necessary.

Tomorrow, a new user may be enrolled, another
Quote:
may leave etc.
Then you change the implementation (the implemented constraint) to reflect a
change in the logical model as dictated by the user or business. It's a
fact of life that systems in the real world evolve, but it doesn't change
the fact that logic should to be deterministic to reflect the real world and
provide some modicum of behavior that is both expected and that conforms to
the model.

By contrast, the domain "user_id" is the (fixed for
Quote:
all time) set of all possible values from which users may be given a
user_id. Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course they
pay the penalty in the complexity of enforcing the constraint.
I think everyone agrees that there is a penalty involved with enforcing
certain constraints to ensure quality of the system, but that is not what is
at issue. People argued, and still argue, that association by value as a
performance liability for the same reasons - to justify their rational for
not accepting something.

Much
Quote:
more typical would be "between 5 and 10 alhphanumeric characters" or
"an integer between 1000 and 9999".

I don't understand the criteria upon which you decide that one degree of
specificity in defining a domain of representations in the form of strings
or numbers is necessarily better than another.

For example, if I devised a logical model for "processes" where I defned
states of a process at a logical level as only being started and completed,
I'd want to ensure that 1) the system only allowed for two state values for
any given process; 2) being forced to use SQL enforcement at a lower
representation level, I chose two representations and ensured that each
representation identified the correct state; and 3) that the DBMS could
enforce state transitions if it was so desired.

I could choose any two strings or set of two numbers to represent those
states, but declaring and defining column variables as VARCHAR(7) or
INTEGER(1) will do nothing to enforce the integrity of my model from a data
management perspective. If I chose to represent my domain as the strings,
'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the VARCHAR(7)
domain definition will not enforce the two state domain in that it allows
for 562,949,953,421,302 states other than the two I want - assuming an
alphabet that allows for 2^7 symbols, which is equivalent to 28 characters
and the disallowance of null. In the case of INTEGER(1), the DBMS would
allow for 8 additional states that are not part of my logical domain.

Since we are dealing in large part with string and number representations,
we might consider a defined language (sequences of symbols composed of
alphabets upon which rules of syntax and grammer are applied) a proxy for a
defined domain.

If you want to argue the point by saying that you have to generalize data
and domain definition to allow for evolution and incremental change to what
one today defines as the domain, then who is to say the VARCHAR(7) or
NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10
characters, is sufficient? Why not just call everying attributeN and
defined everying as alphanumeric 100 wide character strings?

Even more importantly, what assurances can the DBMS give me that my universe
or model of a shared real world is properly being represented at the current
time?

Quote:
I agree that SQL database don't support domains very well, but the
example being discussed here is NOT a domain.
Yes it is. It's just a rather inconvenient domain.

Best regards,

Dan




Reply With Quote
  #13  
Old   
Bob Badour
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 03:53 PM



"D Guntermann" <guntermann (AT) hotmail (DOT) com> wrote

Quote:
"Tony" <andrewst (AT) onetel (DOT) net.uk> wrote in message
news:c0e3f26e.0402090330.1ba1c048 (AT) posting (DOT) google.com...

Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course they
pay the penalty in the complexity of enforcing the constraint.
Dan,

Why does it seem your entire response to Tony ignores the above statement?
Are you trying to talk past him?




Reply With Quote
  #14  
Old   
Tony
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 05:13 PM



"D Guntermann" <guntermann (AT) hotmail (DOT) com> wrote

Quote:
"Tony" <andrewst (AT) onetel (DOT) net.uk> wrote in message
news:c0e3f26e.0402090330.1ba1c048 (AT) posting (DOT) google.com...
Mike Sherrill <MSherrillnonono (AT) compuserve (DOT) com> wrote in message
news:<pp7c20dospovtg3vv925vhtrtikgdvm190 (AT) 4ax (DOT) com>...
On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour (AT) golden (DOT) net
wrote:

Try thinking about it this way . . .

At the conceptual level, a domain is just a data type, and a type is,
among other things, a set of all possible values. One way to handle
a
set of values is to store them in a table.

snip

In designing a SQL database based on a logical model, you have to map
types, relvars, and assignments (among other things) from the logical
model to features supported by the target platform.

This

TYPE USER_ID POSSREP ( INTEGER )
CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997};

(where the enumeration of values is unavoidable) *might* map to
something like this for a primitive data type

CREATE DOMAIN User_ID AS INTEGER;
CONSTRAINT user_id_values
CHECK (VALUE IN (1, 112, 314, ... 9997));

But that probably won't work. (And it's not even worth trying for any
type more complex than a primitive data type.) Most SQL systems won't
let you have more than a few hundred values in that CHECK constraint.

Instead, you might map the relational TYPE...CONSTRAINT to a SQL
CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT
statement for each value, one or more CHECK constraints, a bucketful
of REVOKE statements, a FOREIGN KEY constraint from each table in
which User_ID is (cough) used, and some administrative procedures to
make sure the table is always used. *If* the designer is careful.

Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms
like, say, Mimer SQL? (I mention Mimer because you can download it
from www.mimer.com.) Let's say you have six thousand non-consecutive
integers to enumerate.

The only reason you are trying to enumerate "six thousand
non-consecutive integers" is because you are under the mistaken
impression that the set of users who actually use a system is a
domain, when it isn't.

This is quite a presumption. While it could be one reason, it's not the
only possible reason for defining a "complex" enumerated domain.

Quite the contrary, if a system is "modeled" to represent something in the
real world upon which we want to make decisions or have the system execute
upon, then modeled domains "by constraint" at the logical level are entirely
valid. I further submit that having the database management system enforce
domains and therefore its consistency is not only desireable, but necessary.

Tomorrow, a new user may be enrolled, another
may leave etc.

Then you change the implementation (the implemented constraint) to reflect a
change in the logical model as dictated by the user or business. It's a
fact of life that systems in the real world evolve, but it doesn't change
the fact that logic should to be deterministic to reflect the real world and
provide some modicum of behavior that is both expected and that conforms to
the model.

By contrast, the domain "user_id" is the (fixed for
all time) set of all possible values from which users may be given a
user_id. Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course they
pay the penalty in the complexity of enforcing the constraint.

I think everyone agrees that there is a penalty involved with enforcing
certain constraints to ensure quality of the system, but that is not what is
at issue. People argued, and still argue, that association by value as a
performance liability for the same reasons - to justify their rational for
not accepting something.

Much
more typical would be "between 5 and 10 alhphanumeric characters" or
"an integer between 1000 and 9999".

I don't understand the criteria upon which you decide that one degree of
specificity in defining a domain of representations in the form of strings
or numbers is necessarily better than another.

For example, if I devised a logical model for "processes" where I defned
states of a process at a logical level as only being started and completed,
I'd want to ensure that 1) the system only allowed for two state values for
any given process; 2) being forced to use SQL enforcement at a lower
representation level, I chose two representations and ensured that each
representation identified the correct state; and 3) that the DBMS could
enforce state transitions if it was so desired.

I could choose any two strings or set of two numbers to represent those
states, but declaring and defining column variables as VARCHAR(7) or
INTEGER(1) will do nothing to enforce the integrity of my model from a data
management perspective. If I chose to represent my domain as the strings,
'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the VARCHAR(7)
domain definition will not enforce the two state domain in that it allows
for 562,949,953,421,302 states other than the two I want - assuming an
alphabet that allows for 2^7 symbols, which is equivalent to 28 characters
and the disallowance of null. In the case of INTEGER(1), the DBMS would
allow for 8 additional states that are not part of my logical domain.

Since we are dealing in large part with string and number representations,
we might consider a defined language (sequences of symbols composed of
alphabets upon which rules of syntax and grammer are applied) a proxy for a
defined domain.

If you want to argue the point by saying that you have to generalize data
and domain definition to allow for evolution and incremental change to what
one today defines as the domain, then who is to say the VARCHAR(7) or
NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10
characters, is sufficient? Why not just call everying attributeN and
defined everying as alphanumeric 100 wide character strings?

Even more importantly, what assurances can the DBMS give me that my universe
or model of a shared real world is properly being represented at the current
time?

I agree that SQL database don't support domains very well, but the
example being discussed here is NOT a domain.

Yes it is. It's just a rather inconvenient domain.

Best regards,

Dan
What can I say? I really thought this was quite a simple concept, and
yet you have completely failed to comprehend me. Are you seriously
saying that if I run a business with 5 employees whom I assign the
user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id
column are in the DOMAIN "user_id" that is restricted to the values
(5, 7, 42, 101 and 123). So if tomorrow I hire a 6th employee I have
to redfine the "user_id" DOMAIN to include another value?

Sorry, but you are talking nonsense. I will add a new user to the
USERS table with a user_id taken from the DOMAIN of user_id's, which
may be (for my system) a number between 1 and 999, or a character
string of up to 10 characters. I am NOT restricted to the list id
user_id's that have currently been allocated to existing employees.

My point is not that an enumerated domain is impossible, but that if
one is defined it should be complete. It does not consist solely of
the subset of values that I happen to have used to date.


Reply With Quote
  #15  
Old   
D Guntermann
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 06:43 PM



Bob,

I'm sorry if I appear to be talking past people. That is not my intent. In
fact, I thought I did address the specific statement you reference when I
wrote:

"I think everyone agrees that there is a penalty involved with enforcing
certain constraints to ensure quality of the system, but that is not what is
at issue. People argued, and still argue, that association by value as a
performance liability for the same reasons - to justify their rational for
not accepting something."

The remainder of my end of the discussion was primarily in response to the
following major elements of the thread.

Mike Sherrill wrote the following:
Quote:
(where the enumeration of values is unavoidable) *might* map to
something like this for a primitive data type
CREATE DOMAIN User_ID AS INTEGER;
CONSTRAINT user_id_values
CHECK (VALUE IN (1, 112, 314, ... 9997));
Tony/Andrewst, for whom I have the greatest respect for, responded with the
following:
Quote:
The only reason you are trying to enumerate "six thousand
non-consecutive integers" is because you are under the mistaken
impression that the set of users who actually use a system is a
domain, when it isn't.
- and -

Quote:
I agree that SQL database don't support domains very well, but the
example being discussed here is NOT a domain.
Nonetheless, I probably was stating a point or points that didn't have to be
made. Moreover, my math was atrocious! No one called me on it though.

Regards,

Dan

"Bob Badour" <bbadour (AT) golden (DOT) net> wrote

Quote:
"D Guntermann" <guntermann (AT) hotmail (DOT) com> wrote in message
news:Hstwz4.59I (AT) news (DOT) boeing.com...
"Tony" <andrewst (AT) onetel (DOT) net.uk> wrote in message
news:c0e3f26e.0402090330.1ba1c048 (AT) posting (DOT) google.com...

Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course they
pay the penalty in the complexity of enforcing the constraint.

Dan,

Why does it seem your entire response to Tony ignores the above statement?
Are you trying to talk past him?





Reply With Quote
  #16  
Old   
D Guntermann
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 08:46 PM



[snip]

Quote:
The only reason you are trying to enumerate "six thousand
non-consecutive integers" is because you are under the mistaken
impression that the set of users who actually use a system is a
domain, when it isn't.

This is quite a presumption. While it could be one reason, it's not the
only possible reason for defining a "complex" enumerated domain.

Quite the contrary, if a system is "modeled" to represent something in
the
real world upon which we want to make decisions or have the system
execute
upon, then modeled domains "by constraint" at the logical level are
entirely
valid. I further submit that having the database management system
enforce
domains and therefore its consistency is not only desireable, but
necessary.

Tomorrow, a new user may be enrolled, another
may leave etc.

Then you change the implementation (the implemented constraint) to
reflect a
change in the logical model as dictated by the user or business. It's a
fact of life that systems in the real world evolve, but it doesn't
change
the fact that logic should to be deterministic to reflect the real world
and
provide some modicum of behavior that is both expected and that conforms
to
the model.

By contrast, the domain "user_id" is the (fixed for
all time) set of all possible values from which users may be given a
user_id. Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course they
pay the penalty in the complexity of enforcing the constraint.

I think everyone agrees that there is a penalty involved with enforcing
certain constraints to ensure quality of the system, but that is not
what is
at issue. People argued, and still argue, that association by value as
a
performance liability for the same reasons - to justify their rational
for
not accepting something.

Much
more typical would be "between 5 and 10 alhphanumeric characters" or
"an integer between 1000 and 9999".

I don't understand the criteria upon which you decide that one degree of
specificity in defining a domain of representations in the form of
strings
or numbers is necessarily better than another.

For example, if I devised a logical model for "processes" where I defned
states of a process at a logical level as only being started and
completed,
I'd want to ensure that 1) the system only allowed for two state values
for
any given process; 2) being forced to use SQL enforcement at a lower
representation level, I chose two representations and ensured that each
representation identified the correct state; and 3) that the DBMS could
enforce state transitions if it was so desired.

I could choose any two strings or set of two numbers to represent those
states, but declaring and defining column variables as VARCHAR(7) or
INTEGER(1) will do nothing to enforce the integrity of my model from a
data
management perspective. If I chose to represent my domain as the
strings,
'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the
VARCHAR(7)
domain definition will not enforce the two state domain in that it
allows
for 562,949,953,421,302 states other than the two I want - assuming an
alphabet that allows for 2^7 symbols, which is equivalent to 28
characters
and the disallowance of null. In the case of INTEGER(1), the DBMS
would
allow for 8 additional states that are not part of my logical domain.

Since we are dealing in large part with string and number
representations,
we might consider a defined language (sequences of symbols composed of
alphabets upon which rules of syntax and grammer are applied) a proxy
for a
defined domain.

If you want to argue the point by saying that you have to generalize
data
and domain definition to allow for evolution and incremental change to
what
one today defines as the domain, then who is to say the VARCHAR(7) or
NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10
characters, is sufficient? Why not just call everying attributeN and
defined everying as alphanumeric 100 wide character strings?

Even more importantly, what assurances can the DBMS give me that my
universe
or model of a shared real world is properly being represented at the
current
time?

I agree that SQL database don't support domains very well, but the
example being discussed here is NOT a domain.

Yes it is. It's just a rather inconvenient domain.

Best regards,

Dan

What can I say? I really thought this was quite a simple concept, and
yet you have completely failed to comprehend me.
Well, I don't think I did. But I can see where the disconnect is. You
presume that the enumerated domain under discussion represents actual users
at time *now* as part of a set of propositions that meet some predicate. I
did not see where that was clearly stated in Mr. Sherrill's example and one
of your statements allowed for the domain being discussed to be an
enumerated domain of "allowable" values, which, given the contradiction,
probably confused me.

Another part of the disconnect is that we are discussing this at two
different levels - your point, if I understand you correctly, is that you
want to arbitrarily define some pool of numbers that works as an identity
function, generating user ids for users, now and forever, which I admit is
entirely valid.

My argument was from a more generalized point of view and did not work under
the assumption that the domain under discussion reflected actual current
users, in contrast with the pool of allowable user_id's; but rather
reflected the viewpoint that the domain was an enumerated domain,
representative of any domain that might be finite under some model of logic.
Examples of this might be social security numbers or phone numbers.

Are you seriously
Quote:
saying that if I run a business with 5 employees whom I assign the
user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id
column are in the DOMAIN "user_id" that is restricted to the values
(5, 7, 42, 101 and 123).
I won't belabor the point.

I think Bob is right in that we are talking past one another, but
nonetheless, is there some criteria that states that a domain constraint is
ok in one instance but ridiculous in this particular case? Where is that
line? I suppose its where the user draws it.

So if tomorrow I hire a 6th employee I have
Quote:
to redfine the "user_id" DOMAIN to include another value?

See leading comments....

Quote:
Sorry, but you are talking nonsense.
I'm sorry you feel that way. Hopefully we can put it behind us. I still
stand behind my comments though, given my own set of assumptions.

I will add a new user to the
Quote:
USERS table with a user_id taken from the DOMAIN of user_id's, which
may be (for my system) a number between 1 and 999, or a character
string of up to 10 characters. I am NOT restricted to the list id
user_id's that have currently been allocated to existing employees.

Whoever said you were restricted in such a way? I did not intend to give
the impression I was supporting that particular viewpoint at all.

Quote:
My point is not that an enumerated domain is impossible, but that if
one is defined it should be complete. It does not consist solely of
the subset of values that I happen to have used to date.
I agree with your sentiments to a point, though I still think there are
exceptions, generally speaking. I'll concede this particular case. Sorry
for any misunderstanding.

Regards,

Dan




Reply With Quote
  #17  
Old   
Bob Badour
 
Posts: n/a

Default Re: Table design question - 02-09-2004 , 10:31 PM



"D Guntermann" <guntermann (AT) hotmail (DOT) com> wrote

Quote:
[snip]


The only reason you are trying to enumerate "six thousand
non-consecutive integers" is because you are under the mistaken
impression that the set of users who actually use a system is a
domain, when it isn't.

This is quite a presumption. While it could be one reason, it's not
the
only possible reason for defining a "complex" enumerated domain.

Quite the contrary, if a system is "modeled" to represent something in
the
real world upon which we want to make decisions or have the system
execute
upon, then modeled domains "by constraint" at the logical level are
entirely
valid. I further submit that having the database management system
enforce
domains and therefore its consistency is not only desireable, but
necessary.

Tomorrow, a new user may be enrolled, another
may leave etc.

Then you change the implementation (the implemented constraint) to
reflect a
change in the logical model as dictated by the user or business. It's
a
fact of life that systems in the real world evolve, but it doesn't
change
the fact that logic should to be deterministic to reflect the real
world
and
provide some modicum of behavior that is both expected and that
conforms
to
the model.

By contrast, the domain "user_id" is the (fixed for
all time) set of all possible values from which users may be given a
user_id. Of course, it is conceivable that an organisation might
decide in advance that all user_id values must be taken from the
enumerated set {1, 112, 314, ... 9997}, but if so then of course
they
pay the penalty in the complexity of enforcing the constraint.

I think everyone agrees that there is a penalty involved with
enforcing
certain constraints to ensure quality of the system, but that is not
what is
at issue. People argued, and still argue, that association by value
as
a
performance liability for the same reasons - to justify their rational
for
not accepting something.

Much
more typical would be "between 5 and 10 alhphanumeric characters" or
"an integer between 1000 and 9999".

I don't understand the criteria upon which you decide that one degree
of
specificity in defining a domain of representations in the form of
strings
or numbers is necessarily better than another.

For example, if I devised a logical model for "processes" where I
defned
states of a process at a logical level as only being started and
completed,
I'd want to ensure that 1) the system only allowed for two state
values
for
any given process; 2) being forced to use SQL enforcement at a lower
representation level, I chose two representations and ensured that
each
representation identified the correct state; and 3) that the DBMS
could
enforce state transitions if it was so desired.

I could choose any two strings or set of two numbers to represent
those
states, but declaring and defining column variables as VARCHAR(7) or
INTEGER(1) will do nothing to enforce the integrity of my model from a
data
management perspective. If I chose to represent my domain as the
strings,
'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the
VARCHAR(7)
domain definition will not enforce the two state domain in that it
allows
for 562,949,953,421,302 states other than the two I want - assuming an
alphabet that allows for 2^7 symbols, which is equivalent to 28
characters
and the disallowance of null. In the case of INTEGER(1), the DBMS
would
allow for 8 additional states that are not part of my logical domain.

Since we are dealing in large part with string and number
representations,
we might consider a defined language (sequences of symbols composed of
alphabets upon which rules of syntax and grammer are applied) a proxy
for a
defined domain.

If you want to argue the point by saying that you have to generalize
data
and domain definition to allow for evolution and incremental change to
what
one today defines as the domain, then who is to say the VARCHAR(7) or
NUMBER(1), or alternatively, alphanumeric consisting of between 5 and
10
characters, is sufficient? Why not just call everying attributeN and
defined everying as alphanumeric 100 wide character strings?

Even more importantly, what assurances can the DBMS give me that my
universe
or model of a shared real world is properly being represented at the
current
time?

I agree that SQL database don't support domains very well, but the
example being discussed here is NOT a domain.

Yes it is. It's just a rather inconvenient domain.

Best regards,

Dan

What can I say? I really thought this was quite a simple concept, and
yet you have completely failed to comprehend me.

Well, I don't think I did. But I can see where the disconnect is. You
presume that the enumerated domain under discussion represents actual
users
at time *now* as part of a set of propositions that meet some predicate.
I
did not see where that was clearly stated in Mr. Sherrill's example and
one
of your statements allowed for the domain being discussed to be an
enumerated domain of "allowable" values, which, given the contradiction,
probably confused me.
This thread has a history going back to a question posted Jan 23, and it
always seemed clear enough to me that the unary table under discussion was a
table of past and current users. Well, at least as clear as any business
rule can be in a newsgroup exchange--a strong reason to avoid giving
specific design advice in newsgroups.

Tony had already provided a concise, accurate and useful answer to the
question on Jan 24 when Mike piped up on Jan 25 with his "Smells like teen
spir..." er...um... "Smells like a domain" post. At best, Mike's "Smells
like a domain" post was trivially true and uninformative. At worst,
uninformed.


Quote:
Another part of the disconnect is that we are discussing this at two
different levels - your point, if I understand you correctly, is that you
want to arbitrarily define some pool of numbers that works as an identity
function, generating user ids for users, now and forever, which I admit is
entirely valid.

My argument was from a more generalized point of view and did not work
under
the assumption that the domain under discussion reflected actual current
users, in contrast with the pool of allowable user_id's; but rather
reflected the viewpoint that the domain was an enumerated domain,
representative of any domain that might be finite under some model of
logic.

The claim that an explicitly and singly enumerated domain would be required
was a smokescreen or straw man thrown up by Mike to defend the indefensible.
That's not to say such a domain cannot exist -- just that it was not a
necessity nor even apparently desirable in this specific case.


Quote:
Examples of this might be social security numbers or phone numbers.
A domain or data type might consist of the set of values { blue, 3, 1, cat }
and one of the operations might include the ordinal position of the value in
the ordered list: blue, 4, the, 1, cat, green, for, 3

Tony conceded the general case that a domain can have any set of arbitrary
unrelated values (well, unrelated except that they are values of the domain
in question that is) noting that the unrelatedness necessarily complicates
the specification, which is why I wondered whether you intended to talk past
him. I would agree his concession could have been worded more explicitly as
conceding the general case.


Quote:
Are you seriously
saying that if I run a business with 5 employees whom I assign the
user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id
column are in the DOMAIN "user_id" that is restricted to the values
(5, 7, 42, 101 and 123).

I won't belabor the point.

I think Bob is right in that we are talking past one another, but
nonetheless
At least, it was unintentional and seems to be all cleared up now.


Quote:
, is there some criteria that states that a domain constraint is
ok in one instance but ridiculous in this particular case? Where is that
line? I suppose its where the user draws it.
Tony was not objecting to domain constraints; he objected to the
characterization that a unary relation is necessarily a domain. Unary
relations are quite acceptable as are nullary relations (ie. degree = 0);
albeit, SQL arbitrarily prohibits nullary tables.


Quote:
So if tomorrow I hire a 6th employee I have
to redfine the "user_id" DOMAIN to include another value?

See leading comments....

Sorry, but you are talking nonsense.

I'm sorry you feel that way. Hopefully we can put it behind us. I still
stand behind my comments though, given my own set of assumptions.
I think you and Tony vehemently agree on that score.




Reply With Quote
  #18  
Old   
Tony
 
Posts: n/a

Default Re: Table design question - 02-10-2004 , 04:36 AM



"D Guntermann" <guntermann (AT) hotmail (DOT) com> wrote

Quote:
[snip]
I agree with your sentiments to a point, though I still think there are
exceptions, generally speaking. I'll concede this particular case. Sorry
for any misunderstanding.

Regards,

Dan
Sorry the tone of my post last night was rather strong - it was late
;o)

The issue I am dealing with started with this from Karel Miklav:

Quote:
CREATE TABLE user (
id INT NOT NULL ) <---------------------
|
CREATE TABLE user_history ( |
user_id INT NOT NULL, oo----------------|
valid_from DATETIME NOT NULL, |
name CHAR(100) NOT NULL, |
... ) |
|
CREATE TABLE log ( |
id INT NOT NULL, |
tstamp DATETIME NOT NULL, |
user_id INT NOT NULL, oo----------------
description CHAR(100) NOT NULL,
... )

There are other related tables like log here, where I'm really not
interested in historical data but just the identity of the user, and
those where exact user properties in point of time matter.

The one-field user table smells like a design problem. Merging user
and
user_history tables introduces ugly problems with relational
integrity.
Are these one-field tables normal or how am I supposed to tackle
this?
Please share your thoughts and links.
To which Mike Sherill responded:
Quote:
Smells like a domain.

At the conceptual level, every domain is just a column (or several
columns) of values. Think of it as a domain of user ID numbers.
(And
call it "user_id" if you can.)
Now, I am saying that *in this case under discussion* the USER table
is (intended to be) a perfectly valid 1-column table of users, users
who actually exist. It is not a DOMAIN containing all the possible
user_id values that could be used, it is the subset that are currently
in use. Of course the column USER.ID is IN a domain (INT), and could
have been in a more specific domain "USER_ID" with some more
constrained set of possible values such as 4-digit numbers, prime
numbers or whatever.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.