dbTalk Databases Forums  

projection with nulls

comp.databases.theory comp.databases.theory


Discuss projection with nulls in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
paul c
 
Posts: n/a

Default projection with nulls - 03-16-2008 , 11:34 AM






(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.


Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?


I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!



Reply With Quote
  #2  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM







"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #3  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #4  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #5  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #6  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #7  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #8  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #9  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
  #10  
Old   
David Cressey
 
Posts: n/a

Default Re: projection with nulls - 03-16-2008 , 03:11 PM




"paul c" <toledobysea (AT) ac (DOT) ooyah> wrote

Quote:
(Just my stab at moving from the recent topics towards more basic
questions, sorry if there's not much newspeak in it):


Say you have R with (S)upplier, (P)art, (Q)uantity:

R:
S P
- -
a 2
b _


with predicate: Supplier S supplies Part P. The underscore in the
second 'row' stands for 'NULL', ie., SQL's null.


For argument's sake, say the first tuple means that 'if Supplier a
carries any parts, it carries Part 2' and the second tuple might mean
that 'if Supplier b carries any part, we do not know what that part's
number is'. Not everybody might agree that this meaning is essentially
the same as the predicate given but as long as the meaning or predicate
is the same for all tuples, picking one or the other shouldn't matter to
the dbms.

First, it shouldn't matter to the DBMS, but it should matter to the
database designer, the data consumers, and the data suppliers. This whole
business of assigning meaning to data that isn't there is fraught with
pitfalls at the semantic layer. The best you are going to hope for is
agreement among all the stakeholders. And that agreement has to be made
explicit somewhere, usually in documentation. Without an explicit agreement
you're back to everyone taking a guess at it, and that's puts you back
where we were with files and records, even if you are using a dbms.

The meaning you have assigned to the first tuple is strange enough to me.
If I had been asked to report on this data, in the absence of your
explanation, I would have assumed the first tuple means that "supplier a
carries part 2". The whole business of "if supplier a carries any parts" is
a novelty to me.

As far as newspeak goes, I had thought that tuples, by their nature, did
not cary SQL nulls in them. Maybe you and I mean different things by the
word "tuple".


Quote:
Maybe not a realistic example, it's just for argument's sake (I don't
think it is illegal as far as the SQL standard is concerned).


It's always seemed to me that the projection R{S}, without any other
operands, ought to return only {{S, a}}, ie., 'what suppliers supply
followedsome part', just one tuple in this case which is the same as
saying 'what suppliers supply some part if they supply any parts'. That
is, if one allows nulls (which I'm not advocating).


I gather that most or maybe all SQL products return {{S a}, {S b}}, ie.,
two tuples. I don't see how Supplier b should be included because it's
not known (in this example) whether Supplier b supplies any parts.
Maybe somebody wants to answer what should happen if the projection is
R{P} instead?


I'm not sure exactly what you mean by "a projection" in the context of and
SQL product.

I have always treated GROUP BY and SELECT DISTINCT as providing the
functionality that's closest to projection.

Some dialects of SQL will indeed eliminate the second tuple if GROUP BY or
SELECT DISTINCT are used.

I'm not trying to create confusion here. I'm just trying to clarify your
orignal post.

PS: See archives in Google groups for many discussions of SQL NULLS.


Quote:
Secondly, it seems that traditional logic demands the
closed-world-assumption (CWA) in order to support negation. If the
domain of part numbers were the set {1,2}, the complement of R above
would include the tuples {S b, P 1} and {S b, P 2}. In other words,
there aren't any parts that Supplier b supplies. Does this mean that
SQL gives wrong answers if the CWA is assumed?

The dbms can always perform negation (if all the domains are finite). what
the negation might mean is open to interpretation. Suppose we were to take
a set of tuples that contain email addresses, along with other attributes.
Now suppose we were to negate this set of tuples.
Are we going to get a list of all possible email addresses, except the ones
that were in the original set? Or are we going to somehow limit the
universe of discourse to all email addresses that actually exist somewhere,
but not in the set we negated?

For reasons illustrated by the above, there are a great many systems that
do not support negation as such, but that do support the MINUS operator.
The MINUS operator limits you to values that are in one manageable set but
not in some other manageable set. This is a whole lot less open ended than
negation.


Quote:
I guess one could argue that this is a kind of trick predicate but my
attitude towards that is so what? We don't expect a logic engine to be
aware of predicates.


As far as conventional projection is concerned, it seems that R is
overloaded in the sense that when it allows nulls it also allows tuples
that aren't opaque/blunt enough to project against, ie., it's not very
clear whether the two tuples always share the same predicate. Makes me
wonder if a P attribute that is set-valued would be a less-intricate
way to dispense with nulls. Sometimes I even wonder whether
conventional projection doesn't have enough operands!





Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.