dbTalk Databases Forums  

Re: no names allowed, we serve types only

comp.databases.theory comp.databases.theory


Discuss Re: no names allowed, we serve types only in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #41  
Old   
David BL
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-21-2010 , 08:40 PM






On Feb 21, 7:05 pm, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:
Quote:
On 20 feb, 03:40, David BL <davi... (AT) iinet (DOT) net.au> wrote:

I agree this provides some motivation, but it doesn't seem sufficient
justification to "conflate" concerns (as Bob says). The crucial point
is this: The attribute domains are part of the relation-type, whereas
the attribute names are part of the relation-value. What happens with
your suggestion?
I got that wrong. I should have said:

The crucial point is this: Whereas both the declared attribute names
and attribute domains are part of the relation-type, only the
attribute names are part of the relation-value.


Quote:
It's very important to keep a clear separation between a value and its
declared type, in order to understand equality. Your suggestion ends
up conflating relation-type and relation-value.

How do you support sub-typing of relation-types according to sub-
typing of the attribute domains, and still allow for addressing the
attributes by type name?

That's actually no problem at all. The formal definition of a relation
type / relation header in Keith's approach would be a set of types
(*). For those types you can define a subtyping relation as follows: a
relation header H1 is said to be a subtype of H2 if for every type in
R2 there is a subtype in R1. It's even possible to have a matching
semantics with that that follows the subtype=subset principle. So no
notion of coercion is really needed here.

(*) Note there would / should be an additional requirement that a
header can only contain incomparable types, i.e., it cannot contain t1
and t2 such that t1 is a proper subtype of t2.
There is a problem.

Let's write t1 <= t2 if t1 is a subtype of t2 (let this include the
case t1 = t2).

Definition: header H is valid if for all t1,t2 in H,

t1 <= t2 implies t1 = t2.

(this is your "additional requirement")

Definition: H1 <= H2 if
exists bijection f from H1 to H2
such that for each t in H1, t <= f(t).

Unfortunately even assuming H1,H2 are valid doesn't imply that a
bijection is unique (if it exists).

Here is a counter example: H1 = {t1,t2}, H2 = {t3,t4} where
t1,t2,t3,t4 are all distinct types where:

not t1 <= t2
not t2 <= t1
not t3 <= t4
not t4 <= t3
t1 <= t3
t1 <= t4
t2 <= t3
t2 <= t4

In this example, H1, H2 are both valid and there are two distinct
bijections available. Ambiguity in the bijection is incompatible with
the subtype = subset principle. So multiple inheritance throws a
spanner in the works. The only way to salvage the situation is to
require the bijection be unique (so in the above example it is deemed
that H1 is not a subtype of H2). However I find this gratuitous at
best.

Reply With Quote
  #42  
Old   
Keith H Duggar
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-21-2010 , 09:41 PM






On Feb 21, 8:10 pm, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:
Quote:
As for having just types, I still think that Codd introduced his
attribute names because of relations that can have two 'columns' of the
same type. I think Keith D is arguing against this and if I'm right
about that, I'd like him to deal with that case, not to say it can't be
done, just that I'd like to see whether his "copy" in a language is a
better tool.
How about we do some concrete exercises then? Please provide your
favorite example of some relations with attributes having the same
type along with some associated code operating on those relations.
Then I will convert the {(name,type)} system example into a {type}
only system so we can compare.

KHD

Reply With Quote
  #43  
Old   
paul c
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-21-2010 , 10:30 PM



Keith H Duggar wrote:
Quote:
On Feb 21, 8:10 pm, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:
As for having just types, I still think that Codd introduced his
attribute names because of relations that can have two 'columns' of the
same type. I think Keith D is arguing against this and if I'm right
about that, I'd like him to deal with that case, not to say it can't be
done, just that I'd like to see whether his "copy" in a language is a
better tool.

How about we do some concrete exercises then? Please provide your
favorite example of some relations with attributes having the same
type along with some associated code operating on those relations.
Then I will convert the {(name,type)} system example into a {type}
only system so we can compare.

KHD
Below is a cut and paste, as best as I can manage, of what Codd said in
1970. As some here probably recognize, it has bugged me for years that
arcane ops such as 'tclose' are required in the D&D way of thinking.
Maybe I'm unfair to use such an adjective as arcane but my reason is
that the explanations I've seen for it are intricate and elaborate. In
Codd's approach, relations with two 'columns' having the same 'domain'
don't seem to be able to be closed into relations that transform the
domains, eg., from parent and child to ancestor and decendant. D&D seem
to allow it by use of <RENAME>. One way I'd compare alternative
approaches is to ask 1) how many new operators are needed to substitute
for the above? and 2) how easy are they to 'optimize' for minimum
machine time/cost?. It seems to me that for practical purposes, one
must admit relations that may have two 'columns' having the same 'type'.

(I read something by Bertrand Russell where he talked of domains and I
got the impression that his relations could define the different between
'parent' and 'ancestor', but he wrote that long before people had to
write computer programs and I'm guessing what he wrote is circular in
today's context.)

quote (I hope):
.... As the example in Figure 2 shows, two columns
may have identical headings (indicating identical
domains) but possess distinct meanings with respect to the
relation. The relation depicted is called component. It is a
ternary relation, whose first two domains are called part
and third domain is called quantity. The meaning of component
(2, y, z) is that part x is an immediate component
(or subassembly) of part y, and z units of part 5 are needed
to assemble one unit of part y. It is a relation which plays
a critical role in the parts explosion problem.

component
(part part quantity)
1 5 9
2 5 7
3 5 2
2 6 12
3 6 3
4 7 1
6 7 1
FIG. 2.

A relation with-two identical domains

Reply With Quote
  #44  
Old   
David BL
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-21-2010 , 11:20 PM



On Feb 22, 9:10 am, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:

Quote:
While I
sympathize with Keith D's wish to eliminate the 'name,type' double, I
think that practical issues require not only so-called 'possreps' but
also what I think of as a 'name,type,use' triples. What a mess!
Regarding possreps, I think they can be understood very clearly in a
typeless setting.

Consider the following from Darwen

(see http://www.dcs.warwick.ac.uk/~hugh/T...i.slides.pdf):

TYPE ELLIPSE
IS PLANE_FIGURE
POSSREP { A LENGTH, B LENGTH, CTR POINT
CONSTRAINT A >= B AND B > LENGTH ( 0.0 ) } ;

TYPE CIRCLE
IS ELLIPSE
CONSTRAINT THE_A ( ELLIPSE ) = THE_B ( ELLIPSE )
POSSREP { R = THE_A ( ELLIPSE ) ,
CTR = THE_CTR ( ELLIPSE ) } ;


In the following R means the reals.

Let Tellipse be a set of tuples as follows:

Tellipse =
{ (a,b,ctr) in RxRx(RxR) |
a >= b and b > 0 }

Let Pellipse be a function that maps each tuple t in Tellipse to a
real living ,breathing, ellipse value (i.e. an infinite set of points
in RxR):

for all t in Tellipse,
Pellipse(t) =
{ (x,y) in RxR |
((x-t.ctr.x)/t.a)^2 + ((y-t.ctr.y)/t.b)^2 = 1
}

Similarly let

Tcircle = { (r,ctr) in Rx(RxR) | r > 0 }

for all t in Tcircle,
Pcircle(t) =
{ (x,y) in RxR |
(x-t.ctr.x)^2 + (y-t.ctr.y)^2 = t.r^2
}


Note that range(Pcircle) is a subset of range(Pellipse). This
embodies D&D's assertion that type CIRCLE is a subtype of type
ELLIPSE.

Database systems don't tend to formalise ellipses and circles.
Instead they simply parameterise their values using elements of
Tellipse and Tcircle and leave the mappings Pellipse, Pcircle
unspecified.

In my opinion, we should regard a possrep as formally this
(unspecified) mapping such as Pcircle on a given domain Tcircle.
Usually we would require such a mapping to be onto (so that all
possible circles can be represented) and 1-1 (so that when two tuples
are distinct the DBMS can deduce that the circles that they represent
are distinct as well).

Note that Tcircle is uncountably infinite so we cannot even define an
encoding that can represent each element of Tcircle on a computer. In
practise it means we simply deal with some countable subset of Tcircle
(and therefore limit ourselves to a corresponding subset of all
circles.

The relationship between the two possreps is embodied in the
following:

CtoE = { (t1,t2) in Tcircle x Tellipse |
Pcircle(t1) = Pellipse(t2) }

Since every circle is an ellipse, CtoE is the graph of a function that
maps every tuple in Tcircle to a corresponding tuple in Tellipse that
represents the same circle/ellipse value.

Since Pcircle, Pellipse aren't specified to the DBMS, we instead need
to express CtoE directly. Of course that is very easy

(ctr,r) |---> (ctr,r,r)

Expressing this using D&D type selectors we could express this as:

CIRCLE(ctr,r) is-a ELLIPSE(ctr,r,r)

IMO this is cleaner (and indeed more practical) than the D&D syntax
which seems to get it arse about. D&D syntax specifies how to map a
subset of the ellipse values to matching circle values, by specifying
a constraint and by expressing the possrep of CIRCLE in terms of the
possrep of ELLIPSE.

Reply With Quote
  #45  
Old   
Jan Hidders
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-22-2010 , 08:39 AM



On 22 feb, 03:40, David BL <davi... (AT) iinet (DOT) net.au> wrote:
Quote:
On Feb 21, 7:05 pm, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:

On 20 feb, 03:40, David BL <davi... (AT) iinet (DOT) net.au> wrote:

I agree this provides some motivation, but it doesn't seem sufficient
justification to "conflate" concerns (as Bob says). *The crucial point
is this: *The attribute domains are part of the relation-type, whereas
the attribute names are part of the relation-value. *What happens with
your suggestion?

I got that wrong. *I should have said:

The crucial point is this: *Whereas both the declared attribute names
and attribute domains are part of the relation-type, only the
attribute names are part of the relation-value.





It's very important to keep a clear separation between a value and its
declared type, in order to understand equality. *Your suggestion ends
up conflating relation-type and relation-value.

How do you support sub-typing of relation-types according to sub-
typing of the attribute domains, and still allow for addressing the
attributes by type name?

That's actually no problem at all. The formal definition of a relation
type / relation header in Keith's approach would be a set of types
(*). For those types you can define a subtyping relation as follows: a
relation header H1 is said to be a subtype of H2 if for every type in
R2 there is a subtype in R1. It's even possible to have a matching
semantics with that that follows the subtype=subset principle. So no
notion of coercion is really needed here.

(*) Note there would / should be an additional requirement that a
header can only contain incomparable types, i.e., it cannot contain t1
and t2 such that t1 is a proper subtype of t2.

There is a problem.

Let's write t1 <= t2 if t1 is a subtype of t2 (let this include the
case t1 = t2).

Definition: header H is valid if for all t1,t2 in H,

* t1 <= t2 * *implies * t1 = t2.

(this is your "additional requirement")

Definition: *H1 <= H2 if
* exists bijection f from H1 to H2
* * such that for each t in H1, t <= f(t).
That was not my definition. It would be:

Definition: *H1 <= H2 if
* exists a function f : H2 -> H1
* * such that for each t in H2, f(t) <= t.

Note the direction of f. But also this function is not unique, and
yes, I do think that this is a problem.

Quote:
[...] Ambiguity in the bijection is incompatible with
the subtype = subset principle. *So multiple inheritance throws a
spanner in the works. *The only way to salvage the situation is to
require the bijection be unique (so in the above example it is deemed
that H1 is not a subtype of H2). *However I find this gratuitous at
best.
There is indeed a problem with the subtype = subset principle, or at
least the one that I had in mind:

- if t1 <= t2 then [[t1]] is a subset of [[t2]]

where [[t]] denotes the set of all values that are of type t. Sorry
for being unclear about that.

The restriction that f in the definition should be uniquely defined is
implied by this principle, which IMNSHO makes it not ad-hoc or
gratuitous but rather well-founded. The proof is left to the reader as
an exercise. ;-)

-- Jan Hidders

Reply With Quote
  #46  
Old   
Bob Badour
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-22-2010 , 09:09 AM



Keith H Duggar wrote:

Quote:
On Feb 21, 8:10 pm, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:

As for having just types, I still think that Codd introduced his
attribute names because of relations that can have two 'columns' of the
same type. I think Keith D is arguing against this and if I'm right
about that, I'd like him to deal with that case, not to say it can't be
done, just that I'd like to see whether his "copy" in a language is a
better tool.

How about we do some concrete exercises then? Please provide your
favorite example of some relations with attributes having the same
type along with some associated code operating on those relations.
With all due respect, when proposing an innovation the onus lies on you
to establish a need and/or benefit. How does discarding names benefit
predicate calculus? What flaw does it address?

Reply With Quote
  #47  
Old   
Jan Hidders
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-22-2010 , 10:49 AM



On 22 feb, 15:39, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:
Quote:
On 22 feb, 03:40, David BL <davi... (AT) iinet (DOT) net.au> wrote:





On Feb 21, 7:05 pm, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:

On 20 feb, 03:40, David BL <davi... (AT) iinet (DOT) net.au> wrote:

I agree this provides some motivation, but it doesn't seem sufficient
justification to "conflate" concerns (as Bob says). *The crucial point
is this: *The attribute domains are part of the relation-type, whereas
the attribute names are part of the relation-value. *What happenswith
your suggestion?

I got that wrong. *I should have said:

The crucial point is this: *Whereas both the declared attribute names
and attribute domains are part of the relation-type, only the
attribute names are part of the relation-value.

It's very important to keep a clear separation between a value and its
declared type, in order to understand equality. *Your suggestion ends
up conflating relation-type and relation-value.

How do you support sub-typing of relation-types according to sub-
typing of the attribute domains, and still allow for addressing the
attributes by type name?

That's actually no problem at all. The formal definition of a relation
type / relation header in Keith's approach would be a set of types
(*). For those types you can define a subtyping relation as follows: a
relation header H1 is said to be a subtype of H2 if for every type in
R2 there is a subtype in R1. It's even possible to have a matching
semantics with that that follows the subtype=subset principle. So no
notion of coercion is really needed here.

(*) Note there would / should be an additional requirement that a
header can only contain incomparable types, i.e., it cannot contain t1
and t2 such that t1 is a proper subtype of t2.

There is a problem.

Let's write t1 <= t2 if t1 is a subtype of t2 (let this include the
case t1 = t2).

Definition: header H is valid if for all t1,t2 in H,

* t1 <= t2 * *implies * t1 = t2.

(this is your "additional requirement")

Definition: *H1 <= H2 if
* exists bijection f from H1 to H2
* * such that for each t in H1, t <= f(t).

That was not my definition. It would be:

*Definition: *H1 <= H2 if
** exists a function f : H2 -> H1
** * such that for each t in H2, f(t) <= t.

Note the direction of f. But also this function is not unique, and
yes, I do think that this is a problem.

[...] Ambiguity in the bijection is incompatible with
the subtype = subset principle. *So multiple inheritance throws a
spanner in the works. *The only way to salvage the situation is to
require the bijection be unique (so in the above example it is deemed
that H1 is not a subtype of H2). *However I find this gratuitous at
best.

There is indeed a problem with the subtype = subset principle, or at
least the one that I had in mind:

- if t1 <= t2 then [[t1]] is a subset of [[t2]]

where [[t]] denotes the set of all values that are of type t. Sorry
for being unclear about that.

The restriction that f in the definition should be uniquely defined is
implied by this principle, which IMNSHO makes it not ad-hoc or
gratuitous but rather well-founded. The proof is left to the reader as
an exercise. ;-)
PS. Note that the corrected definition can be compactly formulated:

Definition: *H1 <= H2 if
for each type t2 in H2
there is exactly one type t1 in H1
such that t1 <= t2.

-- Jan Hidders

Reply With Quote
  #48  
Old   
David BL
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-22-2010 , 06:33 PM



On Feb 23, 12:49 am, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:
Quote:
On 22 feb, 15:39, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:

There is indeed a problem with the subtype = subset principle, or at
least the one that I had in mind:

- if t1 <= t2 then [[t1]] is a subset of [[t2]]

where [[t]] denotes the set of all values that are of type t. Sorry
for being unclear about that.

The restriction that f in the definition should be uniquely defined is
implied by this principle, which IMNSHO makes it not ad-hoc or
gratuitous but rather well-founded. The proof is left to the reader as
an exercise. ;-)
My problem is that there may be existing types t1,t2,t3,t4 and an
existing declared relation-type with H2 = {t3,t4}. Someone wants to
create a new relation-type with H1 = {t1,t2} that subtypes H2, perhaps
where it is required that t1,t3 represent one role and t2,t4 represent
another. However it happens that both t1 and t2 subtype both t3 and
t4 so therefore it cannot be done because of ambiguity. That seems
unreasonable.

Quote:
PS. Note that the corrected definition can be compactly formulated:

Definition: H1 <= H2 if
for each type t2 in H2
there is exactly one type t1 in H1
such that t1 <= t2.
That doesn't seem right to me - by that definition H1 might have more
attributes than H2 and yet be considered a subtype.

Reply With Quote
  #49  
Old   
Jan Hidders
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-23-2010 , 03:28 AM



On 23 feb, 01:33, David BL <davi... (AT) iinet (DOT) net.au> wrote:
Quote:
On Feb 23, 12:49 am, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:

On 22 feb, 15:39, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:
There is indeed a problem with the subtype = subset principle, or at
least the one that I had in mind:

- if t1 <= t2 then [[t1]] is a subset of [[t2]]

where [[t]] denotes the set of all values that are of type t. Sorry
for being unclear about that.

The restriction that f in the definition should be uniquely defined is
implied by this principle, which IMNSHO makes it not ad-hoc or
gratuitous but rather well-founded. The proof is left to the reader as
an exercise. ;-)

My problem is that there may be existing types t1,t2,t3,t4 and an
existing declared relation-type with H2 = {t3,t4}. *Someone wants to
create a new relation-type with H1 = {t1,t2} that subtypes H2, perhaps
where it is required that t1,t3 represent one role and t2,t4 represent
another. *However it happens that both t1 and t2 subtype both t3 and
t4 so therefore it cannot be done because of ambiguity. *That seems
unreasonable.
Mathematics can sometimes be very unreasonable. :-) But I think this
makes sense. If you give me a tuple (row/record) of H2 and would ask
for the t1-value in that tuple, then this is ambiguous, since both the
t3 and t4 value are also t1 values. It seems reasonable to assume that
for tuples of type H1 we require that they contain a single value of
type t1 and a single value of type t2. But the tuple of type H2 would
actually in some sense contain two values of type t1 (and two values
of type t2). So semantically speaking it is indeed not really a
subtype.

You could try to fix this by taking another semantics of the types,
but I don't see a reasonable alternative at the moment. In fact, my
current intuition is that this is a symptom of the fact that we are
trying to shoehorn the atrribute-type relationship into the subtype
relationship, where it doesn't really fit comfortably.

Quote:
PS. Note that the corrected definition can be compactly formulated:

Definition: *H1 <= H2 if
* *for each type t2 in H2
* *there is exactly one type t1 in H1
* *such that t1 <= t2.

That doesn't seem right to me - by that definition H1 might have more
attributes than H2 and yet be considered a subtype.
Of course. For the usual record types it holds that <a:t1, b:t2> is a
supertype of <a:t1, b:t2, c:t3>. You can use values of the latter
where values of the first are expected, not the other way around.

-- Jan HIdders

Reply With Quote
  #50  
Old   
David BL
 
Posts: n/a

Default Re: no names allowed, we serve types only - 02-23-2010 , 11:08 AM



On Feb 23, 5:28 pm, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:
Quote:
On 23 feb, 01:33, David BL <davi... (AT) iinet (DOT) net.au> wrote:
On Feb 23, 12:49 am, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:

On 22 feb, 15:39, Jan Hidders <hidd... (AT) gmail (DOT) com> wrote:
There is indeed a problem with the subtype = subset principle, or at
least the one that I had in mind:

- if t1 <= t2 then [[t1]] is a subset of [[t2]]

where [[t]] denotes the set of all values that are of type t. Sorry
for being unclear about that.

The restriction that f in the definition should be uniquely defined is
implied by this principle, which IMNSHO makes it not ad-hoc or
gratuitous but rather well-founded. The proof is left to the reader as
an exercise. ;-)

My problem is that there may be existing types t1,t2,t3,t4 and an
existing declared relation-type with H2 = {t3,t4}. Someone wants to
create a new relation-type with H1 = {t1,t2} that subtypes H2, perhaps
where it is required that t1,t3 represent one role and t2,t4 represent
another. However it happens that both t1 and t2 subtype both t3 and
t4 so therefore it cannot be done because of ambiguity. That seems
unreasonable.

Mathematics can sometimes be very unreasonable. :-) But I think this
makes sense. If you give me a tuple (row/record) of H2 and would ask
for the t1-value in that tuple, then this is ambiguous, since both the
t3 and t4 value are also t1 values. It seems reasonable to assume that
for tuples of type H1 we require that they contain a single value of
type t1 and a single value of type t2. But the tuple of type H2 would
actually in some sense contain two values of type t1 (and two values
of type t2). So semantically speaking it is indeed not really a
subtype.

You could try to fix this by taking another semantics of the types,
but I don't see a reasonable alternative at the moment. In fact, my
current intuition is that this is a symptom of the fact that we are
trying to shoehorn the atrribute-type relationship into the subtype
relationship, where it doesn't really fit comfortably.
I agree with all that, and add that it is noteworthy that the
ambiguity problem disappears with named attributes.

Anyway, I don't believe our analysis is complete, because our
definitions don't specify what happens exactly with Keith's "copy"
types (which must be used where there is a subtype relationship
between two of the attributes in a header).


Quote:
PS. Note that the corrected definition can be compactly formulated:

Definition: H1 <= H2 if
for each type t2 in H2
there is exactly one type t1 in H1
such that t1 <= t2.

That doesn't seem right to me - by that definition H1 might have more
attributes than H2 and yet be considered a subtype.

Of course. For the usual record types it holds that <a:t1, b:t2> is a
supertype of <a:t1, b:t2, c:t3>. You can use values of the latter
where values of the first are expected, not the other way around.
That view of subtyping concerns a possible interpretation of
substitutability (of values), but I believe it is better to adhere to
the subtype = subset principle for data types. It cannot be said that
tuples of the form <a:t1, b:t2, c:t3> represent a subset of tuples of
the form <a:t1, b:t2>.

C.Date presents this argument very well in section 20.9 of an
Introduction to Database Systems where he claims that a coloured
circle is not a subtype of circle (or vice versa).

I suggest implicit conversions must only be an artefact of the type
system (i.e. implicit conversions cannot actually change a value by
adding or removing information), and in fact no concept of implicit
conversions is required in a typeless formalism.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.