dbTalk Databases Forums  

Re: WWW/Internet 2009: 2nd CFP until 21 September

comp.databases.theory comp.databases.theory


Discuss Re: WWW/Internet 2009: 2nd CFP until 21 September in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Walter Mitty
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 08:15 AM






<natty2006 (AT) gmail (DOT) com> wrote


quote:
* Topics related to WWW/Internet are of interest. These include, but
are not limited to the following areas:

Web 2.0
- Collaborative Systems
- Social Networks
- Folksonomies
- Enterprise Wikis and Blogging
- Mashups and Web Programming
- Tagging and User Rating Systems
- Citizen Journalism


Semantic Web and XML
- Semantic Web Architectures
- Semantic Web Middleware
- Semantic Web Services
- Semantic Web Agents
- Ontologies
- Applications of Semantic Web
- Semantic Web Data Management
- Information Retrieval in Semantic Web

unquiote.

I'm interested in the idea of XML and the semantic web. In particular, I'm
interested in comparing this with the following idea, namely that the
relational data model is a useful one for viewing data in transit between
two systems connected by a network like the internet. Codd briefly
mentioned this topic in a single paragraph in the 1970 paper. I do not not
what Codd, Date and others have written on the subject since.

Anyway, I'm interested in whether XML falls under the topic of machine
representation of the data and is therefore neither compatible nor
incompatible with a relational view of data.
Or whether XML is an alternative tro the relational view of data, and
therefore one that should be compared with the relational view of data with
regard to benefits and drawbacks.

The relational view of data as regards data in transit over a network
extends the scope of discussion of the relational model beyond the scope
contemplated in 1970. The discussion in 1970 and for many years afterwards
focussed on the application of the relational model to the organization of
data banks for large scale sharing of data. Large scale sharing of data is
increasingly being carried out by shipping data over the network from one
system to another. Any databases involved are in the background.

If the relational view of data were to acheive the dominance in data
transfer that it has acheived in databases, there would then be only one
area left to tackle, namely the application of the relational view of data
to data encapsulated within an object or a system.

Reply With Quote
  #2  
Old   
paul c
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 09:55 AM






Walter Mitty wrote:
,,,
Quote:
I'm interested in the idea of XML and the semantic web. In particular, I'm
interested in comparing this with the following idea, namely that the
relational data model is a useful one for viewing data in transit between
two systems connected by a network like the internet. Codd briefly
mentioned this topic in a single paragraph in the 1970 paper. I do not not
what Codd, Date and others have written on the subject since.
...
Do you remember what page that mention is on?

Reply With Quote
  #3  
Old   
paul c
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 11:11 AM



Walter Mitty wrote:
Quote:
...
Anyway, I'm interested in whether XML falls under the topic of machine
representation of the data and is therefore neither compatible nor
incompatible with a relational view of data.
Or whether XML is an alternative tro the relational view of data, and
therefore one that should be compared with the relational view of data with
regard to benefits and drawbacks.
...

Here's an excerpt from an article two of the xml originators wrote
(http://www.scientificamerican.com/ar...econd-genera):

"The nesting rule automatically forces a certain simplicity on every XML
document, which takes on the structure known in computer science as a
tree. As with a genealogical tree, each graphic and bit of text in the
document represents a parent, child or sibling of some other element;
relationships are unambiguous. Trees cannot represent every kind of
information, but they can represent most kinds that we need computers to
understand. Trees, moreover, are extraordinarily convenient for
programmers. If your bank statement is in the form of a tree, it is a
simple matter to write a bit of software that will reorder the
transactions or display just the cleared checks."

Note the third sentence! Basically, they admit that trees are not
universally useful.

Another of many amusing sentences in the article is the very first one:

"Give people a few hints, and they can figure out the rest."

My own experience says otherwise!

Codd (1970) offered formal logic as the basis for designing a data language:

"1.5. SOME LINGUISTIC ASPECTS
The adoption of a relational model of data, as described
above, permits the development of a universal data sublanguage
based on an applied predicate calculus. A firstorder
predicate calculus s&ices if the collection of relations
is in normal form. Such a language would provide a yardstick
of linguistic power for all other proposed data Ianguages,
and would itself be a strong candidate for embedding
(with appropriate syntactic modification) in a variety
of host Ianguages (programming, command- or problemoriented)."

From what I've seen of XML it has no equivalent language. How can
people figure out meaning without language?

I think anybody who reads this article critically can easily conclude
that nealy every justification it offers in fact damns the lack of
universality in XML. It seems the whole XML exercise was prompted by
the fact that HTML expresses nothing but punctuation. When it comes to
a theory for organizing and manipulating data, XML compared to pretty
much any of the RM's, even the mangled SQL attempt, amounts to nothing
but punctuation, big deal! A very naive and fractional solution for
very narrow problems encountered in HTML. The only comparison I can
stretch out of that article is that tags might be compared with relation
names, except that no coherent data theory is offered for manipulating
them nor any method for coherent organization, eg., avoiding redundancy
and contradictions and physical aspects such as position and ordering.
That is left for applications to invent in an adhoc way, which is what
Codd was arguing against nearly forty years ago.

Reply With Quote
  #4  
Old   
paul c
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 11:20 AM



Walter Mitty wrote:
Quote:
...
The relational view of data as regards data in transit over a network
extends the scope of discussion of the relational model beyond the scope
contemplated in 1970. The discussion in 1970 and for many years afterwards
focussed on the application of the relational model to the organization of
data banks for large scale sharing of data. Large scale sharing of data is
increasingly being carried out by shipping data over the network from one
system to another. Any databases involved are in the background.
...
In the 1960's, let alone the 1970's, "large scale sharing of data" by
people was already a given requirement, no matter whether the vehicle
was hierarchies or graph designs. The more urgent problem, recognized
even by the Codasyl people, was sharing of data by applications..

Reply With Quote
  #5  
Old   
Walter Mitty
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 12:13 PM



"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Walter Mitty wrote:
,,,
I'm interested in the idea of XML and the semantic web. In particular,
I'm interested in comparing this with the following idea, namely that the
relational data model is a useful one for viewing data in transit between
two systems connected by a network like the internet. Codd briefly
mentioned this topic in a single paragraph in the 1970 paper. I do not
not what Codd, Date and others have written on the subject since.
...

Do you remember what page that mention is on?
I can't pin down the page. Here's the relevant quote.

quote:

The simplicity of the array representation which becomes feasible when all
relations are cast in normal form is not only an advantage for storage
purposes but also for communication of bulk data between systems which use
widely different representations of the data. The communication form would
be a suitably compressed version of the array representation and would have
the following advantages:

(1) It would be devoid of pointers (address-valued or displacement-valued )
..

(2) It would avoid all dependence on hash addressing schemes.

(3) It would contain no indices or ordering lists.

unquote:



It is, of course my interpretation that the above anticipates the kind of
data in transit that I alluded to in my OP. I think it's a reasonable
interpretation. I may hear other opinions in the course of this discussion.

Reply With Quote
  #6  
Old   
paul c
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 12:38 PM



Walter Mitty wrote:
Quote:
"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message
,,,

I can't pin down the page. Here's the relevant quote.

quote:

The simplicity of the array representation which becomes feasible when all
relations are cast in normal form is not only an advantage for storage
purposes but also for communication of bulk data between systems which use
widely different representations of the data. The communication form would
be a suitably compressed version of the array representation and would have
the following advantages:

(1) It would be devoid of pointers (address-valued or displacement-valued )
.

(2) It would avoid all dependence on hash addressing schemes.

(3) It would contain no indices or ordering lists.

unquote:



It is, of course my interpretation that the above anticipates the kind of
data in transit that I alluded to in my OP. I think it's a reasonable
interpretation. I may hear other opinions in the course of this discussion.

Thanks. Earlier on he says this:

"An array which represents an n-ary relation R has the following
properties :
(1) Each row represents an n-tuple of R.
(2) The ordering of rows is immaterial.
(3) All rows are distinct.
(4) The ordering of columns is significant-it corresponds
to the ordering S1, Sz , . . . , S, of the domains
on which R is defined (see, however, remarks
below on domain-ordered and domain-unordered
relations ) .
(5) The significance of each column is partially conveyed
by labeling it with the name of the corresponding
domain."

Although he doesn't use the term "named relation" in this quote, it does
seem he was talking about a communication arrangement ("representation")
as well as a storage arrangement, in a sense they are basically the same
thing. Given that users agree on a "predicate", the meaning of such an
array would be immediately recognizable. In 1999 the xml proponents
claimed to write about a seemingly "new" distinction between format and
meaning! It seems that Codd presaged them by about thirty years! In
his paper, he fairly easily dismissed with "nesting" that the xml people
think is essential for any arrangement of data.

Reply With Quote
  #7  
Old   
Walter Mitty
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 12:49 PM



"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Walter Mitty wrote:
...
The relational view of data as regards data in transit over a network
extends the scope of discussion of the relational model beyond the scope
contemplated in 1970. The discussion in 1970 and for many years
afterwards focussed on the application of the relational model to the
organization of data banks for large scale sharing of data. Large scale
sharing of data is increasingly being carried out by shipping data over
the network from one system to another. Any databases involved are in
the background.
...

In the 1960's, let alone the 1970's, "large scale sharing of data" by
people was already a given requirement, no matter whether the vehicle was
hierarchies or graph designs. The more urgent problem, recognized even by
the Codasyl people, was sharing of data by applications..
First, I've always taken the word "user" in the 1970 paper to apply to an
application program, or to a fairly transparent command and print utility,
or to a human using the database as mediated by either an application of a
utility program. My reading might not have been careful enough.

Second, as to whether sharing was a given requirement or not, I'd have to
say that it depended on who you talked to. A large part of my career from
1985 through 1999 consisted not only in enabling people to share data, but
in convincing people of the merits of doing so. In almost every client
company there was a large faction that stood to lose, or thought so, if data
sharing prevailed. Even today, I'd say that over half of the databases
being built in SQL server are planned for use only by a single application
inside which the database is to be embedded. Any use of the data by other
applications, or even by general purpose report generators or OLAP
environments is to be done through the app's API.

The fact that such a view is monstrously naive doesn't prevent it from being
the majority view. Add to that the marketing plan that says that the client
has no choice but to return to us for access to their own data, and you
have the road to hell, well paved.

Reply With Quote
  #8  
Old   
Walter Mitty
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 01:03 PM



"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Walter Mitty wrote:
...
Anyway, I'm interested in whether XML falls under the topic of machine
representation of the data and is therefore neither compatible nor
incompatible with a relational view of data.
Or whether XML is an alternative tro the relational view of data, and
therefore one that should be compared with the relational view of data
with regard to benefits and drawbacks.
...


Here's an excerpt from an article two of the xml originators wrote
(http://www.scientificamerican.com/ar...econd-genera):

"The nesting rule automatically forces a certain simplicity on every XML
document, which takes on the structure known in computer science as a
tree. As with a genealogical tree, each graphic and bit of text in the
document represents a parent, child or sibling of some other element;
relationships are unambiguous. Trees cannot represent every kind of
information, but they can represent most kinds that we need computers to
understand. Trees, moreover, are extraordinarily convenient for
programmers. If your bank statement is in the form of a tree, it is a
simple matter to write a bit of software that will reorder the
transactions or display just the cleared checks."

Note the third sentence! Basically, they admit that trees are not
universally useful.

The phrase I'm going to quote is this: "to make information
self-describing". This very same phrase was at the heart of the motivation
for databases back when I got my introduction to them in 1984. There was a
progression of how data definitions were managed that made sense back then,
as an explanation of how we got to the threshold of databases.

It went something like this: In FORTRAN, data definitions were scattered
all over the program, in FORMAT statements. In COBOL, the definitions
were at least gathered at the front of the program, in the data division.
Soon afterwards record definition libraries began to be accepted in the
COBOL world. (BTW, I was never a COBOL guy). This enabled lots of
programs to share record definitions. Finally, databases that contained
their own schema, allowed data to be self describing.

I'm wondering if the people who invented XML didn't know that this work had
been done before, or if they regarded the work on databases as worthy of
being ignored.
In any event, they seem to have reinvented the hierarchical model of data.
This keeps happening. Next thing you know, we'll have somebody in this
forum telling us that Nelson Pick got it everything right. We've already
been down that road, but it can happen again.

Reply With Quote
  #9  
Old   
paul c
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 01:33 PM



Walter Mitty wrote:
Quote:
"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message
news:fFXfm.38494$Db2.31022 (AT) edtnps83 (DOT) ..
Walter Mitty wrote:
...
Anyway, I'm interested in whether XML falls under the topic of machine
representation of the data and is therefore neither compatible nor
incompatible with a relational view of data.
Or whether XML is an alternative tro the relational view of data, and
therefore one that should be compared with the relational view of data
with regard to benefits and drawbacks.
...

Here's an excerpt from an article two of the xml originators wrote
(http://www.scientificamerican.com/ar...econd-genera):

"The nesting rule automatically forces a certain simplicity on every XML
document, which takes on the structure known in computer science as a
tree. As with a genealogical tree, each graphic and bit of text in the
document represents a parent, child or sibling of some other element;
relationships are unambiguous. Trees cannot represent every kind of
information, but they can represent most kinds that we need computers to
understand. Trees, moreover, are extraordinarily convenient for
programmers. If your bank statement is in the form of a tree, it is a
simple matter to write a bit of software that will reorder the
transactions or display just the cleared checks."

Note the third sentence! Basically, they admit that trees are not
universally useful.

The phrase I'm going to quote is this: "to make information
self-describing". This very same phrase was at the heart of the motivation
for databases back when I got my introduction to them in 1984. There was a
progression of how data definitions were managed that made sense back then,
as an explanation of how we got to the threshold of databases.

It went something like this: In FORTRAN, data definitions were scattered
all over the program, in FORMAT statements. In COBOL, the definitions
were at least gathered at the front of the program, in the data division.
Soon afterwards record definition libraries began to be accepted in the
COBOL world. (BTW, I was never a COBOL guy). This enabled lots of
programs to share record definitions. Finally, databases that contained
their own schema, allowed data to be self describing.

I'm wondering if the people who invented XML didn't know that this work had
been done before, or if they regarded the work on databases as worthy of
being ignored.
In any event, they seem to have reinvented the hierarchical model of data.
This keeps happening. Next thing you know, we'll have somebody in this
forum telling us that Nelson Pick got it everything right. We've already
been down that road, but it can happen again.


It takes guts to go after the big problems. Going after small ones has
fewer risks, one of them is that the result will be labelled as the
product of small minds. But it really is irresponsible to sell the
small solution as if it solves a big problem. At most, XML is a
programming technique that needs to be buttressed with a great deal of
adhoc infrastructure, like OO, not a semantic innovation, let alone some
kind of fundamental discovery..

Reply With Quote
  #10  
Old   
Walter Mitty
 
Posts: n/a

Default Re: WWW/Internet 2009: 2nd CFP until 21 September - 08-10-2009 , 02:10 PM



"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote

Quote:
Walter Mitty wrote:
"paul c" <toledobythesea (AT) oohay (DOT) ac> wrote in message
news:fFXfm.38494$Db2.31022 (AT) edtnps83 (DOT) ..
Walter Mitty wrote:
...
Anyway, I'm interested in whether XML falls under the topic of machine
representation of the data and is therefore neither compatible nor
incompatible with a relational view of data.
Or whether XML is an alternative tro the relational view of data, and
therefore one that should be compared with the relational view of data
with regard to benefits and drawbacks.
...

Here's an excerpt from an article two of the xml originators wrote
(http://www.scientificamerican.com/ar...econd-genera):

"The nesting rule automatically forces a certain simplicity on every XML
document, which takes on the structure known in computer science as a
tree. As with a genealogical tree, each graphic and bit of text in the
document represents a parent, child or sibling of some other element;
relationships are unambiguous. Trees cannot represent every kind of
information, but they can represent most kinds that we need computers to
understand. Trees, moreover, are extraordinarily convenient for
programmers. If your bank statement is in the form of a tree, it is a
simple matter to write a bit of software that will reorder the
transactions or display just the cleared checks."

Note the third sentence! Basically, they admit that trees are not
universally useful.

The phrase I'm going to quote is this: "to make information
self-describing". This very same phrase was at the heart of the
motivation for databases back when I got my introduction to them in 1984.
There was a progression of how data definitions were managed that made
sense back then, as an explanation of how we got to the threshold of
databases.

It went something like this: In FORTRAN, data definitions were scattered
all over the program, in FORMAT statements. In COBOL, the definitions
were at least gathered at the front of the program, in the data division.
Soon afterwards record definition libraries began to be accepted in the
COBOL world. (BTW, I was never a COBOL guy). This enabled lots of
programs to share record definitions. Finally, databases that contained
their own schema, allowed data to be self describing.

I'm wondering if the people who invented XML didn't know that this work
had been done before, or if they regarded the work on databases as
worthy of being ignored.
In any event, they seem to have reinvented the hierarchical model of
data. This keeps happening. Next thing you know, we'll have somebody in
this forum telling us that Nelson Pick got it everything right. We've
already been down that road, but it can happen again.



It takes guts to go after the big problems. Going after small ones has
fewer risks, one of them is that the result will be labelled as the
product of small minds. But it really is irresponsible to sell the small
solution as if it solves a big problem. At most, XML is a programming
technique that needs to be buttressed with a great deal of adhoc
infrastructure, like OO, not a semantic innovation, let alone some kind of
fundamental discovery..
I have scanned the first few paragraphs of the article you cited. I admit
that I haven't read it all, or any of it carefully.

from the little I did read it seems clear to me that the originators of XML
THINK that they have come up with a view of data that covers all of the same
ground the the relational view of data covers. Or it's just possible that
they never heard of the relational view of data. But just because they
think that doesn't mean that you and I should think that.

So I return to my original question. Is XML simply a machine representation
of data, or is it an alternative to the relational view of data? Another
related question is, can you represent relational data in an XML document?
Is anything gained or lost by doing so?

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.