dbTalk Databases Forums  

RM and abstract syntax trees

comp.databases.theory comp.databases.theory


Discuss RM and abstract syntax trees in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
David BL
 
Posts: n/a

Default RM and abstract syntax trees - 10-29-2007 , 09:06 PM






In the following I compare different techniques for representing an
Abstract Syntax Tree (AST), concluding that RM is poorly suited.

In LISP, S-expressions (parenthesized lists) are well suited to
representing ASTs because S-expressions naturally nest, allow for
native representation of data types like 32 bit integers, and
symbolic names allow for AST nodes to be tagged with their type.

S-expressions are implemented using pointers. The interesting thing
is that the use of pointers is hidden from the programmer. For
example, it is impossible to print the address of a cons cell.

In Prolog, the data type is called a term, and for the same reasons
just mentioned, terms are well suited to representing ASTs.

In C or C++ we have a lower level means to represent ASTs using heap
allocated structs or classes wired up using pointers. Various
techniques can be used to avoid low level details such as memory
management. As an example, a function can be written to test for
value equivalence of two AST node variables, and this will check for
isomorphism in the structure, but otherwise treat pointer values as
insignificant.

The RM seems poorly suited to the representation of ASTs. The
fundamental problem is that it is necessary for the user to assign
every node of the AST a unique identifier. Let's call a spade a
spade : this identifier represents a pointer value. So RM is forced
to expose the equivalent of pointers directly in the representation.
Furthermore, the RM has no mechanism for hiding these pointers or
giving the user an interface that promotes the idea that a node
logically represents a value.

This is rather paradoxical, because one of the benefits of RM over OO
is meant to be the avoidance of pointers (and not their promotion)!

The inappropriateness of the RM for ASTs is highlighted when we
interpret tuples as propositions or facts. Consider the following
expression

(x + 1) * 3

In order to state the fact that this expression consists of the
product of subexpressions '(x+1)' and '3', one must clearly assign
names to the things about which we want to state facts. No wonder the
RM forces us to go on a naming spree.

By contrast, in LISP we may directly represent the above expression
using the following S-expression

(* (+ x 1) 3)

There is an important philosophical difference : In LISP, Prolog or C
we *directly* represent the AST as an entity that is *part of the
abstract computational machine*. By contrast, in the RM one is only
permitted to state facts about the AST in order to represent it,
almost as though the AST is some external entity that is not a part of
the abstract computational machine.

Consider the following analogy: Imagine how inefficient the RM would
become, if it wasn't even permitted to directly represent an integer
value, but instead was forced to represent the number indirectly.
For example, given the binary representation 0101 of an integer value
such as 5, one could label the value with the name 'x' and represent
its value indirectly by stating a fact for each '1' in the binary
representation, using a separate relation for each bit position (ie
power of 2). Clearly hideous, but an interesting illustration of
how RM can be taken to a normal form where the only domain required
allows for symbolic names.

It seems to me that the rule of thumb should be this : Do not use RM
to state facts about entities that can be directly represented within
the abstract computational machine.

I anticipate that this rule of thumb provides a useful insight on that
rather vague notion of "semi-structured data". ie it explains exactly
when and why there is data that is not suitable for direct
representation in RM.


Reply With Quote
  #2  
Old   
Roy Hann
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 03:29 AM






"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
In the following I compare different techniques for representing an
Abstract Syntax Tree (AST), concluding that RM is poorly suited.
[snip]

Quote:
I anticipate that this rule of thumb provides a useful insight on that
rather vague notion of "semi-structured data". ie it explains exactly
when and why there is data that is not suitable for direct
representation in RM.
Education triumphs over learning once again.

Roy




Reply With Quote
  #3  
Old   
David BL
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 05:29 AM



On Oct 30, 6:29 pm, "Roy Hann" <specia... (AT) processed (DOT) almost.meat>
wrote:
Quote:
"David BL" <davi... (AT) iinet (DOT) net.au> wrote in message

news:1193713604.283167.146850 (AT) e34g2000pro (DOT) googlegroups.com...

In the following I compare different techniques for representing an
Abstract Syntax Tree (AST), concluding that RM is poorly suited.

[snip]

I anticipate that this rule of thumb provides a useful insight on that
rather vague notion of "semi-structured data". ie it explains exactly
when and why there is data that is not suitable for direct
representation in RM.

Education triumphs over learning once again.

Roy
Please say what you disagree with. I can take it.





Reply With Quote
  #4  
Old   
Bob Badour
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 12:13 PM



paul c wrote:

Quote:
David BL wrote:

On Oct 30, 6:29 pm, "Roy Hann" <specia... (AT) processed (DOT) almost.meat
wrote:

"David BL" <davi... (AT) iinet (DOT) net.au> wrote in message

news:1193713604.283167.146850 (AT) e34g2000pro (DOT) googlegroups.com...

In the following I compare different techniques for representing an
Abstract Syntax Tree (AST), concluding that RM is poorly suited.

[snip]

I anticipate that this rule of thumb provides a useful insight on that
rather vague notion of "semi-structured data". ie it explains exactly
when and why there is data that is not suitable for direct
representation in RM.

Education triumphs over learning once again.

Roy

Please say what you disagree with. I can take it.

Okay, from your original post:

"So RM is forced
to expose the equivalent of pointers directly in the representation.
Furthermore, the RM has no mechanism for hiding these pointers or
giving the user an interface that promotes the idea that a node
logically represents a value."

Where does RM ever mention pointers? Eg., What are the pointer
operations that RM supports?

(ps: I don't agree that RM can't represent nested lists but I would
agree that it's not much fun to manipulate them, I wish Codd had said
more about nested relations as I have a feeling he spent some time
considering them.)
It seems he considered them unecessary in the sense one can always
normalize the data to obviate the need for them.


Reply With Quote
  #5  
Old   
Bob Badour
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 12:22 PM



paul c wrote:

Quote:
paul c wrote:
...

(ps: I don't agree that RM can't represent nested lists but I would
agree that it's not much fun to manipulate them, I wish Codd had said
more about nested relations as I have a feeling he spent some time
considering them.)


Here's my favourite nested relation, although I admit it's probably
useless in practice. It's a recursive one. Sorry I don't have much
mastery of conventional syntax, what I mean here is something like R:
attribute list> where <attribute list> is a set of attribute name,
attribute type pairs and typeof is swiped from C-language:

R: (A typeof R)

I don't know how to display a value for R but I guess it could have
either no tuples or one tuple.
It could have any number of tuples. See formalism under "philosophy of
mathematics".

Example values are:
zero tuples:
{}

one tuple:
{{}}
{{{}}}
{{{{}}}}
{{{},{{}}}}
....

two tuples:
{{},{{}}}
{{{}},{{{}}}}
{{},{{{}}}}
....

three tuples:
{{},{{}},{{},{{}}}}
{{},{{}},{{{}}}}
....

four tuples:
{{},{{}},{{},{{}}},{{},{{}},{{},{{}}}}}
etc.


Quote:
Also guessing that R <OR> (<NOT> R) has one tuple and R <AND> (<NOT> R)
has no tuples (where <OR>, <AND>, <NOT> come from D&D syntax).
I suspect you guess incorrectly for at least one of them.


Reply With Quote
  #6  
Old   
Bob Badour
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 12:39 PM



paul c wrote:

Quote:
Bob Badour wrote:
...

It seems he considered them unecessary in the sense one can always
normalize the data to obviate the need for them.

It seems that way to me too but I'd add that I think he presumed that
one has applicable "data" in the first place, ie., one has in mind
enough attributes that have values so as to allow tuples to stand for
what what has in mind to express, eg., one must be able to distinguish
different facts by tuple values, otherwise one hasn't determined the
system's requirements in the first place and we could never agree on
what the system is supposed to be talking about!
If one lacks data, one hardly needs data management.

I would like to see more heavy thinkers thinking about 6NF.


Reply With Quote
  #7  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 02:55 PM



On Oct 29, 7:06 pm, David BL <davi... (AT) iinet (DOT) net.au> wrote:
Quote:
In the following I compare different techniques for representing an
Abstract Syntax Tree (AST), concluding that RM is poorly suited.
The connection between parsing and logic has been explored in
Pereira&Warren works, e.g.
http://citeseer.ist.psu.edu/576360.html



Reply With Quote
  #8  
Old   
Marshall
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 06:26 PM



On Oct 30, 10:39 am, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote:
Quote:
I would like to see more heavy thinkers thinking about 6NF.
You've hinted at this idea before. I make no claims about
being a heavy thinker, but I've generally found your ideas
to be worth pursuing. Do you have any suggestions for
what I should be reading, before I start doing any
thinking? :-)

I have a vague, vague thought about there being
a correction between 6NF and relations-as-logic.


Marshall



Reply With Quote
  #9  
Old   
David BL
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 06:34 PM



On Oct 31, 2:46 am, paul c <toledobythe... (AT) ooyah (DOT) ac> wrote:
Quote:
Okay, from your original post:

"So RM is forced
to expose the equivalent of pointers directly in the representation.
Furthermore, the RM has no mechanism for hiding these pointers or
giving the user an interface that promotes the idea that a node
logically represents a value."

Where does RM ever mention pointers? Eg., What are the pointer
operations that RM supports?
I'm associating a "pointer" with the idea to give a thing (like a node
of an AST) some meaningless identifier, and using that identifier
elsewhere as a means to uniquely reference that thing. With that
*analogy*, RM performs a pointer dereference when performing a natural
join.

Consider a representation of (x+1) * 3 using the following relations

var(n,s) :- node n is a variable with name s
number(n,i) :- node n is a number with value i
plus(n,n1,n2) :- node n represents the addition of n1,n2
prod(n,n1,n2) :- node n represents the product of n1,n2

Algorithms that process the AST will use joins on node attributes,
basically emulating pointer dereference.

Whether you accept the analogy with pointers or not, the result is
hideous all the same. For example, consider the integrity
constraints needed - eg that a node can't represent a number and a
variable at the same time.

Assuming there isn't some decent way to represent AST's using RM that
I'm not aware of, my conclusion that RM is ill suited seems valid.

Quote:
(ps: I don't agree that RM can't represent nested lists but I would
agree that it's not much fun to manipulate them, I wish Codd had said
more about nested relations as I have a feeling he spent some time
considering them.)
I never said that RM cannot represent nested lists.





Reply With Quote
  #10  
Old   
Bob Badour
 
Posts: n/a

Default Re: RM and abstract syntax trees - 10-30-2007 , 06:45 PM



paul c wrote:

Quote:
Bob Badour wrote:

paul c wrote:

...

Here's my favourite nested relation, although I admit it's probably
useless in practice. It's a recursive one. Sorry I don't have much
mastery of conventional syntax, what I mean here is something like R:
attribute list> where <attribute list> is a set of attribute name,
attribute type pairs and typeof is swiped from C-language:

R: (A typeof R)

I don't know how to display a value for R but I guess it could have
either no tuples or one tuple.


It could have any number of tuples. See formalism under "philosophy of
mathematics".

Example values are:
zero tuples:
{}

one tuple:
{{}}
{{{}}}
{{{{}}}}
{{{},{{}}}}
...

two tuples:
{{},{{}}}
{{{}},{{{}}}}
{{},{{{}}}}
...

three tuples:
{{},{{}},{{},{{}}}}
{{},{{}},{{{}}}}
...

four tuples:
{{},{{}},{{},{{}}},{{},{{}},{{},{{}}}}}
etc.

...


Thanks for the formalism suggestions, will try to absorb.

I know you've tried to explain various flavours of this to me before,
but an example that prompts this for me, only with two attributes, is:

R2: {A integer, B typeof R2}

In R2, I can see that a value for B isn't necessarily empty, but if it
is, that must be the end of the "descent", for I don't see how an empty
recursive relation can recurse "any further" as it were. (I think this
would be the case even if R2 defined several "levels" of conventional
non-recursive RVA's.)

Now, I admit I'm slow (but not heavy!) but I can see only one possible
value for R.A which is an empty relation of type R and two possible
values for R, one where R is empty and the other where R has one tuple
where R.A is empty. I don't see how a relation defined recursively can
"descend" from an empty value!

So far, it looks like a peculiar kind of constraint to me. As somebody
else say, go ahead and attack it, I can take it!
{ 0, {} }
{ 1, {} }
{ 2, {} }
{ 3, { -1, { 3, {} } } }


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.