dbTalk Databases Forums  

RM and abstract syntax trees

comp.databases.theory comp.databases.theory


Discuss RM and abstract syntax trees in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #41  
Old   
David BL
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 04:19 AM






On Nov 1, 1:31 pm, Jonathan Leffler <jleff... (AT) earthlink (DOT) net> wrote:
Quote:
Marshall wrote:
On Oct 31, 7:50 pm, David BL <davi... (AT) iinet (DOT) net.au> wrote:
Isn't it helpful to see the analogy with a pointer dereference?

I'll leave it up to you as to whether you dislike the analogy between
node identifiers and pointer values, and the idea that a join can be
compared to a pointer dereference. Perhaps you are right and the
analogy creates confusion.

I think if we are clear about it being an analogy we are on solid
ground. But as soon as we start thinking pointers and references
are the *same* thing we are in trouble, because now we can't
see the differences anymore.

I think Date actually nails this issue. He says (roughly) that
pointers add complexity but don't add any expressive power.

Isn't the other 'point' that 'pointers point somewhere' but values
stored in a relation don't - that relational database bases work on
associative addressing. In particular, even in a foreign key, the value
doesn't point to the referenced primary key; it merely contains the same
value as some entry in the referenced table. It may also contain the
same value as a large number of other places in the database.
That's quite right, but note in the special case of using RM to
represent an AST, a foreign key node identifier ends up uniquely
referencing precisely one tuple in one relation elsewhere in the DB.




Reply With Quote
  #42  
Old   
David Cressey
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 05:28 AM







"Gene Wirchenko" <genew (AT) ocis (DOT) net> wrote

Quote:
I am reminded of pAssembler that I used years ago at university.
It was called "NIP" which stood for "nothing in particular". My
compsci instructor said that several times.
Or GNU, which stands for GNU (is) Not Unix.





Reply With Quote
  #43  
Old   
David Cressey
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 05:37 AM




"Marshall" <marshall.spight (AT) gmail (DOT) com> wrote

Quote:
Actually, Paul had a pretty good take on atomic just now:
it means the relational operators can't decompose them.
As has been said a bunch of times, the only operation
the RM requires on attribute domains is the equality test.
We've had this conversation before. But here goes.

The equality test is most often implemented on "simple" data types by
testing on equality of the representations, not on equality of the values
thus represented. This works, provided the representation scheme does not
allow synonyms, i.e. two distinct representations for the same thing. If
synonyms are present, the equality test on representations can yield false
negatives, if what is really desired is an equality test on values.

Most of the representation schemes for compound values, like RVAs, permit
synonyms. The ones that don't permit synonyms are more costly.




Reply With Quote
  #44  
Old   
David Cressey
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 05:44 AM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote

Quote:
I guess I equate pointers with edges in a directed graph (where by
directed graph I'm referring to the formalised notion of a set of
nodes plus a set of directed edges between nodes), and naturally I
regard an AST as an acyclic directed graph.
This is a good working definition. My claim that pointers are really
addresses can be recast as a special case of the above.

But the objections concerning data models based on directed graphs turn out
to be the same as the objections concerning implementations that rely on
(exposed) pointers used as addresses.

Those objections don't mean that the RM is superior to directed graphs in
all instances. Reference what Marshall and others said. It does mean that
caution is needed when proposing a directed graph model to replace the RM as
the most general model.




Reply With Quote
  #45  
Old   
David Cressey
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 05:50 AM




"David BL" <davidbl (AT) iinet (DOT) net.au> wrote


Quote:
represent an AST, a foreign key node identifier ends up uniquely
referencing precisely one tuple in one relation elsewhere in the DB.
Does the foreign key node identifier reference one tuple, or does it
reference a node that is catalogued by one tuple, that contains the node
identifier as a value?





Reply With Quote
  #46  
Old   
David BL
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 06:03 AM



On Nov 1, 1:59 pm, Marshall <marshall.spi... (AT) gmail (DOT) com> wrote:

Quote:
I mentioned previously that pointers are only meaningful or
functional within the context of a specific, typically non-portable
address space. Without the address space, pointers might as
well be random numbers. So if you are going to pack up some
data and move it from, say, one machine to another, you
have to go through all sorts of contortions to preserve some
semblance of pointer semantics. And as anyone who's dug
through the code for Java serialization, this is quite ugly
and error prone. Whereas if you want to copy some
relations from one place to another no transformations of
any kind are necessary.
Most of the time but not always. Meaningless identifiers (such as
used for identifiers of an AST) can pose problems with merging
relational data between DB's even though they have compatible schema.
A solution is to introduce local namespaces for surrogate id
generation, and that parallels the common solution used for merging
(distributed) persistent object stores, which is to qualify local
address spaces so they are distinct. This leads to the technique of
pointer swizzling which is a fancy name for mapping persistent OIDs to
virtual memory pointers as objects are faulted into memory.

I don't find it at all surprising that we see this parallel,
considering the analogy of meaningless attribute values with pointer
values.

To avoid the problems with clashing identifiers/pointers when sending
something like source code between computers then it seems better to
send the original text rather than to parse it into an AST and then
try to send the AST. This seems to be true regardless of how you
represent the AST. ie RM doesn't magically solve that problem.

I hate <ugly>XML</ugly>) but it can be seen as an interchange format
for nested expressions that avoids transmission of node identifiers.
But wouldn't it be so much nicer to send text representations of S-
expressions, or better still use a more efficient binary format. I
rather like the idea of binary grammars.

Quote:
Which rather highlights how the
relational form is a logical form, and not a physical form.
Interesting observation.




Reply With Quote
  #47  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 09:05 AM



On Oct 31, 4:29 pm, David BL <davi... (AT) iinet (DOT) net.au> wrote:
Quote:
On Nov 1, 7:23 am, Tegiri Nenashi <TegiriNena... (AT) gmail (DOT) com> wrote:



On Oct 29, 7:06 pm, David BL <davi... (AT) iinet (DOT) net.au> wrote:

The inappropriateness of the RM for ASTs is highlighted when we
interpret tuples as propositions or facts. Consider the following
expression

(x + 1) * 3

In order to state the fact that this expression consists of the
product of subexpressions '(x+1)' and '3', one must clearly assign
names to the things about which we want to state facts. No wonder the
RM forces us to go on a naming spree.

And when you consider the above expression in the language theory you
don't assign grammar symbols to each subexpression? In your example
they clearly are:

number : 1 | 3;
expr : number | expr + expr | expr * expr | ( expr );

I don't understand your point. Naming grammar symbols is quite
different from naming actual instances of subexpressions. Imagine
how many names one would need in a large source code program.
The actual instances of subexpressions are binary predicates. In your
example

(x + 1) * 3

let's enumerate all the tokens:

0 -> (
1 -> x
2 -> +
3 -> 1
4 -> )
5 -> *
6 -> 3

Then by parsing this fragment we establish that, for example, the
predicate "expr"(x,y) evaluates to true for the arguments x=0, y=5.
(By convention a semiopen interval of tokens [x,y) includes all the
integers between x, included, and y, excluded). Essentially you have a
structure of nested intervals that encodes the parse tree:

expr[0,7)
....expr[0,5)
........"(" [0,1)
........expr[1,4)
........")" [4,5)
...."+"[5,6)
....expr[6,7)




Reply With Quote
  #48  
Old   
Marshall
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 10:53 AM



On Nov 1, 8:04 am, paul c <toledobythe... (AT) ooyah (DOT) ac> wrote:
Quote:
David BL wrote:
In the following I compare different techniques for representing an
Abstract Syntax Tree (AST), concluding that RM is poorly suited.
...

I could just as well say that your choice of application is poorly suited.

From Wikepedia:

"In computing, it is used in a parser as an intermediate between a parse
tree and a data structure"

To me, trying this is reminiscent of EAV attempts, ie., using the RM to
give an "intermediate" implementation. Apart from "jobs for the boys",
why would one want to? Surely the application here is the manipulation
of the parse tree, not the AST.
Well, I'd have to disagree. Syntax is largely to help the human.
Once your compiler reads in the program text, it will really
"prefer" (if I may anthropomorphize a bit) to work on the
minimal data structure it can; this is the AST.

The are applications where one wants to manipulate the
parse tree directly, such as when making source-to-source
transformations. Like when your IDE has the ability to
rename a variable.

But much of what a compiler does, in analysis and optimization
and code generation, is done against the AST.

My long term hypothesis is that there actually exists an
idiom to work with ASTs in the RM that is about as
good as any other way, but we just haven't identified
it yet. I have this problem on my queue of problems
that I wish to look in to in detail. Whether I'll ever
get there is an open question.


Marshall



Reply With Quote
  #49  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 12:24 PM



On Nov 1, 8:04 am, paul c <toledobythe... (AT) ooyah (DOT) ac> wrote:
Quote:
Surely the application here is the manipulation
of the parse tree, not the AST.
Parse tree indeed is the king, not AST. Given the parse tree one can
easily derive AST by selecting only "interesting" nodes. The aguments
for AST -- that it is more human readable and more compact -- are both
meningless from the theory perspective.




Reply With Quote
  #50  
Old   
David BL
 
Posts: n/a

Default Re: RM and abstract syntax trees - 11-01-2007 , 08:29 PM



On Nov 2, 9:10 am, paul c <toledobythe... (AT) ooyah (DOT) ac> wrote:
Quote:
Marshall wrote:

...

Your conception of the RM is too narrow. There is nothing in
the RM that precludes nested structures, nor union types.

By "union types", I presume you mean "unions of types that a db may use"
and also that the RM doesn't preclude *nor define* unions of types
except for unions of relation types as long as what some people call an
identity function (IIRC) and others call an equality operator is
provided, somehow, for the union type.
I like the way Date associates a union type with a supertype over the
subtypes of which it is taking the union.

This seems an elegant way to define discriminated unions. I've
wondered whether a language could support retrospective addition of
supertypes of a given type. This could make sense if one drops the
conventional OO idea of bundling methods with classes.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.