dbTalk Databases Forums  

Object-relational impedence

comp.databases.theory comp.databases.theory


Discuss Object-relational impedence in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #3901  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM








Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
  #3902  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM








Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
  #3903  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM





Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
  #3904  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM





Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
  #3905  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM





Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
  #3906  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM





Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
  #3907  
Old   
topmind
 
Posts: n/a

Default Re: EAV (Re: Object-relational impedence) - 04-03-2008 , 12:06 AM





Brian Selzer wrote:
Quote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a5fa390d-025e-4708-93e0-588d573faa2d (AT) u10g2000prn (DOT) googlegroups.com...


Brian Selzer wrote:
"topmind" <topmind (AT) technologist (DOT) com> wrote in message
news:a742d466-e6a9-4d13-a1ba-953567fcb3dc (AT) u10g2000prn (DOT) googlegroups.com...
[big snip]

It's a gut feeling for everyone here because it's never been tried in
practice. We cannot really know with near certainty all the possible
usage scenarios (query patterns) and which are more frequent. It's an
educated WAG at best.


Perhaps some testing is in order.

There's another suggestion that nobody's mentioned here that I know
of: Add each new column to the Widgets table as needed, creating a
"sparse" table. If we are going to allow add/change to schemas, then
why not just add the new frippen column to the Widgets table and be
done with it? Sparse tables are not really space hogs on most modern
RDBMS like they were in the past. The overhead for unused (null)
columns is small.


It's not about space. Separate tables can be indexed separately,
permitting
the selection of better query plans. A lot of nulls, such as you would
find
in a sparse table, skews the statistics for columns, making some indexes
that would otherwise appear optimal less attractive, leading to poor
performance.

I doubt queries against sparse columns would be common. Generally
there would first be a query involving the "widget type" and/or
attribute type, which would greatly narrow what is being searched.
And, depending on how the optimizer and index system is built, you
don't need to store Nulls *in* the indexes. Think about it. There are
a few operations where indexing nulls makes sense, but they are not
common enough to bother in my opinion. There are other ways to get the
same info. But, a given vendor may have seen it different. In short,
it depends on the DB engine.


You're entitled to your opinion, even if it's wrong. There may not even be
a need for a "widget type" attribute.
Perhaps "type" is not quite the right description. It would identify a
specific widget implementation, say the "button from Foo Inc.". That
can be used to look up the dictionary(s) of available attributes and
features.

Quote:
EAV's don't need to store nulls. You just don't include that entry. If
a given index engine is not null-friendly, then perhaps an EAV is
better than a sparse table as far as performance.


No. EAVs don't store nulls. But their structure makes creating usable
indexes extremely difficult if not impossible. That makes me question
whether performance would be better with an EAV.
I don't see why this would be the case. Being a "skinny" table, you
can easily index every column. It is hard to out-index that.

Quote:
The idea of a sparse table is ugly because it involves nulls.
Nulls--especially those that indicate 'there shouldn't be a value
here'--should be avoided whenever possible. If you're interested, there
have been many discussions on cdt regarding nulls. If you can wade
through
the flames, you might even find some useful information.

Data has an inherent structure.

Sometimes its "inherent structure" does NOT fit static tables well.
The real world is not always friendly to a given abstraction. And,
although I am against "nulls" for strings, the concept of some kind of
"empty" cell (zero length) is not necessarily a bad thing. (In some
RDBMS, zero length and null are the same thing, in others they are
not.)


I don't think you understand "inherent structure."
I suspect you have a personal pet concept that you gave that label.
But you're free to prove me wrong. "Structure" is usually a human-
built abstract model to approximate or describe some limited aspect of
the real world, and there are many possible ways to model the same
given thing.

Quote:
Stuffing everything into one table, whether
it be an EAV table or a sparse table imposes an alien structure on the
data,
introducing redundancy and complexity and usually reducing performance.
A
solution that has one table per widget type also imposes a structure on
the
data, one that inevitably introduces redundancy. The 'table-happy'
design
that I suggested earlier does not introduce redundancy and does not
require
nulls.

Your solution does NOT reduce redundancy because the foreign key has
to be repeated over and over again. And, empty cells don't
necessarily reduce performance, per above. And, filling up table-space
is ugly and not debugging-friendly. Fat table-spaces slow me down. I
cannot speak for everybody's psychology and hand-eye-mouse-brain
coordination, but it slows down MINE.


By most, foreign key references do not constitute redundancy.
I disagree. For one, it fills up the schemas with duplication
regardless of what is actually in the tables. Human eyes have to sift
over all that.

Quote:
In direct
image systems, foreign key references are problematic because updates must
often physically cascade. But not all systems work that way. Some systems
internally use pointers instead of duplicating what is referenced so that
physical cascades are unnecessary. And don't kid yourself: empty cells
necessarily reduce performance, they just may not reduce it noticably.
I would bet that joins slow things down more than empty cells. But we
can agree to treat it as a wash for this debate.

Quote:
I will respect your choice because the tradeoffs are all sticky either
way. But, I do not agree with it, nor do I agree with your performance
criticisms and redundancy claims. They appear incorrect.

-T-




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.