dbTalk Databases Forums  

Design doubt: better extra columns or extra queries?

comp.databases comp.databases


Discuss Design doubt: better extra columns or extra queries? in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
lookmarge@gmail.com
 
Posts: n/a

Default Design doubt: better extra columns or extra queries? - 10-01-2006 , 12:22 AM






Imagine the following scenario:

Article
- aid
- title
- body
- user_id

Votes
- vid
- rate
- user_id
- article_id

Votes keeps a list of tuples composed by the rate, the article being
voted and the user who is casting his/her vote. Assume that a vote is a
number between 0 and 10. I need this table, in order to make sure that
a user votes only once and also for statistics purposes. In my web
application, every time I display an article, I need to show what's the
average vote for that article.

Quote:
From a database design viewpoint, what's the best approach from the
followings?

1) Add the columns "sum_of_votes" and "number_of_votes" to the
'Article' table, in this way I won't have to query 'Votes' everytime I
need to just display an average value. I will just retrieve
sum_of_votes/number_of_votes to get the average. Everytime a vote is
cast, beside the insert into 'Votes' I will also perform an insert into
'Article' to update the values of "sum_of_votes" and "number_of_votes".

2) Keep the database structure as it is, and perform a query on 'Votes'
to retrieve the average of all the votes for the given article. 'Votes'
I assume, can become a pretty big table.

3) Other solution...

I often encounter a similar situation for average, sums, or similar
data that needs to be retrieved, and I wonder what's the best approach.
Please help me understand and learn from your expertise.

Thank-you.



Reply With Quote
  #2  
Old   
Stefan Nobis
 
Posts: n/a

Default Re: Design doubt: better extra columns or extra queries? - 10-01-2006 , 03:33 AM






lookmarge (AT) gmail (DOT) com writes:

Quote:
Imagine the following scenario:

Article
- aid
- title
- body
- user_id

Votes
- vid
- rate
- user_id
- article_id
Hmmm... Why not just this:

Votes
- user_id [unique]
- article_id
- rate

with (user_id, article_id) as primary key?

Quote:
1) Add the columns "sum_of_votes" and "number_of_votes" to the
'Article' table, in this way I won't have to query 'Votes' everytime
This would be a performance hack and you should only do it, if it's
really, really necessary (and even then think twice about it). To
abstract this away (so in the really unlikely case you need this
terrible performance hack, you can implement it without changing the
application code; btw you should use database triggers to implement
this hack, don't do it via application code) you could use a view (or
maybe two) that give you all the needed data.

Quote:
2) Keep the database structure as it is, and perform a query on 'Votes'
to retrieve the average of all the votes for the given article. 'Votes'
I assume, can become a pretty big table.
How big may that be? Such a simple table with about some dozens of
million entries should be no problem to query even on average PC
hardware and mediocre DBMS.

Quote:
I often encounter a similar situation for average, sums, or similar
data that needs to be retrieved, and I wonder what's the best
approach.
Let the DBMS do the job -- that's (part of) what they are designed
for. Why do you think are aggregate functions like AVG, SUM,
COUNT,... are part oft SQL? Only use performance hacks if really
everything else (optimize query, use of indices, performance tuning of
the DBMS and/or OS, faster disks (dedicated database disk, RAID-10,
controller cache,...) or hardware and the like) fails or is just not
affordable. Data integrety is very valuable -- don't throw it away
easily.

--
Stefan.


Reply With Quote
  #3  
Old   
Stefan Nobis
 
Posts: n/a

Default Re: Design doubt: better extra columns or extra queries? - 10-01-2006 , 03:43 AM



lookmarge (AT) gmail (DOT) com writes:

Quote:
1) Add the columns "sum_of_votes" and "number_of_votes" to the
'Article' table, in this way I won't have to query 'Votes' everytime
BTW: premature optimization is the root of all evil!

--
Stefan.


Reply With Quote
  #4  
Old   
--CELKO--
 
Posts: n/a

Default Re: Design doubt: better extra columns or extra queries? - 10-02-2006 , 02:05 PM



Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Could you also follow ISO-11179 rules for data
element names, too?

The first problem in your schema is that a user is not an attribute of
an Article. A user is a separatye entity.

CREATE TABLE Articles -- ther is more than one?
(article_id INTERGER NOT NULL PRIMARY KEY, --issn code?
article_title VARCHAR (50) NOT NULL,
article_body VARCHAR (50000) NOT NULL);

CREATE TABLE Users -- there is more than one?
(user_id CHAR(9)NOT NULL PRIMARY KEY, -- ssn?
user_name VARCHAR (35) NOT NULL,
etc. );

voting is a relationship, so it is in its own table:

CREATE TABLE Votes
(article_id INTERGER NOT NULL
REFERENCES Articles (article_id)
user_id CHAR(9) NOT NULL
REFERENCES Users(user_id),
article_score INTEGER DEFAULT 0 NOT NULL
CHECK (article_score BETWEEN 0 AND 10),
PRIMARY KEY ((article_id, user_id));

Quote:
I need to show what's the average vote for that article.
Use a VIEW which is always current instead of hard-wiring it into a
column, as if you were still using a 1950's file system.

CREATE VIEW AvgVotes (article_id, score_avg, voter_tally)
AS
SELECT article_id, AVG(article_score), COUNT(*)
FROM Votes
GROUP BY article_id;



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.