dbTalk Databases Forums  

How to write a report?

comp.databases.postgresql comp.databases.postgresql


Discuss How to write a report? in the comp.databases.postgresql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Perry Smith
 
Posts: n/a

Default How to write a report? - 12-11-2006 , 11:48 AM






I have a database with the log entries from my Apache log. Right now I
have only one table which is the log entry split into fields. The two
intersting columns is a date and a path. Each access (every file for
every user for every web hit) creates a log entry.

Lets suppose I create a second table of "interesting paths". This
table would be a path (this time the path would be a unique key) and a
boolean of true or false -- lets assume the default would be false. I
would make entries into this table automatically as I add rows to the
first table (if a row needs to be added) and let the boolean default to
false (not interesting). Later, I would look at this table and flip
the boolean to true for the paths that I see as interesting. Over
time, I may be able to do this with a script using regular
expressions... I'm not sure -- but thats not where I'm needing help.

Now, I'd like to have a report produced. It would take a date range.
For each interesting file it would print out a line for each day with
the number of hits for that file on that day. At the end of the list
for that particular file would be some statistics like total hits over
the date range and average hits over the date range.

Can someone get me started? I've never used the count and average type
facilities in any DB. I think all this could be done with one happy
mega select but I get lost with the grouping and other directives. I'm
happy to add in whatever I need to such views or internal procedures to
get this to work halfway effeciently.

A few things that would help in particular:

1) The date is a full date/time stamp with time zone. How is a good
way to select all the entries for a particular day?

2) I'd like to understand how do you "think" when you try to write a
non-trivial select using aggregate functions to produce a "report".

3) Do you try and do it in one select or do you break it down and have
some language drive multiple selects? (at what point do you stop
trying to do everything with one mega-select?)

Thank you for your help,
--
Perry Smith (*pedz (AT) easesoftware (DOT) com )
Ease Software, Inc. (*http://www.easesoftware.com )

Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems


Reply With Quote
  #2  
Old   
Laurenz Albe
 
Posts: n/a

Default Re: How to write a report? - 12-12-2006 , 02:44 AM






Perry Smith <pedz (AT) easesoftware (DOT) com> wrote:
Quote:
I have a database with the log entries from my Apache log. Right now I
have only one table which is the log entry split into fields. The two
intersting columns is a date and a path. Each access (every file for
every user for every web hit) creates a log entry.

Lets suppose I create a second table of "interesting paths". This
table would be a path (this time the path would be a unique key) and a
Just a note of caution:
- an index on a long column will be large - if your path names become
too long ( > 2713 bytes), you will not be able to index them at all.

Quote:
boolean of true or false -- lets assume the default would be false. I
would make entries into this table automatically as I add rows to the
first table (if a row needs to be added) and let the boolean default to
false (not interesting). Later, I would look at this table and flip
the boolean to true for the paths that I see as interesting. Over
time, I may be able to do this with a script using regular
expressions... I'm not sure -- but thats not where I'm needing help.

Now, I'd like to have a report produced. It would take a date range.
For each interesting file it would print out a line for each day with
the number of hits for that file on that day. At the end of the list
for that particular file would be some statistics like total hits over
the date range and average hits over the date range.

Can someone get me started? I've never used the count and average type
facilities in any DB. I think all this could be done with one happy
mega select but I get lost with the grouping and other directives. I'm
happy to add in whatever I need to such views or internal procedures to
get this to work halfway effeciently.

A few things that would help in particular:

1) The date is a full date/time stamp with time zone. How is a good
way to select all the entries for a particular day?
WHERE date_trunc('day', timestamp_field) = '2006-12-12'

Note that if you want to take advantage of an index with this query,
you'll have to create an index based on an expression:

CREATE INDEX iname ON tabname (date_trunc('day', timestamp_field));

Quote:
2) I'd like to understand how do you "think" when you try to write a
non-trivial select using aggregate functions to produce a "report".
This question cannot be answered in general, but for your case think
like this:
- Identify all the tables from which you need information.
- Join them via the foreign keys (make sure there are indices on the
foreign keys).
- Add conditions so you get only those rows that are interesting to you.
- Use a GROUP BY condition that groups together all the rows that
you want to appear as a single line in the output. All sums will be
calculated per group.

Quote:
3) Do you try and do it in one select or do you break it down and have
some language drive multiple selects? (at what point do you stop
trying to do everything with one mega-select?)
I try to do it in one select because it is usually more efficient.
I resort to something else if I have good reasons for it.

In your case, I'd use a combination:

- Write an SQL query that returns the results, grouped by file and day,
counting the number of hits per file and day.
- While you process the result set in the programming language of your
choice, sum up the individual counts for a grand total line that you
output in the end.
This way you can save yourself running a second query for the grand total,
which would take as long as the first query!

I hope this gives you some hints.

Yours,
Laurenz Albe


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.