dbTalk Databases Forums  

table configuration tweak for performance gain.

comp.databases.postgresql.general comp.databases.postgresql.general


Discuss table configuration tweak for performance gain. in the comp.databases.postgresql.general forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Harvey, Allan AC
 
Posts: n/a

Default table configuration tweak for performance gain. - 11-14-2004 , 11:25 PM






Hi all,
I was hoping someone might be able to set me straight
so that I can get some select performance improvements.

The table in question has several thousand rows currently,
planning several million, it has a date time stamp column.
All selects will be on ranges of the date time column.

I created an index on the dt column, ran ANALYSE then,
EXPLAIN for some queries.
The returned plan was always sequential search.
The SELECT time also indicates sequential search.

Is there anything I can do to improve SELECT performance
on what is essentially an ordered table?

Thanks in advance.
Allan

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


Reply With Quote
  #2  
Old   
Michael Fuhr
 
Posts: n/a

Default Re: table configuration tweak for performance gain. - 11-14-2004 , 11:59 PM






On Mon, Nov 15, 2004 at 04:25:59PM +1100, Harvey, Allan AC wrote:

Quote:
The table in question has several thousand rows currently,
planning several million, it has a date time stamp column.
All selects will be on ranges of the date time column.

I created an index on the dt column, ran ANALYSE then,
EXPLAIN for some queries.
The returned plan was always sequential search.
The SELECT time also indicates sequential search.

Is there anything I can do to improve SELECT performance
on what is essentially an ordered table?
Could you post a query and the EXPLAIN ANALYZE output? We could
probably give better advice if we could see what's happening.

Have you experimented with lowering random_page_cost?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html



Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: table configuration tweak for performance gain. - 11-15-2004 , 09:26 AM



Michael Fuhr <mike (AT) fuhr (DOT) org> writes:
Quote:
On Mon, Nov 15, 2004 at 04:25:59PM +1100, Harvey, Allan AC wrote:
I created an index on the dt column, ran ANALYSE then,
EXPLAIN for some queries.
The returned plan was always sequential search.

Could you post a query and the EXPLAIN ANALYZE output? We could
probably give better advice if we could see what's happening.
Also, let's see EXPLAIN ANALYZE results after setting enable_seqscan to
OFF. If that doesn't force it into an indexscan, then you have got
more fundamental issues (perhaps a datatype mismatch). Note that I
don't recommend turning off enable_seqscan as a production solution;
but it's a useful tool for debugging.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org



Reply With Quote
  #4  
Old   
Harvey, Allan AC
 
Posts: n/a

Default Re: table configuration tweak for performance gain. - 11-15-2004 , 05:39 PM



Tom, Michael,

Thanks for your interests. My original post was scant
on detail as I was unsure if I had found the right place.
It appears I have, so...

Version is 7.4.5

Table size these tests were carried out on:-
mill2=> select count(*) from history\g
count
--------
258606
(1 row)

Before index:-
mill2=> \d history
Table "public.history"
Column | Type | Modifiers
-----------+-----------------------------+-----------
pointname | character varying(32) | not null
parameter | character varying(8) | not null
value | double precision | not null
dt | timestamp without time zone | not null

snip.....
WARNING: skipping "pg_conversion" --- only table or database owner can analyze it
WARNING: skipping "pg_depend" --- only table or database owner can analyzeit
ANALYZE
mill2=> explain select value from history where pointname = 'MILL2-SPEED' and dt < now() - interval '5 minutes'\g
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Seq Scan on history (cost=0.00..8276.82 rows=8982 width=8)
Filter: (((pointname)::text = 'MILL2-SPEED'::text) AND ((dt)::timestamp with time zone < (now() - '00:05:00'::interval)))
(2 rows)

After index:-
mill2=> create index dtindex on history( dt )\g
CREATE INDEX
mill2=> \d history
Table "public.history"
Column | Type | Modifiers
-----------+-----------------------------+-----------
pointname | character varying(32) | not null
parameter | character varying(8) | not null
value | double precision | not null
dt | timestamp without time zone | not null
Indexes:
"dtindex" btree (dt)

snip....
WARNING: skipping "pg_conversion" --- only table or database owner can analyze it
WARNING: skipping "pg_depend" --- only table or database owner can analyzeit
ANALYZE
mill2=> explain select value from history where pointname = 'MILL2-SPEED' and dt < now() - interval '5 minutes'\g
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Seq Scan on history (cost=0.00..8263.19 rows=9342 width=8)
Filter: (((pointname)::text = 'MILL2-SPEED'::text) AND ((dt)::timestamp with time zone < (now() - '00:05:00'::interval)))
(2 rows)

Quote:
don't recommend turning off enable_seqscan as a production solution
On your advise I did not go there.

On using BETWEEN:-
mill2=> select value from history where pointname = 'MILL2-SPEED' and dt between now() and now() - interval '5 minutes'\g
value
-------
(0 rows)

mill2=> select value from history where pointname = 'MILL2-SPEED' and dt < now() - interval '5 minutes'\g
value
---------
85606.9
85606.9
85606.9
85606.9
85606.9
85606.9
etc.....

I have obviously used it wrong but cannot see how/why.

Thanks again.

Allan




Quote:
-----Original Message-----
From: Tom Lane [mailto:tgl (AT) sss (DOT) pgh.pa.us]
Sent: Tuesday, 16 November 2004 2:26
To: Michael Fuhr
Cc: Harvey, Allan AC; pgsql-general (AT) postgresql (DOT) org
Subject: Re: [GENERAL] table configuration tweak for
performance gain.


Michael Fuhr <mike (AT) fuhr (DOT) org> writes:
On Mon, Nov 15, 2004 at 04:25:59PM +1100, Harvey, Allan AC wrote:
I created an index on the dt column, ran ANALYSE then,
EXPLAIN for some queries.
The returned plan was always sequential search.

Could you post a query and the EXPLAIN ANALYZE output? We could
probably give better advice if we could see what's happening.

Also, let's see EXPLAIN ANALYZE results after setting
enable_seqscan to
OFF. If that doesn't force it into an indexscan, then you have got
more fundamental issues (perhaps a datatype mismatch). Note that I
don't recommend turning off enable_seqscan as a production solution;
but it's a useful tool for debugging.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster



Reply With Quote
  #5  
Old   
Michael Fuhr
 
Posts: n/a

Default Re: table configuration tweak for performance gain. - 11-16-2004 , 01:29 AM



On Tue, Nov 16, 2004 at 10:39:10AM +1100, Harvey, Allan AC wrote:

Quote:
mill2=> explain select value from history where pointname = 'MILL2-SPEED' and dt < now() - interval '5 minutes'\g
We were looking for the output from "explain analyze select ...."
With EXPLAIN ANALYZE we can see how realistic the planner's estimates
were.

Quote:
Seq Scan on history (cost=0.00..8263.19 rows=9342 width=8)
Filter: (((pointname)::text = 'MILL2-SPEED'::text) AND ((dt)::timestamp with time zone < (now() - '00:05:00'::interval)))
You declared dt to be TIMESTAMP WITHOUT TIME ZONE, so its index
won't be used because the filter's type is TIMESTAMP WITH TIME ZONE.
Try casting the filter to dt's type by using now()::TIMESTAMP (this
won't be necessary in 8.0).

Aside from the type issue, the planner estimates that the query
will return 9342 rows, so even if it could use an index it might
think a sequential scan will be faster. The output from EXPLAIN
ANALYZE would tell us if that guess is correct.

Please show us the output of EXPLAIN ANALYZE after you've modified
the query to use now()::TIMESTAMP. If the query still does a
sequential scan then execute "SET enable_seqscan TO off", run EXPLAIN
ANALYZE again, and show us that output as well.

Quote:
On using BETWEEN:-
mill2=> select value from history where pointname = 'MILL2-SPEED' and dt between now() and now() - interval '5 minutes'\g
value
-------
(0 rows)
"a BETWEEN x AND y" is equivalent to "a >= x AND a <= y", so
if x and y aren't chosen correctly then the expression will
always evaluate to false:

SELECT 5 BETWEEN 1 AND 10;
?column?
----------
t

SELECT 5 BETWEEN 10 AND 1;
?column?
----------
f

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly



Reply With Quote
  #6  
Old   
Harvey, Allan AC
 
Posts: n/a

Default Re: table configuration tweak for performance gain. - 11-16-2004 , 07:41 PM



Michael,

I don't mind at all.

3.244 ms compared to 15706.179 ms.
A sizeable difference.

Screen scraps follow.

Allan

mill2=> set enable_seqscan=off\g
SET
mill2=> explain analyse select count(*) from history where pointname = 'MILL2-SPEED' and dt > (now() - interval '5 minutes')::timestamp\g
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=34815.05..34815.05 rows=1 width=0) (actual time=2..267..2.275 rows=1 loops=1)
-> Index Scan using dtindex on history (cost=0.00..34783.32 rows=12690 width=0) (actual time=1.931..1.931 rows=0 loops=1)
Index Cond: (((pointname)::text = 'MILL2-SPEED'::text) AND (dt >((now() - '00:05:00'::interval))::timestamp without time zone))
Total runtime: 3.244 ms
(4 rows)


mill2=> set enable_seqscan=on\g
SET
mill2=> select count(*) from history where pointname = 'MILL2-SPEED' and dt > (now() - interval '5 minutes')::timestamp\g
count
-------
0
(1 row)

mill2=> explain analyse select count(*) from history where pointname = 'MILL2-SPEED' and dt > (now() - interval '5 minutes')::timestamp\g
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10163.50..10163.50 rows=1 width=0) (actual time=15705.388..15705.395 rows=1 loops=1)
-> Seq Scan on history (cost=0.00..10131.77 rows=12690 width=0) (actual time=15705.286..15705.286 rows=0 loops=1)
Filter: (((pointname)::text = 'MILL2-SPEED'::text) AND (dt > ((now() - '00:05:00'::interval))::timestamp without time zone))
Total runtime: 15706.179 ms
(4 rows)




Quote:
-----Original Message-----
From: Michael Fuhr [mailto:mike (AT) fuhr (DOT) org]
Sent: Wednesday, 17 November 2004 12:19
To: Harvey, Allan AC
Cc: Tom Lane
Subject: Re: [GENERAL] table configuration tweak for performance gain.


On Wed, Nov 17, 2004 at 09:39:11AM +1100, Harvey, Allan AC wrote:

The solution then was:-
an index of the right columns.
explicit, exact type casting ( I'm a casual ingres user, type
casting is something I never need or think you can do)

PostgreSQL 8.0 will allow cross-type index usage, making the explicit
cast unnecessary.

turning off enable_seqscan for specific queries seemed
to help a bit too.

Performance in general might improve if you address the planner's
reasons for chosing an inefficient plan. Even though you're satisifed
with performance now, would you mind posting the output of "EXPLAIN
ANALYZE select ..." with enable_seqscan on and then with it off?
If nothing else, an analysis might be educational for others.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match



Reply With Quote
  #7  
Old   
Michael Fuhr
 
Posts: n/a

Default Re: table configuration tweak for performance gain. - 11-16-2004 , 08:45 PM



On Wed, Nov 17, 2004 at 12:41:20PM +1100, Harvey, Allan AC wrote:

Quote:
mill2=> explain analyse select count(*) from history where pointname = 'MILL2-SPEED' and dt > (now() - interval '5 minutes')::timestamp\g
How typical is this query? How many distinct values does pointname
have? Have you considered adding an index on pointname or a multicolumn
index on pointname and dt?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.