dbTalk Databases Forums  

[BUGS] pg_autovacuum: short, wide tables

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] pg_autovacuum: short, wide tables in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
mark reid
 
Posts: n/a

Default [BUGS] pg_autovacuum: short, wide tables - 07-07-2005 , 02:26 PM






Hi,

I've been using pg_autovacuum for a while, and for the most part it's
been great. There's one case in my system where it won't run on a
particular type of table, even though the table apparently needs it.

I have a table called "properties" that has key->value pairs. Usually
there are only a handful of rows, some of which are updated relatively
frequently compared to the number of rows (hundreds or thousands of
times per day). The problem is that some of the rows have long strings
for their value (on the order of a few hundred kilobytes), so if I
understand correctly, the bulk of the data gets offloaded to a toast
table.

What I believe is happening is that the main table doesn't meet the
minimum activity level for pg_autovacuum based on size / update
frequency, but the toast table would, though it isn't specifically
checked by pg_autovacuum. The result is that the toast table grows
really big before triggering autovacuum (or until I manually vacuum the
"properties" table). Not the end of the world, obviously, but might be
a "gotcha" for some people with similar situations.

Below is a snippet of output from a run of vacuumdb --full --analyze
--verbose that should illustrate the problem.

-Mark.

Table Def:
Table "schema_name.properties"
Column | Type | Modifiers
--------+-------------------+-----------
name | character varying |
value | character varying |
Indexes:
"properties_name_key" unique, btree (name)


Vacuum verbose output:

INFO: vacuuming "schema_name.properties"
INFO: "properties": found 1361 removable, 8 nonremovable row versions
in 172 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 63 to 1705 bytes long.
There were 4827 unused item pointers.
Total free space (including removable row versions) is 1376288 bytes.
164 pages are or will become empty, including 0 at the end of the table.
172 pages containing 1376288 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.06 sec.
INFO: index "properties_name_key" now contains 8 row versions in 15 pages
DETAIL: 1361 index row versions were removed.
8 index pages have been deleted, 8 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO: "properties": moved 8 row versions, truncated 172 to 1 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO: index "properties_name_key" now contains 8 row versions in 15 pages
DETAIL: 8 index row versions were removed.
8 index pages have been deleted, 8 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_10043014"
INFO: "pg_toast_10043014": found 21052 removable, 24 nonremovable row
versions in 21100 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 1117 to 2030 bytes long.
There were 63278 unused item pointers.
Total free space (including removable row versions) is 172044376 bytes.
21093 pages are or will become empty, including 0 at the end of the table.
21096 pages containing 172044264 free bytes are potential move destinations.
CPU 0.41s/0.06u sec elapsed 3.63 sec.
INFO: index "pg_toast_10043014_index" now contains 24 row versions in
321 pages
DETAIL: 21052 index row versions were removed.
317 index pages have been deleted, 317 are currently reusable.
CPU 0.00s/0.01u sec elapsed 0.40 sec.
INFO: "pg_toast_10043014": moved 24 row versions, truncated 21100 to 6
pages
DETAIL: CPU 0.32s/1.04u sec elapsed 5.27 sec.
INFO: index "pg_toast_10043014_index" now contains 24 row versions in
321 pages
DETAIL: 24 index row versions were removed.
317 index pages have been deleted, 317 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "schema_name.properties"
INFO: "properties": 1 pages, 8 rows sampled, 8 estimated total rows

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org

Reply With Quote
  #2  
Old   
Matthew T. O'Connor
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 09:10 AM






mark reid wrote:

Quote:
I've been using pg_autovacuum for a while, and for the most part it's
been great. There's one case in my system where it won't run on a
particular type of table, even though the table apparently needs it.

I have a table called "properties" that has key->value pairs. Usually
there are only a handful of rows, some of which are updated relatively
frequently compared to the number of rows (hundreds or thousands of
times per day). The problem is that some of the rows have long
strings for their value (on the order of a few hundred kilobytes), so
if I understand correctly, the bulk of the data gets offloaded to a
toast table.
What I believe is happening is that the main table doesn't meet the
minimum activity level for pg_autovacuum based on size / update
frequency, but the toast table would, though it isn't specifically
checked by pg_autovacuum. The result is that the toast table grows
really big before triggering autovacuum (or until I manually vacuum
the "properties" table). Not the end of the world, obviously, but
might be a "gotcha" for some people with similar situations.

I don't think the problem has to do with toast, or pg_autovacuum missing
the fact that the toast table has been updated. Rather I think the
problem is that autovacuum believes that all updates are created equal.
That is 1 update is 1 update even though a single update may effect 1
page or thousands of pages. This is where FSM data needs to be
integrated into pg_autovacuum. This isn't going to happen soon (not for
8.1) but it is definatley planned.

With the inclusion of autovacuum into the backend for 8.1, you will at
least be able to set per table thresholds and set a more aggressive
threshold for this table.

Matt



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 10:24 AM



"Matthew T. O'Connor" <matthew (AT) tocr (DOT) com> writes:
Quote:
mark reid wrote:
What I believe is happening is that the main table doesn't meet the
minimum activity level for pg_autovacuum based on size / update
frequency, but the toast table would, though it isn't specifically
checked by pg_autovacuum.

I don't think the problem has to do with toast, or pg_autovacuum missing
the fact that the toast table has been updated. Rather I think the
problem is that autovacuum believes that all updates are created equal.
I think Mark is probably on to something. The activity in the toast
table will show as deletes *in the toast table* ... and that activity
fails to show at all in the pg_stat_activity view, because it shows
only plain relations! So unless autovacuum is ignoring the stats views
and going directly to the underlying stats functions, it cannot see
at all that there is excessive activity in the toast table.

It strikes me that this is a definitional bug in the stats views.
Either we should change the filter to be "regular and toast tables",
or we should add columns to show activity in a table's toast table,
or we should just add the activity in the toast table to the parent
table's activity columns.

The first of these would be easiest but it seems quite likely to break
applications (eg, if unmodified, autovacuum would probably try to issue
vacuums against toast tables). And the last seems to be confusing.
So I think I favor adding columns.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Reply With Quote
  #4  
Old   
Matthew T. O'Connor
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 11:31 AM



Tom Lane wrote:

Quote:
I think Mark is probably on to something. The activity in the toast
table will show as deletes *in the toast table* ... and that activity
fails to show at all in the pg_stat_activity view, because it shows
only plain relations! So unless autovacuum is ignoring the stats views
and going directly to the underlying stats functions, it cannot see
at all that there is excessive activity in the toast table.


I think I'm missing something here. If I have a table t1 with a long
text column, and I do an update on that text column, doesn't that show
up as an update on table t1? And when there are enough upd/del
autovacuum will issue a VACUUM against t1, which will clean up the
associated toast table, right? So I think I must be missing something.
Could you please explain the problem in a little more detail.

Quote:
It strikes me that this is a definitional bug in the stats views.
Either we should change the filter to be "regular and toast tables",
or we should add columns to show activity in a table's toast table,
or we should just add the activity in the toast table to the parent
table's activity columns.


Ok, maybe I get it now, are you saying that if I do:
update t1 set "long text column" = "some very long text value"
and the update doesn't touch any non-toast columns that the stats system
will not show that update against t1? If so, that is a problem.

Quote:
The first of these would be easiest but it seems quite likely to break
applications (eg, if unmodified, autovacuum would probably try to issue
vacuums against toast tables). And the last seems to be confusing.
So I think I favor adding columns.


Shouldn't the update to the toast table just be considered an update to
table t1? The fact that there is an underlying toast table is an
implementation detail that I don't think should show up in the stats system.

Matt


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #5  
Old   
Alvaro Herrera
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 11:56 AM



Quote:
Tom Lane wrote:

The first of these would be easiest but it seems quite likely to break
applications (eg, if unmodified, autovacuum would probably try to issue
vacuums against toast tables). And the last seems to be confusing.
So I think I favor adding columns.
Hmm. With integrated autovacuum, we could set something up to issue
vacuums separately to TOAST tables and the main table. It'd probably be
a tad easier if the toast stats are separate from the main table; and an
autovac of the main table not necessarily would invoke vacuuming the
toast table.

I'm not proposing it for 8.1 though ...

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #6  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 11:58 AM



Mark Reid <mail (AT) markreid (DOT) org> writes:
Quote:
I think the issue is that a single update to the main table causes a
whole bunch of updates to the toast table. So in my case (with the
vacuum output attached previously), a thousand updates to the main table
entails tens of thousands of updates to the toast table.
Exactly. If autovac were looking at the properties of the toast table
it would think a vacuum pass was warranted sooner than it thinks from
just looking at the main table.

Admittedly this doesn't come into play unless you have a fairly large
number of toast chunks per main-table row, so the rows in question have
to be really wide (dozens of KB even after compression) before it gets
to be a big deal.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #7  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 12:02 PM



"Matthew T. O'Connor" <matthew (AT) zeut (DOT) net> writes:
Quote:
Shouldn't the update to the toast table just be considered an update to
table t1? The fact that there is an underlying toast table is an
implementation detail that I don't think should show up in the stats system.
At the level of the stats system, though, you are interested in
"implementation details". The fact that there is such a concept as an
index is an implementation detail according to the SQL standard --- but
if we hid that we wouldn't be able to show things that people want to
know.

In particular, I think people would like to be able to use the stats
views to see how much toast-related I/O is going on, and not have that
smushed together with main-table I/O.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


Reply With Quote
  #8  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 12:13 PM



Alvaro Herrera <alvherre (AT) alvh (DOT) no-ip.org> writes:
Quote:
Tom Lane wrote:
The first of these would be easiest but it seems quite likely to break
applications (eg, if unmodified, autovacuum would probably try to issue
vacuums against toast tables). And the last seems to be confusing.
So I think I favor adding columns.

Hmm. With integrated autovacuum, we could set something up to issue
vacuums separately to TOAST tables and the main table. It'd probably be
a tad easier if the toast stats are separate from the main table; and an
autovac of the main table not necessarily would invoke vacuuming the
toast table.

I'm not proposing it for 8.1 though ...
Well, why not? Arguably Mark's problem is a bug, and it's not too late
to address bugs.

I checked what actually happens if you try to vacuum a toast table:

regression=# vacuum pg_toast.pg_toast_169901;
WARNING: skipping "pg_toast_169901" --- cannot vacuum indexes, views, or special system tables
VACUUM
regression=#

So it's not as bad as I thought. Maybe just weakening the filter in the
stats tables views isn't a bad idea after all. Furthermore, we could
allow VACUUM on a toast table to go through (is there any good reason
to disallow it?) and then autovacuum would do more or less the right
things with no further changes.

I'm not sure about the idea of not vacuuming the toast table when we
decide to vacuum the main table.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


Reply With Quote
  #9  
Old   
Matthew T. O'Connor
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 12:24 PM



Tom Lane wrote:

Quote:
"Matthew T. O'Connor" <matthew (AT) zeut (DOT) net> writes:


Shouldn't the update to the toast table just be considered an update to
table t1? The fact that there is an underlying toast table is an
implementation detail that I don't think should show up in the stats system.



At the level of the stats system, though, you are interested in
"implementation details". The fact that there is such a concept as an
index is an implementation detail according to the SQL standard --- but
if we hid that we wouldn't be able to show things that people want to
know.

In particular, I think people would like to be able to use the stats
views to see how much toast-related I/O is going on, and not have that
smushed together with main-table I/O.

Fair enough, but how are you planning to display the data, if the stat
system just reports that there was an update to a corresponding toast
table, that still isn't going to tell us how many pages that updated
effected, and then we are back to the all updates are not created equal
problem. Currently autovac doesn't look at the block level stats, maybe
it should for this reason.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #10  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] pg_autovacuum: short, wide tables - 07-08-2005 , 12:30 PM



"Matthew T. O'Connor" <matthew (AT) zeut (DOT) net> writes:
Quote:
Tom Lane wrote:
In particular, I think people would like to be able to use the stats
views to see how much toast-related I/O is going on, and not have that
smushed together with main-table I/O.

Fair enough, but how are you planning to display the data,
At the moment I'm thinking "just like a regular table" --- see my
last message to Alvaro. The existing backend code will count each
toast-chunk insert or delete just like a normal row insert or delete,
so AFAICS this will produce sane-looking stats that autovac could
use the same way as for a plain table.

The main bit of additional logic that might be needed is an awareness
that firing a VACUUM on a main table will implicitly fire one on its
toast table, and so you'd not want to go and issue the toast table
VACUUM separately.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.