dbTalk Databases Forums  

Re: [BUGS] concurrent drop table with fkeys corrupt pg_trigger

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss Re: [BUGS] concurrent drop table with fkeys corrupt pg_trigger in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Qingqing Zhou
 
Posts: n/a

Default Re: [BUGS] concurrent drop table with fkeys corrupt pg_trigger - 05-26-2005 , 09:59 AM







"Qingqing Zhou" <zhouqq (AT) cs (DOT) toronto.edu> writes
Quote:
If we concurrently perform drop/create table (with foreign keys) commands
several times, we could corrupt the pg_trigger system table.

Anybody reproduced it?

Regards,
Qingqing



---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Reply With Quote
  #2  
Old   
Brandon Black
 
Posts: n/a

Default Re: [BUGS] concurrent drop table with fkeys corrupt pg_trigger - 11-01-2005 , 08:14 AM






Quote:
Subject: Re: concurrent drop table with fkeys corrupt pg_trigger
Date: Thu, 26 May 2005 09:47:25 +0800"
Qingqing Zhou" <zhouqq ( at ) cs ( dot ) toronto ( dot ) edu> writes
If we concurrently perform drop/create table (with foreign keys) commands
several times, we could corrupt the pg_trigger system table.


Anybody reproduced it?

Regards,
Qingqing
There might be something to this. I'm running in the neighborhood of
~200 writing transactions per second 24/7 on Postgresql 8.1 beta4 at
the moment, and getting some related symptoms.

There is a table "important_table", whose primary key is an fkey to
many, many other tables in the database. This table has no triggers
on it. Every morning at roughly 7am, a cronjob kicks off and does a
reasonably large number of "CREATE TABLE", "CREATE TRIGGER" (on the
new table), and "DROP TABLE" statements (they're part of an
inheritance-based table partitioning scheme based on timestamps - it's
dropping outdated tables and making new tables for the upcoming
timeframes). There are no transactions actually directly using the
tables being created or dropped at the time (since they're outside the
reasonable range of possible current timestamps). All of the
created/dropped tables of course reference the primary key in
"important_table".

I keep a log of the (very few) failed transactions we get, and every
morning at the same time that cron job runs, we get a handful of
client transactions failing out during a SELECT statement, with the
error:

ERROR: too many trigger records found for relation "important_table"

But then it all goes back to normal until it happens again the next
morning. Remember, "important_table" has no triggers that I know of.=20
I suspect that when tables are in the process of being created or
dropped which have fkeys in "important_table", some kind of internal
temporary trigger is created on "important_table", and that there's a
bug in there somewhere?

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] concurrent drop table with fkeys corrupt pg_trigger - 11-01-2005 , 08:32 AM



Brandon Black <blblack (AT) gmail (DOT) com> writes:
Quote:
ERROR: too many trigger records found for relation "important_table"

But then it all goes back to normal until it happens again the next
morning. Remember, "important_table" has no triggers that I know of.
.... except all the foreign-key triggers.

I think what's happening here is that a backend reads the pg_class entry
for "important_table", sees it has some triggers (because reltriggers is
nonzero), and then goes to scan pg_triggers to find them. By the time
it manages to do the scan, somebody else has committed an addition of a
trigger. Since we use SnapshotNow for reading system catalogs, the
added row is visible immediately, and so you get the complaint that the
contents of pg_trigger don't match up with what we saw in
pg_class.reltriggers.

What's not immediately clear though is why this scenario isn't prevented
by high-level relation locking. We require the addition of the trigger
to take exclusive lock on the table, so how come the reader isn't
blocked until that finishes?

[ checks old notes... ] Hm, it seems this has already come up:
http://archives.postgresql.org/pgsql...0/msg01413.php
When loading a relcache entry, we really ought to obtain some lock on
the relation *before* reading the catalogs. My recollection is that
this would have been pretty painful back in 2002, but maybe with
subsequent restructuring it wouldn't be so bad now.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.