dbTalk Databases Forums  

[BUGS] Autovacuum deadlock - bug or not?

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] Autovacuum deadlock - bug or not? in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Mikael Carneholm
 
Posts: n/a

Default [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 08:13 AM






Don't know if this is a bug or just undocumented, but it seems as you shoul=
d turn off autovacuum before you run CLUSTER, otherwise you might run into =
a deadlock:

NOTICE: Clustering idx_vehicle_unit_data_200407_person_information__i d on =
vehicle_unit_data_200407
ERROR: deadlock detected
DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of d=
atabase 16390; blocked by process 15865.
Process 15865 waits for AccessShareLock on relation 68122 of database 16390=
; blocked by process 29022.

/Mikael

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly

Reply With Quote
  #2  
Old   
Mikael Carneholm
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:23 PM






Forgot to mention:=20

dfol=3D> select version();
version=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20
-------------------------------------------------------------
PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)


-----Original Message-----
From: Tom Lane [mailto:tgl (AT) sss (DOT) pgh.pa.us]
Sent: den 17 november 2005 16:04
To: Mikael Carneholm
Cc: pgsql-bugs (AT) postgresql (DOT) org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?=20


"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
variant: CLUSTER indexname ON tablename
Hmph. Looking at the code, that should always lock the table first,
so I don't see where the problem is. Would you look up the numbers
for us --- exactly which relations were involved in the deadlock,
and (if you can tell) which process was which?

Also, what PG version is this exactly?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:24 PM



"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
dfol=> select pgc.oid, pgc.relname from pg_class pgc where pgc.oid in (68950, 68122);
oid | relname
-------+--------------------------
68950 | vehicle_unit_data_200407
68122 | vehicle_unit_data_200301

NOTICE: Clustering idx_vehicle_unit_data_200407_person_information__i d on vehicle_unit_data_200407
ERROR: deadlock detected
DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of database 16390; blocked by process 15865.
Process 15865 waits for AccessShareLock on relation 68122 of database 16390; blocked by process 29022.

So it seems that it was the clustering of idx_vehicle_unit_data_200407_person_information__i d on vehicle_unit_data_200407 that caused the deadlock.
Hmm, the CLUSTER on vehicle_unit_data_200407 wouldn't have taken any
lock on vehicle_unit_data_200301. Were you perhaps issuing a series
of CLUSTERs inside a transaction block? That would pile up exclusive
locks on all the tables involved, which is certainly deadlock-prone.

I'm also wondering where that NOTICE "Clustering ..." came from, because
there is no such message anywhere in the 8.1 PG sources. You *sure*
this is 8.1?

There's something funny about 15865 too; you said that was an autovacuum
process but I don't think so. VACUUM doesn't take AccessShareLock;
there's a different lock type that that tries to acquire. And it
doesn't take any locks at all on more than one user table at a time.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


Reply With Quote
  #4  
Old   
Mikael Carneholm
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:25 PM



variant: CLUSTER indexname ON tablename

Maybe there should be something about this in the docs, so that users don't=
get surprised when this happens and start sending stupid emails to the pgs=
ql-bugs list

/Mikael



-----Original Message-----
From: Tom Lane [mailto:tgl (AT) sss (DOT) pgh.pa.us]
Sent: den 17 november 2005 15:51
To: Mikael Carneholm
Cc: pgsql-bugs (AT) postgresql (DOT) org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?=20


"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
Don't know if this is a bug or just undocumented, but it seems as you sho=
uld turn off autovacuum before you run CLUSTER, otherwise you might run int=
o a deadlock:

Which variant of CLUSTER were you using? ISTR that some of them lock
the index before the table, which is prone to deadlock against nearly
all other operations on the table (not just vacuum).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #5  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:25 PM



"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
variant: CLUSTER indexname ON tablename
Hmph. Looking at the code, that should always lock the table first,
so I don't see where the problem is. Would you look up the numbers
for us --- exactly which relations were involved in the deadlock,
and (if you can tell) which process was which?

Also, what PG version is this exactly?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #6  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:28 PM



"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
Unfortunately, relfilenodes 68950 and 68122 don't exist anymore,
You should be looking at pg_class.oid, not relfilenode.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
  #7  
Old   
Mikael Carneholm
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:31 PM



oops, you're right.

dfol=3D> select pgc.oid, pgc.relname from pg_class pgc where pgc.oid in (68=
950, 68122);
oid | relname=20=20=20=20=20=20=20=20=20=20
-------+--------------------------
68950 | vehicle_unit_data_200407
68122 | vehicle_unit_data_200301

NOTICE: Clustering idx_vehicle_unit_data_200407_person_information__i d on =
vehicle_unit_data_200407
ERROR: deadlock detected
DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of d=
atabase 16390; blocked by process 15865.
Process 15865 waits for AccessShareLock on relation 68122 of database 16390=
; blocked by process 29022.

So it seems that it was the clustering of idx_vehicle_unit_data_200407_pers=
on_information__id on vehicle_unit_data_200407 that caused the deadlock.


-----Original Message-----
From: Tom Lane [mailto:tgl (AT) sss (DOT) pgh.pa.us]
Sent: den 17 november 2005 16:53
To: Mikael Carneholm
Cc: pgsql-bugs (AT) postgresql (DOT) org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?=20


"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
Unfortunately, relfilenodes 68950 and 68122 don't exist anymore,
You should be looking at pg_class.oid, not relfilenode.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #8  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:32 PM



"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
Don't know if this is a bug or just undocumented, but it seems as you should turn off autovacuum before you run CLUSTER, otherwise you might run into a deadlock:
Which variant of CLUSTER were you using? ISTR that some of them lock
the index before the table, which is prone to deadlock against nearly
all other operations on the table (not just vacuum).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


Reply With Quote
  #9  
Old   
Mikael Carneholm
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:33 PM



Unfortunately, relfilenodes 68950 and 68122 don't exist anymore, could be t=
hat they were temporary copies of tables or indexes. I know that process 15=
865 was the autovacuum pid, I looked that up when it happened (pg was resta=
rted with autovacuum=3Doff afterwards, so that process is also gone)

Could it be that the deadlock was caused by autovacuum trying to vacuum one=
of the temp copies?

-----Original Message-----
From: Tom Lane [mailto:tgl (AT) sss (DOT) pgh.pa.us]
Sent: den 17 november 2005 16:04
To: Mikael Carneholm
Cc: pgsql-bugs (AT) postgresql (DOT) org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?=20


"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
variant: CLUSTER indexname ON tablename
Hmph. Looking at the code, that should always lock the table first,
so I don't see where the problem is. Would you look up the numbers
for us --- exactly which relations were involved in the deadlock,
and (if you can tell) which process was which?

Also, what PG version is this exactly?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #10  
Old   
Mikael Carneholm
 
Posts: n/a

Default Re: [BUGS] Autovacuum deadlock - bug or not? - 11-17-2005 , 07:39 PM



Quote:
Hmm, the CLUSTER on vehicle_unit_data_200407 wouldn't have taken any
lock on vehicle_unit_data_200301. Were you perhaps issuing a series
of CLUSTERs inside a transaction block? That would pile up exclusive
locks on all the tables involved, which is certainly deadlock-prone.
Ah, that explains it...the clustering is done from a function that's cluste=
ring all (child) tables of a parent, and as I've come to understand, all fu=
nctions have a built-in transaction (the "Clustering idx_foo on bar" notice=
message also originates from that function)

I'm pretty sure though that 15865 was the autovacuum process, but I'll have=
to run the function with autovacuum turned on before I can verify that thi=
s is (was) the case.

/Mikael


-----Original Message-----
From: Tom Lane [mailto:tgl (AT) sss (DOT) pgh.pa.us]
Sent: den 17 november 2005 17:42
To: Mikael Carneholm
Cc: pgsql-bugs (AT) postgresql (DOT) org
Subject: Re: [BUGS] Autovacuum deadlock - bug or not?=20


"Mikael Carneholm" <Mikael.Carneholm (AT) WirelessCar (DOT) com> writes:
Quote:
dfol=3D> select pgc.oid, pgc.relname from pg_class pgc where pgc.oid in (=
68950, 68122);
oid | relname=20=20=20=20=20=20=20=20=20=20
-------+--------------------------
68950 | vehicle_unit_data_200407
68122 | vehicle_unit_data_200301

NOTICE: Clustering idx_vehicle_unit_data_200407_person_information__i d o=
n vehicle_unit_data_200407
ERROR: deadlock detected
DETAIL: Process 29022 waits for AccessExclusiveLock on relation 68950 of=
database 16390; blocked by process 15865.
Process 15865 waits for AccessShareLock on relation 68122 of database 163=
90; blocked by process 29022.

Quote:
So it seems that it was the clustering of idx_vehicle_unit_data_200407_pe=
rson_information__id on vehicle_unit_data_200407 that caused the deadlock.

Hmm, the CLUSTER on vehicle_unit_data_200407 wouldn't have taken any
lock on vehicle_unit_data_200301. Were you perhaps issuing a series
of CLUSTERs inside a transaction block? That would pile up exclusive
locks on all the tables involved, which is certainly deadlock-prone.

I'm also wondering where that NOTICE "Clustering ..." came from, because
there is no such message anywhere in the 8.1 PG sources. You *sure*
this is 8.1?

There's something funny about 15865 too; you said that was an autovacuum
process but I don't think so. VACUUM doesn't take AccessShareLock;
there's a different lock type that that tries to acquire. And it
doesn't take any locks at all on more than one user table at a time.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.