dbTalk Databases Forums  

Re: [BUGS] right sibling is not next child

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss Re: [BUGS] right sibling is not next child in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Peter Brant
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-11-2006 , 02:02 PM






Sorry about the delay in responding. We had a bit of difficulty with
the test machine. Kevin is also on vacation this week.

The problem is repeatable with a VACUUM. I've found the offending
block. A (partial) pg_filedump of that block is pasted in below. I'm a
little lost as to what the next step is though. Any pointers would be
appreciated.

Pete

"Kevin Grittner" <Kevin ( dot ) Grittner ( at ) wicourts ( dot ) gov>
writes:
Quote:
[2006-04-06 02:19:57.460 ] 3848 <postgres bigbird 127.0.0.1(3944)
PANIC: right sibling is not next child in "Panel_pkey"

This should be repeatable by re-attempting a VACUUM, right? Please
find
out which page exactly it's unhappy about (either gdb the crash or add
a
printout of the "parent" variable to the elog call in nbtpage.c), then
pg_filedump the index and look to see what the index contains.

regards, tom lane

PANIC: right sibling is not next child in "Panel_pkey", parent is 271

************************************************** *****************
* PostgreSQL File/Block Formatted Dump Utility - Version 8.1.1
*
* File: 180571
* Options used: -R 271 -i
*
* Dump created on: Tue Apr 11 13:38:44 2006
************************************************** *****************

Block 271 ************************************************** ******
<Header> -----
Block Offset: 0x0021e000 Offsets: Lower 468 (0x01d4)
Block: Size 8192 Version 3 Upper 1952 (0x07a0)
LSN: logid 246 recoff 0x265d47c0 Special 8176 (0x1ff0)
Items: 112 Free Space: 1484
Length (including item array): 472

<Data> ------
Item 1 -- Length: 56 Offset: 8120 (0x1fb8) Flags: USED
Block Id: 132 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 2 -- Length: 8 Offset: 8112 (0x1fb0) Flags: USED
Block Id: 201 linp Index: 1 Size: 8
Has Nulls: 0 Has Varwidths: 0

Item 3 -- Length: 56 Offset: 8056 (0x1f78) Flags: USED
Block Id: 257 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

.... snip ...

Item 109 -- Length: 56 Offset: 4920 (0x1338) Flags: USED
Block Id: 121 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 110 -- Length: 56 Offset: 4864 (0x1300) Flags: USED
Block Id: 133 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 111 -- Length: 56 Offset: 4808 (0x12c8) Flags: USED
Block Id: 134 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 112 -- Length: 56 Offset: 4752 (0x1290) Flags: USED
Block Id: 137 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384


<Special Section> -----
BTree Index Section:
Flags: 0x0000 ()
Blocks: Previous (167) Next (455) Level (1)


*** End of Requested Range Encountered. Last Block Read: 271 ***



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-11-2006 , 02:47 PM






"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
PANIC: right sibling is not next child in "Panel_pkey", parent is 271
Hmm ... that's not actually enough info to tell us where to look,
is it :-(. Please add the following variables to the elog message, or
gdb for them if you can:
target
rightsib
nextoffset

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Reply With Quote
  #3  
Old   
Peter Brant
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 08:25 AM



It is repeatable. A reindex doesn't work.

Pete

bigbird=# vacuum analyze "MaintCode";
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"


bigbird=# vacuum analyze verbose "MaintCode";
INFO: vacuuming "public.MaintCode"
INFO: index "MaintCode_pkey" now contains 19 row versions in 2 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "MaintCode": found 0 removable, 19 nonremovable row versions in
1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_90472"
INFO: index "pg_toast_90472_index" now contains 0 row versions in 1
pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_90472": found 0 removable, 0 nonremovable row versions
in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.MaintCode"
INFO: "MaintCode": scanned 1 of 1 pages, containing 19 live rows and 0
dead rows; 19 rows in sample, 19 estimated total rows
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"


bigbird=# reindex table pg_statistic;
ERROR: could not create unique index
DETAIL: Table contains duplicated values.


bigbird=# select * from "MaintCode";
maintCode | colorAttrb | descr | isActive |
maintPriority | stopsTheClock | videoAttrb | wcisSpecStatus
-----------+------------+---------------------------+----------+---------------+---------------+------------+----------------
AA | R | Annual accounts | t |
75 | t | N |
AP | R | Appeal | t |
30 | f | N |
AW | R | Awaiting Papers | t |
80 | f | N |
BA | R | Bankruptcy | t |
20 | t | N | S05
CI | R | Case Inactive | t |
60 | f | N | S03
CS | R | Consent decree | t |
90 | t | N |
DP | R | Deferred judgt./prosecute | t |
40 | t | N | S07
EF | R | Electronic filing | t |
33 | f | N |
ES | R | Extension | t |
55 | t | N |
EX | R | Expungement | t |
5 | f | N |
FX | R | Future expungement | t |
93 | f | N |
IN | R | Interpreter | t |
53 | f | N |
JR | R | Judicial review | t |
75 | t | N |
RO | R | Reopen | t |
35 | f | N |
SL | R | Sealed Record | t |
85 | f | N |
SM | R | Sentence modification | t |
43 | f | N |
SU | R | Case suspended | t |
45 | t | N |
UA | R | Under Advisement | t |
50 | f | N | S02
WA | R | Warrant issued | t |
10 | t | F | S04
(19 rows)

bigbird=#



Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/12/06 5:00 am
"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Also, when I tried to run a database-wide VACUUM ANALYZE VERBOSE it
actually doesn't even get to Panel and errors out with:

ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"
Hm, my eyebrows just disappeared over the back of my head somewhere.
Is that repeatable? Does a REINDEX on pg_statistic make it go away?

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
  #4  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 09:38 AM



"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
bigbird=# vacuum analyze "MaintCode";
ERROR: duplicate key violates unique constraint
"pg_statistic_relid_att_index"
Hm, can you see any rows in pg_statistic with duplicate values of
(starelid, staattnum)? If so it'd be useful to look at their
ctid/xmin/xmax/cmin/cmax values.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


Reply With Quote
  #5  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 10:30 AM



"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
Item 85 -- Length: 56 Offset: 2120 (0x0848) Flags: USED
Block Id: 640 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 86 -- Length: 56 Offset: 2176 (0x0880) Flags: USED
Block Id: 635 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 87 -- Length: 56 Offset: 2232 (0x08b8) Flags: USED
Block Id: 636 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 88 -- Length: 56 Offset: 2288 (0x08f0) Flags: USED
Block Id: 635 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 89 -- Length: 56 Offset: 2400 (0x0960) Flags: USED
Block Id: 629 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Item 90 -- Length: 56 Offset: 5704 (0x1648) Flags: USED
Block Id: 166 linp Index: 1 Size: 56
Has Nulls: 0 Has Varwidths: 16384

Well, that's pretty dang interesting. How did block 635 get to be
listed twice? The sibling links say that the correct sequence is
640, 636, 635, 629, 166 ... so something screwed up the parent level's
keys.

What would be most useful at this point is to look at the keys in
these entries, and compare them to the "high keys" (item 1) of the
individual leaf pages. I'm wondering what key is in that extra
entry for 635 ... Did you get permission to show us the keys?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #6  
Old   
Peter Brant
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 10:51 AM



I can't find any duplicates?!?

The query

select starelid, staattnum, ctid, xmin, xmax, cmin, cmax
from pg_statistic p1
where (select count(*) from pg_statistic p2 where
p1.starelid = p2.starelid and p1.staattnum = p2.staattnum) > 1

doesn't turn up anything. Nor does dumping

select starelid, staattnum from pg_statistic

to a file and using sort | uniq -c

Pete

Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/12/06 4:37 pm
Hm, can you see any rows in pg_statistic with duplicate values of
(starelid, staattnum)? If so it'd be useful to look at their
ctid/xmin/xmax/cmin/cmax values.

regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #7  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 11:04 AM



"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
I can't find any duplicates?!?
Weirder and weirder. Maybe the table is OK but the index is corrupt?

Could it be another symptom of the same problem we're seeing in the
Panel_pkey index? I'm currently theorizing that that index might've
been corrupted during WAL replay (specifically, if the replayer somehow
failed to recognize a split completion, btree_xlog_cleanup would've
inserted an extra entry) and one could certainly imagine multiple
indexes getting corrupted in the same fashion at the same time. You
did say that this problem showed up shortly after a database crash,
right?

Could you send me a copy of pg_statistic_relid_att_index, off-list
(the actual disk file, not a pg_filedump)? There's nothing but table
OIDs and attribute numbers in it, so I can't see any reason anyone
would consider it sensitive data. I've got some really crude code
laying about for scanning an index and looking for inconsistencies ...
it's too ugly to give out, but I'd like to see if it can find anything
wrong with that index. (I'll probably ask for a copy of Panel_pkey
for the same purpose, if you get permission to show us its keys.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #8  
Old   
Peter Brant
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 11:41 AM



Per the DBAs, there hadn't been any recent crashes before last Thursday.
A "vacuum analyze verbose" discovered the problem early Thursday
morning. After the PANIC, the database never came back up (the
heap_clean_redo: no block / full_page_writes = off problem).

One thing that seems strange to me is that the original crash on
Thursday failed on Panel_pkey, but my "vacuum analyze verbose" on a copy
of the crashed database failed on MaintCode /
pg_statistic_relid_att_index.

I'll send over pg_statistic_relid_att_index and Panel_pkey. Showing
the keys to Panel_pkey is no problem.

Pete

Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/12/06 6:03 pm
You did say that this problem showed up shortly after a database
crash,
right?

Could you send me a copy of pg_statistic_relid_att_index, off-list
(the actual disk file, not a pg_filedump)? There's nothing but table
OIDs and attribute numbers in it, so I can't see any reason anyone
would consider it sensitive data. I've got some really crude code
laying about for scanning an index and looking for inconsistencies ...
it's too ugly to give out, but I'd like to see if it can find anything
wrong with that index. (I'll probably ask for a copy of Panel_pkey
for the same purpose, if you get permission to show us its keys.)

regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #9  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-12-2006 , 03:29 PM



"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
One thing that seems strange to me is that the original crash on
Thursday failed on Panel_pkey, but my "vacuum analyze verbose" on a copy
of the crashed database failed on MaintCode /
pg_statistic_relid_att_index.
I can't find anything particularly wrong with
pg_statistic_relid_att_index. There are two index entries for many of
the key values, but that's unsurprising if a database-wide vacuum
analyze had been done or in progress --- just-obsoleted pg_statistic
rows wouldn't have been vacuumed away yet. I'm inclined to think that
this problem is a side-effect of the fact that you had to use
pg_resetxlog; there was probably an update to pg_statistic that got
lost. If you want to continue experimenting with this database,
I'd suggest TRUNCATE'ing pg_statistic and rebuilding it via another
VACUUM ANALYZE run.

Quote:
I'll send over pg_statistic_relid_att_index and Panel_pkey. Showing
the keys to Panel_pkey is no problem.
My little index printing tool shows these entries in Panel_pkey at btree
level zero:

....
40 2006 RES032706 CH0327RES high key on page 606
40 2006 RES032706 RES032706 383 11
40 2006 RES032706 RES032706 high key on page 608
40 2006 RES040306 CC0403RES 507 14
40 2006 RES040306 CCC0403RES 551 1
40 2006 RES040306 CCC0403RES high key on page 601
40 2006 RES040306 RES040306 500 1
40 2006 RES040306 RES040306 high key on page 640
40 2006 RES040306 RES040306 high key on page 636
40 2006 RES040306 RES040306 high key on page 635
41 0001 2000POOL 0001 159 3
41 0001 2000POOL 0002 159 4
41 0001 2000POOL 0003 159 5
....

(The first four columns are the key values of this index; the last two
are the pointed-to heap tuple's page/line location. High keys are
printed after any data keys on the page.) The down-links at level one
look like:

....
40 2006 RES032706 CCC0327RES 606 1
40 2006 RES032706 CH0327RES 608 1
40 2006 RES032706 RES032706 601 1
40 2006 RES040306 CCC0403RES 640 1
40 2006 RES040306 RES040306 635 1
40 2006 RES040306 RES040306 636 1
40 2006 RES040306 RES040306 635 1
40 2006 RES040306 RES040306 629 1
41 0405 0405 0105 166 1
....

This is fairly interesting because we've got three pages with the same
boundary key. The bogus entry for page 635 has been inserted where
you'd expect it to get inserted if the insertion were being done on the
basis of key comparison. (We usually insert a new entry in front of any
ones with the same key.) But we never do insertions on non-leaf pages
by key comparison! Upper-level entries are only made during page
splits, by _bt_insert_parent, and that always works by locating the
down-link to the page just split and adding the new entry just after it.

One thing I realize after seeing this is that the index corruption might
be of long standing: the way that the btree search algorithms work, no
search would ever have descended directly to 635 or 636, but instead to
640 and then traverse right from there. So you'd not have noticed any
malfunction, until the time came to try to delete the page. That means
we shouldn't assume that the problem was created recently.

Now that we know the problem is associated with keys '40 2006 RES040306
RES040306', can you check your logs and see if anything interesting
happened around the time those keys were being inserted? (Note: if
Panel_pkey really is a primary key, ie unique, it might seem odd for
there to be several pages worth of identical entries --- this would
imply that the same record had been updated several hundred times
between two vacuums.)

I still kinda like the theory that the extra key got in there because
btree_xlog_cleanup fired inappropriately ... mainly because I can't
think of any other theories ... but I'm still baffled about the details.
Anyone have any ideas?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #10  
Old   
Peter Brant
 
Posts: n/a

Default Re: [BUGS] right sibling is not next child - 04-13-2006 , 11:11 AM



Sounds good.

There is nothing sensitive in DbTranImageStatus_pkey so if you decide
you want it after all, it's there for the asking.

Pete

Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/13/06 3:30 am
Oh, never mind ... I've sussed it.


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.