dbTalk Databases Forums  

[BUGS] Primary key duplicates

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] Primary key duplicates in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Zeki Mokhtarzada
 
Posts: n/a

Default [BUGS] Primary key duplicates - 08-13-2004 , 12:52 PM






I have a very strange bug with postgres 7.4.3. I have a table with about
15 million rows and recently, duplicate rows have started appearing.

For simplicity, let's say my table is as follows:

files
-------
fileid integer -- Primary key generated by a sequence
userid integer -- id of user that owns the file
filename character varying(255) -- name of file

Indexes: "files_pkey" primary key, bree (fileid)
"files_userid" hash (userid)


When I do:

select fileid, userid from files where userid = 1898598 order by fileid;

I get:

fileid | userid
---------+---------
3787433 | 1898598
3787433 | 1898598
3787563 | 1898598
9554275 | 1898598


Notice that 3787433 is duplicated. How could this have happened if that
column is flagged as the primary key. Even more interesting:

select oid, fileid, userid from files where userid = 1898598 order by
fileid;


oid | fileid | userid
----------+---------+---------
1573737 | 3787433 | 1898598
1573737 | 3787433 | 1898598
1573820 | 3787563 | 1898598
18612041 | 9554275 | 1898598

The rows have the same OID! So my question is how do I delete the
duplicate row. If I execute

select fileid, userid from files where fileid = 1573737;

I get:

fileid | userid
---------+---------
1573737 | 1008628


Similarly, if I try to delete both of the rows, only one of them gets
deleted, then when I select by userid, I get the other remaining one
listed. But if I select by fileid I get no rows returned.

I suspect a corrupt index is at fault here. If that's the case, a reindex
will take quite some time and will lock the table causing a long period of
downtime. Is that my only option? Any other ideas?

-Zeki



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Primary key duplicates - 08-13-2004 , 03:55 PM






Zeki Mokhtarzada <zeki (AT) freewebz (DOT) com> writes:
Quote:
Notice that 3787433 is duplicated. How could this have happened if that
column is flagged as the primary key. Even more interesting:

select oid, fileid, userid from files where userid = 1898598 order by
fileid;
oid | fileid | userid
----------+---------+---------
1573737 | 3787433 | 1898598
1573737 | 3787433 | 1898598
1573820 | 3787563 | 1898598
18612041 | 9554275 | 1898598
It would be interesting to see the ctid, xmin, cmin, and xmax fields
as well.

Quote:
I suspect a corrupt index is at fault here.
Possibly, but the duplicate OID entries suggest that a row was updated
and then for some reason the old version appears to still be valid.
(Or an attempted update was rolled back, but the new version got marked
valid anyway.) Have you had any crashes or other strange behavior lately?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Primary key duplicates - 08-13-2004 , 09:24 PM



Zeki Mokhtarzada <zeki (AT) freewebz (DOT) com> writes:
Quote:
The system is running on a Dell PowerEdge 2650 running RedHat 8. We had a
kernel halt about two weeks ago that was caused by one of our disk mirrors
failing. It could be that these problems were caused at that point in
time and are just being noticed now.
Not sure. xmin=2 (FrozenXid) implies that those rows have been like
that for a good long time (~ 1 billion transactions).

Quote:
But you never stated how to delete the duplicate rows. Any suggestions?
You can select 'em by ctid, viz WHERE ctid = '(53101,30)'

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


Reply With Quote
  #4  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Primary key duplicates - 08-13-2004 , 09:51 PM



Zeki Mokhtarzada <zeki (AT) freewebz (DOT) com> writes:
Quote:
It appears that I can select the rows, but not delete or update them.
That's *very* odd, because it works for me. Is the same true of both
those duplicate rows, or did you only try the one with nonzero xmax?

I think there must be something we don't know about the states of those
rows. Could you grab pg_filedump from http://sources.redhat.com/rhdb/
(note: right at the moment that server seems to be down, but I suppose
it'll be up soon) and dump out page 53101 of this table? Usually
I find "-i -f" to be the most informative formatting option.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Reply With Quote
  #5  
Old   
Zeki Mokhtarzada
 
Posts: n/a

Default Re: [BUGS] Primary key duplicates - 08-15-2004 , 01:05 PM



On Fri, 13 Aug 2004, Tom Lane wrote:

Quote:
It would be interesting to see the ctid, xmin, cmin, and xmax fields
as well.
ctid | xmin | cmin | xmax | oid | fileid | userid
--------------+------------+------------+------------+----------+---------+---------
(53101,30) | 2 | 1478674026 | 1478674026 | 1573737 | 3787433 | 1898598
(53101,39) | 2 | 0 | 0 | 1573737 | 3787433 | 1898598

Quote:
Possibly, but the duplicate OID entries suggest that a row was updated
and then for some reason the old version appears to still be valid.
(Or an attempted update was rolled back, but the new version got marked
I've run the query with set enable_indexscan =off so that rules out an
index problem.


Quote:
valid anyway.) Have you had any crashes or other strange behavior lately?
The system is running on a Dell PowerEdge 2650 running RedHat 8. We had a
kernel halt about two weeks ago that was caused by one of our disk mirrors
failing. It could be that these problems were caused at that point in
time and are just being noticed now.

It looks like you've seen this problem before:

http://archives.postgresql.org/pgsql...2/msg00174.php

But you never stated how to delete the duplicate rows. Any suggestions?
Also, where can I find documentation on the purpose and values of the
ctid, oid, xmin, xmax, cmin, cmax columns?

Thanks!

-Zeki


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #6  
Old   
Zeki Mokhtarzada
 
Posts: n/a

Default Re: [BUGS] Primary key duplicates - 08-15-2004 , 01:06 PM



It appears that I can select the rows, but not delete or update them.

# select ctid from files where ctid = '(53101,30)';
ctid
------------
(53101,30)
(1 row)

# delete from files where ctid = '(53101,30)';
DELETE 0
# update files set fileid = 1000 where ctid = '(53101,30)';
UPDATE 0
#

-Zeki

On Fri, 13 Aug 2004, Tom Lane wrote:

Quote:
Zeki Mokhtarzada <zeki (AT) freewebz (DOT) com> writes:
The system is running on a Dell PowerEdge 2650 running RedHat 8. We had a
kernel halt about two weeks ago that was caused by one of our disk mirrors
failing. It could be that these problems were caused at that point in
time and are just being noticed now.

Not sure. xmin=2 (FrozenXid) implies that those rows have been like
that for a good long time (~ 1 billion transactions).

But you never stated how to delete the duplicate rows. Any suggestions?

You can select 'em by ctid, viz WHERE ctid = '(53101,30)'

regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.