dbTalk Databases Forums  

[BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Jean-Pierre Pelletier
 
Posts: n/a

Default [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-04-2005 , 11:27 AM






Hi,

I've installed PostgreSQL 8.1 beta2 five days ago and it crashed 3 times
since then.
Here is what's been logged for the last crash

2005-10-04 11:00:19 FATAL: could not read block 121 of relation
1663/16384/2608: Invalid argument
2005-10-04 11:00:20 LOG: server process (PID 2592) was terminated by signal
1
2005-10-04 11:00:20 LOG: terminating any other active server processes

Than for each connections, the log has:
2005-10-04 11:00:20 WARNING: terminating connection because of crash of
another server process
2005-10-04 11:00:20 DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2005-10-04 11:00:20 HINT: In a moment you should be able to reconnect to
the database and repeat your command.

With this in the end:
2005-10-04 11:00:20 LOG: all server processes terminated; reinitializing
2005-10-04 11:00:21 LOG: database system was interrupted at 2005-10-04
10:59:43 Eastern Daylight Time

relation 2608 is pg_depend
----------------------------------------------------------------------------------
The crash before that was on relation pg_type, the first line logged was:
2005-10-03 10:51:06 FATAL: could not read block 38 of relation
1663/16384/1247: Invalid argument
----------------------------------------------------------------------------------
The first crash was also on relation pg_depend, but with open instead or
read
2005-09-30 18:38:53 FATAL: could not open relation 1663/16384/2608: Invalid
argument
----------------------------------------------------------------------------------

There was between 14 and 17 connections when these crashes happened.

The database was not reloaded from a backup but created from
..sql scripts for DDL, and data from user tables were reloaded
from files with "copy from".

We are using Windows 2000 Server, Service Pack 4.

Thanks,
Jean-Pierre Pelletier
e-djuster


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Reply With Quote
  #2  
Old   
Qingqing Zhou
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-04-2005 , 11:11 PM







""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote
Quote:
I've installed PostgreSQL 8.1 beta2 five days ago and it crashed 3 times
since then.
Here is what's been logged for the last crash

2005-10-04 11:00:19 FATAL: could not read block 121 of relation
1663/16384/2608: Invalid argument

relation 2608 is pg_depend
----------------------------------------------------------------------------------
The crash before that was on relation pg_type, the first line logged was:
2005-10-03 10:51:06 FATAL: could not read block 38 of relation
1663/16384/1247: Invalid argument
----------------------------------------------------------------------------------
The first crash was also on relation pg_depend, but with open instead or
read
2005-09-30 18:38:53 FATAL: could not open relation 1663/16384/2608:
Invalid argument
----------------------------------------------------------------------------------

This problem was reported several times before, but not necessarily system
tables. Is there any anti-virus softwares installed on the same machine? Is
the database under intensive IO pressure?

Regards,
Qingqing



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #3  
Old   
Jean-Pierre Pelletier
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-05-2005 , 08:29 AM




Yes, there is an antivirus software on the machine, a reboot is needed when
it's turned off,
I'll be allowed to reboot it tonight or I'll do it sooner if it crashes
before that.

There are around 15 connections to PostgreSQL when it crashes but most are
idle
there may be a few inserts but no bulk inserts, the biggest load would come
from
select statements.

Jean-Pierre Pelletier

----- Original Message -----
From: "Qingqing Zhou" <zhouqq (AT) cs (DOT) toronto.edu>
To: <pgsql-bugs (AT) postgresql (DOT) org>
Sent: Wednesday, October 05, 2005 3:03 AM
Subject: Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2,
Windows 2000


Quote:
""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote

I've installed PostgreSQL 8.1 beta2 five days ago and it crashed 3 times
since then.
Here is what's been logged for the last crash

2005-10-04 11:00:19 FATAL: could not read block 121 of relation
1663/16384/2608: Invalid argument

relation 2608 is pg_depend
----------------------------------------------------------------------------------
The crash before that was on relation pg_type, the first line logged was:
2005-10-03 10:51:06 FATAL: could not read block 38 of relation
1663/16384/1247: Invalid argument
----------------------------------------------------------------------------------
The first crash was also on relation pg_depend, but with open instead or
read
2005-09-30 18:38:53 FATAL: could not open relation 1663/16384/2608:
Invalid argument
----------------------------------------------------------------------------------


This problem was reported several times before, but not necessarily system
tables. Is there any anti-virus softwares installed on the same machine?
Is the database under intensive IO pressure?

Regards,
Qingqing


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #4  
Old   
Qingqing Zhou
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-05-2005 , 01:22 PM




""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote

Quote:
Yes, there is an antivirus software on the machine, a reboot is needed
when it's turned off,
I'll be allowed to reboot it tonight or I'll do it sooner if it crashes
before that.

There are around 15 connections to PostgreSQL when it crashes but most are
idle
there may be a few inserts but no bulk inserts, the biggest load would
come from
select statements.

We haven't identified that the failed read/write are caused by anti-virus
software or intensive read/write. If you can compile the source, can you
patch smgrread()/smgrwrite() like this to capture the native windows error:

void
smgrwrite(SMgrRelation reln, BlockNumber blocknum, char *buffer, bool
isTemp)
{
if (!(*(smgrsw[reln->smgr_which].smgr_write)) (reln, blocknum, buffer,
isTemp))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write block %u of relation %u/%u/%u:%d: %m",
blocknum,
reln->smgr_rnode.spcNode,
reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode,
GetLastError())));
}

Regards,
Qingqing



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #5  
Old   
Jean-Pierre Pelletier
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-05-2005 , 02:03 PM



I'll recompile with the trace that's no problem,
and install the patched release tonight.

After your last email, I've excluded the postgreSQL
directory from the antivirus because I could do it without
rebooting.

I was also sometimes getting read/write or open
error Invalid argument without the server crashing.
After two days, if I haven't seen any of these
error messages there is a very high chance that it's
been fixed by turning off the antivirus.

Jean-Pierre Pelletier

----- Original Message -----
From: "Qingqing Zhou" <zhouqq (AT) cs (DOT) toronto.edu>
To: <pgsql-bugs (AT) postgresql (DOT) org>
Sent: Wednesday, October 05, 2005 5:16 PM
Subject: Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2,
Windows 2000


Quote:
""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote in message
news:003801c5c9b0$03e08500$6401a8c0 (AT) JP (DOT) ..

Yes, there is an antivirus software on the machine, a reboot is needed
when it's turned off,
I'll be allowed to reboot it tonight or I'll do it sooner if it crashes
before that.

There are around 15 connections to PostgreSQL when it crashes but most
are idle
there may be a few inserts but no bulk inserts, the biggest load would
come from
select statements.


We haven't identified that the failed read/write are caused by anti-virus
software or intensive read/write. If you can compile the source, can you
patch smgrread()/smgrwrite() like this to capture the native windows
error:

void
smgrwrite(SMgrRelation reln, BlockNumber blocknum, char *buffer, bool
isTemp)
{
if (!(*(smgrsw[reln->smgr_which].smgr_write)) (reln, blocknum, buffer,
isTemp))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write block %u of relation %u/%u/%u:%d: %m",
blocknum,
reln->smgr_rnode.spcNode,
reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode,
GetLastError())));
}

Regards,
Qingqing


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
  #6  
Old   
Jean-Pierre Pelletier
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-07-2005 , 10:30 AM



Turning off the antivirus fixed the problem.
We haven't have any read/write/open error in more
than two days.

Thank you very much for your help and keep up the good work.

Our only remaining PostgreSQL problem is with pg_stat_actitivity
being unreliable and the statistics collector being restarted many times
every day.

Any idea what might be causing that?

Jean-Pierre Pelletier

----- Original Message -----
From: "Jean-Pierre Pelletier" <pelletier_32 (AT) sympatico (DOT) ca>
To: "Qingqing Zhou" <zhouqq (AT) cs (DOT) toronto.edu>
Cc: <pgsql-bugs (AT) postgresql (DOT) org>
Sent: Wednesday, October 05, 2005 2:58 PM
Subject: Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2,
Windows 2000


Quote:
I'll recompile with the trace that's no problem,
and install the patched release tonight.

After your last email, I've excluded the postgreSQL
directory from the antivirus because I could do it without
rebooting.

I was also sometimes getting read/write or open
error Invalid argument without the server crashing.
After two days, if I haven't seen any of these
error messages there is a very high chance that it's
been fixed by turning off the antivirus.

Jean-Pierre Pelletier

----- Original Message -----
From: "Qingqing Zhou" <zhouqq (AT) cs (DOT) toronto.edu
To: <pgsql-bugs (AT) postgresql (DOT) org
Sent: Wednesday, October 05, 2005 5:16 PM
Subject: Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1
beta2,
Windows 2000



""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote in message
news:003801c5c9b0$03e08500$6401a8c0 (AT) JP (DOT) ..

Yes, there is an antivirus software on the machine, a reboot is needed
when it's turned off,
I'll be allowed to reboot it tonight or I'll do it sooner if it crashes
before that.

There are around 15 connections to PostgreSQL when it crashes but most
are idle
there may be a few inserts but no bulk inserts, the biggest load would
come from
select statements.


We haven't identified that the failed read/write are caused by anti-virus
software or intensive read/write. If you can compile the source, can you
patch smgrread()/smgrwrite() like this to capture the native windows
error:

void
smgrwrite(SMgrRelation reln, BlockNumber blocknum, char *buffer, bool
isTemp)
{
if (!(*(smgrsw[reln->smgr_which].smgr_write)) (reln, blocknum, buffer,
isTemp))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write block %u of relation %u/%u/%u:%d: %m",
blocknum,
reln->smgr_rnode.spcNode,
reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode,
GetLastError())));
}

Regards,
Qingqing


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Reply With Quote
  #7  
Old   
Alvaro Herrera
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-07-2005 , 10:30 AM



On Fri, Oct 07, 2005 at 11:19:25AM -0400, Jean-Pierre Pelletier wrote:

Quote:
Our only remaining PostgreSQL problem is with pg_stat_actitivity
being unreliable and the statistics collector being restarted many times
every day.
The stats collector (which mantains pg_stat_activity among other things)
uses an UDP socket to receive info from the backends, so if UDP
communication is crippled, it's going to be unreliable. Maybe there are
too many lost packets. I don't know what could cause it to die though
-- certainly not lost packets. (The postmaster restarts it
automatically if it detects it's not running.)

--
Alvaro Herrera http://www.advogato.org/person/alvherre
"Everybody understands Mickey Mouse. Few understand Hermann Hesse.
Hardly anybody understands Einstein. And nobody understands Emperor Norton."

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #8  
Old   
Qingqing Zhou
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-07-2005 , 11:21 AM




""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote
Quote:
Turning off the antivirus fixed the problem.
We haven't have any read/write/open error in more
than two days.

Thank you very much for your help and keep up the good work.

You are welcome :-) But I still suspect if this really solves the problem
.... by the way, may I know what anti-virus software are you using? And, if
it is possible, can you please turn on the anti-virus software again and
check the GetLastError()?

A more detailed "guess" of the problem is here:
http://archives.postgresql.org/pgsql...7/msg00489.php

Thanks a lot,
Qingqing



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #9  
Old   
Jean-Pierre Pelletier
 
Posts: n/a

Default Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2, Windows 2000 - 10-11-2005 , 07:41 AM



The antivirus is CA eTrust EZ v 7.0.6.7.

I cannot put back the antivirus on that server
because it is now in production mode.

Jean-Pierre Pelletier

----- Original Message -----
From: "Qingqing Zhou" <zhouqq (AT) cs (DOT) toronto.edu>
To: <pgsql-bugs (AT) postgresql (DOT) org>
Sent: Friday, October 07, 2005 3:08 PM
Subject: Re: [BUGS] Possibly corrupted shared memory, PostgreSQL 8.1 beta2,
Windows 2000


Quote:
""Jean-Pierre Pelletier"" <pelletier_32 (AT) sympatico (DOT) ca> wrote
Turning off the antivirus fixed the problem.
We haven't have any read/write/open error in more
than two days.

Thank you very much for your help and keep up the good work.


You are welcome :-) But I still suspect if this really solves the problem
... by the way, may I know what anti-virus software are you using? And, if
it is possible, can you please turn on the anti-virus software again and
check the GetLastError()?

A more detailed "guess" of the problem is here:
http://archives.postgresql.org/pgsql...7/msg00489.php

Thanks a lot,
Qingqing


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.