dbTalk Databases Forums  

Re: Permission denied on fsync / Win32 (was [BUGS] right

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss Re: Permission denied on fsync / Win32 (was [BUGS] right in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Peter Brant
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-13-2006 , 03:18 PM






The culprit is CLUSTER. There is a batch file which runs CLUSTER
against six, relatively small (60k rows between them) tables at 7am,
1pm, and 9pm. Follows is the list of dates and hours when the
"Permission denied" errors showed up. They match up to a tee (although
the error apparently sometimes persists for a while).

The machine is clean (basically just Windows + Postgres [no AV,
firewall, etc. software]).

Pete

2006-03-20 21
2006-03-21 07
2006-03-22 21
2006-03-23 21
2006-03-23 22
2006-03-24 13
2006-03-24 21
2006-03-24 22
2006-03-26 13
2006-03-27 13
2006-03-27 21
2006-03-27 22
2006-03-28 13
2006-03-28 21
2006-03-29 13
2006-03-29 21
2006-03-30 13
2006-03-30 14
2006-03-30 15
2006-03-30 21
2006-03-30 22
2006-03-31 07
2006-03-31 08
2006-03-31 09
2006-03-31 10
2006-03-31 11
2006-03-31 12
2006-03-31 13
2006-04-03 21
2006-04-04 07
2006-04-05 07
2006-04-05 21


Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/13/06 8:30 pm
The interesting thing is that _none_ of the referenced relfilenode
numbers actually appear in the file system.
Could they have been temporary tables? Alternatively, if you
routinely
use TRUNCATE, CLUSTER, or REINDEX (all of which assign new relfilenode
numbers), then maybe they were older versions of tables that still
exist.


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-13-2006 , 07:41 PM






"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
The culprit is CLUSTER. There is a batch file which runs CLUSTER
against six, relatively small (60k rows between them) tables at 7am,
1pm, and 9pm. Follows is the list of dates and hours when the
"Permission denied" errors showed up. They match up to a tee (although
the error apparently sometimes persists for a while).
OK ... but what's still unclear is whether the failures are occurring
against the old relfilenode (the one just removed by the CLUSTER) or the
new one just added by CLUSTER. If you note the relfilenodes assigned to
these tables just before and just after the next cycle of CLUSTERs, it
should be easy to tell what the complaints refer to.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #3  
Old   
Peter Brant
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-14-2006 , 02:48 PM



Apparently we got lucky on all four servers with the latest cycle, so
nothing to report. Load (both reading and writing) is quite light today
so perhaps the bug is only triggered under a higher load. It seems the
problem typically doesn't show up on weekends either (when load is also
much lighter for us).

In any case, we're logging the relfilenodes before and after now, so
I'll post again when the problem crops up again next week.

Pete

Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/14/06 2:41 am
OK ... but what's still unclear is whether the failures are occurring
against the old relfilenode (the one just removed by the CLUSTER) or
the
new one just added by CLUSTER. If you note the relfilenodes assigned
to
these tables just before and just after the next cycle of CLUSTERs, it
should be easy to tell what the complaints refer to.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #4  
Old   
Tom Lane
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 12:05 PM



"Peter Brant" <Peter.Brant (AT) wicourts (DOT) gov> writes:
Quote:
The error messages refer to the old relfilenode (in 3 out of 3
occurrences today).
So it'd seem the problem is with fsync on recently-deleted files.
Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION
maybe) in the situation where we try to fsync a file that's been
unlinked but isn't fully gone yet due to open handles?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #5  
Old   
Magnus Hagander
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 12:25 PM



Quote:
The error messages refer to the old relfilenode (in 3 out of 3=20
occurrences today).
=20
So it'd seem the problem is with fsync on recently-deleted files.
Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION
maybe) in the situation where we try to fsync a file that's=20
been unlinked but isn't fully gone yet due to open handles?
I think that sounds "reasonable". Not as in a reasonable thing to do,
but as a reasonable thing to expect from the win32 api.

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
  #6  
Old   
Tom Lane
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 12:36 PM



"Magnus Hagander" <mha (AT) sollentuna (DOT) net> writes:
Quote:
Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION
maybe) in the situation where we try to fsync a file that's
been unlinked but isn't fully gone yet due to open handles?

I think that sounds "reasonable". Not as in a reasonable thing to do,
but as a reasonable thing to expect from the win32 api.
Probably be good if someone can experimentally confirm that (and confirm
exactly which underlying Win32 error code it is) before we think about
how to fix it.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #7  
Old   
Magnus Hagander
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 01:46 PM



Quote:
Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION
maybe) in the situation where we try to fsync a file that's been=20
unlinked but isn't fully gone yet due to open handles?
=20
I think that sounds "reasonable". Not as in a reasonable=20
thing to do,=20
but as a reasonable thing to expect from the win32 api.
=20
Probably be good if someone can experimentally confirm that=20
(and confirm exactly which underlying Win32 error code it is)=20
before we think about how to fix it.
Hmm. A very simple test that's basically:

A: open file (using win32_open copied from src/port)
A: fsync file (success)
B: delete file
A: fsync file again (success)

So it must be a different scenario that causes it.

Per the MSDN docs
(http://msdn.microsoft.com/library/de...y/en-us/vclib/
html/_crt__commit.asp) the result code sohuld be EBADF and nothing else
- which is clearly not what's happening :-(

Because we are talking about checking the output from _commit(), right?
(being fsync() redefined)

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #8  
Old   
Peter Brant
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 01:51 PM



It happens often enough and the episodes last long enough that grabbing
a handle dump while this is going on should be easily done.

Regarding the Win32 error code, backend/storage/file/fd.c calls
_commit().
http://msdn2.microsoft.com/en-us/lib...85(VS.80).aspx It looks
like it is already using errno to report errors. Will GetLastError()
return something useful there?

Pete

Quote:
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/18/06 7:35 pm
Probably be good if someone can experimentally confirm that (and
confirm
exactly which underlying Win32 error code it is) before we think about
how to fix it.


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
  #9  
Old   
Tom Lane
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 01:54 PM



"Magnus Hagander" <mha (AT) sollentuna (DOT) net> writes:
Quote:
Because we are talking about checking the output from _commit(), right?
(being fsync() redefined)
The failure could be coming from that, or from a preceding open() if the
bgwriter didn't already have the file open --- basically, the message
Peter is quoting indicates a failure return from FileSync() in fd.c.
The fact that it doesn't happen for him every time is pretty good
evidence that only one of those two cases fails.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #10  
Old   
Magnus Hagander
 
Posts: n/a

Default Re: Permission denied on fsync / Win32 (was [BUGS] right - 04-18-2006 , 02:01 PM



Quote:
It happens often enough and the episodes last long enough=20
that grabbing a handle dump while this is going on should be=20
easily done.
=20
Regarding the Win32 error code, backend/storage/file/fd.c=20
calls _commit().=20
http://msdn2.microsoft.com/en-us/lib...VS.80).aspx=20
It looks
like it is already using errno to report errors. Will=20
GetLastError() return something useful there?
Good point.
Ran a quick test. If I open the file read-only and then fsync, I get
errno=3D9 (EBADF) and GetLastError()=3D5. Which explains the fact that we
got the wrong error-code. The *underlying API call* to _commit() returns
access denied...

Looking at the source to _commit(), if the call to FlushFileBuffers()
returns an error, it will set _doserrno to that value,and then return
with errno=3DEBADF.

So, this basicalliyu means that FlushFileBuffers() returns ACCESS
DENIED.

//Magnus

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.