![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/13/06 8:30 pm The interesting thing is that _none_ of the referenced relfilenode numbers actually appear in the file system. |
#2
| |||
| |||
|
|
The culprit is CLUSTER. There is a batch file which runs CLUSTER against six, relatively small (60k rows between them) tables at 7am, 1pm, and 9pm. Follows is the list of dates and hours when the "Permission denied" errors showed up. They match up to a tee (although the error apparently sometimes persists for a while). |
#3
| |||
| |||
|
|
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/14/06 2:41 am OK ... but what's still unclear is whether the failures are occurring |
#4
| |||
| |||
|
|
The error messages refer to the old relfilenode (in 3 out of 3 occurrences today). |
#5
| |||
| |||
|
|
The error messages refer to the old relfilenode (in 3 out of 3=20 occurrences today). =20 So it'd seem the problem is with fsync on recently-deleted files. Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION maybe) in the situation where we try to fsync a file that's=20 been unlinked but isn't fully gone yet due to open handles? |
#6
| |||
| |||
|
|
Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION maybe) in the situation where we try to fsync a file that's been unlinked but isn't fully gone yet due to open handles? I think that sounds "reasonable". Not as in a reasonable thing to do, but as a reasonable thing to expect from the win32 api. |
#7
| |||
| |||
|
|
Is it possible that we are getting EACCES (ERROR_SHARING_VIOLATION maybe) in the situation where we try to fsync a file that's been=20 unlinked but isn't fully gone yet due to open handles? =20 I think that sounds "reasonable". Not as in a reasonable=20 thing to do,=20 but as a reasonable thing to expect from the win32 api. =20 Probably be good if someone can experimentally confirm that=20 (and confirm exactly which underlying Win32 error code it is)=20 before we think about how to fix it. |
#8
| |||
| |||
|
|
Tom Lane <tgl (AT) sss (DOT) pgh.pa.us> 04/18/06 7:35 pm Probably be good if someone can experimentally confirm that (and |
#9
| |||
| |||
|
|
Because we are talking about checking the output from _commit(), right? (being fsync() redefined) |
#10
| |||
| |||
|
|
It happens often enough and the episodes last long enough=20 that grabbing a handle dump while this is going on should be=20 easily done. =20 Regarding the Win32 error code, backend/storage/file/fd.c=20 calls _commit().=20 http://msdn2.microsoft.com/en-us/lib...VS.80).aspx=20 It looks like it is already using errno to report errors. Will=20 GetLastError() return something useful there? |
![]() |
| Thread Tools | |
| Display Modes | |
| |