![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
The following bug has been logged on the website: Bug reference: 6661 Logged by: Valentine Gogichashvili Email address: valgog (AT) gmail (DOT) com PostgreSQL version: 9.0.7 Operating system: Linux version 2.6.32-5-amd64 (Debian 2.6.32-41sque Description: Hello, today when trying to restore a replication database I got the following error: ... 2012-05-22 21:20:24.823 CEST,,,23804,,4fbbe69e.5cfc,124,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"restored log file ""0000000200001C9C0000001E"" from archive",,,,,,,,,"" 2012-05-22 21:20:24.954 CEST,,,23804,,4fbbe69e.5cfc,125,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"restored log file ""0000000200001C9C0000001F"" from archive",,,,,,,,,"" 2012-05-22 21:20:26.374 CEST,,,23804,,4fbbe69e.5cfc,126,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"restored log file ""0000000200001C9C00000020"" from archive",,,,,,,,,"" 2012-05-22 21:20:27.238 CEST,,,23804,,4fbbe69e.5cfc,127,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"restored log file ""0000000200001C9C00000021"" from archive",,,,,,,,,"" 2012-05-22 21:20:27.443 CEST,,,23804,,4fbbe69e.5cfc,128,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"consistent recovery state reached at 1C9C/21318C20",,,,,,,,,"" 2012-05-22 21:20:27.868 CEST,,,23804,,4fbbe69e.5cfc,129,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"4 KnownAssignedXids (num=4 tail=0 head=916) [0]=3674726497 [1]=3674727041 [2]=3674727042 [3]=3674727128 ",,,,,"xlog redo running xacts: nextXid 3674728633 latestCompletedXid 3674728632 oldestRunningXid 3674726497; 4 xacts: 3674727041 3674727042 3674727128 3674726497",,,,"" 2012-05-22 21:20:27.868 CEST,,,23804,,4fbbe69e.5cfc,130,,2012-05-22 21:18:54 CEST,1/0,0,FATAL,XX000,"out-of-order XID insertion in KnownAssignedXids",,,,,"xlog redo running xacts: nextXid 3674728633 latestCompletedXid 3674728632 oldestRunningXid 3674726497; 4 xacts: 3674727041 3674727042 3674727128 3674726497",,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,1,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"startup process (PID 23804) exited with exit code 1",,,,,,,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,2,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" |
#3
| |||
| |||
|
|
2012-05-22 21:20:27.868 CEST,,,23804,,4fbbe69e.5cfc,129,,2012-05-22 21:18:54 CEST,1/0,0,LOG,00000,"4 KnownAssignedXids (num=4 tail=0 head=916) [0]=3674726497 [1]=3674727041 [2]=3674727042 [3]=3674727128 ",,,,,"xlog redo running xacts: nextXid 3674728633 latestCompletedXid 3674728632 oldestRunningXid 3674726497; 4 xacts: 3674727041 3674727042 3674727128 3674726497",,,,"" 2012-05-22 21:20:27.868 CEST,,,23804,,4fbbe69e.5cfc,130,,2012-05-22 21:18:54 CEST,1/0,0,FATAL,XX000,"out-of-order XID insertion in KnownAssignedXids",,,,,"xlog redo running xacts: nextXid 3674728633 latestCompletedXid 3674728632 oldestRunningXid 3674726497; 4 xacts: 3674727041 3674727042 3674727128 3674726497",,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,1,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"startup process (PID 23804) exited with exit code 1",,,,,,,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,2,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" Could you provide a log with log_min_messages=DEBUG4? This will be quite long |
#4
| |||
| |||
|
|
2012-05-22 21:20:27.868 CEST,,,23804,,4fbbe69e.5cfc,130,,2012-05-22 21:18:54 CEST,1/0,0,FATAL,XX000,"out-of-order XID insertion in KnownAssignedXids",,,,,"xlog redo running xacts: nextXid 3674728633 latestCompletedXid 3674728632 oldestRunningXid 3674726497; 4 xacts: 3674727041 3674727042 3674727128 3674726497",,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,1,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"startup process (PID 23804) exited with exit code 1",,,,,,,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,2,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" Could you provide a log with log_min_messages=DEBUG4? This will be quite long though... Andres |

#5
| |||
| |||
|
|
CEST,,0,LOG,00000,"startup process (PID 23804) exited with exit code 1",,,,,,,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,2,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" Could you provide a log with log_min_messages=DEBUG4? This will be quite long though... Hm... I have the data directory, but not the WAL files any more. Forgot to back them up ![]() Too bad, I think without more information we won't be able to fix this. I feel |

#6
| |||
| |||
|
|
On Wednesday, May 23, 2012 12:30:31 PM Valentine Gogichashvili wrote: CEST,,0,LOG,00000,"startup process (PID 23804) exited with exit code 1",,,,,,,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,2,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" Could you provide a log with log_min_messages=DEBUG4? This will be quite long though... Hm... I have the data directory, but not the WAL files any more. Forgot to back them up ![]() Too bad, I think without more information we won't be able to fix this. I feel bad for it but I actually you run into it again ![]() Here is what I found strange, should somebody ever come back to the issue: - the standby was in a consistent state - according to the error annotation we were processing a XLOG_RUNNING_XACTS - ProcArrayApplyRecoveryInfo with STANDBY_SNAPSHOT_READY only does ExpireOldKnownAssignedTransactionIds(running->oldestRunningXid); StandbyReleaseOldLocks(running->xcnt, running->xids); before short-circuiting. How comes we call KnownAssignedXidsAdd in that case? Looking over the code again reachedConsistency doesn't imply |
#7
| |||
| |||
|
|
2012-05-22 21:20:27.868 CEST,,,23804,,4fbbe69e.5cfc,130,,2012-05-22 21:18:54 CEST,1/0,0,FATAL,XX000,"out-of-order XID insertion in KnownAssignedXids",,,,,"xlog redo running xacts: nextXid 3674728633 latestCompletedXid 3674728632 oldestRunningXid 3674726497; 4 xacts: 3674727041 3674727042 3674727128 3674726497",,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,1,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"startup process (PID 23804) exited with exit code 1",,,,,,,,,"" 2012-05-22 21:20:27.939 CEST,,,23793,,4fbbe69e.5cf1,2,,2012-05-22 21:18:54 CEST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" Could you provide a log with log_min_messages=DEBUG4? This will be quite long though... Andres Hm... I have the data directory, but not the WAL files any more. Forgot to back them up ![]() Do you have both, longrunning transactions and transactions with many |
#8
| |||
| |||
|
|
subtransactions? Yes long running plane SQL queries as well as many short SQL queries running one after another in one long spanning transaction (we are working |
#9
| |||
| |||
|
|
But no subtransactions with writes (plpgsql + EXCEPTION also counts if does DML)? On the master, yes. Practically all the stored procedures there are catching exceptions in their own bodies and returning result codes back. Regards, -- Valentine |
#10
| |||
| |||
|
|
I have the situation again, one of 3 slaves was slow to play all the WAL files and being about 10GB late it crashed with the same error again. I collected DEBUG4 output in this time: https://docs.google.com/open?id=0B2N...jNDbU0xQ3lvWms Ok, I stared at this some time and I think I see what the problem is. Some log |
|
numKnownAssignedXids != 0. |
![]() |
| Thread Tools | |
| Display Modes | |
| |