![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
The last weeks I discovery a behave of the engine what is something new for me. I believe others will be surprised too... IFX 11.50 FC9 When you restore an archive with logical logs , maybe you needed the files/tapes of logical logs backuped before the Archive Level 0. The reason is open transactions. I have a situation here where for some reason a session keep they transaction opened up to 4 hours (probably a application failure over the control of the transaction). When was executed the Archive Level 0 this sessions was idle (with TX open) from 3 hour before. The "lag" between the logical log what this TX was opened and Archive are take is at least 100 Logical Logs (30 MB each = 3GB of Logs). So, during the restore the engine request all logical logs since this open TX. The engine include this logical logs on the archive and show the number of this logs when the Archive Level 0 is finished, anyway you still needed the tapes/files created before this archive to restore the logical backuped after this archive. e.g. when the the archive bellow start running, the current log is 129634 , the log 129537 was backuped 5 hours early. If I try restore this Archive with logical logs, the engine will request the files/tapes since log 129537. * * *Please enter the level of archive to be performed (0, 1, or 2)0 * * *Please mount tape 1 on /dev/rmt0 and press Return to continue .... * * *10 percent done. * * *20 percent done. * * *30 percent done. * * *40 percent done. * * *50 percent done. * * *60 percent done. * * *70 percent done. * * *80 percent done. * * *100 percent done. * * *Read/Write End Of Medium enabled: blocks = 307955 * * *Please label this tape as number 1 in the arc tape sequence. * * *This tape contains the following logical logs: * * * 129537 - 129634 * * *Program over. A PMR was opened over this behave and answered as expected behave. Was created a APAR over the documentation (this behave wasn't documented before). So, if you have any policy of discarding logical logs backups take before your Archive... beware . Regards Cesar |
#3
| |||
| |||
|
|
The last weeks I discovery a behave of the engine what is something new for me. I believe others will be surprised too... IFX 11.50 FC9 When you restore an archive with logical logs , maybe you needed the files/tapes of logical logs backuped before the Archive Level 0. The reason is open transactions. I have a situation here where for some reason a session keep they transaction opened up to 4 hours (probably a application failure over the control of the transaction). When was executed the Archive Level 0 this sessions was idle (with TX open) from 3 hour before. The "lag" between the logical log what this TX was opened and Archive are take is at least 100 Logical Logs (30 MB each = 3GB of Logs). So, during the restore the engine request all logical logs since this open TX. The engine include this logical logs on the archive and show the number of this logs when the Archive Level 0 is finished, anyway you still needed the tapes/files created before this archive to restore the logical backuped after this archive. e.g. when the the archive bellow start running, the current log is 129634 , the log 129537 was backuped 5 hours early. If I try restore this Archive with logical logs, the engine will request the files/tapes since log 129537. Please enter the level of archive to be performed (0, 1, or 2) 0 Please mount tape 1 on /dev/rmt0 and press Return to continue ... 10 percent done. ... 100 percent done. Read/Write End Of Medium enabled: blocks = 307955 Please label this tape as number 1 in the arc tape sequence. This tape contains the following logical logs: 129537 - 129634 Program over. A PMR was opened over this behave and answered as expected behave. Was created a APAR over the documentation (this behave wasn't documented before). So, if you have any policy of discarding logical logs backups take before your Archive... beware . Regards Cesar_____________________________________________ __ Informix-list mailing list Informix-list (AT) iiug (DOT) org http://www.iiug.org/mailman/listinfo/informix-list |
#4
| |||
| |||
|
|
Hi, sorry that it took me quite a while to pick up on this topic ... but here's an attempt to answer some of the questions: - Cesar Inacio described the existing behavior correctly. * It is really the case that upon logical log restore the * oldest log file to be restored is the one that contains * the BEGIN WORK log record of the oldest transaction * that is (still) open at the time of the level-0 archive * checkpoint. As such, this log file can be much older * than the level-0 archive itself. - This is not new behavior. It has been like that since a * long time (some say since 1988 and that's most * probably correct). At least I confirmed, that a 7.31.UD8 * from 2004 had exactly this behavior. - For a level-0 archive to be restorable without logical log * restore, logical log records needed to make the restore * logically consistent are contained within the level-0 archive. * This is especially needed for the cases where logical logs * are never backed up but only discarded (think of LTAPEDEV * in the onconfig file set to /dev/null). * These needed logical log records are applied after the * physical restore (which is the copying of pages from tape * to chunks on disk) when transactions were open at the * time of the level-0 archive checkpoint. Applying these * logical log records rolls back those open transactions and * such makes the restore logically consistent. - If the physical restore of a level-0 archive is to be followed * by a logical restore, the space where logical log files reside * in the chunks on disk is cleared before doing the logical log * restore. This step is done to ensure that no obsolete log * record data or otherwise corrupt data is lingering in the log * space (in chunks on disk). After this clearing, all the needed * logical log files must be restored, including those that contain * log records of an open transaction that started before the * level-0 archive checkpoint. * [ It already has been discussed in this email thread, that * * these old log files may be needed to roll back the open * * transactions if they are not committed during the log roll * * forward. ] - The remaining question now is: Why then is this clearing * of the log space so important before starting the log restore? * The reason here is the way how the logical roll forward is * implemented (and this again is an implementation that always * was like this and has not changed recently). The logical roll * forward is processing the log records sequentially and after * each processed log record it is looking on disk whether there * are more log records to be processed. The end of the logical * roll forward is reached when the next logical log file page on * disk is all zero. And here is the catch: without the clearing of * logical log space before the restore of log files, some * "garbage data" may still be left over on disk (that also was not * overwritten by the physical restore phase). Such "garbage data" * may trick the logical roll forward into finding and applying * obsolete "stuff" after the last proper log record. * There are mainly three scenarios that can ensue when the * logical roll forward encounters "garbage data" on disk after the * last valid and correct log record: * - the log roll forward would recognize the garbage as garbage * * and correctly end the log roll forward. This is the good case. * - the log roll forward would think that there is a valid log * * record, but when attempting to apply it gets completely * * corrupted and cannot recover from this. In this case the * * log roll forward would just stop and leave the system in a * * hopelessly inconsistent state, making the whole restore a failure.. * - the log rollforward would apply obsolete log records successfully. * * This probably would be a rare, but possible case, especially when * * there were old log files on that part of the disk before the restore. * * Even though the restore may well finish, this would probably be * * the worst case, as data has been corrupted without any notice. - As all this is not new behavior but was 'always' like this, there * is not much motivation to change it, at least not for the time * being. The current implementation is rather stable and reliable * (since quite a few years), an important attribute for the * backup and restore functionality. Improvements to avoid the * need of restoring those old log files would require medium level * redesign and implementation of otherwise reliably working code, * adding a real risk for introducing new problems and making the * code unstable (at least for some time) - *a prospect that many * customers probably would not be happy with ... - Some years ago we also discussed the possibility of making the * checking for the validity of a log page found on disk much * stronger than it currently is. Also this would require redesign * and reimplementation, but additionally also diminish performance * during normal operation. The logging itself would have to create * more redundant data, adding it to the log records or the log * pages, so that later the checks can be more assuring. More * redundant data also means bigger log records and hence * more I/O when writing them (to disk). At the time of these * discussions, the costs involved were clearly deemed far * outweighing the possible benefits. Long response, I know ... :-) But I hope it can clarify some things and give some background info for better understanding ... which hopefully can help with accepting the current implementation. At first sight it may look quite dumb, but after all there is some thinking behind it. ![]() Regards, Martin -- Martin Fuerderer IBM Informix Development Munich, Germany Information Management Read about the Informix Warehouse Accelerator:http://tinyurl.com/the-iwa-blog IBM Deutschland Research & Development GmbH Chairman of the Supervisory Board: Martina Koederitz Board of Management: Dirk Wittkopp Corporate Seat: Boeblingen, Germany Reg.-Gericht: Amtsgericht Stuttgart, HRB 243294 informix-list-boun... (AT) iiug (DOT) org wrote on 01/20/2012 12:52:01 PM: The last weeks I discovery a behave of the engine what is something new for me. I believe others will be surprised too... IFX 11.50 FC9 When you restore an archive with logical logs , maybe you needed the files/tapes of logical logs backuped before the Archive Level 0. The reason is open transactions. I have a situation here where for some reason a session keep they transaction opened up to 4 hours (probably a application failure over the control of the transaction). When was executed the Archive Level 0 this sessions was idle (with TX open) from 3 hour before. The "lag" between the logical log what this TX was opened and Archive are take is at least 100 Logical Logs (30 MB each = 3GB of Logs). So, during the restore the engine request all logical logs since this open TX. The engine include this logical logs on the archive and show the number of this logs when the Archive Level 0 is finished, anyway you still needed the tapes/files created before this archive to restore the logical backuped after this archive. e.g. when the the archive bellow start running, the current log is 129634 , the log 129537 was backuped 5 hours early. If I try restore this Archive with logical logs, the engine will request the files/tapes since log 129537. *Please enter the level of archive to be performed (0, 1, or 2) 0 *Please mount tape 1 on /dev/rmt0 and press Return to continue ... *10 percent done. *... *100 percent done. *Read/Write End Of Medium enabled: blocks = 307955 *Please label this tape as number 1 in the arc tape sequence. *This tape contains the following logical logs: * 129537 - 129634 *Program over. A PMR was opened over this behave and answered as expected behave. Was created a APAR over the documentation (this behave wasn't documented before). So, if you have any policy of discarding logical logs backups take before your Archive... beware . Regards Cesar_____________________________________________ __ Informix-list mailing list Informix-l... (AT) iiug (DOT) org http://www.iiug.org/mailman/listinfo/informix-list |
#5
| |||
| |||
|
|
Hello Martin, thanks for the info, i did not bump into this issue until this message; one is never too old to learn i guess and yes i started my old indy to check this against an old V7 engine: this one did clear serial if i recall correctly V93 and higher clear parallel. Anyways I am a bit supprised here since the info needed is on disk. I know the logs need to be cleared however it could be cleared around the logs which are in the lvl 0 archive. the answer is probably do not fix what is not broken, a much hated comment on things which are not perfect. See you Superboer. On 24 jan, 13:23, Martin Fuerderer <MARTI... (AT) de (DOT) ibm.com> wrote: Hi, sorry that it took me quite a while to pick up on this topic ... but here's an attempt to answer some of the questions: - Cesar Inacio described the existing behavior correctly. It is really the case that upon logical log restore the oldest log file to be restored is the one that contains the BEGIN WORK log record of the oldest transaction that is (still) open at the time of the level-0 archive checkpoint. As such, this log file can be much older than the level-0 archive itself. - This is not new behavior. It has been like that since a long time (some say since 1988 and that's most probably correct). At least I confirmed, that a 7.31.UD8 from 2004 had exactly this behavior. - For a level-0 archive to be restorable without logical log restore, logical log records needed to make the restore logically consistent are contained within the level-0 archive. This is especially needed for the cases where logical logs are never backed up but only discarded (think of LTAPEDEV in the onconfig file set to /dev/null). These needed logical log records are applied after the physical restore (which is the copying of pages from tape to chunks on disk) when transactions were open at the time of the level-0 archive checkpoint. Applying these logical log records rolls back those open transactions and such makes the restore logically consistent. - If the physical restore of a level-0 archive is to be followed by a logical restore, the space where logical log files reside in the chunks on disk is cleared before doing the logical log restore. This step is done to ensure that no obsolete log record data or otherwise corrupt data is lingering in the log space (in chunks on disk). After this clearing, all the needed logical log files must be restored, including those that contain log records of an open transaction that started before the level-0 archive checkpoint. [ It already has been discussed in this email thread, that these old log files may be needed to roll back the open transactions if they are not committed during the log roll forward. ] - The remaining question now is: Why then is this clearing of the log space so important before starting the log restore? The reason here is the way how the logical roll forward is implemented (and this again is an implementation that always was like this and has not changed recently). The logical roll forward is processing the log records sequentially and after each processed log record it is looking on disk whether there are more log records to be processed. The end of the logical roll forward is reached when the next logical log file page on disk is all zero. And here is the catch: without the clearing of logical log space before the restore of log files, some "garbage data" may still be left over on disk (that also was not overwritten by the physical restore phase). Such "garbage data" may trick the logical roll forward into finding and applying obsolete "stuff" after the last proper log record. There are mainly three scenarios that can ensue when the logical roll forward encounters "garbage data" on disk after the last valid and correct log record: - the log roll forward would recognize the garbage as garbage and correctly end the log roll forward. This is the good case. - the log roll forward would think that there is a valid log record, but when attempting to apply it gets completely corrupted and cannot recover from this. In this case the log roll forward would just stop and leave the system in a hopelessly inconsistent state, making the whole restore a failure. - the log rollforward would apply obsolete log records successfully. This probably would be a rare, but possible case, especially when there were old log files on that part of the disk before the restore. Even though the restore may well finish, this would probably be the worst case, as data has been corrupted without any notice. - As all this is not new behavior but was 'always' like this, there is not much motivation to change it, at least not for the time being. The current implementation is rather stable and reliable (since quite a few years), an important attribute for the backup and restore functionality. Improvements to avoid the need of restoring those old log files would require medium level redesign and implementation of otherwise reliably working code, adding a real risk for introducing new problems and making the code unstable (at least for some time) - a prospect that many customers probably would not be happy with ... - Some years ago we also discussed the possibility of making the checking for the validity of a log page found on disk much stronger than it currently is. Also this would require redesign and reimplementation, but additionally also diminish performance during normal operation. The logging itself would have to create more redundant data, adding it to the log records or the log pages, so that later the checks can be more assuring. More redundant data also means bigger log records and hence more I/O when writing them (to disk). At the time of these discussions, the costs involved were clearly deemed far outweighing the possible benefits. Long response, I know ... :-) But I hope it can clarify some things and give some background info for better understanding ... which hopefully can help with accepting the current implementation. At first sight it may look quite dumb, but after all there is some thinking behind it. ![]() Regards, Martin -- Martin Fuerderer IBM Informix Development Munich, Germany Information Management Read about the Informix Warehouse Accelerator: http://tinyurl.com/the-iwa-blog IBM Deutschland Research & Development GmbH Chairman of the Supervisory Board: Martina Koederitz Board of Management: Dirk Wittkopp Corporate Seat: Boeblingen, Germany Reg.-Gericht: Amtsgericht Stuttgart, HRB 243294 informix-list-boun... (AT) iiug (DOT) org wrote on 01/20/2012 12:52:01 PM: The last weeks I discovery a behave of the engine what is something new for me. I believe others will be surprised too... IFX 11.50 FC9 When you restore an archive with logical logs , maybe you needed the files/tapes of logical logs backuped before the Archive Level 0. The reason is open transactions. I have a situation here where for some reason a session keep they transaction opened up to 4 hours (probably a application failure over the control of the transaction). When was executed the Archive Level 0 this sessions was idle (with TX open) from 3 hour before. The "lag" between the logical log what this TX was opened and Archive are take is at least 100 Logical Logs (30 MB each = 3GB of Logs). So, during the restore the engine request all logical logs since this open TX. The engine include this logical logs on the archive and show the number of this logs when the Archive Level 0 is finished, anyway you still needed the tapes/files created before this archive to restore the logical backuped after this archive. e.g. when the the archive bellow start running, the current log is 129634 , the log 129537 was backuped 5 hours early. If I try restore this Archive with logical logs, the engine will request the files/tapes since log 129537. Please enter the level of archive to be performed (0, 1, or 2) 0 Please mount tape 1 on /dev/rmt0 and press Return to continue ... 10 percent done. ... 100 percent done. Read/Write End Of Medium enabled: blocks = 307955 Please label this tape as number 1 in the arc tape sequence. This tape contains the following logical logs: 129537 - 129634 Program over. A PMR was opened over this behave and answered as expected behave. Was created a APAR over the documentation (this behave wasn't documented before). So, if you have any policy of discarding logical logs backups take before your Archive... beware . Regards Cesar_____________________________________________ __ Informix-list mailing list Informix-l... (AT) iiug (DOT) org http://www.iiug.org/mailman/listinfo/informix-list _______________________________________________ Informix-list mailing list Informix-list (AT) iiug (DOT) org http://www.iiug.org/mailman/listinfo/informix-list |
![]() |
| Thread Tools | |
| Display Modes | |
| |