![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I'm looking for assistance with Online version 5.0.5. I need to know if there is a way to detect problems at the system level of an Online 5 instance besides using tbcheck. Informix version: Online version 5.0.5 UC1 OS: SCO Unix System 5 Ver. 3.2 The whole story: One of my clients provides software and servers to a number of their clients. Their application is running on Online 5.05.UC1 on SCO boxes. Upgrading is not an option; they are locked into this version due to regulatory constraints. The application database never exceeds about 500 MB; they provide a process to remove old records as the database grows so that their clients never have to be concerned with adding disk space or other administration tasks. The problem is this: They've run into an issue with several of their biggest clients where informix tbtape archives stop working properly. Rather than taking an hour as expected, the archive takes just 5-10 minutes, then when my client tries to restore their archive to his server, the restore completes in just 5-10 minutes but doesn't really restore anything and the database is "broken" (tbchecks against the resulting database report errors). Tbtape reports no errors; the normal archive started and completed messages appear in the online log. No errors appear in the online log of the restored instance, either. I've had him run tbcheck -cr, -cc, and -ce on the originating instance, and all come back with no errors. There are no tbcheck data or index errors in the databases of the originating instances and the application works without error. The only symptom (and problem) is that an archive won't work. Because the problem is apparently at the system level of the instance, a workaround is to dbexport the data, reinitialize the instance, then reload the data. This does correct the problem. However, it would be preferable to have a way of recognizing that the problem exists without having to rely on spotting a bad archive. In some cases, tbstat -d and tbcheck -pe return negative page counts, which makes it obvious. But in most cases there is nothing at all to indicate that something is wrong. So, back to my original question: Is there a way to detect that there is something wrong at the instance level besides using tbcheck? Since this is version 5, there is no sysmaster database to query. Thank you in advance for any assistance you can give me, Barrie Shaw Xtivia, Inc. |
#3
| |||
| |||
|
|
Tbtape in OL5.xx DID NOT WORK PROPERLY in any release prior to 5.07! It would miss backing up pages when the server was busy. In addition, IB you're running into something else that was also fixed at about the same time. The page timestamps have wrapped past 2^31 and gone negative. These older versions of tbtape did not properly handle the condition and got confused. Oh the flashbacks - I was the poor sod that found that bug. |
#4
| |||
| |||
|
|
Art S. Kagel wrote: Tbtape in OL5.xx DID NOT WORK PROPERLY in any release prior to 5.07! It would miss backing up pages when the server was busy. In addition, IB you're running into something else that was also fixed at about the same time. The page timestamps have wrapped past 2^31 and gone negative. These older versions of tbtape did not properly handle the condition and got confused. Oh the flashbacks - I was the poor sod that found that bug. IIRC it was actually a bug in the SCO icc 'enhanced C compiler' Advice from tech support at the time 'Take your DB offline every night and dd the chunks to tape by hand until we can ship you a fix' Our db was much too big to dbexport/import as disk was expensive then. Took about 4 days as I recall... -- Clive |
#5
| |||
| |||
|
|
Upgrading is not an option; they are locked into this version due to regulatory constraints. |
|
I'm looking for assistance with Online version 5.0.5. I need to know if there is a way to detect problems at the system level of an Online 5 instance besides using tbcheck. Informix version: Online version 5.0.5 UC1 OS: SCO Unix System 5 Ver. 3.2 The whole story: One of my clients provides software and servers to a number of their clients. Their application is running on Online 5.05.UC1 on SCO boxes. Upgrading is not an option; they are locked into this version due to regulatory constraints. The application database never exceeds about 500 MB; they provide a process to remove old records as the database grows so that their clients never have to be concerned with adding disk space or other administration tasks. The problem is this: They've run into an issue with several of their biggest clients where informix tbtape archives stop working properly. Rather than taking an hour as expected, the archive takes just 5-10 minutes, then when my client tries to restore their archive to his server, the restore completes in just 5-10 minutes but doesn't really restore anything and the database is "broken" (tbchecks against the resulting database report errors). Tbtape reports no errors; the normal archive started and completed messages appear in the online log. No errors appear in the online log of the restored instance, either. I've had him run tbcheck -cr, -cc, and -ce on the originating instance, and all come back with no errors. There are no tbcheck data or index errors in the databases of the originating instances and the application works without error. The only symptom (and problem) is that an archive won't work. Because the problem is apparently at the system level of the instance, a workaround is to dbexport the data, reinitialize the instance, then reload the data. This does correct the problem. However, it would be preferable to have a way of recognizing that the problem exists without having to rely on spotting a bad archive. In some cases, tbstat -d and tbcheck -pe return negative page counts, which makes it obvious. But in most cases there is nothing at all to indicate that something is wrong. So, back to my original question: Is there a way to detect that there is something wrong at the instance level besides using tbcheck? Since this is version 5, there is no sysmaster database to query. Thank you in advance for any assistance you can give me, Barrie Shaw Xtivia, Inc. _______________________________________________ Informix-list mailing list Informix-list (AT) iiug (DOT) org http://www.iiug.org/mailman/listinfo/informix-list |
#6
| |||
| |||
|
#7
| |||
| |||
|
|
From: barries <barriesh20 (AT) hotmail (DOT) com Interestingly, because of all of my hard work in trying to restore the data, I ended up with a bonus, a company cell phone, and some long- needed assistance. If the restore had worked, no one would have noticed and life would have gone as usual. Go figure.. Barrie |
#8
| |||
| |||
|
|
You and I must have hit it at about the same time then. We'd been doing test restores to the same test machine using the same disks over and over so it always tbchecked out fine because any pages missing from a particular archive were restored from the original test archive or one of the previous restore tests. Sigh. One day tried to restore to a different machine and there were holes in the data. literally pages missing in the middle of an extent. Reported it and they said, OH, yeah, that's a bug and we'll have a patch for your 5.07 release in a few days. It's scheduled to be fixed in 5.08. Right 5.08, not 5.07! Darn I'm getting old. |
![]() |
| Thread Tools | |
| Display Modes | |
| |