![]() | |
![]() |
| | Thread Tools | Display Modes |
#41
| |||
| |||
|
|
I agree with John that you'd be better off just to check the system eventlog. Cluster.log is not really readable because it's not meant for troubleshooting by users. The main reason is that it logs far more than what you would care, and it logs most of the activities using the GUID's of the cluster objects instead of their more readable names. So unless you process the cluster.log first to distill it and replace all the GUID's with their corresponding names, your return on time may not be great. The following is an example of a cluster.log that shows only the section relevant to a failover (with a huge number of lines removed): 2008/04/18-03:03:48.520 INFO [FM] FmpMoveGroup: Moving group 3771af1f-c5a8-42a3-91f2-73c6277d3931 to node 1 (1) 2008/04/18-03:03:48.520 INFO [FM] FmpNotifyGroupStateChangeReason: Notifying group Group 1 [3771af1f-c5a8-42a3-91f2-73c6277d3931] of state change reason 1... 2008/04/18-03:03:48.520 INFO [FM] FmpOfflineResource: SQL Server (SQL1) depends on Disk L:. Shut down first. 2008/04/18-03:03:48.520 INFO [FM] FmpOfflineResource: SQL Server Agent (SQL1) depends on SQL Server (SQL1). Shut down first. 2008/04/18-03:03:50.629 INFO [FM] FmpRmOfflineResource: d189fa11-7c3f-4120-8042-e554da0b8fd9 is now offline 2008/04/18-03:03:50.629 INFO [CP] CppResourceNotify for resource SQL Server Agent (SQL1) 2008/04/18-03:03:51.082 INFO SQL Server <SQL Server (SQL1)>: [sqsrvres] OnlineThread: asked to terminate while waiting for QP. 2008/04/18-03:03:51.395 INFO [CP] CppResourceNotify for resource SQL Server (SQL1) 2008/04/18-03:03:51.395 INFO [FM] FmpOfflineResource: SQL Server Fulltext (SQL1) depends on Disk L:. Shut down first. 2008/04/18-03:03:51.395 INFO Network Name <SQL Network Name (NYSQL1)>: Taking resource offline... 2008/04/18-03:03:51.395 INFO Network Name <SQL Network Name (NYSQL1)>: Offline of resource continuing... 2008/04/18-03:03:51.395 INFO [FM] FmpRmOfflineResource: RmOffline() for 039175c9-52f2-420e-809d-9f64e8e6a304 returned error 997 2008/04/18-03:03:51.395 INFO [FM] FmpOfflineResource: SQL Network Name (NYSQL1) depends on SQL IP Address 1 (NYSQL1). Shut down first. 2008/04/18-03:03:51.395 INFO [FM] FmpOfflineResource for SQL IP Address 1 (NYSQL1) marked as waiting. 2008/04/18-03:03:51.395 INFO Network Name <SQL Network Name (NYSQL1)>: Deleted server name NYSQL1 from all transports. 2008/04/18-03:03:51.395 INFO [FM] FmpOfflineResource: Offline resource <SQL Server Fulltext (SQL1)> returned pending 2008/04/18-03:03:51.395 INFO [FM] FmpMoveGroup: Exit group <Group 1>, status = 997 2008/04/18-03:03:51.395 INFO [FM] FmpDoMoveGroup: Exit, status = 997 2008/04/18-03:03:51.395 INFO Network Name <SQL Network Name (NYSQL1)>: Resource is now offline 2008/04/18-03:03:51.395 INFO IP Address <SQL IP Address 1 (NYSQL1)>: Taking resource offline... 2008/04/18-03:03:51.395 INFO IP Address <SQL IP Address 1 (NYSQL1)>: Deleting IP interface 4. 2008/04/18-03:03:51.410 INFO IP Address <SQL IP Address 1 (NYSQL1)>: Address 30.5.182.181 on adapter DH Team #1 offline. 2008/04/18-03:03:51.410 INFO [FM] OfflineWaitingResourceTree: Exit, status=0 for <SQL IP Address 1 (NYSQL1)>. 2008/04/18-03:03:51.410 INFO [FM] OfflineWaitingResourceTree: Exit, status=0 for <SQL Network Name (NYSQL1)>. 2008/04/18-03:03:51.410 INFO [FM] FmpCompleteMoveGroup: Completing the move for group Group 1 to node 1 (1) 2008/04/18-03:03:51.410 INFO [FM] FmpOfflineResource: Offline resource <Disk L:> returned pending 2008/04/18-03:03:51.410 INFO [FM] FmpOfflineResource: Offline resource <SQL Server Fulltext (SQL1)> returned pending 2008/04/18-03:03:52.395 INFO Generic Service <SQL Server Fulltext (SQL1)>: Service died or not active any more; status = 1062. 2008/04/18-03:03:52.395 INFO Generic Service <SQL Server Fulltext (SQL1)>: Service is now offline. 2008/04/18-03:03:52.395 INFO [FM] FmpOfflineResource: SQL Server (SQL1) depends on Disk L:. Shut down first. 2008/04/18-03:03:52.395 INFO [FM] FmpOfflineResource: SQL Server Fulltext (SQL1) depends on Disk L:. Shut down first. 2008/04/18-03:03:52.395 INFO Physical Disk <Disk L:>: Offline, FlushFileBuffers for \Device\Harddisk4\Partition1. 2008/04/18-03:03:52.410 INFO Physical Disk <Disk L:>: Offline, Locking volume for \Device\Harddisk4\Partition1. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: Offline, Dismounting volume \Device\Harddisk4\Partition1. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: Offline, Dismount complete, volume \Device\Harddisk4\Partition1. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: DiskCleanup started. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [DiskArb] StopPersistentReservations is called. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [DiskArb] Stopping reservation thread. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [DiskArb] CompletionRoutine, status 0. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [ArbCleanup] Verifying sector size. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [ArbCleanup] Reading arbitration block. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [DiskArb] Successful read (sector 12) [ELABVHOST162:17380] (0,e30653ee:01c8a0c3). 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [ArbCleanup] Writing arbitration block. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [DiskArb] Successful write (sector 12) [:0] (0,00000000:00000000). 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [ArbCleanup] Returning status 0. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: [DiskArb] StopPersistentReservations is complete. 2008/04/18-03:03:53.067 INFO Physical Disk <Disk L:>: DisksDismountDrives: letter mask is 00000800. 2008/04/18-03:03:53.082 INFO Physical Disk <Disk L:>: DiskCleanup returning final error 0 2008/04/18-03:03:53.082 INFO Physical Disk <Disk L:>: Offline, Returning final error 0. Linchi "Mike" wrote: Could you please tell me how I would "spot" it in the Cluster.log file? Thank you again for your help. -- Mike "John Toner [MVP]" wrote: Yes, failover is logged in the cluster log on the node where the group is going offline, and on the node where it is coming online. It's not really easy to spot if this is your first attempt to review the cluster.log. I would recommend instead looking thru the System Event log. You can monitor the system event log for these event IDs: 1069 - Generic error indicating a cluster resource has failed 1205 - Cluster failed to bring a group completely online or offline 1204 - Cluster brought a Group offline 1203 - Cluster is attempting to offline a group 1201 - Cluster successfully brought a group online 1200 - Cluster is attempting to online the group Hope this helps. Regards, John Visit my blog: http://msmvps.com/blogs/jtoner "Mike" <Mike (AT) discussions (DOT) microsoft.com> wrote in message news:1E087C0B-F8B8-46B5-A66B-335F5802AA56 (AT) microsoft (DOT) com... Is a failover logged in Cluster.log? If so, what text would be recorded? Thank you. -- Mike |
![]() |
| Thread Tools | |
| Display Modes | |
| |