![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
We've been struggling with a problem for a while now. If anyone has has a similiar issue, I'd appreciate it if you could share it here as it may lead me to a solution... The Cluster houses SQL and IIS (bad I know, but it shouldn't cause the problems we see) We have the following Cluster hardware: 2 IBM x345 Servers 1 IBM ServeRAID 4MX (RAID 5) Basically, whenever we have both machines connected to the cluster, at some point (sometimes days, sometimes weeks) a failure will occur where the Clustered drives (Data and Quorum) will become defunct. Bringing them back online and restarting, etc works fine (but this takes a while and always with risk). I've been working with IBM for months now to try to troubleshoot this but nothing has helped to make this a "highly available" environment. |
#3
| |||
| |||
|
|
From your description it looks like this is a SCSI cluster. Can you give a complete description of the SCSI device as well as the physical(RAID) and logical(LUN) disk layouts for the cluster. I have an idea where your problem might be, but I need more information to be sure. Geoff N. Hiten Microsoft SQL Server MVP Senior Database Administrator "Joel" <joelmacaluso (AT) hotmail (DOT) com> wrote in message news:%23qfT7o%23MFHA.4028 (AT) tk2msftngp13 (DOT) phx.gbl... We've been struggling with a problem for a while now. If anyone has has a similiar issue, I'd appreciate it if you could share it here as it may lead me to a solution... The Cluster houses SQL and IIS (bad I know, but it shouldn't cause the problems we see) We have the following Cluster hardware: 2 IBM x345 Servers 1 IBM ServeRAID 4MX (RAID 5) Basically, whenever we have both machines connected to the cluster, at some point (sometimes days, sometimes weeks) a failure will occur where the Clustered drives (Data and Quorum) will become defunct. Bringing them back online and restarting, etc works fine (but this takes a while and always with risk). I've been working with IBM for months now to try to troubleshoot this but nothing has helped to make this a "highly available" environment. |
#4
| |||
| |||
|
|
Thanks Geoff, Here's what I think you are looking for: 1.)Both servers are equipped with 2x18GB (Array A) mirrored. They are connected to the internal channel of the IBM4MX SCSI. This is the logical C: and D: drives. The external Channel 1 of the Raid controller connects to the shared scsi. 2.)Array B = 2x18 GB mirrored = Q drive (Quorum) slots 13-14. (Physical device is a shared IBM SCSI storage array) 3)Array C= 5x18 Raid 5 = S drive (Shared) slots 0-4 (Physical device is shared IBM SCSI storage array) Summarized: LUNs= Q: and S: [Storage Array-Arrays B&C] C: and D: internal Server [Array A] "Geoff N. Hiten" <SRDBA (AT) Careerbuilder (DOT) com> wrote in message news:uiCzd1ANFHA.2136 (AT) TK2MSFTNGP14 (DOT) phx.gbl... From your description it looks like this is a SCSI cluster. Can you give a complete description of the SCSI device as well as the physical(RAID) and logical(LUN) disk layouts for the cluster. I have an idea where your problem might be, but I need more information to be sure. Geoff N. Hiten Microsoft SQL Server MVP Senior Database Administrator "Joel" <joelmacaluso (AT) hotmail (DOT) com> wrote in message news:%23qfT7o%23MFHA.4028 (AT) tk2msftngp13 (DOT) phx.gbl... We've been struggling with a problem for a while now. If anyone has has a similiar issue, I'd appreciate it if you could share it here as it may lead me to a solution... The Cluster houses SQL and IIS (bad I know, but it shouldn't cause the problems we see) We have the following Cluster hardware: 2 IBM x345 Servers 1 IBM ServeRAID 4MX (RAID 5) Basically, whenever we have both machines connected to the cluster, at some point (sometimes days, sometimes weeks) a failure will occur where the Clustered drives (Data and Quorum) will become defunct. Bringing them back online and restarting, etc works fine (but this takes a while and always with risk). I've been working with IBM for months now to try to troubleshoot this but nothing has helped to make this a "highly available" environment. |
#5
| |||
| |||
|
|
Looks like there is a problem sharing a controller between the clustered resource and the local disk resources. Make the vendor show you where this is a certified cluster solution. I don't thing shared controllers is supported. Any way you slice it, you will get very poor performance from a SCSI storage array in a clustered environment using RAID5 containers. Clustering requires that the controllers operate in direct-write mode (no write cache) so RAID5 is extremely slow. Geoff N. Hiten Microsoft SQL Server MVP Senior Database Administrator "Joel" <joelmacaluso (AT) hotmail (DOT) com> wrote in message news:Of$K0BHNFHA.244 (AT) tk2msftngp13 (DOT) phx.gbl... Thanks Geoff, Here's what I think you are looking for: 1.)Both servers are equipped with 2x18GB (Array A) mirrored. They are connected to the internal channel of the IBM4MX SCSI. This is the logical C: and D: drives. The external Channel 1 of the Raid controller connects to the shared scsi. 2.)Array B = 2x18 GB mirrored = Q drive (Quorum) slots 13-14. (Physical device is a shared IBM SCSI storage array) 3)Array C= 5x18 Raid 5 = S drive (Shared) slots 0-4 (Physical device is shared IBM SCSI storage array) Summarized: LUNs= Q: and S: [Storage Array-Arrays B&C] C: and D: internal Server [Array A] "Geoff N. Hiten" <SRDBA (AT) Careerbuilder (DOT) com> wrote in message news:uiCzd1ANFHA.2136 (AT) TK2MSFTNGP14 (DOT) phx.gbl... From your description it looks like this is a SCSI cluster. Can you give a complete description of the SCSI device as well as the physical(RAID) and logical(LUN) disk layouts for the cluster. I have an idea where your problem might be, but I need more information to be sure. Geoff N. Hiten Microsoft SQL Server MVP Senior Database Administrator "Joel" <joelmacaluso (AT) hotmail (DOT) com> wrote in message news:%23qfT7o%23MFHA.4028 (AT) tk2msftngp13 (DOT) phx.gbl... We've been struggling with a problem for a while now. If anyone has has a similiar issue, I'd appreciate it if you could share it here as it may lead me to a solution... The Cluster houses SQL and IIS (bad I know, but it shouldn't cause the problems we see) We have the following Cluster hardware: 2 IBM x345 Servers 1 IBM ServeRAID 4MX (RAID 5) Basically, whenever we have both machines connected to the cluster, at some point (sometimes days, sometimes weeks) a failure will occur where the Clustered drives (Data and Quorum) will become defunct. Bringing them back online and restarting, etc works fine (but this takes a while and always with risk). I've been working with IBM for months now to try to troubleshoot this but nothing has helped to make this a "highly available" environment. |
![]() |
| Thread Tools | |
| Display Modes | |
| |