dbTalk Databases Forums  

Help with clustering SQL2K8 SP1 on iSCSI?

microsoft.public.sqlserver.clustering microsoft.public.sqlserver.clustering


Discuss Help with clustering SQL2K8 SP1 on iSCSI? in the microsoft.public.sqlserver.clustering forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Robert Hindla
 
Posts: n/a

Default Help with clustering SQL2K8 SP1 on iSCSI? - 01-20-2010 , 10:31 AM






I'm seeing some issues on a Two Node cluster (majority + disk). Failover
works GORGEOUSLY. No problems. They go like a volleyball game. But when I
shut one of the nodes down, the node that's gone down won't give up the
reservations to the remaining node. As a result, failover works, but SQL
can't come up on the remaining node because the remaining node can't bring
the failed instance's drives on line.

There's something about shutting down . . .

The first time this happened it drove us N.U.T.S. It lead us to evict the
node that couldn't get the drives back, but it lead to a total cluster
rebuild over NYEve weekend.

The first time, as we learned from the second time it happened, all we would
have needed to do to fix the problem was reboot the node that had locked up
the drives.

The second time, we managed to check through all the iSCSI details and
reached that conclusion about the reboot.

BOTH TIMES the shutdown was supposedly in an orderly fashion. The first
time, I ordered one node to shut down so I could set the terminal services
to disabled and boot without them. The instance failed properly to the
remaining node, but it couldn't come back. The second time, I just shut a
node down instead of logging off. (Hey, don't tell me you've not done this
one in your life .

My understanding is that whether deliberate or fault-related, the group
should failover on shutdown.

Have you ever heard of anything like this? The first time this happened,
the cluster was running on Fibre Channel, but we had just moved the Witness
Disk to the iSCSI san we were migrating to. We had the data on the FC, and
MSDtc and Witness disks on iSCSI. This first time, a run of the validation
wizard showed the Witness disk was the problem child that couldn't be
brought on line. The simultaneous availability check on W: failed after my
reboot. Sadly, I don't remember exactly which node owned it when I shut
down.

The second time, the problem wasn't with the witness disk. One of our SQL
Instance's Data, Temp, and Master volumes couldn't come on line.

BTW, the nodes are running W2K8 SP2, SQL2K8 SP1, and are connected to an
Equallogic iSCSI SAN.

ALSO, MAYBE Important: All our data locations are drives mounted on small
parent volumes.

Reply With Quote
  #2  
Old   
Geoff N. Hiten
 
Posts: n/a

Default Re: Help with clustering SQL2K8 SP1 on iSCSI? - 01-20-2010 , 12:44 PM






Sounds like a glitch in the drivers or in the SAN firmware. Are you sure
you have the SAN set to allow multiple connections to the target LUNs?

As far as using "Anchor" LUNs and mountpoints, I do that as a matter of
course. Make sure the mount point volumes are dependent on the Anchor LUN
and that SQL is dependent on all LUNs (anchor and mountpoint) to make sure
everything comes online in the correct sequence.

You are correct, the group should failover on a host node shutdown whether
the shutdown is controlled or not.

A couple of questions.

Are the iSCSI connections on dedicated NICs? If so, are these iSCSI NICs or
ordinary network NICs?

If iSCSI is on separate NICs, are all the other protocols disabled on those
NICs?

--
Geoff N. Hiten
Principal SQL Infrastructure Consultant
Microsoft SQL Server MVP


"Robert Hindla" <rhindla (AT) panix (DOT) com> wrote

Quote:
I'm seeing some issues on a Two Node cluster (majority + disk). Failover
works GORGEOUSLY. No problems. They go like a volleyball game. But when
I
shut one of the nodes down, the node that's gone down won't give up the
reservations to the remaining node. As a result, failover works, but SQL
can't come up on the remaining node because the remaining node can't bring
the failed instance's drives on line.

There's something about shutting down . . .

The first time this happened it drove us N.U.T.S. It lead us to evict
the
node that couldn't get the drives back, but it lead to a total cluster
rebuild over NYEve weekend.

The first time, as we learned from the second time it happened, all we
would
have needed to do to fix the problem was reboot the node that had locked
up
the drives.

The second time, we managed to check through all the iSCSI details and
reached that conclusion about the reboot.

BOTH TIMES the shutdown was supposedly in an orderly fashion. The first
time, I ordered one node to shut down so I could set the terminal services
to disabled and boot without them. The instance failed properly to the
remaining node, but it couldn't come back. The second time, I just shut a
node down instead of logging off. (Hey, don't tell me you've not done
this
one in your life .

My understanding is that whether deliberate or fault-related, the group
should failover on shutdown.

Have you ever heard of anything like this? The first time this happened,
the cluster was running on Fibre Channel, but we had just moved the
Witness
Disk to the iSCSI san we were migrating to. We had the data on the FC,
and
MSDtc and Witness disks on iSCSI. This first time, a run of the
validation
wizard showed the Witness disk was the problem child that couldn't be
brought on line. The simultaneous availability check on W: failed after
my
reboot. Sadly, I don't remember exactly which node owned it when I shut
down.

The second time, the problem wasn't with the witness disk. One of our SQL
Instance's Data, Temp, and Master volumes couldn't come on line.

BTW, the nodes are running W2K8 SP2, SQL2K8 SP1, and are connected to an
Equallogic iSCSI SAN.

ALSO, MAYBE Important: All our data locations are drives mounted on small
parent volumes.


Reply With Quote
  #3  
Old   
Robert Hindla
 
Posts: n/a

Default Re: Help with clustering SQL2K8 SP1 on iSCSI? - 01-21-2010 , 09:05 AM



Geoff, thanks for your response. Please see my responses in line. I look
forward to testing out different configurations to combat this problem as
downtime allows. It's scarce at all times, but maybe if we come up with a
plan we can test out a change.

On 1/20/10 1:44 PM, in article OTAaoCgmKHA.4312 (AT) TK2MSFTNGP05 (DOT) phx.gbl, "Geoff
N. Hiten" <SQLCraftsman (AT) gmail (DOT) com> wrote:

Quote:
Sounds like a glitch in the drivers or in the SAN firmware. Are you sure
you have the SAN set to allow multiple connections to the target LUNs?
Yes, that was the first thing I thought of.

Quote:
As far as using "Anchor" LUNs and mountpoints, I do that as a matter of
course. Make sure the mount point volumes are dependent on the Anchor LUN
and that SQL is dependent on all LUNs (anchor and mountpoint) to make sure
everything comes online in the correct sequence.
I read somewhere this wasn't strictly necessary with W2K8 and SQL2K8. In
the previous generation, with W2K3 and SQL2K5, I had made sure that SQL
depended on every single drive. Is it in books on line where I was advised
that one needed obsess about SQL process dependencies? I think I read
something to the effect of 'Make sure SQL depends on the name and the
parent disk, and then Windows clustering will do the rest." If this is not
true, adding the dependencies may be a quick fix.

Quote:
You are correct, the group should failover on a host node shutdown whether
the shutdown is controlled or not.

A couple of questions.

Are the iSCSI connections on dedicated NICs? If so, are these iSCSI NICs or
ordinary network NICs?
Yes, they are on dedicated nics. They are not strictly iSCSI nics, but they
are on NICs whose drivers advertise an iSCSI, not merely a TOE capability.
They are HP NC360T's.
Quote:
If iSCSI is on separate NICs, are all the other protocols disabled on those
NICs?
Hmmm. What do you mean exactly? We're standardized on IP v 4. I could
unbind IP6. But I think I need to have the HP Network Control Utility
onboard to set the cards to use Jumbo Frames and Flow Control.

Thanks for getting back so quickly.

Reply With Quote
  #4  
Old   
Geoff N. Hiten
 
Posts: n/a

Default Re: Help with clustering SQL2K8 SP1 on iSCSI? - 01-21-2010 , 01:09 PM



Definitely set the dependencies. Clustering handles the rest IF you have
the dependencies set.

Leave IPv4 and IPv6 bound to the iSCSI NICs, but take away all the Microsoft
protocols.


--
Geoff N. Hiten
Principal SQL Infrastructure Consultant
Microsoft SQL Server MVP

"Robert Hindla" <rhindla (AT) panix (DOT) com> wrote

Quote:
Geoff, thanks for your response. Please see my responses in line. I
look
forward to testing out different configurations to combat this problem as
downtime allows. It's scarce at all times, but maybe if we come up with
a
plan we can test out a change.

On 1/20/10 1:44 PM, in article OTAaoCgmKHA.4312 (AT) TK2MSFTNGP05 (DOT) phx.gbl,
"Geoff
N. Hiten" <SQLCraftsman (AT) gmail (DOT) com> wrote:

Sounds like a glitch in the drivers or in the SAN firmware. Are you sure
you have the SAN set to allow multiple connections to the target LUNs?

Yes, that was the first thing I thought of.


As far as using "Anchor" LUNs and mountpoints, I do that as a matter of
course. Make sure the mount point volumes are dependent on the Anchor
LUN
and that SQL is dependent on all LUNs (anchor and mountpoint) to make
sure
everything comes online in the correct sequence.

I read somewhere this wasn't strictly necessary with W2K8 and SQL2K8. In
the previous generation, with W2K3 and SQL2K5, I had made sure that SQL
depended on every single drive. Is it in books on line where I was
advised
that one needed obsess about SQL process dependencies? I think I read
something to the effect of 'Make sure SQL depends on the name and the
parent disk, and then Windows clustering will do the rest." If this is
not
true, adding the dependencies may be a quick fix.


You are correct, the group should failover on a host node shutdown
whether
the shutdown is controlled or not.

A couple of questions.

Are the iSCSI connections on dedicated NICs? If so, are these iSCSI NICs
or
ordinary network NICs?

Yes, they are on dedicated nics. They are not strictly iSCSI nics, but
they
are on NICs whose drivers advertise an iSCSI, not merely a TOE capability.
They are HP NC360T's.

If iSCSI is on separate NICs, are all the other protocols disabled on
those
NICs?

Hmmm. What do you mean exactly? We're standardized on IP v 4. I could
unbind IP6. But I think I need to have the HP Network Control Utility
onboard to set the cards to use Jumbo Frames and Flow Control.

Thanks for getting back so quickly.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.