dbTalk Databases Forums  

Secondary Node fails to start

microsoft.public.sqlserver.clustering microsoft.public.sqlserver.clustering


Discuss Secondary Node fails to start in the microsoft.public.sqlserver.clustering forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Johnnie Scott
 
Posts: n/a

Default Secondary Node fails to start - 01-21-2004 , 12:59 PM






I have a two-node Active/Passive SQL Server Cluster on
Win2k AS and SQL2000. When I attempt to manually failover
to node 2 of the cluster all of the resources go over
except SQL Server. The cluster log reports an Error
435. 'SQL Server failed to start'. I've just noticed
another error that is showing up in the event log when the
manual failover occurs.

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17055
Date: 1/15/2004
Time: 12:37:14 PM
User: N/A
Computer: DPOTSQL00 (virtual name)
Description:
17050 :
initerrlog: Could not open error log
file 'f:\data\MSSQL\log\ERRORLOG'. Operating system error
= 3(The system cannot find the path specified.).

The F: path is on a SAN disk where the primary node of the
cluster functions correctly. It seems like the second
node fails to find the F drive altogether but I can't
figure out why since the node successfully imports the
disk group right before this failure.

This server was working for about 1 year then quit failing
over to the secondary node. It is possible that a couple
of environmental changes affected these boxes-but, we
haven't been able to prove the changes as the cause, nor
have I been able to regain functionality after trying to
reverse those changes.

One thing that we know that changed around the time that
the node quit working is that Group Policies were changed
for the SQL service account. Someone in our network
security department mistakenly removed "logon as a
service" and "replace a process-level token" rights from
Group Policies. These rights were replaced both at the
group and local levels. Adding the policies back to the
accounts didn't work. Changing the account also didn't
work.

I also attempted to remove the failed node and adding it
back. The interesting thing here is that I received an
error message at the end of this process stating "Setup
failed to perform required operations on the cluster
nodes" but the node was no longer a part of the cluster
after rebooting both machines. I added the node back to
the cluster without any errors, but the node still
wouldn't start during failover.

Dependencies are set up as follows:

F: disk has no dependencies and can't be changed (as far
as I can tell) to be dependent on SQL -currently comes
across fine
IP Address has no dependencies and can only be changed to
have a dependency on F: --currently comes across fine
SQL Network Name has a dependency on IP Address. F: can
be added but currently comes across fine.
SQL Server has a dependency on F: and Network Name - fails
to come across. (At this point FAILBACK occurs.
SQL Server Agent has a dependency on SQL Server only -
doesn't try to come across.
SQL Server FULLTEXT has a dependency on SQL Server -
doesn't' try to come across.
No MS DTC resource defined.

Any ideas on what I should do to get node2 to start
properly?



Reply With Quote
  #2  
Old   
Allan Hirt
 
Posts: n/a

Default Secondary Node fails to start - 01-23-2004 , 07:11 AM






Could be a few things, including a bad disk controller,
the account running SQL Server or the server cluster
itself has the wrong permissions, etc. Even though you've
added the permissiones back in, it doesn't mean that
others are not missing.

MS DTC should be defined, but it wouldn't be in the SQL
group. Drive F: should not have any dependencies;SQL
Server should be dependent upon it.

Without seeing your logs or having access to your node,
these would be where I look first.

If the problem persists, I'd open a case with PSS.

Quote:
-----Original Message-----
I have a two-node Active/Passive SQL Server Cluster on
Win2k AS and SQL2000. When I attempt to manually
failover
to node 2 of the cluster all of the resources go over
except SQL Server. The cluster log reports an Error
435. 'SQL Server failed to start'. I've just noticed
another error that is showing up in the event log when
the
manual failover occurs.

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17055
Date: 1/15/2004
Time: 12:37:14 PM
User: N/A
Computer: DPOTSQL00 (virtual name)
Description:
17050 :
initerrlog: Could not open error log
file 'f:\data\MSSQL\log\ERRORLOG'. Operating system error
= 3(The system cannot find the path specified.).

The F: path is on a SAN disk where the primary node of
the
cluster functions correctly. It seems like the second
node fails to find the F drive altogether but I can't
figure out why since the node successfully imports the
disk group right before this failure.

This server was working for about 1 year then quit
failing
over to the secondary node. It is possible that a couple
of environmental changes affected these boxes-but, we
haven't been able to prove the changes as the cause, nor
have I been able to regain functionality after trying to
reverse those changes.

One thing that we know that changed around the time that
the node quit working is that Group Policies were changed
for the SQL service account. Someone in our network
security department mistakenly removed "logon as a
service" and "replace a process-level token" rights from
Group Policies. These rights were replaced both at the
group and local levels. Adding the policies back to the
accounts didn't work. Changing the account also didn't
work.

I also attempted to remove the failed node and adding it
back. The interesting thing here is that I received an
error message at the end of this process stating "Setup
failed to perform required operations on the cluster
nodes" but the node was no longer a part of the cluster
after rebooting both machines. I added the node back to
the cluster without any errors, but the node still
wouldn't start during failover.

Dependencies are set up as follows:

F: disk has no dependencies and can't be changed (as far
as I can tell) to be dependent on SQL -currently comes
across fine
IP Address has no dependencies and can only be changed to
have a dependency on F: --currently comes across fine
SQL Network Name has a dependency on IP Address. F: can
be added but currently comes across fine.
SQL Server has a dependency on F: and Network Name -
fails
to come across. (At this point FAILBACK occurs.
SQL Server Agent has a dependency on SQL Server only -
doesn't try to come across.
SQL Server FULLTEXT has a dependency on SQL Server -
doesn't' try to come across.
No MS DTC resource defined.

Any ideas on what I should do to get node2 to start
properly?


.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.