![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#3
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#4
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#5
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#6
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#7
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#8
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#9
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
#10
| |||
| |||
|
|
A SQL 2000 db on W2K3 cluster with 2 nodes lost it's public connection over the weekend. Both VIPs were unpingable. I was able to rdc directly to each of the nodes on monday and restart the cluster admin service. Once this was done the VIPs were reachable again. I'm trying to understand what happened here. There are messages in the event log that show that Node1 lost it's public connection and then tried to fail-over to node2. Node2 could not communicate won the public connection so it tried to fail-ove to node1. This flip-flopping happened a few more times before another message appeared indicating the cluster could once again communicate on the public conn. However, now of the apps were able to connect to the db. After a certain number or trys, did both of the nodes eventually give up and remove themselves from the cluster thus requiring the restart of the cluster services? -- MG |
![]() |
| Thread Tools | |
| Display Modes | |
| |