Showing posts with label AG. Show all posts
Showing posts with label AG. Show all posts

Format-Volume Bug or Feature?

Format-Volume Bug or Feature?

Problem

We just recently added an additional node to our existing Availability Group cluster. 
Our team uses a PowerShell script to easily format multiple drives in a specific allocation unit size.

The issue happened when we formatted the drive on the new node. It is supposed to only format the drive on the server where it was ran. But for some reason, it also formatted the drive on the existing nodes.

This wiped out the drives on the existing nodes. The existing system went down and we have to restore everything from our backup.

We have to do a RCA (Root Cause Analysis) because it affected our production system.

Root Cause Analysis

We have to build a similar environment to simulate the issue.

To cut the story short, we were able to pin point the issue on Format-Volume command in PowerShell.

This particular issue only affects servers that are part of Windows cluster and have same drive letters on each node.

For SQL Server Always On:
  • This does NOT affect Failover Cluster Instance (FCI) since it is using shared storage.
  • This only affects the Availability Group configuration because each node have it's own storage and they can have same drive letters.
As a test we ran the Format-Volume to format the Z drive on node1 (of a 3 node cluster).

Format-Volume









It formatted the Z drive on of all the nodes instead of just formatting the drive on node1.

Solution

Open a case with Microsoft to check if this is a bug or if there is a different method that we can do to run the Format-Volume only on the node where it is executed. (At the time of this writing, we have not heard back from Microsoft.)

As a precautionary measure for our team, we only use this feature on the following situation:
  • Standalone servers
  • Always On Failover Cluster Instance servers
  • New server builds of Always On Availability Group

Get SQL Server Windows Cluster Information

Get SQL Server Windows Cluster Information

Problem

I have already identified that there is a SQL Server installed on a server. Using PowerShell, I wanted to know if the server is using windows clustered WITHOUT connecting the database instance. 

This is an option that system administrators since they have admin rights on the Operating System but don't have access to the SQL Server instance.

I need to know the following:
1. I need to know if the server is clustered or not.
2. If the server is clustered, I need to know why type of clustering is configured on the server. 
    SQL Server have 2 types of Windows Clustering:
  • Failover Cluster Instance
  • Availability Group
    You can also combine Failover Cluster Instance with Availability Group.
    I also wanted to know if this type of configuration is configured on the server.
3. If the server is clustered, I also need to know if the server is active or passive node

Solution

One of the requirements to be able to get windows clustering information is to ensure that you have FailoverClusters PowerShell module installed on the machine that you are using to run the PowerShell commands that are specified below.

Let us step through the requirements:
1. How to check if the server is clustered or not?
    Using HKEY_LOCAL_MACHINE check if there is a "Cluster" entry in the registry
PowerShell

Get-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\Cluster}
    To check a server remotely you need to use Invoke-Command
PowerShell

Invoke-Command -Computer $servername -ScriptBlock {Get-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\Cluster}
    If no value is returned, the SQL Server is standalone.
    If the command returns a value, then SQL Server is clustered using windows clustering.

2. How to check what type of clustering is configured on the server?
    This is where the FailoverClusters module will come-in handy.
    We need to get/check the following:
    a. Check WMIObject MSCluster_ResourceGroup for root\mscluster namespace
    b. Get-Cluster information of the server
    c. Get-ClusterResource information of the server

    Check root\mscluster namespace
PowerShell

Get-WMIObject -Class MSCluster_ResourceGroup -Namespace root\mscluster}
    To check a server remotely you need to use Invoke-Command
PowerShell

Invoke-Command -Computer $servername -ScriptBlock {Get-WMIObject -Class MSCluster_ResourceGroup -Namespace root\mscluster}
    Get the cluster and cluster resource information of the server
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource
    If both root\mscluster namespace and cluster resource does not return a value then the server is configured as a "single node" cluster.

    If both root\mscluster namespace and cluster resource returns a value, we need to figure out what type of SQL Server windows clustering is configured.

3. How to check if "Availability Group" is configured?
    Using the cluster resource information, check if there is a resource type called "SQL Server Availability Group".
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource | Where-Object ResourceType -eq 'SQL Server Availability Group'
    If no results were returned then, the server is configured with "Failover Cluster Instance".
    If one result is returned, the server is configured with "Availability Group/Listener".
    If more than one result is returned, the server is configured with multiple "Availability Group/Listener".

4. How to check if "FCI with Availability Group" is configured?
    First you need to make sure that the server is configured with "Availability Group". 
    Using the cluster resource information, check if there is a resource type called "SQL Server Availability Group" or resource type called "Network Name" and resource name that starts with "SQL Network Name".
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource | Where-Object { (($_.ResourceType -eq 'Network Name') -and ($_.Name -Like 'SQL Network Name *')) -or ($_.ResourceType -eq 'SQL Server Availability Group')  }
    If 2 different resource types are returned, the server is configured with "FCI with Availability Group".
    NOTE: Ensure that you filter the uniqueness of the returned resource type value.

5. How to check if the server is the active or passive node?
    Using the cluster resource information, get the OwnerNode for resource type called "SQL Server".
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource | Where-Object ResourceType -eq 'SQL Server' | Select-Object OwnerNode
    If the returned OwnerNode value is the same as the $servername, then it is the "active" node, otherwise it is the passive node.

Now you can combine all of the commands above to create a function that will return the SQL Server Windows Cluster Information.

You can now download the Get-SQLServer-WindowsClusterInfo function from github.