Showing posts with label Failover. Show all posts
Showing posts with label Failover. Show all posts

Format-Volume Bug or Feature?

Format-Volume Bug or Feature?

Problem

We just recently added an additional node to our existing Availability Group cluster. 
Our team uses a PowerShell script to easily format multiple drives in a specific allocation unit size.

The issue happened when we formatted the drive on the new node. It is supposed to only format the drive on the server where it was ran. But for some reason, it also formatted the drive on the existing nodes.

This wiped out the drives on the existing nodes. The existing system went down and we have to restore everything from our backup.

We have to do a RCA (Root Cause Analysis) because it affected our production system.

Root Cause Analysis

We have to build a similar environment to simulate the issue.

To cut the story short, we were able to pin point the issue on Format-Volume command in PowerShell.

This particular issue only affects servers that are part of Windows cluster and have same drive letters on each node.

For SQL Server Always On:
  • This does NOT affect Failover Cluster Instance (FCI) since it is using shared storage.
  • This only affects the Availability Group configuration because each node have it's own storage and they can have same drive letters.
As a test we ran the Format-Volume to format the Z drive on node1 (of a 3 node cluster).

Format-Volume









It formatted the Z drive on of all the nodes instead of just formatting the drive on node1.

Solution

Open a case with Microsoft to check if this is a bug or if there is a different method that we can do to run the Format-Volume only on the node where it is executed. (At the time of this writing, we have not heard back from Microsoft.)

As a precautionary measure for our team, we only use this feature on the following situation:
  • Standalone servers
  • Always On Failover Cluster Instance servers
  • New server builds of Always On Availability Group

Get SQL Server Windows Cluster Information

Get SQL Server Windows Cluster Information

Problem

I have already identified that there is a SQL Server installed on a server. Using PowerShell, I wanted to know if the server is using windows clustered WITHOUT connecting the database instance. 

This is an option that system administrators since they have admin rights on the Operating System but don't have access to the SQL Server instance.

I need to know the following:
1. I need to know if the server is clustered or not.
2. If the server is clustered, I need to know why type of clustering is configured on the server. 
    SQL Server have 2 types of Windows Clustering:
  • Failover Cluster Instance
  • Availability Group
    You can also combine Failover Cluster Instance with Availability Group.
    I also wanted to know if this type of configuration is configured on the server.
3. If the server is clustered, I also need to know if the server is active or passive node

Solution

One of the requirements to be able to get windows clustering information is to ensure that you have FailoverClusters PowerShell module installed on the machine that you are using to run the PowerShell commands that are specified below.

Let us step through the requirements:
1. How to check if the server is clustered or not?
    Using HKEY_LOCAL_MACHINE check if there is a "Cluster" entry in the registry
PowerShell

Get-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\Cluster}
    To check a server remotely you need to use Invoke-Command
PowerShell

Invoke-Command -Computer $servername -ScriptBlock {Get-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\Cluster}
    If no value is returned, the SQL Server is standalone.
    If the command returns a value, then SQL Server is clustered using windows clustering.

2. How to check what type of clustering is configured on the server?
    This is where the FailoverClusters module will come-in handy.
    We need to get/check the following:
    a. Check WMIObject MSCluster_ResourceGroup for root\mscluster namespace
    b. Get-Cluster information of the server
    c. Get-ClusterResource information of the server

    Check root\mscluster namespace
PowerShell

Get-WMIObject -Class MSCluster_ResourceGroup -Namespace root\mscluster}
    To check a server remotely you need to use Invoke-Command
PowerShell

Invoke-Command -Computer $servername -ScriptBlock {Get-WMIObject -Class MSCluster_ResourceGroup -Namespace root\mscluster}
    Get the cluster and cluster resource information of the server
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource
    If both root\mscluster namespace and cluster resource does not return a value then the server is configured as a "single node" cluster.

    If both root\mscluster namespace and cluster resource returns a value, we need to figure out what type of SQL Server windows clustering is configured.

3. How to check if "Availability Group" is configured?
    Using the cluster resource information, check if there is a resource type called "SQL Server Availability Group".
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource | Where-Object ResourceType -eq 'SQL Server Availability Group'
    If no results were returned then, the server is configured with "Failover Cluster Instance".
    If one result is returned, the server is configured with "Availability Group/Listener".
    If more than one result is returned, the server is configured with multiple "Availability Group/Listener".

4. How to check if "FCI with Availability Group" is configured?
    First you need to make sure that the server is configured with "Availability Group". 
    Using the cluster resource information, check if there is a resource type called "SQL Server Availability Group" or resource type called "Network Name" and resource name that starts with "SQL Network Name".
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource | Where-Object { (($_.ResourceType -eq 'Network Name') -and ($_.Name -Like 'SQL Network Name *')) -or ($_.ResourceType -eq 'SQL Server Availability Group')  }
    If 2 different resource types are returned, the server is configured with "FCI with Availability Group".
    NOTE: Ensure that you filter the uniqueness of the returned resource type value.

5. How to check if the server is the active or passive node?
    Using the cluster resource information, get the OwnerNode for resource type called "SQL Server".
PowerShell

Get-Cluster -Name $servername | Get-ClusterResource | Where-Object ResourceType -eq 'SQL Server' | Select-Object OwnerNode
    If the returned OwnerNode value is the same as the $servername, then it is the "active" node, otherwise it is the passive node.

Now you can combine all of the commands above to create a function that will return the SQL Server Windows Cluster Information.

You can now download the Get-SQLServer-WindowsClusterInfo function from github.

Get SQL Server Cluster Information - Stored Procedure

Get SQL Server Cluster Information - Stored Procedure

Problem

I always need to get cluster information of my server whenever I am investigating a database server. 

I tried searching in internet as someone might have already posted something similar but could not find one. 

I decided to create a stored procedure so that I can deploy it on all of my servers and be able to retrieve the cluster information quicker.

Below are cluster information that I need:
1. Instance Name?
2. Server (Cluster Node) Name/s
3. Is it Standalone or Clustered?
4. If Clustered, is it using Always On Availability Group (AG), Failover Cluster Instance (FCI) or combination of FCI and AG?
5. If Clustered, which node is Active/Primary and Passive/Secondary
6. If using Availability Group, what is the Availability Group Name and Listener Name being used?
7. What is the SQL Server version and patch level? are the patch level the same on all nodes?
8. Lastly, I need to make sure it works on older versions of SQL Server. I have tested this in SQL Server 2008 R2 and higher (SQL Server 2019 is the latest version at the time of this writing).

Solution

I will try to explain a few sections of the script that I used to return some of the requirements above.

Check SQL Server version:

Below is the query I used to check the SQL Server version, if value is less than 11, the version is below SQL Server 2012:

SELECT @SQLServerMajorBuildVersion = LEFT(CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR), CHARINDEX('.',CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR))-1)

The sys.dm_os_cluster_nodes table for SQL Server 2008 R2 does not contain the is_current_owner column. The query to determine the primary node is different.

Check if the Database Instance is Clustered:

Below is the query I used to check if the the database server is clustered:

To check if failover cluster instance is being used, check this server property if the value is 1:

SELECT SERVERPROPERTY('IsClustered')

To check if availability group is being used, check this server property if the value is 1:

SELECT SERVERPROPERTY('IsHadrEnabled')

If both property values are 1, the database server is configure to use both failover cluster instance and availability group.

SELECT SERVERPROPERTY('IsHadrEnabled')) = 1 AND (SELECT SERVERPROPERTY('IsClustered')) = 1

Check Primary Node for Failover Cluster Instance:

I queried the sys.dm_os_cluster_nodes to get the nodes that are included in the cluster.
For SQL Server 2008 R2, I used this case statement:

CASE UPPER(NodeName)
    WHEN UPPER(CAST(SERVERPROPERTY('ComputerNamePhysicalNetBIOS') AS VARCHAR)) THEN 'PRIMARY'
    ELSE 'SECONDARY'
END AS HighAvailabilityRoleDesc

For SQL Server 2012 and higher, I used this case statement:

CASE is_current_owner 
    WHEN 1 THEN 'PRIMARY'
    ELSE 'SECONDARY'
END AS HighAvailabilityRoleDesc

Check Primary Node for Availability Group:

To get the primary node for Availability Group, you need to use the sys.dm_hadr_availability_group_states table and the primary_replica column. I have to join this with the sys.availability_replicas table and use a case statement to hardcode the values as PRIMARY or SECONDARY if I wanted it to run on any nodes.

The sys.dm_hadr_availability_replica_states table only returns multiple values when being queried in the primary server but only returns one row when querying the secondary node. Microsoft have stated this as an important information in their documentation.

To obtain information about every replica in a given availability group, query sys.dm_hadr_availability_replica_states on the server instance that is hosting the primary replica. When queried on a server instance that is hosting a secondary replica of an availability group, this dynamic management view returns only local information for the availability group.

SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName
,UPPER(ar.replica_server_name) AS ServerName
,SERVERPROPERTY('IsClustered') AS IsClustered
,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
,ag.name AS AvailabilityGroupName
,agl.dns_name AS AvailabilityGroupListenerName
,CASE ar.replica_server_name WHEN hags.primary_replica THEN 'PRIMARY'
ELSE 'SECONDARY'
END AS HighAvailabilityRoleDesc
,'AG' AS SQLServerClusteringMethod
,@@VERSION AS SQLServerVersion
FROM sys.availability_groups AS ag
LEFT OUTER JOIN sys.availability_replicas AS ar ON ag.group_id = ar.group_id
LEFT OUTER JOIN sys.availability_group_listeners AS agl ON ag.group_id = agl.group_id
LEFT OUTER JOIN sys.dm_hadr_availability_group_states as hags ON ag.group_id = hags.group_id
ORDER BY 5,2; 


Below is the entire code of the stored procedure:

SET ANSI_NULLS ON;
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET QUOTED_IDENTIFIER ON;
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
GO

IF OBJECT_ID('dbo.sp_ClusterInfo') IS NULL
  EXEC ('CREATE PROCEDURE dbo.sp_ClusterInfo AS RETURN 0;');
GO

ALTER PROCEDURE dbo.sp_ClusterInfo
AS
SET NOCOUNT ON;
SET STATISTICS XML OFF;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

DECLARE @SQLServerMajorBuildVersion AS INT 
SELECT @SQLServerMajorBuildVersion = LEFT(CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR), CHARINDEX('.',CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR))-1)

IF @SQLServerMajorBuildVersion < 11
BEGIN
       IF (SELECT SERVERPROPERTY('IsClustered')) = 1
       BEGIN
              SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName  
              ,UPPER(NodeName) AS ServerName
              ,SERVERPROPERTY('IsClustered') AS IsClustered
              ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
              ,NULL AS AvailabilityGroupName
              ,NULL AS AvailabilityGroupListenerName
              ,CASE UPPER(NodeName)
                     WHEN UPPER(CAST(SERVERPROPERTY('ComputerNamePhysicalNetBIOS') AS VARCHAR)) THEN 'PRIMARY'
                     ELSE 'SECONDARY'
              END AS HighAvailabilityRoleDesc
              ,'FCI' AS SQLServerClusteringMethod
              ,@@VERSION AS SQLServerVersion  
              FROM sys.dm_os_cluster_nodes
              ORDER BY NodeName;  
       END
       ELSE
       BEGIN
              SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName  
              ,UPPER(CAST(SERVERPROPERTY('ComputerNamePhysicalNetBIOS') AS VARCHAR)) AS ServerName
              ,SERVERPROPERTY('IsClustered') AS IsClustered
              ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
              ,NULL AS AvailabilityGroupName
              ,NULL AS AvailabilityGroupListenerName
              ,NULL AS HighAvailabilityRoleDesc
              ,'Standalone' AS SQLServerClusteringMethod  
              ,@@VERSION AS SQLServerVersion  
       END
END
ELSE
BEGIN
       IF (SELECT SERVERPROPERTY('IsHadrEnabled')) = 1 AND (SELECT SERVERPROPERTY('IsClustered')) = 1
       BEGIN
              SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName
              ,UPPER(ar.replica_server_name) AS ServerName
              ,SERVERPROPERTY('IsClustered') AS IsClustered
              ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
              ,ag.name AS AvailabilityGroupName
              ,agl.dns_name AS AvailabilityGroupListenerName
                       ,CASE ar.replica_server_name WHEN dhags.primary_replica THEN 'PRIMARY'
                       ELSE 'SECONDARY'
              END AS HighAvailabilityRoleDesc
              ,'FCI/AG' AS SQLServerClusteringMethod
              ,@@VERSION AS SQLServerVersion  
              FROM sys.availability_groups AS ag
              LEFT OUTER JOIN sys.availability_replicas AS ar ON ag.group_id = ar.group_id
              LEFT OUTER JOIN sys.availability_group_listeners AS agl ON ag.group_id = agl.group_id
                       LEFT OUTER JOIN sys.dm_hadr_availability_group_states as dhags ON ag.group_id = dhags.group_id
              WHERE UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) <> ar.replica_server_name
              UNION ALL
              SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName  
              ,UPPER(NodeName) AS ServerName
              ,SERVERPROPERTY('IsClustered') AS IsClustered
              ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
              ,NULL AS AvailabilityGroupName
              ,NULL AS AvailabilityGroupListenerName
              ,CASE is_current_owner 
                     WHEN 1 THEN 'PRIMARY'
                     ELSE 'SECONDARY'
              END AS HighAvailabilityRoleDesc
              ,'FCI/AG' AS SQLServerClusteringMethod
              ,@@VERSION AS SQLServerVersion  
              FROM sys.dm_os_cluster_nodes
              ORDER BY 5,2;  
       END
       ELSE
       BEGIN
              IF (SELECT SERVERPROPERTY('IsHadrEnabled')) = 1 
              BEGIN
   SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName
   ,UPPER(ar.replica_server_name) AS ServerName
   ,SERVERPROPERTY('IsClustered') AS IsClustered
   ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
   ,ag.name AS AvailabilityGroupName
   ,agl.dns_name AS AvailabilityGroupListenerName
   ,CASE ar.replica_server_name WHEN dhags.primary_replica THEN 'PRIMARY'
   ELSE 'SECONDARY'
   END AS HighAvailabilityRoleDesc
   ,'AG' AS SQLServerClusteringMethod
   ,@@VERSION AS SQLServerVersion
   FROM sys.availability_groups AS ag
   LEFT OUTER JOIN sys.availability_replicas AS ar ON ag.group_id = ar.group_id
   LEFT OUTER JOIN sys.availability_group_listeners AS agl ON ag.group_id = agl.group_id
   LEFT OUTER JOIN sys.dm_hadr_availability_group_states as dhags ON ag.group_id = dhags.group_id
   ORDER BY 5,2;  
              END
              ELSE 
              BEGIN
                     IF (SELECT SERVERPROPERTY('IsClustered')) = 1
                     BEGIN
                           SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName  
                           ,UPPER(NodeName) AS ServerName
                           ,SERVERPROPERTY('IsClustered') AS IsClustered
                           ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
                           ,NULL AS AvailabilityGroupName
                           ,NULL AS AvailabilityGroupListenerName
                           ,CASE is_current_owner 
                                  WHEN 1 THEN 'PRIMARY'
                                  ELSE 'SECONDARY'
                           END AS HighAvailabilityRoleDesc
                           ,'FCI' AS SQLServerClusteringMethod
                           ,@@VERSION AS SQLServerVersion  
                           FROM sys.dm_os_cluster_nodes
                           ORDER BY NodeName;  
                     END
                     ELSE
                     BEGIN
                           SELECT UPPER(CAST(SERVERPROPERTY('ServerName') AS VARCHAR)) AS InstanceName  
                           ,UPPER(CAST(SERVERPROPERTY('ComputerNamePhysicalNetBIOS') AS VARCHAR)) AS ServerName
                           ,SERVERPROPERTY('IsClustered') AS IsClustered
                           ,SERVERPROPERTY ('IsHadrEnabled') AS IsAvailabilityGroupEnabled
                           ,NULL AS AvailabilityGroupName
                           ,NULL AS AvailabilityGroupListenerName
                           ,NULL AS HighAvailabilityRoleDesc
                           ,'Standalone' AS SQLServerClusteringMethod
                           ,@@VERSION AS SQLServerVersion;  
                     END
              END
       END
END;


Update: You can now download the latest version from github.