How to set up Basic Always-On Availability Groups in SQL Server Standard Edition

Set up Basic Always-on availability & disaster recovery

In SQL Server Standard Edition

Index

Contents

Index 1

Introduction 2

List of acronyms used in this document: 2

Explanation 2

A brief explanation of what BADR is and why we use it: 2

Limitations 3

Prerequisites for setting up BADR 4

Now we need to do the actual Basic Availability Setup 4

Step 1: Enable Always On in all participating instances of SQL Server 4

Step 2: Restart SQL Server instance Service 6

Step 3: Configure AG per database per node 6

Step 4: Adding more databases to AG 14

The beauty of BADR: 16

Introduction

List of acronyms used in this document:

Acronym Description
WSFC Windows Server Failover Cluster (A group of connected and interdependent servers used for reliability and availability of an environment)
HA High Availability
AG Availability Groups
BAG Basic Availability Groups
HADR High Always-On Availability and Disaster Recovery
FCI Failover Cluster Instances
BADR Basic Always-on Availability and Disaster Recovery

Explanation

In this walkthrough, we will be setting up BADR in SQL Server Standard, as a base for HA.

A brief explanation of what BADR is and why we use it:

This is a group of SQL servers that work together to ensure high availability of applications and services.

Any HA platform in SQL Server runs in WSFC (albeit for FCI or AG).

To begin, let us provide a succent breakdown of the various components, and compare between SQL Server Standard and Enterprise editions:

Component Standard Enterprise
Always On failover cluster instances Yes Yes
High Always On availability groups No Yes
Basic Always-On availability groups Yes No
Contained availability groups No Yes
Distributed availability groups No Yes
Number of Availability Groups per Database Server Limited to number of databases in Primary Limited to system resources
Number of Databases per availability group 1 Number of databases in Primary
Maximums:
Nodes 2 9
Of which:
  • Primary
1 1
  • Read-only Secondary Replicas
0 2
  • Synchronised Secondary Replicas with Automatic Failover
1 3
  • Synchronised Secondary Replicas with Manual Failover
1 5
  • Asynchronous Secondary Replicas with Manual / Forced Failover
1 5

This functionality therefore ensures that a copy of the current primary is always available, with the following caveats:

  • For Synchronised Sencondary replicas, the more replicas in the group, the longer it will take to finalise each entry (incrementally higher latency), i.e. all secondaries in the chain need to have hardened their logs before the primary records the transaction as successful. Therefore:

    • Location is important, the closer to the primary the better.
    • Connectivity speeds between members of the WSFC makes a difference (only use in a high-speed intranet environment).
  • For Asynchronous Secondary replicas, commit occurs on the Primary and responds as such, before sending the logs to the secondary replicas. Therefore:

    • Location and network speed do not matter. However, stability does.
    • However, the probability of data loss in case of forced failover is greater than with Synchronised Secondary Replicas.

Limitations

Basic availability groups use a subset of features compared to advanced availability groups on SQL Server Enterprise Edition. Basic availability groups include the following limitations:

  • Limit of two replicas (primary and secondary). Basic Availability Groups for SQL Server on Linux support an additional configuration only replica.
  • No read access on secondary replica.
  • No backups on secondary replica.
  • No integrity checks on secondary replicas.
  • No support for replicas hosted on servers running a version of SQL Server prior to SQL Server 2016 Community Technology Preview 3 (CTP3).
  • Support for one availability database.
  • Basic availability groups cannot be upgraded to advanced availability groups. The group must be dropped and re-added to a group that contains servers running only SQL Server Enterprise Edition.
  • Basic availability groups are only supported for Standard Edition servers.
  • Basic availability groups cannot be part of a distributed availability group.
  • You may have multiple Basic availability groups connected to a single instance of SQL Server.

Prerequisites for setting up BADR

  • Ensure both nodes are on the same domain.
  • Both nodes have the same version and edition of SQL server installed, with the same level of patching.
  • The hard drive locations are the same on both replicas.
  • WSFC has been set up and is correctly configured, and both nodes are members thereof.

Now we need to do the actual Basic Availability Setup

To showcase this, I have the following setup in my environment:

  • Host: Windows Server 2022 Standard with Hyper-V

    • Virtual Servers:

      • Active Directory Services. Domain Controller and DNS Services (In the real world, these would be separate servers.)
      • DB Server 1, SQL Server 2022 with 3 databases
      • DB Server 2, SQL Server 2022.
      • File Server (this will be the tiebreaker instance)

A screenshot of a computer

Description automatically generated

Step 1: Enable Always On in all participating instances of SQL Server

  • Open SQL Server Configuration Manager (as Administrator).
  • Select SQL Server Services.
  • Right Click on the instance of SQL server and select Properties.
  • In the properties form, select Always On Availability Groups tab.
  • Tick Enable Always On Availability Groups.

A screenshot of a computer

Description automatically generated

  • A popup notification will appear to state that the changes will be applied but will only take effect after the service is restarted. Click on OK.

Step 2: Restart SQL Server instance Service

Right Click on the SQL Server instance and select Restart. The Stopping and starting pane will briefly appear

A screenshot of a computer error

Description automatically generated
A screenshot of a computer error

Description automatically generated

Step 3: Configure AG per database per node

To create a basic availability group, use the CREATE AVAILABILITY GROUP Transact-SQL command and specify the WITH BASIC option (the default is ADVANCED). You can also create the basic availability group using the UI in SQL Server Management Studio.

For the HR database, we opt for Asynchronous Secondary:

Endpoints:

A screenshot of a computer

Description automatically generated

Backup Preference:

Listener:

Once that is done, we proceed to Data Synchronisation:

We are going with Join only, as a backup and restore was done on the secondary. This will take us to the validation page:

Summary:

Pressing Finish will execute setup at the various locations.

The full detail of what was done is reflected.

Now, looking at the primary DB server in SSMS, we can see:

A screenshot of a computer

Description automatically generated

The Secondary Replica reflects:

We can check that the Virtual Network Node for the listener has been created in the domain:

Step 4: Adding more databases to AG

Adding the other databases to availability groups reflects as follows:

  • On the primary replica:

    • In SSMS Connected database list:

A screenshot of a computer

Description automatically generated

    • AG Dashboard:

  • On the secondary replica:

    • In SSMS Connected database list:

A screenshot of a computer

Description automatically generated

    • AD Dashboard:

A screenshot of a computer

Description automatically generated

  • On Active Directory:

    • Computers:

A group of black text

Description automatically generated

The beauty of BADR:

  • Connections to the database can now be made using the listener, and not directly to the primary and/or secondary replica’s (even though an instance of SQL server had not been installed on the listener (which is, in and of itself, a virtual network node, with no OS or applications installed)). Thus, in the case of a failover, it is not necessary to change the connection string in the application.

A screenshot of a computer

Description automatically generated

  • The Always-On Dashboard gives a birds-eye view of the synchronisation status of participating databases

A screenshot of a computer

Description automatically generated

  • Manual Failover can take place with the click of a button.

Here is the entire script applied, at each server as highlighted, to activate BADR:

— YOU MUST EXECUTE THE FOLLOWING SCRIPT IN SQLCMD MODE.

:Connect DBSERV01

IF (SELECT state FROM sys.endpoints WHERE name = N’Hadr_endpoint’) <> 0

BEGIN

ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED

END

GO

use [master]

GO

GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [ACSCS\SQLService]

GO

:Connect DBSERV01

IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name=’AlwaysOn_health’)

BEGIN

ALTER EVENT SESSION [AlwaysOn_health] ON SERVER WITH (STARTUP_STATE=ON);

END

IF NOT EXISTS(SELECT * FROM sys.dm_xe_sessions WHERE name=’AlwaysOn_health’)

BEGIN

ALTER EVENT SESSION [AlwaysOn_health] ON SERVER STATE=START;

END

GO

:Connect DBSERV02

IF (SELECT state FROM sys.endpoints WHERE name = N’Hadr_endpoint’) <> 0

BEGIN

ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED

END

GO

use [master]

GO

GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [ACSCS\SQLService]

GO

:Connect DBSERV02

IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name=’AlwaysOn_health’)

BEGIN

ALTER EVENT SESSION [AlwaysOn_health] ON SERVER WITH (STARTUP_STATE=ON);

END

IF NOT EXISTS(SELECT * FROM sys.dm_xe_sessions WHERE name=’AlwaysOn_health’)

BEGIN

ALTER EVENT SESSION [AlwaysOn_health] ON SERVER STATE=START;

END

GO

:Connect DBSERV01

USE [master]

GO

CREATE AVAILABILITY GROUP [HR_AG]

WITH (AUTOMATED_BACKUP_PREFERENCE = PRIMARY,

BASIC,

DB_FAILOVER = ON,

DTC_SUPPORT = NONE,

REQUIRED_SYNCHRONIZED_SECONDARIES_TO_COMMIT = 0)

FOR DATABASE [HR]

REPLICA ON N’DBServ01′ WITH (ENDPOINT_URL = N’TCP://DBServ01.ACSCS.arturicast.co.za:5022′, FAILOVER_MODE = MANUAL, AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50, SEEDING_MODE = MANUAL, SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)),

N’DBServ02′ WITH (ENDPOINT_URL = N’TCP://DBServ02.ACSCS.arturicast.co.za:5022′, FAILOVER_MODE = MANUAL, AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50, SEEDING_MODE = MANUAL, SECONDARY_ROLE(ALLOW_CONNECTIONS = NO));

GO

:Connect DBSERV01

USE [master]

GO

ALTER AVAILABILITY GROUP [HR_AG]

ADD LISTENER N’Listener_HR_AG’ (

WITH IP

((N’172.26.92.14′, N’255.255.0.0′)

)

, PORT=1433);

GO

:Connect DBSERV02

ALTER AVAILABILITY GROUP [HR_AG] JOIN;

GO

:Connect DBSERV01

BACKUP DATABASE [HR] TO DISK = N’\\DBSERV01\UserDB\HR\HR.bak’ WITH COPY_ONLY, FORMAT, INIT, SKIP, REWIND, NOUNLOAD, COMPRESSION, STATS = 5

GO

:Connect DBSERV02

RESTORE DATABASE [HR] FROM DISK = N’\\DBSERV01\UserDB\HR\HR.bak’ WITH NORECOVERY, NOUNLOAD, STATS = 5

GO

:Connect DBSERV01

BACKUP LOG [HR] TO DISK = N’\\DBSERV01\UserDB\HR\HR.trn’ WITH NOFORMAT, INIT, NOSKIP, REWIND, NOUNLOAD, COMPRESSION, STATS = 5

GO

:Connect DBSERV02

RESTORE LOG [HR] FROM DISK = N’\\DBSERV01\UserDB\HR\HR.trn’ WITH NORECOVERY, NOUNLOAD, STATS = 5

GO

:Connect DBSERV02

— Wait for the replica to start communicating

begin try

declare @conn bit

declare @count int

declare @replica_id uniqueidentifier

declare @group_id uniqueidentifier

set @conn = 0

set @count = 30 — wait for 5 minutes

if (serverproperty(‘IsHadrEnabled’) = 1)

and (isnull((select member_state from master.sys.dm_hadr_cluster_members where upper(member_name COLLATE Latin1_General_CI_AS) = upper(cast(serverproperty(‘ComputerNamePhysicalNetBIOS’) as nvarchar(256)) COLLATE Latin1_General_CI_AS)), 0) <> 0)

and (isnull((select state from master.sys.database_mirroring_endpoints), 1) = 0)

begin

select @group_id = ags.group_id from master.sys.availability_groups as ags where name = N’HR_AG’

select @replica_id = replicas.replica_id from master.sys.availability_replicas as replicas where upper(replicas.replica_server_name COLLATE Latin1_General_CI_AS) = upper(@@SERVERNAME COLLATE Latin1_General_CI_AS) and group_id = @group_id

while @conn <> 1 and @count > 0

begin

set @conn = isnull((select connected_state from master.sys.dm_hadr_availability_replica_states as states where states.replica_id = @replica_id), 1)

if @conn = 1

begin

— exit loop when the replica is connected, or if the query cannot find the replica status

break

end

waitfor delay ’00:00:10′

set @count = @count – 1

end

end

end try

begin catch

— If the wait loop fails, do not stop execution of the alter database statement

end catch

ALTER DATABASE [HR] SET HADR AVAILABILITY GROUP = [HR_AG];

GO

GO

Posted in Database, Programming | Tagged , , , | Comments Off on How to set up Basic Always-On Availability Groups in SQL Server Standard Edition

How to Set up Windows Server Failover Clusters

Set up windows server failover clusters

As a precursor for HA in SQL Server Standard Edition

Index

Contents

Set up windows server failover clusters 0

Index 1

Introduction 2

List of acronyms used in this document: 2

Explanation 2

A brief explanation of what WSFC is and why we use it: 2

Prerequisites for setting up WSFC 3

Additionally, verify the following account requirements: 3

Now we need to do the actual Clustering work 4

Step 1: Install Failover Cluster feature on all servers that will form part of the cluster 4

Step 2: Validate the configuration 5

Step 3: Run cluster validation tests 6

Step 4: Create the failover cluster 9

Step 5: Create Witness 14

Final Outcomes 17

One last thing – Setting up Cluster-Aware Updating 18

Feature description 18

Practical applications 19

Important functionality 20

Configure the nodes for remote management 21

Enable a firewall rule to allow automatic restarts 22

Note 22

Enable Windows Management Instrumentation (WMI) 23

Enable Windows PowerShell and Windows PowerShell remoting 23

Install .NET Framework 4.6 or 4.5 23

Best practices recommendations for using Cluster-Aware Updating 24

Recommendations for applying Microsoft updates 24

Introduction

List of acronyms used in this document:

Acronym

Description

WSFC

Windows Server Failover Cluster (A group of connected and interdependent servers used for reliability and availability of an environment)

HA

High Availability

AG

Availability Groups

BAG

Basic Availability Groups

HADR

High Availability and Disaster Recovery

FCI

Failover Cluster Instances

BADR

Basic Availability and Disaster Recovery

AD

Active Directory

DS

Domain Services

OU

Organisational Unit

CAU

Cluster-Aware Updating

Explanation

In this walkthrough, we will be setting up WSFC in Windows Server.

A brief explanation of what WSFC is and why we use it:

This is a group of Windows servers that work together to ensure high availability of applications and services.

Any WSFC (albeit for FCI or AG) comprises the following components:

Component

Definition

Node

Any server that participates (preferable to have an odd number of nodes, as one will perform the role of tiebreaker).

Cluster Resource

A physical or logical entity that can be owned by a node, brought online and taken offline, moved between nodes, and managed as a cluster object. A cluster resource can be owned by only a single node at any point in time.

Role

A collection of cluster resources managed as a single cluster object to provide specific functionality

Network name resource

A logical server name that is managed as a cluster resource. A network name resource must be used with an IP address resource. These entries may require objects in Active Directory Domain Services and/or DNS.

Resource dependency

A resource on which another resource depends. If resource A depends on resource B, then B is a dependency of A. Resource A will not be able to start without resource B.

Preferred owner

A node on which a resource group prefers to run. Each resource group is associated with a list of preferred owners sorted in order of preference. During automatic failover, the resource group is moved to the next preferred node in the preferred owner list.

Possible owner

A secondary node on which a resource can run. Each resource group is associated with a list of possible owners. Roles can fail over only to nodes that are listed as possible owners.

Quorum mode

The quorum configuration in a failover cluster that determines the number of node failures that the cluster can sustain.

Force quorum

The process to start the cluster even though only a minority of the elements that are required for quorum are in communication

It provides infrastructure features that support the high-availability and disaster recovery scenarios of hosted server applications.

The nodes in a WSFC work together to collectively provide these types of capabilities:

  • Distributed metadata and notifications
  • Resource management
  • Health monitoring
  • Failover coordination

Prerequisites for setting up WSFC

  • Make sure that all servers that you want to add as cluster nodes are running the same version of Windows Server.
  • Review the hardware requirements to make sure that your configuration is supported.
  • Make sure that all servers that you want to add as cluster nodes are joined to the same Active Directory domain.
  • Optionally, create an organizational unit (OU) and move the computer accounts for the servers that you want to add as cluster nodes into the OU. As a best practice, we recommend that you place failover clusters in their own OU in AD DS. This can help you better control which Group Policy settings or security template settings affect the cluster nodes. By isolating clusters in their own OU, it also helps prevent against accidental deletion of cluster computer objects.

Additionally, verify the following account requirements:

  • Make sure that the account you want to use to create the cluster is a domain user who has administrator rights on all servers that you want to add as cluster nodes.
  • Make sure that either of the following is true:
    • The user who creates the cluster has the Create Computer objects permission to the OU or the container where the servers that will form the cluster reside.
    • If the user does not have the Create Computer objects permission, ask a domain administrator to prestage a cluster computer object for the cluster.

Now we need to do the actual Clustering work

To showcase this, I have the following setup in my environment:

  • Host: Windows Server 2022 Standard with Hyper-V
    • Virtuals:
      • Active Directory Services. Domain Controller and DNS Services (In the real world, this would be separate servers.)
      • DB Server 1
      • DB Server 2
      • File Server (this will be the tiebreaker instance)

A screenshot of a computer

Description automatically generated

Step 1: Install Failover Cluster feature on all servers that will form part of the cluster

A screenshot of a computer

Description automatically generated

Step 2: Validate the configuration

Before you create the failover cluster, we strongly recommend that you validate the configuration to make sure that the hardware and hardware settings are compatible with failover clustering. Microsoft supports a cluster solution only if the complete configuration passes all validation tests and if all hardware is certified for the version of Windows Server that the cluster nodes are running.

The above demo environments do meet this, so we continue.

Step 3: Run cluster validation tests

  1. On a computer that has the Failover Cluster Management Tools installed from the Remote Server Administration Tools or on a server where you installed the Failover Clustering feature, start Failover Cluster Manager. To do this on a server, start Server Manager and then on the Tools menu, select Failover Cluster Manager.

      1. In the Failover Cluster Manager pane under Management, select Validate Configuration.

A screenshot of a computer error

Description automatically generated

      1. On the Before You Begin page, select Next.
      2. On the Select Servers or a Cluster page in the Enter name box, enter the NetBIOS name or the fully qualified domain name of a server that you plan to add as a failover cluster node and then select Add. Repeat this step for each server that you want to add. To add multiple servers at the same time, separate the names by a comma or by a semicolon. For example, enter the names in the format server1.contoso.com, server2.contoso.com. When you are finished, select Next

  1. On the Testing Options page, select Run all tests (recommended) and then select Next.

  1. On the Confirmation page, select Next. The Validating page displays the status of the running tests.

A screenshot of a software update

Description automatically generated

      1. On the Summary page, do either of the following:
        • If the results indicate that the tests completed successfully and the configuration is suited for clustering and you want to create the cluster immediately, make sure that the Create the cluster now using the validated nodes check box is selected and then select Finish. Then, continue to step 4 of the Create the failover cluster procedure.

        • If the results indicate that there were warnings or failures, select View Report to view the details and determine which issues must be corrected. Realize that a warning for a particular validation test indicates that this aspect of the failover cluster can be supported but might not meet the recommended best practices.

Step 4: Create the failover cluster

To complete this step, make sure that the user account that you log on as meets the requirements that are outlined in the Verify the prerequisites section of this topic.

  1. Start Server Manager.
  2. On the Tools menu, select Failover Cluster Manager.
  3. In the Failover Cluster Manager pane, under Management, select Create Cluster.

The Create Cluster Wizard opens.

  1. On the Before You Begin page, select Next.
  2. If the Select Servers page appears, in the Enter name box enter the NetBIOS name or the fully qualified domain name of a server that you plan to add as a failover cluster node and then select Add. Repeat this step for each server that you want to add. To add multiple servers at the same time, separate the names by a comma or a semicolon. For example, enter the names in the format server1.contoso.com; server2.contoso.com. When you are finished, select Next.

 Note

If you chose to create the cluster immediately after running validation in the configuration validating procedure, you will not see the Select Servers page. The nodes that were validated are automatically added to the Create Cluster Wizard so that you do not have to enter them again.

  1. If you skipped validation earlier, the Validation Warning page appears. We strongly recommend that you run cluster validation. Only clusters that pass all validation tests are supported by Microsoft. To run the validation tests, select Yes, and then select Next. Complete the Validate a Configuration Wizard as described in Validate the configuration.
  2. On the Access Point for Administering the Cluster page, do the following:
    1. In the Cluster Name box, enter the name that you want to use to administer the cluster. Before you do, review the following information:
      1. During cluster creation, this name is registered as the cluster computer object (also known as the cluster name object or CNO) in AD DS. If you specify a NetBIOS name for the cluster, the CNO is created in the same location where the computer objects for the cluster nodes reside. This can be either the default Computers container or an OU.
      2. To specify a different location for the CNO, you can enter the distinguished name of an OU in the Cluster Name box. For example: CN=ClusterName, OU=Clusters, DC=Contoso, DC=com.
      3. If a domain administrator has prestaged the CNO in a different OU than where the cluster nodes reside, specify the distinguished name that the domain administrator provides.
    2. If the server does not have a network adapter that is configured to use DHCP, you must configure one or more static IP addresses for the failover cluster. Select the check box next to each network that you want to use for cluster management. Select the Address field next to a selected network and then enter the IP address that you want to assign to the cluster. This IP address (or addresses) will be associated with the cluster name in Domain Name System (DNS).

    1. When you are finished, select Next.
    2. On the Confirmation page, review the settings. By default, the Add all eligible storage to the cluster check box is selected. Clear this check box if you want to do either of the following:
    • You want to configure storage later.
    • You plan to create clustered storage spaces through Failover Cluster Manager or through the Failover Clustering Windows PowerShell cmdlets and have not yet created storage spaces in File and Storage Services. For more information, see Deploy Clustered Storage Spaces.

A screenshot of a computer

Description automatically generated

  1. Select Next to create the failover cluster.

  1. On the Summary page, confirm that the failover cluster was successfully created. If there were any warnings or errors, view the summary output or select View Report to view the full report. Select Finish.
  2. To confirm that the cluster was created, verify that the cluster name is listed under Failover Cluster Manager in the navigation tree. You can expand the cluster name and then select items under NodesStorage or Networks to view the associated resources.

Realize that it may take some time for the cluster name to successfully replicate in DNS. After successful DNS registration and replication, if you select All Servers in Server Manager, the cluster name should be listed as a server with a Manageability status of Online.

After the cluster is created, you can do things such as verify cluster quorum configuration, and optionally, create Cluster Shared Volumes (CSV)

A screenshot of a computer

Description automatically generated

A screenshot of a computer

Description automatically generated

Step 5: Create Witness

What is needed next is to create a witness.

    1. In the Failover Cluster Manager, locate the new cluster resource, right click, More Actions, Configure Cluster Quorum Settings.

    1. Choose “Select the quorum witness”

A screenshot of a computer

Description automatically generated

    1. Here are various options. Choose what is right for your setup. In my lab I chose Configure file share witness

    1. Configure share witness

    1. Next the confirmation screen

    1. If all in order, press next

If all works according to plan, the cluster quorum witness will be created and added to the list of objects.

Final Outcomes

The local server properties now reflect:

A screenshot of a computer

Description automatically generated

A close-up of a computer screen

Description automatically generated

Now we have a fully witnessed WSFC set up, any additional setup (e.g. FCI, HADR, BADR) is much simpler.

One last thing – Setting up Cluster-Aware Updating

Now that WSFC has been set up, we need to carefully consider how to apply updates.

Do not just update haphazardly. Care needs to be exercised. To this end, once the Clustering Management tool has been installed, Cluster-Aware Updating should be applied.

Feature description

Cluster-Aware Updating is an automated feature that enables you to update servers in a failover cluster with little or no loss in availability during the update process. During an Updating Run, Cluster-Aware Updating transparently performs the following tasks:

  1. Puts each node of the cluster into node maintenance mode.
  2. Moves the clustered roles off the node.
  3. Installs the updates and any dependent updates.
  4. Performs a restart if necessary.
  5. Brings the node out of maintenance mode.
  6. Restores the clustered roles on the node.
  7. Moves to update the next node.

For many clustered roles in the cluster, the automatic update process triggers a planned failover. This can cause a transient service interruption for connected clients. However, in the case of continuously available workloads, such as Hyper-V with live migration or file server with SMB Transparent Failover, Cluster-Aware Updating can coordinate cluster updates with no impact to the service availability.

Practical applications

  • CAU reduces service outages in clustered services, reduces the need for manual updating workarounds, and makes the end-to-end cluster updating process more reliable for the administrator. When the CAU feature is used in conjunction with continuously available cluster workloads, such as continuously available file servers (file server workload with SMB Transparent Failover) or Hyper-V, the cluster updates can be performed with zero impact to service availability for clients.
  • CAU facilitates the adoption of consistent IT processes across the enterprise. Updating Run Profiles can be created for different classes of failover clusters and then managed centrally on a file share to ensure that CAU deployments throughout the IT organization apply updates consistently, even if the clusters are managed by different lines-of-business or administrators.
  • CAU can schedule Updating Runs on regular daily, weekly, or monthly intervals to help coordinate cluster updates with other IT management processes.
  • CAU provides an extensible architecture to update the cluster software inventory in a cluster-aware fashion. This can be used by publishers to coordinate the installation of software updates that are not published to Windows Update or Microsoft Update or that are not available from Microsoft, for example, updates for non-Microsoft device drivers.
  • CAU self-updating mode enables a “cluster in a box” appliance (a set of clustered physical machines, typically packaged in one chassis) to update itself. Typically, such appliances are deployed in branch offices with minimal local IT support to manage the clusters. Self-updating mode offers great value in these deployment scenarios.

Important functionality

The following is a description of important Cluster-Aware Updating functionality:

  • A user interface (UI) – the Cluster Aware Updating window – and a set of cmdlets that you can use to preview, apply, monitor, and report on the updates
  • An end-to-end automation of the cluster-updating operation (an Updating Run), orchestrated by one or more Update Coordinator computers
  • A default plug-in that integrates with the existing Windows Update Agent (WUA) and Windows Server Update Services (WSUS) infrastructure in Windows Server to apply important Microsoft updates
  • A second plug-in that can be used to apply Microsoft hotfixes, and that can be customized to apply non-Microsoft updates
  • Updating Run Profiles that you configure with settings for Updating Run options, such as the maximum number of times that the update will be retried per node. Updating Run Profiles enable you to rapidly reuse the same settings across Updating Runs and easily share the update settings with other failover clusters.
  • An extensible architecture that supports new plug-in development to coordinate other node-updating tools across the cluster, such as custom software installers, BIOS updating tools, and network adapter or host bus adapter (HBA) updating tools.

Cluster-Aware Updating can coordinate the complete cluster updating operation in two modes:

  • Self-updating mode For this mode, the CAU clustered role is configured as a workload on the failover cluster that is to be updated, and an associated update schedule is defined. The cluster updates itself at scheduled times by using a default or custom Updating Run profile. During the Updating Run, the CAU Update Coordinator process starts on the node that currently owns the CAU clustered role, and the process sequentially performs updates on each cluster node. To update the current cluster node, the CAU clustered role fails over to another cluster node, and a new Update Coordinator process on that node assumes control of the Updating Run. In self-updating mode, CAU can update the failover cluster by using a fully automated, end-to-end updating process. An administrator can also trigger updates on-demand in this mode, or simply use the remote-updating approach if desired. In self-updating mode, an administrator can get summary information about an Updating Run in progress by connecting to the cluster and running the Get-CauRun Windows PowerShell cmdlet.
  • Remote-updating mode For this mode, a remote computer, which is called an Update Coordinator, is configured with the CAU tools. The Update Coordinator is not a member of the cluster that is updated during the Updating Run. From the remote computer, the administrator triggers an on-demand Updating Run by using a default or custom Updating Run profile. Remote-updating mode is useful for monitoring real-time progress during the Updating Run, and for clusters that are running on Server Core installations.

Configure the nodes for remote management

To use Cluster-Aware Updating, all nodes of the cluster must be configured for remote management. By default, the only task you must perform to configure the nodes for remote management is to Enable a firewall rule to allow automatic restarts.

The following table lists the complete remote management requirements, in case your environment diverges from the defaults.

These requirements are in addition to the installation requirements for the Install the Failover Clustering feature and the Failover Clustering Tools and the general clustering requirements that are described in previous sections in this topic.

Requirement

Default state

Self-updating mode

Remote-updating mode

Enable a firewall rule to allow automatic restarts

Disabled

Required on all cluster nodes if a firewall is in use

Required on all cluster nodes if a firewall is in use

Enable Windows Management Instrumentation

Enabled

Required on all cluster nodes

Required on all cluster nodes

Enable Windows PowerShell 3.0 or 4.0 and Windows PowerShell remoting

Enabled

Required on all cluster nodes

Required on all cluster nodes to run the following:

– The Save-CauDebugTrace cmdlet
– PowerShell pre-update and post-update scripts during an Updating Run
– Tests of cluster updating readiness using the Cluster-Aware Updating window or the Test-CauSetup Windows PowerShell cmdlet

Install .NET Framework 4.6 or 4.5

Enabled

Required on all cluster nodes

Required on all cluster nodes to run the following:

– The Save-CauDebugTrace cmdlet
– PowerShell pre-update and post-update scripts during an Updating Run
– Tests of cluster updating readiness using the Cluster-Aware Updating window or the Test-CauSetup Windows PowerShell cmdlet

Enable a firewall rule to allow automatic restarts

To allow automatic restarts after updates are applied (if the installation of an update requires a restart), if Windows Firewall or a non-Microsoft firewall is in use on the cluster nodes, a firewall rule must be enabled on each node that allows the following traffic:

Protocol: TCP

Direction: inbound

Program: wininit.exe

Ports: RPC Dynamic Ports

Profile: Domain

If Windows Firewall is used on the cluster nodes, you can do this by enabling the Remote Shutdown Windows Firewall rule group on each cluster node. When you use the Cluster-Aware Updating window to apply updates and to configure self-updating options, the Remote Shutdown Windows Firewall rule group is automatically enabled on each cluster node.

Note

The Remote Shutdown Windows Firewall rule group cannot be enabled when it will conflict with Group Policy settings that are configured for Windows Firewall.

The Remote Shutdown firewall rule group is also enabled by specifying the –EnableFirewallRules parameter when running the following CAU cmdlets: Add-CauClusterRole, Invoke-CauRun, and SetCauClusterRole.

The following PowerShell example shows an additional method to enable automatic restarts on a cluster node.

Set-NetFirewallRule -Group “@firewallapi.dll,-36751” -Profile Domain -Enabled true

Enable Windows Management Instrumentation (WMI)

All cluster nodes must be configured for remote management using Windows Management Instrumentation (WMI). This is enabled by default.

To manually enable remote management, do the following:

In the Services console, start the Windows Remote Management service and set the startup type to Automatic.

Run the Set-WSManQuickConfig cmdlet, or run the following command from an elevated command prompt:

winrm quickconfig -q

To support WMI remoting, if Windows Firewall is in use on the cluster nodes, the inbound firewall rule for Windows Remote Management (HTTP-In) must be enabled on each node. By default, this rule is enabled.

Enable Windows PowerShell and Windows PowerShell remoting

To enable self-updating mode and certain CAU features in remote-updating mode, PowerShell must be installed and enabled to run remote commands on all cluster nodes. By default, PowerShell is installed and enabled for remoting.

To enable PowerShell remoting, use one of the following methods:

Run the Enable-PSRemoting cmdlet.

Configure a domain-level Group Policy setting for Windows Remote Management (WinRM).

For more information about enabling PowerShell remoting, see About Remote Requirements.

Install .NET Framework 4.6 or 4.5

To enable self-updating mode and certain CAU features in remote-updating mode,.NET Framework 4.6, or .NET Framework 4.5 (on Windows Server 2012 R2) must be installed on all cluster nodes. By default, NET Framework is installed.

To install .NET Framework 4.6 (or 4.5) using PowerShell if it’s not already installed, use the following command:

Install-WindowsFeature -Name NET-Framework-45-Core

Best practices recommendations for using Cluster-Aware Updating

Recommendations for applying Microsoft updates

We recommend that when you begin to use CAU to apply updates with the default Microsoft.WindowsUpdatePlugin plug-in on a cluster, you stop using other methods to install software updates from Microsoft on the cluster nodes.

Caution

Combining CAU with methods that update individual nodes automatically (on a fixed time schedule) can cause unpredictable results, including interruptions in service and unplanned downtime.

We recommend that you follow these guidelines:

For optimal results, we recommend that you disable settings on the cluster nodes for automatic updating, for example, through the Automatic Updates settings in Control Panel, or in settings that are configured using Group Policy.

Caution

Automatic installation of updates on the cluster nodes can interfere with installation of updates by CAU and can cause CAU failures.

If they are needed, the following Automatic Updates settings are compatible with CAU, because the administrator can control the timing of update installation:

  • Settings to notify before downloading updates and to notify before installation
  • Settings to automatically download updates and to notify before installation

However, if Automatic Updates is downloading updates at the same time as a CAU Updating Run, the Updating Run might take longer to complete.

Do not configure an update system such as Windows Server Update Services (WSUS) to apply updates automatically (on a fixed time schedule) to cluster nodes.

All cluster nodes should be uniformly configured to use the same update source, for example, a WSUS server, Windows Update, or Microsoft Update.

If you use a configuration management system to apply software updates to computers on the network, exclude cluster nodes from all required or automatic updates. Examples of configuration management systems include Microsoft Endpoint Configuration Manager and Microsoft System Center Virtual Machine Manager 2008.

If internal software distribution servers (for example, WSUS servers) are used to contain and deploy the updates, ensure that those servers correctly identify the approved updates for the cluster nodes.

Posted in Failover, Windows Server | Tagged , , | Comments Off on How to Set up Windows Server Failover Clusters

Data recovery services: What is it and when will you make use of it?

Have you ever had a PC that just would not start up? Or a hard drive that is just not read by any machine?

Did an external hard drive stop responding, become unsearchable and/or start making funny clicking sounds?

Have you ever formatted a drive (albeit hard drive or flash drive (memory stick)), only to realise you needed the data on that drive and didn’t make a backup thereof before clicking on the OK button?

These are some of the scenarios wherein you could conceivably make use of data recovery services.

But what can data recovery services do for you, you may ask.

First off, please always bear on mind that while data recovery firms will always do everything in their power to recover your precious data, recovery is never 100% guaranteed, as there are circumstances that may make this impossible, e.g.

> drive struck by lightning or subject to continuous power surges;
> subjected to very strong magnetic fields;
> machine dropped or knocked over while the drives are working hard, where the disks are so badly damaged (surface scraped, disks bent or broken, spindle bent or snapped, read/write head penetrates the disks) that recovery effectively becomes impossible;

Data recovery is the process whereby data is recovered from a damaged or otherwise inaccessible drive.

there are two methods employed, viz.

> soft recovery, where recovery is possible by means of computer software;
> when soft recovery fails, one can perform hard recovery, where the hard drive is dismantled (in a Class 100 or less Clean Room) and, depending on the damage encountered, either replacing components within the hard drive where possible, or, as a last resort, the disks removed and placed in a specialised piece of equipment (a disk reader), which is then used to get the data from said disks (however, this is a labour intensive operation and is, as such, the more expensive option, with limited chance of success)

But what is the biggest causes of failure in hard drives?

> Severe viral infections, which may damage the boot sector, effectively rendering the drive inaccessible;
> Not performing regular cleanup and maintenance of hard drives (including, but not limited to, defragmentation (which can be scheduled on most operating systems), disk cleanup (removal of temporary files or unused / unusable files, to save space), or filling a drive beyond the recommended maximum of 75-80% of capacity (this additional space is what gets used for maintenance of said drive by the Operating System)
> Abuse of the drive, e.g. using a desktop / laptop drive as a server (with extremely high read and write activity, pushing the drive beyond its actual designed capabilities);
> moving the machine while the drives are spinning (refer below for an analogy of the workings of a hard drive and possible causes of damage)

Now, the question can be asked about how to avoid said issues.
To be honest, drive failure is inevitable, it happens to everyone at one time or another.
Yet there is a way to minimise the impact of the failure:

> Ensure you have an Uninteruptable Power Supply (with Surge Protection) connected between the power socket and your PC;
> Keep backups of your precious data, in a reliable storage (e.g. an External Hard Drive), not the drive on which your data is currently stored (remember to regularly check the quality of the backed up data). There are several modestly priced yet reliable backup tools available on the market;
> Do not manouver, move or shake your PC or Laptop while the hard drives are spinning (i.e. while it is on) – that includes walking around with the laptop on (if you need to do this, obtain a machine with Solid State Drives)

The promised analogy:

Consider the read / write heads of the hard drive as a Jumbo 747 Super.

Consider the palettes of the drive as the ground.

Now imagine the jumbo flying at Mach 5 three inches (approximately 75 mm) above the ground

On this scale,

> a wave of white light is approx 10 inches (~250 mm);

> a particle of smoke is a 1/2 metre diameter boulder;

> a fingerprint is a metre high wall;

> a dust particle is the size of a house;

impact with any of these is sure to cause damage.

However, the most frequest cause of hardware damage (i.e. physical damage to the disks and read / write heads) is movement. While the disks are spinning at full speed (between roughly 5400 rpm and 7200 rpm (or even more (up to approx 11.000 rpm) on the newer hard drives), depending on the make of the drive), any movement will cause the pallettes of the disk to wobble. On the scale used above, the wobble can be anything between 4 inches (~100 mm) and 8 inches (~200 mm), i.e. the palette will hit the read / write head, leaving bad sectors. Enough of these impacts will render the drive (and the read / write head) totally useless, with all data effectively lost.

Posted in Data Recovery | Tagged | Comments Off on Data recovery services: What is it and when will you make use of it?

Our current projects

We have several projects in the pipeline, covering a wide audience

These include:

> My Secret Secret, a simple encryption / decryption tool, making use of a complex encryption algorithm, with a dynamic key length, determined by the user.

> Biblios Personae, a library application, making it possible to manage who you lend what to and how long it took to return it as well as the condidtion it was in when it was returned. Using this, you can manage if you want to lend another item to them (albeit book, cd, dvd or any other lendable item )

> My Secret Secret Maxi, based on My Secret Secret, with one major difference: it can encrypt or decrypt any readable and writeable document, in place (e.g. a word document being send to a specific person that you do not want to fall into the wrong hands (e.g. competitor))

Posted in New developments | Comments Off on Our current projects

Future…

Keep watching this space. Our products will soon be available for purchase and download.

Posted in Company News and Events | Comments Off on Future…

Launched!

Our web page is launched. Now we are ready to face the world.

Posted in Company News and Events | Comments Off on Launched!

Why Normalise Database Tables?

Many GUI and web developers I have spoken to, do not fully understand the value to be gained from normalising their database designs. It is, after all, easier to build the tables to look exactly like the form or web page they are building.

The question that usually arises when we as database “specialists” question them is: but what is wrong with that design? my code works, doesn’t it?

The obvious answer to this would be “yes, your code works”, but does it really?

Designing tables in this manner will eventually have serious drawbacks, including, but not limited to, redundancy, wasted space and loss of data integrity.

As an example, we will use an extract of one table, from a failed company (the reason for it’s eventual failure will become apparent in due course).

Definition of Order Table

Order Table Definition

Now the referenced programmers defend their design, stating that it is normalised, as it does have a primary key, so the rows are uniquely identifyable.

OK, let us look at that statement.

Yes, there is a primary key. Now, does having a primary key make a table Normalised?

The anser is yes, provided it also complies to First Normal Form (1NF) and Second Normal Form (2NF) rules. This table does not comply to 1NF rules, which state that:

> the table is a faithful representation of a relation
> it is free of repeating groups

(see here for a definition of 1NF)

So, How do we go about normalising this data set?

> First, let us eliminate the repeating columns, by placing all these columns in their own table, which will look as follows once complete:

Remove repeating columns

Remove repeating columns

Now, looking at the resulting tables, we can see they comply to 1NF.

Now, let us check that they comply to 2NF, which states:

> Identify a candidate key;
> Ensure those records are supportive of said candidate key, i.e. those that are dependent on the candidate key, with no partial dependencies.

for Orders, a good candidate key would be OrderNo (which we will make into a Primary Key, as it must, of needs, remain unique) and for PartsPerOrder, OrderNo and Part make a good choice for candidate key (we will create a surrogate key (viz. PartPerOrderCode) as the primary key, with a non clustered index covering the candidate key(s) (along with the respective foreign keys)

Thus the structure now complies to 2NF .

(see here for a definition of 2NF)

Now we need to go further. Are there any fields in the tables that do not belong (i.e. do not describe the record in the table)?

The PartPerOrder table still has a field for the Sales Person’s mobile phone. If the salesperson changes their Mobile number, this will need to be updated in each and every order that person was involved in. That could (and quite possibly will) lead to data anomolies.

How do we now eliminate this data?
By applying the rules for Third Normal Form (3NF).

We can ask if this data may already exist in another table. Shouldn’t the salesperson contact information therefore be stored in the Staff table?

Checking the Staff table, we do indeed find a column labelled “MobilePhone”, which should, presuming it has been completed correctly, contain the person’s correct mobile number.

Removing this and using the column in the Staff table gives us the following:

Now there are no redundant data columns in the tables, which gives a much cleaner design (see here for a definition of 3NF).

Is it possible to improve the data?
Yes, we can proceed to the next level of Normalisation.
This next level of normalisation is termed Boyce-Codd Normal Form (BCNF) (also called 3.5 Normal Form (3.5NF) by some theorists).

The aim here is to ask if a column really describes the table and, if not, move it to another table (and, in so doing, probably eliminate existing nullable columns).

In the Staff table, there are columns that do not explicitly describe the staff member as a person, viz. their contact details (Extention and MobilePhone).

We therefore create a new table to carry the staff member’s contact information, and remove them from the Staff table. This could be called StaffContactDetails and will appear as follows:

Boyce-Codd Normal Form

Now there are really no redundant data columns visible, and the number of nullable columns is greatly reduced (see here for a definition of BCNF).

The nullable columns that are left add value where they are, as they do describe the tables and will, eventually, be filled in (e.g. when an order is finalised and paid).

We could, however, normalise this design further, but we need to ask ourselves if this will add value in terms of both storage space and performance (for details of the other levels of normalisation, refer to Fourth Normal Form (4NF), Fifth Normal Form (5NF) and Sixth Normal Form (6NF)).

The programmer would now scream and shout, stating that his code will not work, he cannot represent this “shatterred” data in his front-end, he will have to start all over again, etc., etc. ad nausium.

Whereupon we, as the database “guys” will simply answer: “Haven’t you ever heard of a view or stored procedure?”.

In any event, using our example, why did the company fail?
> what happens when a client has a large order, wherein they request numerous different parts (substantially more that 5)?
– There will be one order generated and paid for for each 5 parts required. If the client required 100 different parts, there will be 20 orders completed, each one have to be settled independently.
– This is expensive and time-consuming, not to mention frustrating.

– Most people will rather go to a competitor, who has a simpler and more streamlined process (with a well designed database), who can proces all parts in one order, with one payment.

When the developer was asked why it was done this way, he replied that management assured him no-one would order more that 5 parts at a time – this points back to:
> not understanding the business nor the environment in which the client operates, and
> poor requirement gathering skills.

Requirement elicitation will be handled in a future post to this blog…..

 

Posted in Database | Tagged , | Comments Off on Why Normalise Database Tables?

File system storage

I am constantly amazed at how many people believe the only way to change the storage option of a hard drive from FAT or FAT32 to NTFS is by means of formatting the hard drive.

the following  command line (DOS)  should eliminate all those issues (execute (command shell) on any drive other than the drive being changed):

CONVERT <volume> /FS:NTFS [/V] [/CvtArea:filename] [/NoSecurity] [/X]
volume Specifies the drive letter (followed by a colon), mount point, or volume name.
/FS:NTFS    Specifies that the volume will be converted to NTFS.
/V          Specifies that Convert will be run in verbose mode.
/CvtArea:filename Specifies a contiguous file in the root directory
that will be the place holder for NTFS system files.
/NoSecurity Specifies that the security settings on the converted
files and directories allow access by all users.
/X          Forces the volume to dismount first if necessary. All open handles to the volume will not be valid.

Posted in Operating System | Tagged , , , | Comments Off on File system storage

Table Partitioning

Recently presented a discussion at the SA SQL User Group in Bryanston, on the topic of Logical Patitioning Database Tables in MSSQL 2005 & 2008

presentation: TablePartitioning

Posted in Database | Tagged , | 1 Comment