Showing posts with label Hyper-V Replica. Show all posts
Showing posts with label Hyper-V Replica. Show all posts

Wednesday, November 19, 2014

Windows Azure Pack with DR add-on (ASR)


One of the good things with Windows Azure Pack is that it is an extensible solution where we are able to customize, extend and integrate WAP to meet our desired configuration.

I have already covered the majority of the API’s we have available, both from an admin perspective and from a tenant perspective.

These blog posts can be found here:




The intention of this blog post is to drive awareness of the solution that Microsoft now has made available.

Offering managed DR for IaaS workload with ASR and Windows Azure Pack


Many people have requested that Windows Azure Pack should have an integration with Hyper-V Replica, or Azure Site Recovery.
If you are not familiar with Azure Site Recovery as a concept, you can think of it as the umbrella for all the DR capabilities that Microsoft provides, including storage replication that will be available in the Update Rollup 5 for SCVMM (currently in preview). Azure Site Recovery let you use Hyper-V Replica through SCVMM on-premise to either replicate to a secondary datacenter (on-premise) or use Microsoft Azure as your DR target.
No matter what and where you go, the experience will be the same and provide you with consistency.

I will not cover the setup or the actual workflow of the DR integration with WAP since it is very detailed explained in the URL above.
Instead, I would point out the high-level design of this solution and what you really need to think of.

After you have installed Update Rollup 4 for Windows Azure Pack, you will see some small changes in the UI when you drill into the Plan in WAP and explore the VM Cloud services.



This is where we will enable DR as an add-on, meaning that tenants are able to associate that add-on to an existing subscription they have.

The DR add-on will consist of several SMA runbooks that you will have to import into your WAP environment in the Admin Portal.

Once this is done from the tenant side, this will effectively trigger the SMA runbooks that will replicate all the virtual machines running in that subscription to the target environment.
The subscription ID itself will be replicated with all the mapping down towards each and every tenant VM.
However, virtual networks (if using NVGRE) is not replicated. This means the tenant will have to recreate the networking artifacts in the secondary environment, and you – the service provider must perform the initial network mapping in ASR.

The SMA runbooks can be scheduled so that once a new VM is deployed into that particular subscription, the VM will be scheduled for initial replication and be protected.

Now, over to the delicate explanation of the initial design in order to implement this.

In Azure Site Recovery when using DR between on-premises sites, we are doing the mapping at the VMM Cloud level. The Cloud in VMM should contains Hyper-V hosts/clusters within one or more host groups that will be the foundation of the virtual machines and the replica.

As you may be aware of, in Windows Azure Pack when you create hosting plans, these hosting plans that contains VM Services will be bound to a VMM cloud and a VMM server.
In other words, we are not able to replicate with ASR using a single cloud, although we could have two different host groups (primary and replica) within that cloud.

So since we have to have two clouds, we also need two plans. Hence we have an isolation issue to deal with in order to provide DR with a good tenant experience.

The subscription each tenant create will be unique in the environment, and we are not able to use the same subscription twice within an environment. But if we have two subscriptions, then the tenant would have to know which one to use and could easily lead to mistakes.

So in order to keep the subscription ID and its resources, we need to have another Azure Pack environment.
And since we need to have another Azure Pack environment, we also need another instance of Service Provider Foundation (SPF).

So from a tenant perspective during a failover process, they will be redirected to the WAP environment which is currently online, sign in with their credentials and get access to their resources. The only thing that has changed is the URL to the tenant portal itself.

I know it can be hard to absorb this information at first, especially if we are not familiar with the concept of stamp and the actual architecture of the multi-tenant IaaS cloud platform we are dealing with. So I have created some graphics to show each layer and the purpose of each layer.

High-level overview of management stamps with Windows Azure Pack and Azure Site Recovery



Overview of the different layers for the VM Cloud Resource provider in the context of WAP with DR add-on:



Hopefully this makes sense and gives you a better understanding of the design of Windows Azure Pack with DR add-on


Please note that this is a managed DR solution, where the service provider has very clear responsibility.
They need to perform the initial setup, perform all the processes and ensure that testing and planning are compliant with the actual SLA they provides for this solution.



Sunday, June 22, 2014

Microsoft Azure Site Recovery

In January, we had a new and interesting service available in Microsoft Azure, called “Hyper-V Recovery Manager”. I blogged about it and explained how to configure this on-premises using a single VMM management server. For details you can read this blog post: http://kristiannese.blogspot.no/2013/12/how-to-setup-hyper-v-recovery-manager.html

Hyper-V Recovery Manager provided organizations using Hyper-V and System Center with automated protection and orchestrating of accurate recovery of virtualized workloads between private clouds, leveraging the asynchronous replication engine in Hyper-V – Hyper – V Replica.

In other words, no data were sent to Azure except the metadata from the VMM clouds.
This has now changed and the service is renamed to Microsoft Azure Site Recovery that finally let you replicate between private clouds and public clouds (Microsoft Azure).

This means that we are still able to utilize the automatic protection of workloads that we are familiar with through the service, but now we can use Azure as the target in addition to private clouds.
This is also a door opener for migration scenarios where organizations considering moving VMs to the cloud, can easily do this with almost no downtime using Azure Site Recovery.

Topology

In our environment, we will use a dedicated Hyper-V cluster with Hyper-V Replica. This means we have added the Hyper-V Replica Broker role to the cluster. This cluster is located in its own host group in VMM and the only host group we have added to a cloud called “E2A”. Microsoft Azure Site Recovery requires System Center Virtual Machine Manager, which will be responsible for the communication and aggregation of the desired instructions made by the administrator in the Azure portal.


Pre-reqs

-          You must have an Azure account and add Recovery Services to your subscription
-          Certificate (.cer) that you upload to the management portal and register to the vault. Each vault has a single .cer certificate associated with it and it’s used when registering VMM servers in the vault.
-          Certificate (.pfx) that you import on each VMM server. When you install the Azure Site Recovery Provider on the VMM server, you must use this .pfx certificate.
-          Azure Storage account, where you will store the replicas replicated to Azure. The storage account needs geo-replication enabled and should be in the same region as the Azure Site Recovery service and associated with the same subscription
-          VMM Cloud(s). A cloud must be created in VMM that contains Hyper-V hosts in a host group enabled with Hyper-V Replica

-          Azure Site Recovery Provider must be installed on the VMM management server(s)
In our case, we had already implemented “Hyper-V Recovery Manager”, so we were able to do an in-place upgrade of the ASR Provider.
-          Azure Recovery Services agent must be installed on every Hyper-V host that will replicate to Microsoft Azure. Make sure you install this agent on all hosts located in the host group that you are using in your VMM cloud.

Once we had enabled all of this in our environment, we were ready to proceed and to the configuration of our site recovery setup.

Configuration


Login to the Azure management portal and navigate to recovery services to get the details around your vault, and see the instructions on how to get started.

We will jump to “Configure cloud for protection” as the fabric in VMM is already configured and ready to go.
The provider installed on the VMM management server is exposing the details of our VMM clouds to Azure, so we can easily pick “E2A” – which is the dedicated cloud for this setup. This is where we will configure our site recovery to target Microsoft Azure.



Click on the cloud and configure protection settings.



On target, select Microsoft Azure. Also note that you are able to setup protection and recovery using another VMM Cloud or VMM management server.



For the configuration part, we are able to specify some options when Azure is the target.

Target: Azure. We are now replicating from our private cloud to Microsoft Azure’s public cloud.
Storage Account: If none is present, then you need to create a storage account before you are able to proceed. If you have several storage accounts, then choose the accounts that are in the same region as your recovery vault.
Encrypt stored data: This is default set to “on”, and not possible to change in the preview.
Copy frequency: Since we are using Hyper-V 2012 R2 in our fabric – that introduced us for additional capabilities related to copy frequencies, we can select 30 seconds, 5 minutes and 15 minutes. We will use the “default” that is 5 minutes in this setup.
Retain recovery points: Hyper-V Replica is able to create additional recovery points (crash consistent snapshots) so that you can have a more flexible recovery option for your virtual workload. We don’t need any additional recovery points for our workloads, so we will leave this to 0.
Frequency of application consistent snapshots: If you want app consistent snapshots (ideally for SQL servers, which will create VSS snapshots) then you can enable this and specify it here.
Replication settings: This is set to “immediately” which means that every time a new VM is deployed to our “E2A” cloud in VMM with protection enabled, will automatically start the initial replication from on-premises to Microsoft Azure. For large deployments, we would normally recommend to schedule this.

Once you are happy with the configuration, you can click ‘save’.



Now, Azure Site Recovery will configure this for your VMM cloud. This means that – through the provider, the hosts/clusters will be configured with these settings automatically from Azure.
-          Firewall rules used by Azure Site Recovery are configured so that ports for replication traffic are opened
-          Certificates required for replication are installed
-          Hyper-V Replica Settings are configured
 Cool!

You will have a job view in Azure that shows every step during the actions you perform. We can see that protection has been successfully enabled for our VMM Cloud.




If we look at the cloud in VMM, we also see that protection is enabled and Microsoft Azure is the target.



Configuring resources

In Azure, you have had the option to create virtualized networks for many years now. We can of course use them in this context, to map with our VM networks present in VMM.
To ensure business continuity it is important that the VMs that failover to Azure are able to be reached over the network – and that RDP is enabled within the guest. We are mapping our management VM network to a corresponding network in Azure.



VM Deployment

Important things to note:
In preview, there are some requirements for using Site Recovery with your virtual machines in the private cloud.

Only support for Gen1 virtual machines!
This means that the virtual machines must have their OS partition attached to an IDE controller. The disk can be vhd or vhdx, and you can even attach data disks that you want to replicate. Please note that Microsoft Azure does not support VHDX format (introduced in Hyper-V 2012), but will convert the VHDX to VHD during initial replication in Azure. In other words, virtual machines using VHDX on premises will run on VHD’s when you failover to Azure. If you failback to on-premises, VHDX will be used as expected.

Next, we will deploy a new VM in VMM. When we enable protection on the hardware profile and want to deploy to a Cloud, intelligent placement will kick in and find the appropriate cloud that contains Hyper-V hosts/clusters that meet the requirements for replica.



After the deployment, the virtual machine should immediately start with an initial replication to Microsoft Azure, as we configured this on the protection settings for our cloud in Azure. We can see the details of the job in the portal and monitor the process. Once it is done, we can see – at a lower level that we are actually replicating to Microsoft Azure directly on the VM level.




After a while (depending on available bandwidth), we have finally replicated to Azure and the VM is protected.





Enabling protection on already existing VMs in the VMM cloud

Also note that you can enable this directly from Azure. If you have a virtual machine running in the VMM cloud enabled for protection, but the VM itself is not enabled in VMM, then Azure can pick this up and configure it directly from the portal.



If you prefer to achieve this by using VMM, it is easy by open the properties of the VM and enable for protection.




One last option is to use the VMM powershell module to enable this on many VMs at once.

Set-SCVirtualMachine –VM “VMName” –DRProtectionRequired $true –RecoveryPointObjective 300

Test Failover

One of the best things with Hyper-V Replica is that complex workflows, such as test failovers, planned failovers and unplanned failovers are integrated into the solution. This is also exposed and made available in the Azure portal, so that you easily can perform a test failover on your workloads. Once a VM is protected – meaning that the VM has successfully completed the initial replication to Azure, we can perform a test failover. This will create a copy based on the recovery point you select and boot that virtual machine in Microsoft Azure.







Once you are satisfied with the test, you can complete the test failover from the portal.
This will power off the test virtual machine and delete it from Azure. Please note that this process will not interfere with the ongoing replication from private cloud to Azure.



Planned failover

You can use planned failover in Azure Site Recovery for more than just failover. Consider a migration scenario where you actually want to move your existing on-premises workload to Azure, planned failover will be the preferred option. This will ensure minimal downtime during the process and start up the virtual machine in Azure afterwards.
In our case, we wanted to simulate planned maintenance in our private cloud, and therefore perform a planned failover to Azure.



Click on the virtual machine you want to failover, and click planned failover in the portal.
Note that if the virtual machine has not performed a test failover, we are recommending you to do so before an actual failover.
Since this is a test, we are ready to proceed with the planned failover.



When the job has started, we are drilling down to the lowest level again, Hyper-V Replica, to see what’s going on. We can see that the VM is preparing for planned failover where Azure is the target.



In the management portal, we can see the details for the planned failover job.



Once done, we have a running virtual machine in Microsoft Azure, that appears in the Virtual Machine list.



If we go back to the protected clouds in Azure, we see that our virtual machine “Azure01” has “Microsoft Azure” as its active location.



If we click on the VMs and drill into the details, we can see that we are able to change the name and the size of the virtual machine in Azure.



We have now successfully performed a planned failover from our private cloud to Microsoft Azure!

Failback from Microsoft Azure

When we were done with our planned maintenance in our fabric, it was time to failback the running virtual machine in Azure to our VMM Cloud.
Click on the virtual machine that is running in Azure that is protected, and click planned failover.
We have two options for the data synchronization. We can either use “Synchronize data before failover” that will perform something similar as “re-initializing replication” to our private cloud. This means synchronization will be performed without shutting down the virtual machine, leading to minimal downtime during the process.
The other option “Synchronize data during failover only” will minimize synchronization data but have more downtime as the shutdown will begin immediately. Synchronization will start after shutdown to complete the failover.
We are aiming for minimal downtime, so option 1 is preferred.



When the job is started, you can monitor the process in Azure portal.



Once the sync is complete, we must complete the failover from the portal so that this will go ahead and start the VM in our private cloud.



Checking Hyper-V Replica again, we can see that the state is set to “failback in progress” and that we currently have no primary server.



The job has now completed all the required steps in Azure.



Moving back to Hyper-V Replica, we can see that the VM is again replicating to Microsoft Azure, and that the primary server is one of our Hyper-V nodes.



In VMM, our virtual machine “Azure01” is running again in the “E2A” cloud



In the Azure management portal in the virtual machines list, our VM is still present but stopped.

Thanks for joining us on this guided tour on how to work with Azure Site Recovery.
Next time we will explore the scenarios we can achieve by using recovery plans in Azure Site Recovery, to streamline failover of multi-tier applications, LOB applications and much more.

Monday, May 12, 2014

Site Recovery in Microsoft Azure

Announced during the keynote of TechEd, Microsoft will let you use their scalable datacenters across the world as a secondary site.
To put this simply, you will be able to use Microsoft Azure as a site for disaster recovery.

This is brilliant news, and has been requested by millions of users world wide since the release of Hyper-V Recovery Manager.

Hyper-V Recovery Manager was introduced as an “orchestrator” for your DR solution, based on Hyper-V in Windows Server (which includes Hyper-V Replica – the key feature here) and System Center – Virtual Machine Manager.

Here’s a recap of how it works.

You sign-up to Azure and download a provider that will connect your VMM management stamp to your subscription, and your Recovery Vault in Microsoft Azure.
This will also add some changes to VMM, reflected in the GUI.
After configuring a DR enabled Cloud (a cloud created in VMM), the metadata will be sent to Azure where the recovery manager serves as a control panel and orchestrator for your DR solution.
Hyper-V Recovery Manager (HRM) will configure your stand-alone hosts and clusters (through VMM) and enable Hyper-V Replica if stand-alone servers, and Hyper-V Replica Broker if clustered.

The takeaway is that you can manage all of your DR enabled virtual machines running in these clouds, as an entity. Also, you can create recovery plans for a subset (if not all) virtual machines and ensure that the failover will include every server, application and service that requires to run in case of a failover of an entire business application.

For a complete guide on the setup, please see a blog post I wrote earlier, available here!

Here’s a high-level overview of a quite common environment, where you have a tier 1 production site, and a tier 2 recovery site – both on-premise.
In addition, Microsoft Azure (at the top) can be a secondary site and Hyper-V Recovery Manager will orchestrate the recovery plans for your environment.


This is a big step in the right direction when it comes to our hybrid cloud story.
Not only use the service itself from Azure, but also extend – and use the capacity in Microsoft Azure to run your infrastructure.

I will follow up this blog post later with technical details around architecture, tips & tricks, and how to implement this to an already existing DR solution, based on Hyper-V, VMM and HRM.



Monday, April 14, 2014

Co-existence of WAP and HRM - Fixed!

Co-existence of WAP and HRM is now fixed!


Microsoft has released an Update Rollup (1) for Hyper-V Recovery Manager.
This update will eliminate the needs for the “Hyper-V Cloud Capability Profile” associated with the HW profiles on the VMs, that also must be present in the cloud in VMM.

Why do we get an Update Rollup that specifically removes this requirement?

When testing HRM and Windows Azure Pack together, we saw that the combination of this capability profile together with Gallery Items (VM Roles with SPF/VMM as resource provider in a VM Cloud) didn’t work very well together.
If you are familiar with VM roles, then you know that they are not associated with any hardware profiles, but uses only resdefpkg (imported in WAP) and resextpkg (imported in VMM). These artifacts will use their own settings and only bind to physical objects in the VMM library like VHD’s, scripts etc.

In other words: the deployment of a VM Role to a cloud with any cloud capability profile associated, would fail. Therefore, you could not have a cloud configured in VMM that could be used by both WAP and HRM.

This is now fixed and you must install the update on your VMM management server (where the HRM provider is installed) and perform a restart.


Tuesday, December 31, 2013

How to Setup Hyper-V Recovery Manager with a Single VMM server topology

Hyper-V Recovery Manager with single VMM server topology

Recently, Microsoft announced that a single VMM server will be sufficient in order to take advantage of Hyper-V Replica – a software as a service offering from Windows Azure, that will orchestrate DR workflows in your on-premise cloud infrastructures, managed by System Center 2012 R2 – Virtual Machine Manager.
This is a huge step in the right direction, in order to ensure HVR adoption for customers and partners.
The requirement of having two VMM infrastructures would not only be an additional cost, but also lead to administrative overhead and complexity, since a Hyper-V host can only be managed by a single VMM management server at a time.


This blog post will focus on:

·         Setup of the HVR agent on the VMM Management server
·         Creation of DR Cloud within VMM
·         Configuration of DR in HVR
·         Orchestration with HVR and VMM

Setup of the HVR agent on the VMM Management server

Before we can go ahead and deploy HVR into our environment, the following requirements must be met.

Hyper-V Recovery Manager prerequisites:

·         Windows Azure account. You will need an Azure account with the recovery services feature enabled.
·         .CER certificate that must be uploaded as a management certificate containing the public key to the Hyper-V Recovery vault, so that the VMM server can be registered with this vault. Each vault has a single .cer certificate that complies with the certificate prerequisites.
·         .PFX file. The .cer certificate must be exported as a .PFX file (with the private key), and you will import it on each VMM server that contains virtual machines that you want to protect. This blog post will only use a single VMM server.

VMM server prerequisites:

·         At least one VMM server running on System Center 2012 SP1 or System Center 2012 R2 (this blog post will demonstrate 2012 R2)
·         If you are running one VMM server, it will need two clouds configured (where the DR will occur between the clouds). If you have two or more VMM servers, at least one cloud should be configured on the source VMM server you want to protect, and one cloud on the destination VMM server that you will use for recover. The primary cloud you want to protect must contain the following:
o   One or more VMM host groups
o   One or more Hyper-V hosts servers in each host group
o   One or more Hyper-V virtual machines on each Hyper-V host
·         If you want virtual machines to be connected to a VM network after failover, you configure network mapping in Hyper-V Recovery Manager.

Once the certificate is uploaded to HVR, you can download the latest provider that you should install on your VMM management server



The installation process will require that you stop the System Center Virtual Machine Manager service prior to install, as there will be changes made to the GUI as well as extra functionality on the server

During the installation, you must point to the .pfx file of your .cer certificate and map it with the vault created in Windows Azure Hyper-V Recovery Manager.




Specify the VMM server name, and enable ‘Synchronize cloud data with the vault’. For you information, there will only be metadata that is shipped from VMM to Windows Azure.

Once the setup has completed, the setup can start the VMM server service again, and you can open the VMM console.


The next thing we will do, is to create clouds in VMM.

Creation of DR Cloud within VMM

 A cloud is an abstraction of your physical fabric resources, like virtualization hosts (host groups), networks, storage, library resources, port classifications, load balancers and eventually the user actions that you permits.

Create at least two clouds (one for production and one for DR) where you enable DR on both of them. This option is available when you assign a cloud name and a description



Also, please note that the capability profile that contains ‘Hyper-V’ should be selected as part of the cloud. This is a requirement so that only virtual machines tagged for Hyper-V, can participate in the DR workflows that is solely depending on Hyper-V as the hypervisor.


Now, if we look at the HVR service in Windows Azure again, under protected items, we should see both of our clouds listed


Note that there are currently no virtual machines enabled for protection, although there could be virtual machines running in these clouds.
If we check the clouds in VMM, we can see that status for protection shows ‘disabled’


Configuration of DR in HVR

To complete the configuration of the HVR service, we must continue to work in the Windows Azure Portal.
Click on your cloud under protected items, that should be seen as the primary cloud (running the primary workload).


In order to complete the configuration, click configure protection settings.





This will let you configure the replication location and frequency.
If you are familiar with Hyper-V Replica, you will recognize the options here.

Target location: this will be your VMM server

Target cloud: this will be the DR cloud you created in VMM, that will receive replication from the primary cloud, running the primary workload.

Copy frequency: Choose between 5 minutes (default, 30 seconds and 15 minutes – which was introduced with Windows Server 2012 R2 – Hyper-V.

Additional recovery points: Default is zero, but you can have in total 15 recover points.

Frequency of application-consistent snapshots: Hyper-V Replica does also support app-consistent snapshots in addition to crash consistent snapshots. This is ideally for SQL servers and other critical applications enabled for DR with HVR.

Data transfer compression: default is ON, so that the data is compressed during replication.

Authentication: Certificate and Kerberos is the option. HVR will let you use certificates so you can replicate between different domains if you would like, without any trust.

Port: 8084 is the default port, and a firewall rule will be enabled on the Hyper-V hosts in primary and recovery clouds to allow access to this port

Replication Method: Over the network is default – and recommended, but offline is also an option.

Replication start time: Immediately – which is good when you have the bandwidth. An initial replication will copy and replicate the entire virtual machine (with its virtual hard disks) to the recovery site. A good idea might be to schedule this to happen during night, for example.

Once you have completed the configuration, click ‘Save’.

This will initiate a job in your VMM and Hyper-V infrastructure that will pair clouds, prepare the VMM server(s) and clouds for protection configuration, and configure the settings for the clouds to start protecting virtual machines.

Once the job has completed, go back to protected items in the Azure portal and verify that DR is enabled for your clouds.


We must also map some resources in order to streamline the potential failovers between our cloud.
If you have worked with Hyper-V Replica, you may remember that after you have enabled initial replication on a new virtual machine, the wizard will send you to the virtual NIC interface on the hardware profile, so that you can configure an alternative IP configuration for the VM.
This setting in HVR let us do this at scale, so that network A on the primary cloud could always be mapped to network A2 on the DR cloud, for instance.

Click on ‘resources’ in the portal, and map your networks.
It is important that these networks are available in the cloud configuration in VMM in order to show up here.





Next, let us enable DR on our virtual machines running in the primary cloud.
In VMM, we will notice a new option under ‘Advanced’ on the hardware tab on the virtual machines.
The screenshot below shows a virtual machine running in my ‘Service Provider Cloud’ which is the primary cloud, where I enable DR.



Once this has completed, the virtual machine’s metadata should be exposed in HVR and ready to use in a recovery plan.

Note: if DR should be considered as mandatory in your environment, a good tip would be to tag the hardware profiles on your templates to be enabled for Hyper-V on the capability profile, as well as DR enabled under advanced. Then all newly created virtual machines based on your templates, will be available in the recovery plans in HVR. Also note that if Hyper-V Replica Broker is in use (in a Hyper-V Cluster), you can’t use protection on VMs that are not configured as highly available, running locally on one of the nodes.

Back in the portal, we must create a recovery plan.

Creating Recovery Plans in HVR

Now that we have a VM enabled for protection, it is time to create one or several recovery plans.
A recovery plan gathers virtual machines into groups and specifies the order in which the groups fail over. Virtual machines you select will be added to the default group (Group 1). After you create the recovery plan, you can customize it and add additional groups.
This is very useful if you have distributed applications (everyone have this!) or specific workload you would like to group. The power of HVR is the ability to orchestrate and facilitate the failovers.

Click on recovery plans in the portal, and start the wizard to create a new one.
First, you must select source and target. In my example, since using only a single VMM server, I can use the same on both source and target. Specify a name and continue.



Select virtual machines that should participate in the recovery plan. We can see the VM I enabled previously at this stage.


Once the job has completed, you should have successfully enabled a recovery plan for the virtual machine(s) and is able to perform the workflows like failover (planned, unplanned) and test failover.



Thanks for reading – and in the next blog post or so, we will look closer at DR operations at scale and how to use groups together with recovery plans to meet critical business requirements.

Happy new year!