vCenter Server Feature Design

In addition to the physical and logical designs described for vCenter Server, there are many additional capabilities that are provided with vSphere deployment. The following section describes vCenter Server feature design in the vSphere environment.

vSphere HA

The following section discusses the design options for vSphere HA in the environment.

vSphere HA Feature Details

vSphere HA provides protection against a host failure, restarting the virtual machines on another host in the cluster if a failure is detected.

vSphere HA is simple to configure and easily protects a wide range of workloads. Many services that are not currently protected are typically chosen to be protected in the new virtual infrastructure. When vSphere HA is configured, networking and storage must also be configured to support it.

VMware recommends protecting all workloads with vSphere HA if they are important to daily business operations. However, if there are workloads that are already running application-level high availability, vSphere HA might not be needed.

vSphere HA monitors each host for failures through a heartbeat mechanism in a master–slave relationship. During configuration of the cluster, an election takes place between the hosts, and a master host is elected. The master host communicates with vCenter Server and monitors the virtual machines and slave hosts in the cluster.

For more in-depth detail on vSphere HA features, see the “Creating and Using vSphere HA” section of the VMware vSphere Availability guide (https://pubs.vmware.com/vsphere-60/index.jsp#com.vmware.vsphere.avail.doc/GUID-53F6938C-96E5-4F67-9A6E-479F5A894571.html).

vSphere HA Admission Control Policy

The vSphere HA admission control policy allows an administrator to configure how the cluster judges available resources. In a smaller vSphere HA cluster, a larger proportion of the cluster resources are reserved to accommodate for host failures, based on the policy chosen. The following policies are available:

Define failover capacity by static number of hosts – With this admission control policy, vSphere HA specifies that a specified number of hosts can fail and sufficient resources will still remain available in the cluster to fail over all the virtual machines from those hosts.
Define failover capacity by reserving a percentage of the cluster resources – With this admission control policy, vSphere HA specifies that a specified percentage of aggregate CPU and memory resources are reserved for failover.
Use dedicated failover hosts – With this admission control policy, when a host fails, vSphere HA attempts to restart its virtual machines on any of the specified failover hosts. If this is not possible, for example, if the failover hosts have failed or have insufficient resources, then vSphere HA attempts to restart those virtual machines on other hosts in the cluster.
Do not reserve failover capacity – With this policy, no resources are reserved at all and virtual machines are allowed to power on even if they violate availability constraints.

VMware recommends that you always protect workloads. Even with a larger portion of resources reserved, an organization still benefits and can obtain significant cost savings due to server consolidation. The return on investment might take slightly longer to be realized, but all workloads are protected.

For more details on selecting policies, see “vSphere HA Admission Control” in the VMware vSphere Availability guide (https://pubs.vmware.com/vsphere-60/index.jsp#com.vmware.vsphere.avail.doc/GUID-53F6938C-96E5-4F67-9A6E-479F5A894571.html).

VM Monitoring

When VM monitoring is enabled, the VM monitoring service (using VMware Tools™) evaluates whether each virtual machine in the cluster is running. The VM monitoring service checks for regular heartbeats and I/O activity from the VMware Tools process running on guests. If no heartbeats or I/O activity are received, it is likely that the guest operating system has failed or VMware Tools is not being allocated time to complete its tasks. In this case, the VM monitoring service determines that the virtual machine has failed and the virtual machine is rebooted to restore service.

VM monitoring should be enabled to restart a failed virtual machine. To be effective, the application or service running on the virtual machine must be capable of restarting successfully after a reboot.

vSphere HA Design Decisions

The following table lists the vSphere HA design decisions made for this architecture design.

Table 24. vSphere HA Design Decisions

Design Decision	Design Justification	Design Implication
For this design, <Customer> has made the following decisions listed in this table.
vSphere HA will be used to protect the management, compute, and edge cluster infrastructure servers against vSphere ESXi host failure.	No RPO or RTO has been defined by <Customer>. Servers that provide a dependency for other services will be prioritized by using the restart priority option within vSphere HA.	Admission control needs to be set up to reserve failover capacity in order to guarantee a restart of the VMs.
Isolation response will be enabled and set to “Leave Powered On” for the management cluster. For the compute and edge clusters, host isolation will be set to “Power off”.	VMs will be restarted on another host in the cluster in a defined timeframe for example on a switch failure with an undefined recovery time. “Power off” is recommended in an IP storage scenario in order to prevent a split-brain condition.	With host isolation set to “Leave Powered On”, VMs could become unavailable if a network issue occurs.
Datastore heartbeat will be used for the edge and compute clusters. It will be set to use the automatic datastore selection policy.	Datastore heartbeating supports vSphere HA to identify the particular event (failure, partition, isolation) occurring on a host.	Datastore heartbeating cannot be used on the management cluster. Virtual SAN does currently not support datastore hearbeating.
VM component protection protecting against APD and PDL events will be enabled, with VM failover configured to power-off and restart VMs, allowing for recovery from storage loss.	To prevent storage loss caused by APD or PDL from causing extended periods of down time.
Host monitoring will be enabled.	Host monitoring is required to trigger the host status in the cluster and provide HA capability for handling host issues. This capability is a requirement for VMware Integrated OpenStack.

Table 25. vSphere HA Admission Control Design Decisions

Design Decision	Design Justification	Design Implication
For this design, <Customer> has made the following decisions listed in this table.
The management cluster will have admission control set to percentage-based, with 50% set for CPU and 50% set for RAM. An average HA slot size will be set to 4 vCPU / 12 GB RAM.	refer to DD013
Edge clusters have admission controlset to reserved failover capacity = 1 host. The slot size will be set to 2 vCPU / 1 GB RAM.	refer to DD018
For compute clusters, no failover capacity will be reserved for admission control.	refer to DD022	On an HA event, there is no guarantee that all impacted VMs can be restarted. Restarted VMs can also impact the performance of other VMs on the host where they are restarted.

Table 26. vSphere HA Monitor Virtual Machines Design Decisions

Decision ID	Design Decision	Design Justification	Design Implication
	For this design, <Customer> has made the following decisions listed in this table.
	VM monitoring will be disabled.	VM monitoring can lead to false positive alarms.

VMware vSphere Fault Tolerance

Table 27. vSphere FT Design Decisions

Decision ID	Design Decision	Design Justification	Design Implication
	For this design, <Customer> has made the following decisions listed in this table.
	vSphere FT is not required.	No identified use cases exist.	vSphere FT will not be available for either management or customer workloads.

vSphere Distributed Resource Scheduler

The following section discusses the design choices made for vSphere Distributed Resource Scheduler in the environment.

vSphere DRS Feature Details

vSphere DRS provides load balancing of a cluster by migrating workloads with VMware vSphere vMotion^® from heavily loaded hosts to less utilized hosts in the cluster. It can be set up to operate in three modes:

Manual, where recommendations are made, but not acted upon at all, unless an administrator initiates the migration.

Partially automated, where recommendations are made, but not acted upon, unless an administrator initiates the migration. Initial power-on placements, however, are made automatically.

Fully automated, where both initial power-on placement is performed automatically and all recommendations are acted upon automatically by the system.

How aggressive vSphere DRS is all depends on the migration threshold value that is set. It ranges from conservative to aggressive, with five different levels. What this means is, the lower the setting, the less likely a migration is to occur unless certain criteria, such as entering maintenance mode happens, or if there is a projected significantly performance gain that can be obtained as a result of the migration. On the opposite side of the spectrum, the highest level setting accepts any migration request generated by the system, regardless of how big the performance gain might be. The third level provides the best compromise between load balancing and reducing the number of vSphere vMotion events.

VMware vSphere Distributed Power Management

Table 28. DPM Design Decisions

Decision ID	Design Decision	Design Justification	Design Implication
	For this design, <Customer> has made the following decisions listed in this table.
	DPM is not required.	No identified use cases exist.	DPM will not be available for either management or customer clusters.

vSphere DRS Design Decisions

The following table lists the vSphere DRS design decisions made for this architecture design.

Table 29. vSphere DRS Design Decisions

Design Decision	Design Justification	Design Implication
For this design, <Customer> has made the following decisions listed in this table.
vSphere DRS will be enabled on all clusters with default settings.	vSphere DRS optimizes the load-balancing across the hosts in the cluster. This is required for the VMware Integrated OpenStack deployment.	The appliance cannot use vSphere DRS or HA features if direct-attached storage is used.
DRS anti-affinity rules will be set for the NSX Controller instances and the Platform Services Controller instances. The DRS rule “keep VMs on separate hosts” will be used.	The DRS rule helps to place the NSX Controller instances and the Platform Services Controller instances on different hosts. Therefore, a host failure will partially impact these VMs.

vSphere Enhanced vMotion Compatibility

The following section discusses VMware vSphere Enhanced vMotion Compatibility (EVC) design choices in the environment.

vSphere EVC Feature Details

vSphere vMotion requires that the CPUs in each host be similar so that live migrations can occur successfully. With EVC, all hosts in a cluster present the same CPU feature set to virtual machines, even if the actual CPUs on the hosts differ. Using EVC prevents migrations performed with vSphere vMotion from failing because of incompatible CPUs.

By setting EVC on the cluster in the initial design, hosts with newer CPUs can be added at a later date, without disruption. It can also be used to perform a rolling upgrade of all hardware with zero downtime.

EVC should be set to the highest level possible with the current CPUs in use.

vSphere EVC Design Decisions

The following table lists the vSphere EVC design decisions for this architecture design.

Table 30. EVC Design Decisions

Decision ID	Design Decision	Design Justification	Design Implication
	For this design, <Customer> has made the following decisions listed in this table.
	EVC will be enabled for the management, edge and compute clusters.	Different server models will be used and mixed for the management and compute clusters.

Resource Pools

The following section discusses resource pool design choices in the environment.

Resource Pools Design Decisions

The following table lists the resource pool design decisions made for this architecture design.

Table 31. Resource Pools Design Decisions

Decision ID	Design Decision	Design Justification	Design Implication
	For this design, <Customer> has made the following decisions listed in this table.
	Resource pools will not be created.	Resource pools are not supported in the current version of the OpenStack API.

vSphere Update Manager Deployment Model

In addition to the VMware vSphere Update Manager™ deployment, there are two ways of making sure that patches can be downloaded from VMware. These models are the following:

Internet-connected model – The vSphere Update Manger server is connected to the VMware patch repository to download patches for ESXi 5.x hosts, ESXi 6.0 hosts, and virtual appliances. No additional configuration is required, other than to scan and remediate the hosts as appropriate.
Air-gap model – vSphere Update Manager has no connection to the Internet and cannot download patch metadata. In this model, you have to install and configure the vSphere Update Manager download service to download and store patch metadata and patch binaries in a shared repository. vSphere Update Manager must be configured to use the shared repository as a patch datastore prior to being able to remediate the ESXi hosts.

vSphere Update Manager Resource Requirements

For minimum requirements and other details not discussed here, see the VMware vSphere Update Manager 6.0 documentation (http://pubs.vmware.com/vsphere-60/index.jsp#com.vmware.update_manager.doc/GUID-B5FB88E4-5341-45D4-ADC3-173922247466.html).

The following table lists the specifications for the vSphere Update Manager server.

Table 32. vSphere Update Manager Server Specifications

Attribute	Specification
vSphere Update Manager server version	6.0
Physical or virtual system	virtual
Number of CPUs	2 or more
Memory	4 GB RAM
Number of NIC and ports	1/1
Number of disks and disk sizes	Varies. See the vSphere Update Manager Sizing Estimator for disk space utilization requirements for the environment. Two (2) Disks are recommended; one (1) for the server OS, and one (1) for patches.
Operating system and SP level	Windows Server 2012

vSphere Update Manager Database Design

VMware best practice is to use an existing database server, if one is available. VMware recommends that any database being used for vSphere Update Manager be installed in a virtual machine to gain the benefits of using a virtual machine.

vSphere Update Manager comes with an embedded SQL Server 2008 R2 Express database for small deployments. For environments larger than 5 hosts and 50 Virtual machines, VMware recommends that you use either a Microsoft SQL or Oracle external database. For larger environments, it is recommended that you place the vSphere Update Manager database on a different computer. See the database vendor documentation for hardware requirements and recommendations.

The following table lists the specifications for the VMware vSphere Update Manager database.

Table 33. vSphere Update Manager Database Specifications

Attribute	Specification
Vendor and version	SQL Server 2012 Express
Authentication method	Active Directory
Recovery method	Simple
Database auto growth	Enabled in 1 MB increments
Transaction log auto growth	In 10% increments, restricted to 2 GB maximum size
Estimated vSphere Update Manager database size	3 GB

All configuration specifics for the database design can be found in the Virtualization Configuration Workbook document.

vSphere Update Manager Physical Design Decisions

The following table lists the vSphere Update Manager physical design decisions made for this architecture design.

Table 34. Update Manager Physical Design Decision

Decision ID	Design Decision	Design Justification	Design Implication
	For this design, <Customer> has made the following decisions listed in this table.
	vSphere Update Manager will be used. vCenter Server and vSphere Update Manager will run on different hosts. The embedded SQL Server Express database will be used.	vCenter Server Appliance only supports an external vSphere Update Manager.	An additional license for the Windows OS will be required.

vCenter Server Feature Design