Professional Documents
Culture Documents
This document is provided "as-is". Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. This document does not provide you with any legal rights to any intellectual property in any Microsoft product or product name. You may copy and use Terms of Use (http://technet.microsoft.com/cc300389.aspx) | Trademarks (http://www.microsoft.com/library/toolbar/3.0/trademarks/en-us.mspx)
Table Of Contents
Chapter 1
Managing High Availability and Site Resilience: Exchange 2013 Help Managing Database Availability Groups: Exchange 2013 Help Managing Mailbox Database Copies: Exchange 2013 Help Monitoring Database Availability Groups: Exchange 2013 Help
Chapter 1
TechNet
Products
IT Resources
Downloads
Training
Support
Proactive monitoring
Making sure that your servers are operating reliably and that your database copies are healthy are key objectives for daily messaging operations. Exchange 2013 includes a number of features that can be used to perform a variety of health monitoring tasks for DAGs and mailbox database copies, including:
Get-MailboxDatabaseCopyStatus Test-ReplicationHealth Crimson channel event logging In addition to monitoring the health and status, it's also critical to monitor for situations that can compromise availability. For example, we recommend that you monitor the redundancy of your replicated databases. It's critical to avoid situations where you're down to a single copy of a database. This scenario should be treated with the highest priority and resolved as soon as possible. For more detailed information about monitoring the health and status of DAGs and mailbox database copies, see Monitoring Database Availability Groups. Return to top
TechNet
Products
IT Resources
Downloads
Training
Support
Creating DAGs
A DAG can be created using the New Database Availability Group wizard in the Exchange Administration Center (EAC), or by running the NewDatabaseAvailabilityGroup cmdlet in the Exchange Management Shell. When creating a DAG, you provide a name for the DAG, and optional witness server and witness directory settings. In addition, one or more IP addresses are assigned to the DAG, either by using static IP addresses or by allowing the DAG to be automatically assigned the necessary IP addresses using Dynamic Host Configuration Protocol (DHCP). You can manually assign IP addresses to the DAG by using the DatabaseAvailabilityGroupIpAddresses parameter. If you omit this parameter, the DAG attempts to obtain an IP address by using a DHCP server on your network. For detailed steps about how to create a DAG, see Create a Database Availability Group. When you create a DAG, an empty object representing the DAG with the name you specified and an object class of msExchMDBAvailabilityGroup is created in Active Directory. DAGs use a subset of Windows failover clustering technologies, such as the cluster heartbeat, cluster networks, and cluster database (for storing data that changes or can change quickly, such as database state changes from active to passive or the reverse, or from mounted to dismounted or the reverse). Because DAGs rely on Windows failover clustering, they can only be created on Exchange 2013 Mailbox servers running the Windows Server 2008 R2 Enterprise or Datacenter operating system or the Windows Server 2012 Standard or Datacenter operating system. Note: The failover cluster created and used by the DAG must be dedicated to the DAG. The cluster can't be used for any other high availability solution or for any other purpose. For example, the failover cluster can't be used to cluster other applications or services. Using a DAG's underlying failover cluster for purposes other than the DAG isn't supported.
number of users, the datacenter you choose to host the witness server is considered to be the primary datacenter from the solution's perspective. If the witness server is in the datacenter with the majority of the client population, the majority of clients retain access after a failure. If the datacenter is remote to large user populations, this may affect your decision. You would then need to determine if there's a requirement for the primary datacenter to remain healthy and active if there's a loss of wide are network (WAN) connectivity to the other two datacenters. In that event, the witness server should also be in the primary datacenter. Although it's supported to use a witness server in a third datacenter, we don't recommend this scenario. From an Exchange perspective, this configuration doesn't provide you with greater availability. It's important that you examine the critical path factors if you use a witness server in a third datacenter. For example, if the WAN connection between the primary datacenter and the second and third datacenter fails, the solution in the primary datacenter becomes unavailable.
The task was unable to create the default witness directory on server < Server Name>. Please manually specify a witness directory. If you specify a witness server and witness directory, you receive the following warning message.
Unable to access file shares on witness server 'ServerName'. Until this problem is corrected, the database availability group may be more vulnerable to failures. You can use the Set-DatabaseAvailabilityGroup cmdlet to try the operation again. Error: The network path was not found. If Windows Firewall is enabled on the witness server after the DAG is created but before servers are added, it may block the addition or removal of DAG members. If Windows Firewall is enabled on the witness server and there are no firewall exceptions configured for WMI, the Add-DatabaseAvailabilityGroupServer cmdlet displays the following warning message.
Failed to create file share witness directory 'C:\DAGFileShareWitnesses\DAG_FQDN' on witness server 'ServerName' . Until this problem is corrected, the database availability group may be more vulnerable to failures. You can use the Set-DatabaseAvailabilityGroup cmdlet to try the operation again. Error: WMI exception occurred on server 'ServerName': The RPC server is unavailable. (Exception from HRESULT: 0x800706BA) To resolve the preceding error and warnings, do one of the following: Manually create the witness directory and share on the witness server, and assign the CNO for the DAG full control for the directory and share. Enable the WMI exception in Windows Firewall. Disable Windows Firewall. Return to top
DAG membership
After a DAG has been created, you can add servers to or remove servers from the DAG using the Manage Database Availability Group wizard in the EAC, or using the Add-DatabaseAvailabilityGroupServer or Remove-DatabaseAvailabilityGroupServer cmdlets in the Shell. For detailed steps about how to manage DAG membership, see Manage Database Availability Group Membership. Note: Each Mailbox server that's a member of a DAG is also a node in the underlying cluster used by the DAG. As a result, at any one time, a Mailbox server can be a member of only one DAG. If the Mailbox server being added to a DAG doesn't have the failover clustering component installed, the method used to add the server (for example, the AddDatabaseAvailabilityGroupServer cmdlet or the Manage Database Availability Group wizard) installs the failover clustering feature. When the first Mailbox server is added to a DAG, the following occurs: The Windows failover clustering component is installed, if it isn't already installed. A failover cluster is created using the name of the DAG. This failover cluster is used exclusively by the DAG, and the cluster must be dedicated to the DAG. Use of the cluster for any other purpose isn't supported. A CNO is created in the default computers container. The name and IP address of the DAG is registered as a Host (A) record in Domain Name System (DNS). The server is added to the DAG object in Active Directory. The cluster database is updated with information on the databases mounted on the added server.
In a large or multiple site environment, especially those in which the DAG is extended to multiple Active Directory sites, you must wait for Active Directory replication of the DAG object containing the first DAG member to complete. If this Active Directory object isn't replicated throughout your environment, adding the second server may cause a new cluster (and new CNO) to be created for the DAG. This is because the DAG object appears empty from the perspective of the second member being added, thereby causing the Add-DatabaseAvailabilityGroupServer cmdlet to create a cluster and CNO for the DAG, even though these objects already exist. To verify that the DAG object containing the first DAG server has been replicated, use the Get-DatabaseAvailabilityGroup cmdlet on the second server being added to verify that the first server you added is listed as a member of the DAG. When the second and subsequent servers are added to the DAG, the following occurs: The server is joined to the Windows failover cluster for the DAG. The quorum model is automatically adjusted: A Node Majority quorum model is used for DAGs with an odd number of members. A Node and File Share Majority quorum model is used for DAGs with an even number of members. The witness directory and share are automatically created by Exchange when needed. The server is added to the DAG object in Active Directory. The cluster database is updated with information about mounted databases. Note: The quorum model change should happen automatically. However, if the quorum model doesn't automatically change to the proper model, you can run the SetDatabaseAvailabilityGroup cmdlet with only the Identity parameter to correct the quorum settings for the DAG.
Warning: If your DAG members are running Windows Server 2012, you must pre-stage the CNO prior to adding the first server to the DAG. In environments where computer account creation is restricted, or where computer accounts are created in a container other than the default computers container, you can pre-stage and provision the CNO. You create and disable a computer account for the CNO, and then either: Assign full control of the computer account to the computer account of the first Mailbox server you're adding to the DAG. Assign full control of the computer account to the Exchange Trusted Subsystem USG. Assigning full control of the computer account to the computer account of the first Mailbox server you're adding to the DAG ensures that the LOCAL SYSTEM security context will be able to manage the pre-staged computer account. Assigning full control of the computer account to the Exchange Trusted Subsystem USG can be used instead because the Exchange Trusted Subsystem USG contains the machine accounts of all Exchange servers in the domain. For detailed steps about how to pre-stage and provision the CNO for a DAG, see Pre-Stage the Cluster Name Object for a Database Availability Group.
Description Network encryption isn't used. Network encryption is used on all DAG networks for replication and seeding. Network encryption is used on DAG networks when replicating across different subnets. This is the default setting. Network encryption is used on all DAG networks for seeding only.
DAG networks
A DAG network is a collection of one or more subnets used for either replication traffic or MAPI traffic. Each DAG contains a maximum of one MAPI network and zero or more replication networks. In a single network adapter configuration, the network is used for both MAPI and replication traffic. Although a single network adapter and path is supported, we recommend that each DAG have a minimum of two DAG networks. In a two-network configuration, one network is typically dedicated for replication traffic, and the other network is used primarily for MAPI traffic. You can also add network adapters to each DAG member and configure additional DAG networks as replication networks. Note: When using multiple replication networks, there's no way to specify an order of precedence for network use. Exchange randomly selects a replication network from the group of replication networks to use for log shipping. In Exchange 2010, manual configuration of DAG networks was necessary in many scenarios. By default in Exchange 2013, DAG networks are automatically configured by the system. Before you can create or modify DAG networks, you must first enable manual DAG network control by running the following command:
After you've enabled manual DAG network configuration, you can use the New-DatabaseAvailabilityGroupNetwork cmdlet in the Shell to create a DAG network. For detailed steps about how to create a DAG network, see Create a Database Availability Group Network. You can use the Set-DatabaseAvailabilityGroupNetwork cmdlet in the Shell to configure DAG network properties. For detailed steps about how to configure DAG network properties, see Configure Database Availability Group Network Properties. Each DAG network has required and optional parameters to configure: Network name A unique name for the DAG network of up to 128 characters. Network description An optional description for the DAG network of up to 256 characters. Network subnets One or more subnets entered using a format of IPAddress/Bitmask (for example, 192.168.1.0/24 for Internet Protocol version 4 (IPv4) subnets; 2001:DB8:0:C000::/64 for Internet Protocol version 6 (IPv6) subnets). Enable replication In the EAC, select the check box to dedicate the DAG network to replication traffic and block MAPI traffic. Clear the check box to prevent replication from using the DAG network and to enable MAPI traffic. In the Shell, use the ReplicationEnabled parameter in the SetDatabaseAvailabilityGroupNetwork cmdlet to enable and disable replication. Note: Disabling replication for the MAPI network doesn't guarantee that the system won't use the MAPI network for replication. When all configured replication networks are offline, failed, or otherwise unavailable, and only the MAPI network remains (which is configured as disabled for replication), the system uses the MAPI network for replication. The initial DAG networks (for example, MapiDagNetwork and ReplicationDagNetwork01) created by the system are based on the subnets enumerated by the Cluster service. Each DAG member must have the same number of network adapters, and each network adapter must have an IPv4 address (and optionally, an IPv6 address as well) on a unique subnet. Multiple DAG members can have IPv4 addresses on the same subnet, but each network adapter and IP address pair in a specific DAG member must be on a unique subnet. In addition, only the adapter used for the MAPI network should be configured with a default gateway. Replication networks shouldn't be configured with a default gateway. For example, consider DAG1, a two-member DAG where each member has two network adapters (one dedicated for the MAPI network and the other for a replication network). Example IP address configuration settings are shown in the following table.
In the following configuration, there are two subnets configured in the DAG: 192.168.1.0 and 10.0.0.0. When EX1 and EX2 are added to the DAG, two subnets will be
enumerated and two DAG networks will be created: MapiDagNetwork (192.168.1.0) and ReplicationDagNetwork01 (10.0.0.0). These networks will be configured as shown in the following table.
After replication is disabled for MapiDagNetwork, the Microsoft Exchange Replication service uses ReplicationDagNetwork01 for continuous replication. If ReplicationDagNetwork01 experiences a failure, the Microsoft Exchange Replication service reverts to using MapiDagNetwork for continuous replication. This is done intentionally by the system to maintain high availability.
In the following configuration, there are four subnets configured in the DAG: 192.168.0.0, 192.168.1.0, 10.0.0.0, and 10.0.1.0. When EX1 and EX2 are added to the DAG, four subnets will be enumerated, but only two DAG networks will be created: MapiDagNetwork (192.168.0.0, 192.168.1.0) and ReplicationDagNetwork01 (10.0.0.0, 10.0.1.0). These networks will be configured as shown in the following table.
This command will also disable the network for use by the cluster. Although the iSCSI networks will continue to appear as DAG networks, they won't be used for MAPI or replication traffic after running the above command. Return to top
database doesn't automatically mount. When the copy queue length is less than or equal to 12, Exchange attempts to replicate the remaining logs to the passive copy and mounts the database. GoodAvailability If you specify this value, the database automatically mounts immediately after a failover if the copy queue length is less than or equal to six. The copy queue length is the number of logs recognized by the passive copy that needs to be replicated. If the copy queue length is more than six, the database doesn't automatically mount. When the copy queue length is less than or equal to six, Exchange attempts to replicate the remaining logs to the passive copy and mounts the database. Lossless If you specify this value, the database doesn't automatically mount until all logs generated on the active copy have been copied to the passive copy. This setting also causes the Active Manager best copy selection algorithm to sort potential candidates for activation based on the database copy's activation preference value and not its copy queue length. The default value is GoodAvailability. If you specify either BestAvailability or GoodAvailability, and all the logs from the active copy can't be copied to the passive copy being activated, you may lose some mailbox data. However, the Safety Net feature (which is enabled by default) helps protect against most data loss by resubmitting messages that are in the Safety Net queue. In addition to the preceding values, you can also configure the AutoDatabaseMountDial parameter with a custom value by using ADSI Edit or Ldp.exe to modify the attribute directly in Active Directory. The AutoDatabaseMountDial parameter is represented by the msExchDataLossForAutoDatabaseMount attribute of the Mailbox server object. The whole number numeric value for this attribute represents the maximum number of transaction log files you are willing to lose to mount a database without human intervention. If you configure the AutoDatabaseMountDial parameter with a custom value greater than 12, we recommend that you also increase the duration of the Safety Net retention period to enable increased protection against a greater number of lost logs.
Return to top
You can download the latest update rollup for Exchange 2013 from the Microsoft Download Center. Return to top
TechNet
Products
IT Resources
Downloads
Training
Support
Seeding process
When you initiate a seeding process by using the Add-MailboxDatabaseCopy or Update-MailboxDatabaseCopy cmdlets, the following tasks are performed: 1. Database properties from Active Directory are read to validate the specified database and servers, and to verify that the source and target servers are running Exchange 2013, they are both members of the same DAG, and that the specified database isn't a recovery database. The database file paths are also read. 2. Preparations occur for reseed checks from the Microsoft Exchange Replication service on the target server. 3. The Microsoft Exchange Replication service on the target server checks for the presence of database and transaction log files in the file directories read by the Active Directory checks in step 1. 4. The Microsoft Exchange Replication service returns the status information from the target server to the administrative interface from where the cmdlet was run. 5. If all preliminary checks have passed, you're prompted to confirm the operation before continuing. If you confirm the operation, the process continues. If an error is encountered during the preliminary checks, the error is reported and the operation fails. 6. The seed operation is started from the Microsoft Exchange Replication service on the target server. 7. The Microsoft Exchange Replication service suspends database replication for the active database copy. 8. The state information for the database is updated by the Microsoft Exchange Replication service to reflect a status of Seeding. 9. If the target server doesn't already have the directories for the target database and log files, they are created. 10. A request to seed the database is passed from the Microsoft Exchange Replication service on the target server to the Microsoft Exchange Replication service on the source server using TCP. This request and the subsequent communications for seeding the database occur on a DAG network that has been configured as a replication network. 11. The Microsoft Exchange Replication service on the source server initiates an Extensible Storage Engine (ESE) streaming backup via the Microsoft Exchange Information Store service interface. 12. The Microsoft Exchange Information Store service streams the database data to the Microsoft Exchange Replication service. 13. The database data is moved from the source server's Microsoft Exchange Replication service to the target server's Microsoft Exchange Replication service. 14. The Microsoft Exchange Replication service on the target server writes the database copy to a temporary directory located in the main database directory called temp-seeding. 15. The streaming backup operation on the source server ends when the end of the database is reached. 16. The write operation on the target server completes, and the database is moved from the temp-seeding directory to the final location. The temp-seeding directory is deleted. 17. On the target server, the Microsoft Exchange Replication service proxies a request to the Microsoft Exchange Search service to mount the content index catalog for the database copy, if it exists. If there are existing out-of-date catalog files from a previous instance of the database copy, the mount operation fails, which triggers the need to replicate the catalog from the source server. Likewise, if the catalog doesn't exist on a new instance of the database copy on the target server, a copy of the catalog is required. The Microsoft Exchange Replication service directs the Microsoft Exchange Search service to suspend indexing for the database copy while a new catalog is copied from the source. 18. The Microsoft Exchange Replication service on the target server sends a seed catalog request to the Microsoft Exchange Replication service on the source server. 19. On the source server, the Microsoft Exchange Replication service requests the directory information from the Microsoft Exchange Search service and requests that indexing be suspended. 20. The Microsoft Exchange Search service on the source server returns the search catalog directory information to the Microsoft Exchange Replication service. 21. The Microsoft Exchange Replication service on the source server reads the catalog files from the directory. 22. The Microsoft Exchange Replication service on the source server moves the catalog data to the Microsoft Exchange Replication service on the target server using a connection across the replication network. After the read is complete, the Microsoft Exchange Replication service sends a request to the Microsoft Exchange Search service to resume indexing of the source database. 23. If there are any existing catalog files on the target server in the directory, the Microsoft Exchange Replication service on the target server deletes them. 24. The Microsoft Exchange Replication service on the target server writes the catalog data to a temporary directory called CiSeed.Temp until the data is completely transferred. 25. The Microsoft Exchange Replication service moves the complete catalog data to the final location. 26. The Microsoft Exchange Replication service on the target server resumes search indexing on the target database. 27. The Microsoft Exchange Replication service on the target server returns a completion status. 28. The final result of the operation is passed to the administrative interface from which the cmdlet was called.
The greater the replay lag time set, the longer the database recovery process. Depending on the number of log files that need to replayed during recovery, and the speed at which your hardware can replay them, it may take several hours or more to recover a database. We recommend that you determine whether lagged copies are critical for your overall disaster recovery strategy. If using them is critical to your strategy, we recommend using multiple lagged copies, or using a redundant array of independent disks (RAID) to protect a single lagged copy, if you don't have multiple lagged copies. If you lose a disk or if corruption occurs, you don't lose your lagged point in time. Lagged copies aren't patchable with the ESE single page restore feature. If a lagged copy encounters database page corruption (for example, a -1018 error), it will have to be reseeded (which will lose the lagged aspect of the copy). Activating and recovering a lagged mailbox database copy is an easy process if you want the database to replay all log files and make the database copy current. If you want to replay log files up to a specific point in time, it's a more difficult operation because you manually manipulate log files and run Exchange Server Database Utilities (Eseutil.exe). For detailed steps about how to activate a lagged mailbox database copy, see Activate a Lagged Mailbox Database Copy.
If the DataMoveReplicationConstraint
Conditions
SecondCopy
At least one passive database copy for a replicated database must meet the conditions in the next column.
The passive database copy must: Be healthy. Have a replay queue within 10 minutes of the replay lag time. Have a copy queue length less than 10 logs. Have an average copy queue length less than 10 logs. The average copy queue length is computed based on the number of times the application has queried the database status.
SecondDatacenter
At least one passive database copy in another Active Directory site must meet the conditions in the next column.
AllDatacenters
The active copy must be mounted, and a passive copy in each Active Directory site must meet the conditions in the next column.
AllCopies
The active copy must be mounted, and all passive database copies must meet the conditions in the next column.
Check Replication Flush The Data Guarantee API can also be used to validate that a prerequisite number of database copies have replayed the required transaction logs. This is verified by comparing the last log replayed timestamp with that of the calling service's commit timestamp (in most cases, this is the timestamp of the last log file that contains required data) plus an additional five seconds (to deal with system time clock skews or drift). If the replay timestamp is greater than the commit timestamp, the DataMoveReplicationConstraint parameter is satisfied. If the replay timestamp is less than the commit timestamp, the DataMoveReplicationConstraint isn't satisfied. Before moving large numbers of mailboxes to or from replication databases within a DAG, we recommend that you configure the DataMoveReplicationConstraint parameter on each mailbox database according to the following:
Mailbox databases that don't have any database copies A DAG within a single Active Directory site A DAG in multiple datacenters using a stretched Active Directory site A DAG that spans two Active Directory sites, and you will have highly available database copies in each site A DAG that spans two Active Directory sites, and you will have only lagged database copies in the second site
None
SecondCopy
SecondCopy
SecondDatacenter
SecondCopy This is because the Data Guarantee API won't guarantee data being committed until the log file is replayed into the database copy, and due to the nature of the database copy being lagged, this constraint will fail the move request, unless the lagged database copy ReplayLagTime value is less than 30 minutes. AllDatacenters
A DAG that spans three or more Active Directory sites, and each site will contain highly available database copies
In the preceding example, there are four copies of each database, and therefore, only four possible values for activation preference (1, 2, 3, or 4). The Preference count list column shows the count of the number of databases with each of these values. For example, on EX3, there are 13 database copies with an activation preference of 1, two copies with an activation preference of 2, one copy with an activation preference of 3, and no copies with an activation preference of 4. As you can see, this DAG isn't balanced in terms of the number of active databases hosted by each DAG member, the number of passive databases hosted by each DAG member, or the activation preference count of the hosted databases. You can use the RedistributeActiveDatabases.ps1 script to balance the active mailbox databases copies across a DAG. This script moves databases between their copies in an attempt to have an equal number of mounted databases on each server in DAG. If required, the script also attempts to balance active databases across sites. The script provides two options for balancing active database copies within a DAG: BalanceDbsByActivationPreference When this option is specified, the script attempts to move databases to their most preferred copy (based on activation preference) without regard to the Active Directory site. BalanceDbsBySiteAndActivationPreference When this option is specified, the script attempts to move active databases to their most preferred copy, while also trying to balance active databases within each Active Directory site. After running the script with the first option, the preceding unbalanced DAG becomes balanced, as shown in the following table.
Server
As shown in the preceding table, this DAG is now balanced in terms of number of active and passive databases on each server and activation preference across the servers. The following table lists the available parameters for the RedistributeActiveDatabases.ps1 script.
ShowDatabaseDistributionByServer RunOnlyOnPAM
RedistributeActiveDatabases.ps1 examples
This example shows the current database distribution for a DAG, including preference count list.
This example redistributes and balances the active mailbox database copies in a DAG using activation preference without prompting for input.
This example redistributes and balances the active mailbox database copies in a DAG using activation preference, and produces a summary of the distribution.
Database switchovers
The Mailbox server that hosts the active copy of a database is referred to as the mailbox database master. The process of activating a passive database copy changes the mailbox database master for the database and turns the passive copy into the new active copy. This process is called a database switchover. In a database switchover, the active copy of a database is dismounted on one Mailbox server and a passive copy of that database is mounted as the new active mailbox database on another Mailbox server. When performing a switchover, you can optionally override the database mount dial setting on the new mailbox database master. You can quickly identify which Mailbox server is the current mailbox database master by reviewing the right-hand column under the Database Copies tab in the EAC. You can perform a switchover by using the Activate link in the EAC, or by using the Move-ActiveMailboxDatabase cmdlet in the Shell. There are several internal checks that will be performed before activating a passive copy: The status of the database copy is checked. If the database copy is in a failed state, the switchover is blocked. You can override this behavior and bypass the health check by using the SkipHealthChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move the active copy to a database copy in a failed state. The active database copy is checked to see if it's currently a seeding source for any passive copies of the database. If the active copy is currently being used as a source for seeding, the switchover is blocked. You can override this behavior and bypass the seeding source check by using the SkipActiveCopyChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows you to move an active copy that's being used as a seeding source. Using this parameter will cause the seeding operation to be cancelled and considered failed. The copy queue and replay queue lengths for the database copy are checked to ensure their values are within the configured criteria. Also, the database copy is verified to ensure that it isn't currently in use as a source for seeding. If the values for the queue lengths are outside the configured criteria, or if the database is currently used as a source for seeding, the switchover is blocked. You can override this behavior and bypass these checks by using the SkipLagChecks parameter of the Move-ActiveMailboxDatabase cmdlet. This parameter allows a copy to be activated that has replay and copy queues outside of the configured criteria. The state of the search catalog (content index) for the database copy is checked. If the search catalog isn't up to date, is in an unhealthy state, or is corrupt, the switchover is blocked. You can override this behavior and bypass the search catalog check by using the SkipClientExperienceChecks parameter of the MoveActiveMailboxDatabase cmdlet. This parameter causes this search to skip the catalog health check. If the search catalog for the database copy you're activating is in an unhealthy or unusable state and you use this parameter to skip the catalog health check and activate the database copy, you will need to either crawl or seed the search catalog again. When performing a database switchover, you also have the option of overriding the mount dial settings configured for the server that hosts the passive database copy being activated. Using the MountDialOverride parameter of the Move-ActiveMailboxDatabase cmdlet instructs the target server to override its own mount dial settings and use those specified by the MountDialOverride parameter. For detailed steps about how to perform a switchover of a database copy, see Activate a Mailbox Database Copy.
TechNet
Products
IT Resources
Downloads
Training
Support
Managed availability
Managed availability is the integration of built-in monitoring and recovery actions with the Exchange built-in high availability platform. It's designed to detect and recover from problems as soon as they occur and are discovered by the system. Unlike previous external monitoring solutions for Exchange, managed availability doesn't try to identify or communicate the root cause of an issue. It's instead focused on recovery aspects that address three key areas of the user experience: Availability Can users access the service? Latency How is the experience for users? Errors Are users able to accomplish what they want? The new architecture in Exchange 2013 makes each Exchange server an island where services on that island only service the active databases located on that server. The architectural changes in Exchange 2013 require a new approach to availability model used by Exchange. The Mailbox and Client Access server architecture imply that any Mailbox server with an active database is in production for all services, including all protocol services. As a result, this fundamentally changes the model used to manage the protocol services. Managed availability was conceived to address this change and to provide a native health monitoring and recovery solution. The integration of the building block architecture into a unified framework provides a powerful capability to detect failures and recover from them. Managed availability moves away from monitoring individual separate slices of the system to monitoring the end-to-end user experience, and protecting the end user's experience through recovery-oriented computing. In Exchange 2013, client access protocols for a specific mailbox are always served from the protocol instance that's local to the active database copy. As a result, it's important that managed availability's monitoring and recovery actions take into account more than just the health of the database. Managed availability is an internal process that runs on every Exchange 2013 server. It's implemented in the form of two services: Exchange Health Manager Service (MSExchangeHMHost.exe) This is a controller process used to manage worker processes. It's used to build, execute, and start and stop the worker process, as needed. It's also used to recover the worker process in case that process fails, to prevent the worker process from being a single point of failure. Exchange Health Manager Worker process (MSExchangeHMWorker.exe) This is the worker process responsible for performing the run-time tasks. Managed availability uses persistent storage to perform its functions: XML configuration files are used to initialize the work item definitions during startup of the worker process. The Windows registry is used to store run-time data, such as bookmarks. The Windows crimson channel event log infrastructure is used to store the work item results. As illustrated in the following drawing, managed availability includes three main asynchronous components that are constantly doing work. Managed availability
The first component is the probe engine, which is responsible for taking measurements on the server and collecting data. The results of those measurements flow into the second component, the monitor. The monitor contains all of the business logic used by the system based on what is considered healthy on the data collected. Similar to a pattern recognition engine, the monitor looks for the various different patterns on all the collected measurements, and then it decides whether something is considered healthy. Finally, there is the responder engine, which is responsible for recovery actions. When something is unhealthy, the first action is to attempt to recover that component. This could include multi-stage recovery actions; for example, the first attempt may be to restart the application pool, the second may be to restart the service, the third attempt may be to restart the server, and the subsequent attempt may be to take the server offline so that it no longer accepts traffic. If the recovery actions are unsuccessful, the system escalates the issue to a human through event log notifications.
The probe engine contains probes, checks, and notification logic. Probes are synthetic transactions performed by the system to test the end-to-end user experience. Checks are the infrastructure that perform the collection of performance data, including user traffic, and measure the collected data against thresholds that are set to determine spikes in user failures. This enables the checks infrastructure to become aware when users are experiencing issues. Finally, the notification logic enables the system to take action immediately based on a critical event, without having to wait for the results of the data collected by a probe. These are typically exceptions or conditions that can be detected and recognized without a large sample set. Monitors query the data collected by probes to determine if action needs to be taken based on a predefined rule set. Depending on the rule or the nature of the issue, a monitor can either initiate a responder or escalate the issue to a human via an event log entry. In addition, monitors define how much time after a failure that a responder is executed, as well as the workflow of the recovery action. Monitors have various states. From a system state perspective, monitors have two states: Healthy The monitor is operating properly and all collected metrics are within normal operating parameters Unhealthy The monitor isn't healthy and has either initiated recovery through a responder or notified an administrator through escalation. From an administrative perspective, monitors have additional states that appear in the Shell: Degraded When a monitor is in an unhealthy state from 0 through 60 seconds, it's considered Degraded. If a monitor is unhealthy for more than 60 seconds, it is considered Unhealthy. Disabled The monitor has been explicitly disabled by an administrator. Unavailable The Microsoft Exchange Health service periodically queries each monitor for its state. If it doesn't get a response to the query, the monitor state becomes Unavailable. Repairing An administrator sets the Repairing state to indicate to the system that corrective action is in process by a human, which allows the system and humans to differentiate between other failures that may occur at the same time corrective action is being taken (such as a database copy reseed operation). Return to top
Get-MailboxDatabaseCopyStatus cmdlet
You can use the Get-MailboxDatabaseCopyStatus cmdlet to view status information about mailbox database copies. This cmdlet enables you to view information about all copies of a particular database, information about a specific copy of a database on a specific server, or information about all database copies on a server. The following table describes possible values for the copy status of a mailbox database copy.
Mounted
Dismounting
DisconnectedAndHealthy
DisconnectedAndResynchronizing
The Get-MailboxDatabaseCopyStatus cmdlet also includes a parameter called ConnectionStatus, which returns details about the in-use replication networks. If you use this parameter, two additional output fields, IncomingLogCopyingNetwork and SeedingNetwork, will be populated in the task's output.
Get-MailboxDatabaseCopyStatus examples
The following examples use the Get-MailboxDatabaseCopyStatus cmdlet. Each example pipes the results to the Format-List cmdlet to display the output in list format. This example returns status information for all copies of the database DB2.
This example returns the status for all database copies on the Mailbox server MBX2.
This example returns the status for all database copies on the local Mailbox server.
This example returns status, log shipping, and seeding network information for the database DB3 on the Mailbox server MBX1.
For more information about using the Get-MailboxDatabaseCopyStatus cmdlet, see Get-MailboxDatabaseCopyStatus. Return to top
Test-ReplicationHealth cmdlet
You can use the Test-ReplicationHealth cmdlet to view continuous replication status information about mailbox database copies. This cmdlet can be used to check all aspects of the replication and replay status to provide a complete overview of a specific Mailbox server in a DAG. The Test-ReplicationHealth cmdlet is designed for the proactive monitoring of continuous replication and the continuous replication pipeline, the availability of Active Manager, and the health and status of the underlying cluster service, quorum, and network components. It can be run locally on or remotely against any Mailbox server in a DAG. The Test-ReplicationHealth cmdlet performs the tests listed in the following table.
DBCopyFailed
DBDisconnected
DBLogCopyKeepingUp
DBLogReplayKeepingUp
Test-ReplicationHealth example
This example uses the Test-ReplicationHealth cmdlet to test the health of replication for the Mailbox server MBX1.
Return to top
CollectOverMetrics.ps1 script
Exchange 2013 includes a script called CollectOverMetrics.ps1, which can be found in the Scripts folder. CollectOverMetrics.ps1 reads DAG member event logs to gather information about database operations (such as database mounts, moves, and failovers) over a specific time period. For each operation, the script records the following information: Identity of the database Time at which the operation began and ended Servers on which the database was mounted at the start and finish of the operation Reason for the operation Whether the operation was successful, and if the operation failed, the error details The script writes this information to .csv files with one operation per row. It writes a separate .csv file for each DAG. The script supports parameters that allow you to customize the script's behavior and output. For example, the results can be restricted to a specified subset by using the Database or ReportFilter parameters. Only the operations that match these filters will be included in the summary HTML report. The available parameters are listed in the following table.
StartTime
ActionType
RawOutput
Specifies that the script writes the results that would have been written to .csv files directly to the output stream, as would happen with write-output. This information can then be piped to other commands.
IncludedExtendedEvents
Specifies that the script collects the events that provide diagnostic details of times spent mounting databases. This can be a timeconsuming stage if the Application event log on the servers is large.
MergeCSVFiles ReportFilter
Specifies that the script takes all the .csv files containing data about each operation and merges them into a single .csv file. Specifies that a filter should be applied to the operations using the fields as they appear in the .csv files. This parameter uses the same format as a Where operation, with each element set to $_ and returning a Boolean value. For example: {$_DatabaseName notlike "Mailbox Database*"} can be used to exclude the default databases from the report.
CollectOverMetrics.ps1 examples
The following example collects metrics for all databases that match DB* (which includes a wildcard character) in the DAG DAG1. After the metrics are collected, an HTML report is generated and displayed.
The following examples demonstrate ways that the summary HTML report may be filtered. The first uses the Database parameter, which takes a list of database names. The summary report then contains data only about those databases. The next two examples use the ReportFilter option. The last example filters out all the default databases.
CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -Database MailboxDatabase123,MailboxDatabase456 CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -ReportFilter { $_.DatabaseName -notlike "Mailbox Database*" } CollectOverMetrics.ps1 -SummariseCsvFiles (dir *.csv) -ReportFilter { ($_.ActiveOnStart -like "ServerXYZ*") -and ($_ .ActiveOnEnd -notlike "ServerXYZ*") }
Return to top
CollectReplicationMetrics.ps1 script
CollectReplicationMetrics.ps1 is another health metric script included in Exchange 2013. This script provides an active form of monitoring because it collects metrics in real time, while the script is running. CollectReplicationMetrics.ps1 collects data from performance counters related to database replication. The script gathers counter data from multiple Mailbox servers, writes each server's data to a .csv file, and then reports various statistics across all of this data (for example, the amount of time each copy was failed or suspended, the average copy or replay queue length, or the amount of time that copies were outside of their failover criteria). You can either specify the servers individually, or you can specify entire DAGs. You can either run the script to first collect the data and then generate the report, or you can run it to just gather the data or to only report on data that's already been collected. You can specify the frequency at which data should be sampled and the total duration to gather data. The data collected from each server is written to a file named CounterData.<ServerName>.<TimeStamp>.csv . The summary report will be written to a file named HaReplPerfReport.<DAGName>.<TimeStamp>.csv, or HaReplPerfReport.<TimeStamp>.csv if you didn't run the script with the DagName parameter. The script starts Windows PowerShell jobs to collect the data from each server. These jobs run for the full period in which data is being collected. If you specify a large number of servers, this process can use a considerable amount of memory. The final stage of the process, when data is processed into a summary report, can also be quite time consuming for large amounts of data. It's possible to run the collection stage on one computer, and then copy the data elsewhere for processing. The CollectReplicationMetrics.ps1 script supports parameters that allow you to customize the script's behavior and output. The available parameters are listed in the following table.
ReportPath Duration
Servers SummariseFiles
MoveFilestoArchive LoadExchangeSnapin
Specifies that the script should move the files to a compressed folder after processing. Specifies that the script should load the Shell commands. This parameter is useful when the script needs to run from outside the Shell, such as in a scheduled task.
CollectReplicationMetrics.ps1 example
The following example gathers one hour's worth of data from all the servers in the DAG DAG1, sampled at one minute intervals, and then generates a summary report. In addition, the ReportPath parameter is used, which causes the script to place all the files in the current directory.
The following example reads the data from all the files matching CounterData* and then generates a summary report.
Return to top