Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Home
Library
Forums
Wiki
TechCenter
26
I ve recently returned from TechEd North America 2011 in Atlanta, Georgia, where I had a wonderful
time seeing old friends and new, and talking with customers and partners about my favorite subject:
high availability in Exchange Server 2010. In case you missed TechEd, or were there and missed some
sessions, you can download slide decks and watch presentations on Channel 9.
64
Li e
While at TechEd, I noticed several misconceptions that were being repeated around certain aspects of
Exchange HA. I thought a blog post might help clear up these misconceptions.
Description
How Configured
Activation
M
ove
-A
cti
veM
ail
bo
xDa
tab
as
ecmdlet
S
usp
en
d-M
ail
box
Da
tab
ase
Co
pyor S
etM
ail
bo
xSe
rve
rcmdlets
N/A
S
etDa
tab
ase
Ava
il
abi
lit
yG
rou
pcmdlet or
Exchange Management Console (EMC)
S
etDa
tab
ase
Ava
il
abi
lit
yG
rou
pcmdlet or
EMC
AutoDatabaseMountDial
S
etMa
ilb
oxS
erv
ercm
dle
t
N/A
Active Manager
Datacenter
N/A
S
etDa
tab
ase
Ava
il
abi
lit
yG
rou
pcmdlet
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
1/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Datacenter Switchover
Failover
The automatic process performed by the system in N/A Happens automatically by the
response to a failure affecting one or more active
system; although you can activation-block
mailbox database copies.
or suspend individual mailbox database
copies or an entire DAG member.
Incremental Resync
Lossy Failover
Quorum
Site
Split Brain
N/A
StartedMailboxServers
StoppedMailboxServers
Switchover
Targetless Switchover
M
ove
-A
cti
veM
ail
bo
xDa
tab
as
ecmdlet
(specifically without using the
T
arg
et
Ser
verparameter)
Transport Dumpster
S
etTr
ans
por
tCo
nf
igcmdlet or EMC
Witness Directory
Witness Server
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
2/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Figure 1: When extending a two-member DAG across two datacenters, locate the Witness Server in the primary
datacenter
In this example, Portland is the primary datacenter because it contains the majority of the user
population. As illustrated below, in the event of a WAN outage (which will always result in the loss of
communication between some DAG members when a DAG is extended across a WAN), the DAG
member in the Portland datacenter will maintain quorum and continue servicing the local user
population, and the DAG member in the Redmond datacenter will lose quorum and will require manual
intervention to restore to service after WAN connectivity is restored.
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
3/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Figure 2: In the event of a WAN outage, the DAG member in the primary datacenter will maintain quorum and
continue servicing local users
The reason for this behavior has to do with the core rules around quorum and DAGs, specifically:
All DAGs and DAG members require quorum to operate. If you don t have quorum, you don t
have an operational DAG. When quorum is lost, databases are dismounted, connectivity is
unavailable and replication is stopped.
Quorum requires a majority of voters to achieve a consensus.Thus, when you have an even
number of members in a DAG, you need an external component to provide a weighted vote for
one of the actual quorum voters to prevent ties from occurring.
In a Windows Failover Cluster, only members of the cluster are quorum voters. When the
cluster is one vote away from losing quorum and the Witness Server is needed to maintain
quorum, one of the DAG members that can communicate with the Witness Server places a
Server Message Block (SMB) lock on a file called witness.log that is located in the Witness
Directory. The DAG member that places the SMB lock on this file is referred to as the locking
node. Once an SMB lock is placed on the file, no other DAG member can lock the file.
The locking node then acquires a weighted vote; that is, instead of its vote counting for 1, it
counts for 2 (itself and the Witness Server).
If the number of members that can communicate with the locking node constitutes a
majority, then the members in communication with the locking node will maintain quorum
and continuing servicing clients. DAG members that cannot communicate with the locking
node are in the minority, and they lose quorum and terminate cluster and DAG operations.
The majority formula for maintaining quorum is V/2 + 1, the result of which is always a whole
number. The formula is the number of voters (V) divided by 2, plus 1 (for tie-breaking purposes).
Going back to our example, consider the placement of the Witness Server in a third datacenter, which
would look like the following:
Figure 3: Locating the Witness Server in a third datacenter does not provide you with any different behavior
The above configuration does not provide you with any different behavior. In the event WAN
connectivity is lost between Portland and Redmond, one DAG member will retain quorum and one DAG
member will lose quorum, as illustrated below:
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
4/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Figure 4: In the event of a WAN outage between the two datacenters, one DAG member will retain quorum
Here we have two DAG members; thus two voters. Using the formula V/2 + 1, we need at least 2 votes
to maintain quorum. When the WAN connection between Portland and Redmond is lost, it causes the
DAG s underlying cluster to verify that it still has quorum.
In this example, the DAG member in Portland is able to place an SMB lock on the witness.log file on the
Witness Server in Olympia. Because the DAG member in Portland is the locking node, it gets the
weighted vote, and now therefore holds the two votes necessary to retain quorum and keep its cluster
and DAG functions operating.
Although the DAG member in Redmond can communicate with the Witness Server in Olympia, it cannot
place an SMB lock on the witness.log file because one already exists. And because it cannot
communicate with the locking node, the Redmond DAG member is in the minority, it loses quorum, and
it terminates its cluster and DAG functions. Remember, it doesn t matter if the other DAG members can
communicate with the Witness Server; they need to be able to communicate with the locking node in
order to participate in quorum and remain functional.
As documented in Managing Database Availability Groups on TechNet, if you have a DAG extended
across two sites, we recommend that you place the Witness Server in the datacenter that you consider
to be your primary datacenter based on the location of your user population. If you have multiple
datacenters with active user populations, we recommend using two DAGs (also as documented in
Database Availability Group Design Examples on TechNet).
5/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
is a property of the DAG that, when enabled, forces starting DAG members to acquire permission from
other DAG members in order to mount mailbox databases. DAC mode was created to handle the
following basic scenario:
You have a DAG extended to two datacenters.
You lose the power to your primary datacenter, which also takes out WAN connectivity between
your primary and secondary datacenters.
Because primary datacenter power will be down for a while, you decide to activate your
secondary datacenter and you perform a datacenter switchover.
Eventually, power is restored to your primary datacenter, but WAN connectivity between the two
datacenters is not yet functional.
The DAG members starting up in the primary datacenter cannot communicate with any of the
running DAG members in the secondary datacenter.
In this scenario, the starting DAG members in the primary datacenter have no idea that a datacenter
switchover has occurred. They still believe they are responsible for hosting active copies of databases,
and without DAC mode, if they have a sufficient number of votes to establish quorum, they would try to
mount their active databases. This would result in a bad condition called pli b ain, which would occur
at the database level. In this condition, multiple DAG members that cannot communicate with each other
both host an active copy of the same mailbox database. This would be a very unfortunate condition that
increases the chances of data loss, and make data recovery challenging and lengthy (albeit possible,
but definitely not a situation we would want any customer to be in).
The way databases are mounted in Exchange 2010 has changed. Yes, the Information Store still
performs the mount, but it will only do so if Active Manager asks it to. Even when an administrator
right-clicks a mailbox database in the EMC and selects Mount Database, it is Active Manager that
provides the administrative interface for that task, and performs the RPC request into the Information
Store to perform the mount operation (even on Mailbox servers that are not members of a DAG).
Thus, when every DAG member starts, it is Active Manager that decides whether or not to send a mount
request for a mailbox database to the Information Store. When a DAG is enabled for DAC mode, this
startup and decision-making process by Active Manager is altered. Specifically, in DAC mode, a starting
DAG member must ask for permission from other DAG members before it can mount any databases.
DAC mode works by using a bit stored in memory by Active Manager called the Datacenter Activation
Coordination Protocol (DACP). That s a very fancy name for something that is simply a bit in memory set
to either a 1 or a 0. A value of 1 means Active Manager can issue mount requests, and a value of 0
means it cannot.
The starting bit is always 0, and because the bit is held in memory, any time the Microsoft Exchange
Replication service (MSExchangeRepl.exe) is stopped and restarted, the bit reverts to 0. In order to
change its DACP bit to 1 and be able to mount databases, a starting DAG member needs to either:
Be able to communicate with any other DAG member that has a DACP bit set to 1; or
Be able to communicate with all DAG members that are listed on the StartedMailboxServers list.
If either condition is true, Active Manager on a starting DAG member will issue mount requests for the
active databases copies it hosts. If neither condition is true, Active Manager will not issue any mount
requests.
Reverting back to the intended DAC mode scenario, when power is restored to the primary datacenter
without WAN connectivity, the DAG members starting up in that datacenter can communicate only with
each other. And because they are starting up from a power loss, their DACP bit will be set to 0. As a
result, none of the starting DAG members in the primary datacenter are able meet either of the
conditions above and are therefore unable to change their DACP bit to 1 and issue mount requests.
So that s how DAC mode prevents split brain at the database level. It has nothing whatsoever to do with
failovers, and therefore leaving DAC mode disabled will not enable automatic datacenter failovers.
By the way, as documented in Understanding Datacenter Activation Coordination Mode on TechNet,
a nice side benefit of DAC mode is that it also provides you with the ability to use the built-in Exchange
site resilience tasks.
6/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
As documented in Understanding Datacenter Activation Coordination Mode, this dial setting is the
administrator s way of telling a DAG member the maximum number of log files that can be missing
while still allowing its database copies to be mounted. The default setting is GoodAvailability, which
translates to 6 or fewer logs missing. This means if 6 or fewer log files never made it from the active
copy to this passive copy, it is still OK for the server to mount this database copy as the new active
copy. This scenario is referred to as a lo
failo e , and it is Exchange doing what it was designed to
do. Other settings include BestAvailability (12 or fewer logs missing) and Lossless (0 logs missing).
After a passive copy has been activated in a lossy failover, it will create log files continuing the log
generation sequence based on the last log file it received from the active copy (either through normal
replication, or as a result of successful copying during the ACLL process). To illustrate this, let s look at
the scenario in detail, starting before a failure occurs.
We have two copies of DB1; the active copy is hosted on EX1 and the passive copy is hosted on EX2.
The current settings and mailbox database copy status at the time of failure are as follows:
AutoDatabaseMountDial: BestAvailability
Copy Queue Length: 4
Replay Queue Length: 0
Last log generated by DB1\EX1: E0000000010
Last Log Received by DB1\EX2: E0000000006
At this point, someone accidentally powers off EX1, and we have a lossy failover in which DB1\EX2 is
mounted as the new active copy of the database. Because E0000000006 is the last log file DB1\EX2 has,
it continues the generation stream, creating log files E0000000007, E0000000008, E0000000009,
E0000000010, and so forth.
An administrator notices that EX1 is turned off and they restart EX1. EX1 boots up and among other
things, the Microsoft Exchange Replication service starts. The Active Manager component, which runs
inside this service, detects that:
DB1\EX2 was activated as part of a lossy failover
DB1\EX2 is now the active copy
DB1\EX1 is now a diverged passive copy
Any time a lossy failover occurs where there original active copy may be viable for use, there is always
divergence in the log stream that the system must deal with. This state causes DB1\EX1 to automatically
invoke a process called Incremental Resync, which is designed to deal with divergence in the log stream
after a lossy failover has occurred. Its purpose is to resynchronize database copies so that when certain
failure conditions occur, you don t have to perform a full reseed of a database copy.
In this example, divergence occurred with log generation E0000000007, as illustrated below:
DB1\EX2 received generations 1 through 6 from DB1\EX1 when DB1\EX1 was the active copy. But a
failover occurred, and logs 7 through 10 were never copied from EX1 to EX2. Thus, when DB1\EX2
became the active copy, it continued the log generation sequence from the last log that it had, log 6. As
a result, DB1\EX2 generated its own logs 7-10 that now contain data that is different from the data
contained in logs 7-10 that were generated by DB1\EX1.
To detect (and resolve) this divergence, the Incremental Resync feature starts with the latest log
generation on each database copy (in this example, log file 10), and it compares the two different log
files, working back in the sequence until it finds a matching pair. In this example, log generation 6 is the
last log file that is the same on both systems. Because DB1\EX1 is now a passive copy, and because its
logs 7 through 10 are diverged from logs 7 through 10 on DB1\EX2, which is now the active copy, these
log files will be thrown away by the system. Of course, this does not represent lost messages because
the messages themselves are recoverable through the Transport Dumpster mechanism.
Then, logs 7 through 10 on DB1\EX2 will be replicated to DB1\EX1, and DB1\EX1 will be a healthy up-todate copy of DB1\EX2, as illustrated below:
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
7/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
I should point out that I am oversimplifying the complete Incremental Resync process, and that it is
more complicated than what I have described here; however, for purposes of this discussion only a
basic understanding is needed.
As we saw in this example, even though DB1\EX2 lost four log files, it will still able to mount as the new
active database copy because the number of missing log files was within EX2 s configured value for
AutoDatabaseMountDial. And we also saw that, in order to correct divergence in the log stream after a
lossy failover, the Incremental Resync function threw away four logs files.
But the fact that both operations dealt with four log files does not make them related, nor does it mean
that the system is throwing away log files based on the AutoDatabaseMountDialsetting.
To help understand why these are really not related functions, and why AutoDatabaseMountDial does
not throw away log files, consider the failure scenario itself. AutoDatabaseMountDial simply determines
whether a database copy will mount during activation based on the number of missing log files. The key
here is the word missing. We re talking about log files that have not been replicated to this activated
copy. If they have not been replicated, they don t exist on this copy, and therefore, they cannot be
thrown away. You can t throw away something you don t have.
It is also important to understand that the Incremental Resync process can only work if the previous
active copy is still viable. In our example, someone accidentally shut down the server, and typically, that
act should not adversely affect the mailbox database or its log stream. Thus, it left the original active
copy intact and viable, making it a great candidate for Incremental Resync.
But let s say instead that the failure was actually a storage failure, and that we ve lost DB1\EX1
altogether. Without a viable database, Incremental Resync can t help here, and all you can do to recover
is to perform a reseed operation.
So, as you can see:
AutoDatabaseMountDial does not control how many log files the system throws away
AutoDatabaseMountDial is a completely separate process that does not require Incremental
Resync to be available or successful
Incremental Resync throws away log files as part of its divergence correction mechanism, but
does not lose messages as a result of doing so
a H b T an po e e i h 16 GB of memo
n
ice a lo a a H b
T an po e e i h 8 GB of memo , and he E change 2010 e e ole
e e op imi ed o n i h onl 4 o 8 GB of memo .
This misconception isn t directly related to high availability, per se, but because scalability and cost all
factor into any Exchange high availability solution, it s important to discuss this, as well, so that you can
be confident that your servers are sized appropriately and that you have the proper server role ratio.
It is also important to address this misconception because it s blatantly wrong. You can read our
recommendations for memory and processors for all server roles and multi-role servers in TechNet. At
no time have we ever said to limit memory to 8 GB or less on a Hub Transport or Client Access server.
In fact, examining our published guidance will show you that the exact opposite is true.
Consider the recommended maximum number of processor cores we state that you should have for a
Client Access or Hub Transport server. It s 12. Now consider that our memory guidance for Client
Access servers is 2 GB per core and for Hub Transport it is 1 GB per core. Thus, if you have a 12-core
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
8/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Client Access server, you d install 24 GB of memory, and if you had a 12-core Hub Transport server, you
would install 12 GB of memory.
Exchange 2010 is a high-performance, highly-scalable, resource-efficient, enterprise-class application. In
this 64-bit world of ever-increasing socket and core count and memory slots, of course Exchange 2010
is designed to handle much more than 4-8 GB of memory.
Microsoft s internal IT department, MSIT knows first-hand how well Exchange 2010 scales beyond 8 GB.
As detailed in the white paper, Exchange Server 2010 Design and Architecture at Microsoft: How
Microsoft IT Deployed Exchange Server 2010, MSIT deployed single role Hub Transport and Client
Access servers with 16 GB of memory.
It has been suggested that a possible basis for this misconception is a statement we have in
Understanding Memory Configurations and Exchange Performance on TechNet that reads as
follows:
Blogcasts
High Availability in Exchange
High Availability in Exchange
High Availability in Exchange
High Availability in Exchange
Server
Server
Server
Server
2010,
2010,
2010,
2010,
Part 1
Part 2
Part 3
Part 4
TechNet Documentation
Understanding High Availability and Site Resilience
Understanding Database Availability Groups
Understanding Mailbox Database Copies
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
9/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
TechEd Presentations
EXL312 Designing Microsoft Exchange 2010 Mailbox High Availability for Failure Domains
TechEd North America 2011
EXL327 Real-World Site Resilience Design in Microsoft Exchange Server 2010? TechEd North
America 2011
EXL401 Exchange Server 2010 High Availability Management and Operations TechEd North
America 2011
UNC401 Microsoft Exchange Server 2010: High Availability Deep Dive (including changes
introduced by SP1) TechEd Europe 2010
UNC302 Exchange Server 2010 SP1 High Availability Design Considerations TechEd New
Zealand 2010
Scott Schnoll
Comments
Randall Vogsland
Andrew
Andrew
Andrew, thanks for the kind words. Please do re-read the part where I state:
- AutoDatabaseMountDial is a completely separate process that does not require Incremental Resync
to be available or successful
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
10/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
- Incremental Resync throws away log files as part of its divergence correction mechanism, but does
not lose messages as a result of doing so
Remember, Active Manager consults the value for ADMD before issuing a mount request during Best
Copy Selection. Once that operation is complete, the value for ADMD is not used by the copy being
activated, nor it used by the Incremental Resync process. Again, the Incremental Resync process will
only occur if the previous active copy is still viable and can be resynchronized. And it is only in that
case where logs will be discarded.
You may be confusing the value of the ADMD setting with the actual number of logs lost. Consider
the above example where ADMD is set to BestAvailability, which translates to 12 or fewer log files.
When Incremental Resync runs, that does not mean it can discard up to 12 log files. It will only
discard diverged log files, and in our example, there were only 4 that were diverged. Thus, only 4
are discarded, even though ADMD is configured with a value of up to 12.
To illustrate further, consider the same example scenario I used in the blog post, but instead of
losing 4 log files, the copy queue length was actually 20 log files. Since ADMD is set to
BestAvailability, the passive copy will not automatically mount. So, the administrator forces a mount
of the database to restore client access. When the server hosting the previous active copy is
restarted, Incremental Resync will perform its divergence check and discard 20 log files, several more
than what ADMD was set to.
Bruce
Bruce, I would be very interested to hear all of the details of your configuration, the failure scenario
and your test cases. If you would like, please email the information directly to me. My email address
is in the above blog post.
Bruce
Tim Huang
Paul Cunningham
11/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Rajiv Arora
Peter
Michel de Rooij
Bill T
Paul, Rajiv, Peter, Tim and Michel, thanks for the kind words. BTW, Tim, I did not deliver the HA
failure domain session. That was delivered by Ross Smith IV, and its an excellent session!
Bill, thanks for the comment. As Bruce implied in his note, and as I later confirmed when he sent me
more information offline, if you have multiple datacenters, but you have sufficient connectivity where
they are treated as a single datacenter from an AD perspective and from a namespace perspective,
then that scenario is analagous to a DAG that exists in a single datacenter. In that case, the
datacenter switchover scenario does not apply, regardless of the number of well-connected
datacenters. In the context of this blog, datacenter equals AD site.
As for the PFE content, feel free to send me a copy of it via email and I can have it corrected. I'm not
familiar with the specific training you mention, but I would like to determine the source of the content
and have it corrected.
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
12/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Justin
Justin
Mallikarjun Shirshetti
1 Jun 2011 12:40 PM
#
Anthony T
Syed Faizan
Sren
13/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
ajm88
ajm88, thanks for the kind words. There are a couple of bugs in Exchange 2010 SP1 IPD that we are
working to get corrected. I don't have an ETA for the updated version, but we will announce it on
this blog when it's published. Sorry for any confusion the IPD doc bugs may have caused.
cparker4486
The links to the HA blogcasts should be working now. Sorry for the inconvenience.
1 DAG - active/active
12 Jul 2011 9:39 PM
#
Exchange TechNet
Resources
Exchange TechCenter
Exchange Server 2010
Exchange Server 2007
TechNet Library
Forums
Quick Links
Support
Exchange Development
Blog
Buy Now
MSExchange.org
Exchange Online
MSExchangeGuru's Blog
Ask Perry
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
More...
14/15
3/23/12
Exchange 2010 High Availability Misconceptions Addressed - Exchange Team Blog - Site Home - Te
Bing Community
Terms of Use
Trademarks
blogs.technet.com/b/exchange/ /exchange-2010-high-availability-misconceptions-addressed.aspx
Privacy Statement
Report Abuse
15/15