Contents
Data Protection and Disaster Recovery Tips
Chapter 1: Disaster Preparedness and You . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
by Paul Robichaux
6 Common Backup and Restore Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Using the Wrong Backup Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Not Verifying Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Mismanaging the Transaction Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Not Allowing Enough Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Forgetting the Small Stuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Not Practicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Spend Time, Not Money . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
sidebar: Setting Up a Secure Offsite Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 5: Putting Together Your High-Availability Puzzle . . . . . . . . . . . . . . . . . 21
Improve system, database, and data availability with SQL Server 2005
by Kalen Delaney and Ron Talmage
Failover Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Server vs. Data Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Database Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Mirroring Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Log Shipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Merge Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Transactional Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Peer-to-Peer Transactional Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Availability in a Highly Concurrent Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Snapshot Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Online Index Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Faster Restoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Database Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Chapter 1:
I’m writing this column from Nice, France, which is a beautiful city. And, so far, no one has laughed
at my attempts to dust off my rusty college French. However, what should have been a perfect trip has
been haunted by the ghost of disasters, both past and future.
First, the past. Not too long ago, the 100th anniversary of the great San Francisco earthquake of
1906 rolled around. San Francisco and its surrounding area were uniquely vulnerable to this earthquake
because of a variety of factors, including prevailing construction methods, soil composition, and the lack
of effective firefighting capability. As you probably know, the fault systems that underlie the Bay Area
(and their companion faults in the Puget Sound area) are overdue for a major earthquake, and that’s
worrisome.
Second, I’ve been reading a scary book, “Fifty Degrees Below Zero,” in which science fiction
author Kim Stanley Robinson describes some of the possible outcomes of abrupt climate change.
Those outcomes include destructive weather events that are practically Biblical in scale, along with
desperate efforts to mitigate the climate change and retool the economy. Whether or not you agree that
global warming is real, the historical record of abrupt climate change—and the lasting aftereffects—is
abundantly clear.
These two things have little in common except this: Both point out the need for effective disaster
recovery for your Exchange organization, and “effective” in this context implies effective and accurate
preparation. As hurricane season approaches, there are lots of nervous folks along the Gulf Coast, in
Florida, and along the eastern seaboard of the United States, but they’re already preparing. What about
your own organization?
I don’t have space here to list every step you might conceivably take to protect your Exchange
organization, but I can point out a few high-value things that you should be sure to include in your
planning:
1. Have a bug-out plan. If a disaster hit your business, how would you get away from the area?
How would you decide when it was time to go? How would you tell your employees not to come
to work? In fact, how would you make the decision to shut down or relocate operations?
2. Keep communicating. How would management and employees communicate until your
email service could be reestablished? Who’s in charge of establishing and maintaining disaster
communications?
3. Grab your gear and go. One of my customers implemented its disaster recovery plan for
Hurricane Katrina by shutting down the Exchange server, pulling all the disks from the storage
enclosure, and taking them by car to Houston. This was an ingenious and effective solution,
given the circumstances. What would you do under similar circumstances?
4. Now is always better than later. It’s better to have a fair solution now than a perfect solution
later. Of course, this doesn’t mean that you should rush out and slap together a disaster-
preparedness strategy out of whatever random products and technologies you can find. It
does, however, mean that you should push disaster recovery and preparedness planning to the
forefront of your list of operational concerns.
It’s not possible to anticipate every possible disaster, but you don’t have to. The responses to many
disasters will be the same; you can make plans based on the expected duration of recovery, the impact
of the disaster on your facilities and the surrounding area, and other factors. Even if you don’t live in a
disaster-prone area (I don’t; the biggest threat in northwest Ohio is apparently highway construction),
you should still be prepared for things such as structure fires, major traffic accidents (what if a gasoline
tanker blew up nearby? That happened at my wedding!), and so on.
The Boy Scouts say “Be prepared,” but I like the US Coast Guard’s motto better: “Semper Paratus,”
which is Latin for “always ready.”
Nothing compares with the sinking feeling you experience when you need to restore data from a
backup but can’t for some reason. Most computer users have this experience eventually; the pain is
even more acute and frequent for administrators, who are responsible for large amounts of important
business data. Although backup and restore technologies have advanced in the past few years, you
probably still use them only as last-ditch safety mechanisms. When all else fails, you try to restore from
backup. For this alternative to be viable, you must have a degree of confidence that your data will be
available and readable when you need it. However, Exchange administrators make several common
mistakes that prevent their backup and recovery operations from running smoothly.
more time consuming than generating online backups, many administrators prefer the extra safety of
having a periodic offline backup in addition to routine production backups.
If one of these three checks fails, you should be able to determine the cause of the backup failure
and therefore fix the problem. For example, during an online backup, Exchange computes a checksum
for each page and compares it with that page’s checksum on disk. If the checksums don’t match, you
receive a 1018 error and the backup stops. Checking your backups would alert you to the error and give
you a chance to fix it before the backup stopped.
Even if your backups are working now, don’t get complacent. Changing your environment, backup
software, Windows configuration, or Exchange configuration might make your backups fail in the
future. Check your backups regularly for the best protection. The fastest and simplest way to check
your backups to be sure they work is to check the Application event log and the report that your
backup program generates. Check the Application event log to ensure that Exchange didn’t generate any
errors during the backup period. Check the backup program report to verify that the backup program
didn’t skip any files and that no errors occurred.
Online backups automatically include the log files as part of the backup data set. During normal
operation, Exchange continues to create new log files as transactions occur. These log files remain on
disk until you perform a full or an incremental online backup, at which point the Exchange IS process
truncates or removes the files. Don’t remove log files yourself. In some circumstances, you might
need to copy the log files to a separate directory for safekeeping. In “Offline Backup and Restoration
Procedures for Exchange” (http://support.microsoft.com/?kbid=296788), Microsoft recommends saving
copies of the transaction logs in a separate location before attempting to recover data from an offline
backup.
When you use NTBackup to perform a restore, the logs don’t play back unless you select the Last
restore set check box (or the equivalent check box in another backup program). The database you restore
isn’t mountable unless you select this option, or unless you use the Eseutil /r command to manually start
a log playback.
If your transaction logs are missing or any of your log files are damaged, Microsoft’s free Exchange
Server Disaster Recovery Analyzer (ExDRA) might be helpful. This tool can analyze a dismounted
database, tell you which log files are present and which are missing, and give you options for fixing any
problems it finds. ExDRA can be valuable if you experience an unexpected restore failure, although
it’s no substitute for understanding the disaster-recovery process and consulting Microsoft Customer
Service and Support (CSS) or other experts when necessary.
This list isn’t trivial; if a problem occurs at any stage in the process, your recovery operation won’t
proceed through the successive steps. The more restores you perform, the more smoothly they’ll go.
You’ll be able to accurately estimate how long a restore will take, and you’ll become familiar with and
learn how to solve the types of problems that are common in your environment.
I’ve discussed, practice backup and recovery in your environment, and continually monitor your
processes. Then, when you experience a failure, you’ll be ready to put your skills to work.
What’s the simplest way to set up a secure, automatic, offsite backup process for files on a server?
The simplest way would be to use an Internet-based backup service such as NetMass. Internet-based backup services
use a local agent to compress and encrypt your files, then transmit them to a data center. I’ve used NetMass, and it
was a lifesaver. However, such services can be costly for companies with many gigabytes of data, and some compa-
nies are unwilling to put their data into someone else’s hands.
The next-simplest option would be to implement Microsoft System Center Data Protection Manager (DPM) 2006,
which automatically maintains multiple versions of files and lets users restore files themselves without involving the
administrator. But DPM can also be costly, and it requires a SQL Server license.
I had a client who wanted secure offsite backups for about 300GB of data but couldn’t afford DPM and SQL Server. I
fulfilled that client’s needs with one additional PC and a Windows Server 2003 Release 2 (R2) license. I set up the new
Windows server to serve as the backup server. After connecting the backup server to the company’s domain, I set up
DFS to replicate data from the company’s main servers to the backup server.
After the backup server completed the initial replication, we moved it to an offsite location. Next, I configured the
backup server to automatically establish an L2TP VPN connection to a server at the company’s main office by using
RRAS on both servers. Over the persistent VPN connection, DFS keeps the files on the backup server up-to-date with
changes on the main servers, usually within seconds.
To preserve the ability to restore a version of a file from several days earlier, I advised the client to run a full backup of
the files on the backup server to an archive disk drive once a week. Each of the other nights of the week, the backup
server performs an incremental backup to the backup drive. This arrangement lets users restore any version of a file
that’s up to seven days old. Periodically, at the client’s request, I copy the files from the archive disk drive to a USB
drive for long-term archiving.
If you’re going to use DFS for remote backups, you’ll find the DFS enhancements in Windows 2003 R2 to be worth
the investment. DFS on Windows 2003 R2 is more stable and efficient than on Windows 2003 and is easy to manage
and troubleshoot.
Chapter 2:
I recently received a call from a client saying that a remote server in San Francisco had Microsoft
Exchange Server databases that wouldn’t mount. Lately, this particular server has become unstable and
freezes every few weeks. The server had frozen again and the administrator rebooted it. Although the
server came up, the Exchange private and public databases refused to mount. Usually when both stores
refuse to mount on a server, the problem is server related and not necessarily related to the Exchange
databases themselves.
But, because this was a remote server and I wanted to get it up and running as quickly as possible,
I tried to run Eseutil against both the private and public databases. Unfortunately, the databases didn’t
mount after running Eseutil. This problem occurred when I was speaking at the WinConnections
conference in San Diego. I discussed the situation with the client and the company decided to order a
new server to replace the server that was unstable. The server hardware was going to take a few days to
arrive, so I scheduled a trip to San Francisco immediately following the conference.
Fortunately, this client has several Exchange servers on its WAN. All of the mailboxes were deleted
on the San Francisco server by using the Microsoft Management Console (MMC) Active Directory Users
and Computers snap-in, and the user’s mailboxes were recreated on a local server in Los Angeles. This
setup allowed the remote users to at least send and receive new mail until the new server could be
installed. Users took a performance hit because their mail resided on a remote server, but slow mail is
better than no mail.
When I arrived, the San Francisco server was installed with Windows Server 2003 and Exchange
Server 2003. After running a few tests to verify the server was functioning properly, I used Active
Directory Users and Computers to move the San Francisco users’ mailboxes from the Los Angeles server
to the new server. Fortunately these mailboxes were relatively small because the users had been using
the new mailboxes on the Los Angeles server for only a week, so even over the WAN the mailbox move
took only about 1 hour. Now that the mail was located on the correct local server, I needed to recover
all the messages prior to the Exchange server crash. Originally I had planned to use the Recovery
Storage Group feature in Exchange 2003 (the original server was running Exchange 2000, which the
feature supports), but because the mailboxes had been deleted and recreated on the Los Angeles server,
the new mailboxes had new Global Unique Identifiers (GUIDs), and the original GUID assigned to the
San Francisco mailboxes prior to the Exchange crash were gone.
As you might know, the mailbox GUIDs must be consistent when using a Recovery Storage Group
or you’ll receive an error when you try to run ExMerge to export the mailbox information to a .pst file.
At this point, I had a couple of options: I could try to get the original Exchange 2000 server up and
running again, restore the original database, then merge the information; I could purchase an Exchange
Recovery tool from third-party venture, or I could try to somehow get the information out of the old
store to merge it with the new mailbox information. After discussing the situation with the client, we
decided on the last option.
Knowing that I needed to merge the new and old mailbox information, I used Exmerge to export
all the new mailbox information to .pst files. I then created a Recovery Storage Group on the new server
and copied the information store databases from the old server, then mounted and dismounted the store
(this process proved that the old database files were OK). This ensured that I had a database that was in
a consistent state. Using Exchange System Manager (ESM), I opened the properties of the First Storage
Group, Private Information Store Database properties and selected the “Database could be overwritten
with a Restore” checkbox and dismounted the store. I renamed the original store databases and copied
over the old private information store database files (priv1.edb and priv1.stm) from the Recovery
Storage Group directory to the live mdbdata directory on the new server. I made sure that the old store
database file names were the same as the new database file names (priv1.edb and priv1.stm). Then I
mounted the store databases from the old server on the new server and had all the San Francisco users
access their mailboxes.
At this point, the mailboxes looked empty because they still had the new GUIDs assigned to their
mailboxes in Active Directory (AD). I took this approach to create “dummy” mailboxes on the server
so I could delete the dummy mailboxes and reconnect the old mailboxes to the corresponding user
in AD via ESM. After all the San Francisco users accessed their mailboxes, I went into the ESM and
deleted the empty mailboxes, reconnected the users to the old mailboxes, and modified the rights on
the mailbox to ensure that the user and mail administrator had full rights to their old mailboxes. Now
users had restored mailboxes in a state just before the server crashed. I ran Exmerge and merged all
the new mailbox information from the .pst files I had previously exported into the original mailboxes.
Now each user had a complete set of information in their mailboxes, both pre- and post-crash. After
AD replication completed, I had to refresh the mail profile on all the workstations by deleting the
profile and recreating it. After these steps, the users were able to access their mailboxes with a complete
information set. I restored the Public folder database from the old server so users could access the
Public folders that previously resided on the old server.
Fortunately, no additional Public folders were created after the crash, so I didn’t need to worry
about merging new Public folder information on the new server. I did run into some Public folder
rights concerns, and I had to reassign rights to certain Public folders. Consider using the PFDavAdmin
utility to reassign Public folder rights if you have problems assigning rights via the ESM.
The above process allowed me to restore all the users’ mailbox information. Fortunately this
remote office had a relatively small number of users (20), so it wasn’t too much work to recover
the information. The users were happy to get back all their old mail and restore the original mail
performance now that they were accessing their mail from a local server.
WithExchange Server 2003 Service Pack 2 (SP2) and Exchange 2003 Standard Edition, you can
TIP
now have a mail store of 75GB, up from 16GB in earlier versions.
As I write this, most of the local waterways are well over flood stage (as much as 11 feet above crest). In addition,
the Federal Emergency Management Agency (FEMA) has yet to show up with debit cards and trailers, and many
local businesses now find themselves completely flooded out. Some locations along the river are seeing water hit the
second floor of some low-lying buildings. As my office sits between two major flood areas, the current weather pat-
terns give me reason to think about my disaster recovery plans despite no immediate danger.
Although I’m completely comfortable with my data backups, I realize that I lack any sort of business continuity/disaster
recovery plan. What happens if my office gets flooded or simply loses power for a long period of time due to flooding
elsewhere? Fortunately, floods are rarely a surprise, and if necessary, I know that I could pack up my office and move it
elsewhere—which would be a major aggravation given the amount of hardware involved—but it could be done.
Because an ISP hosts my Web and email services, I don’t have to concern myself with customers that have issues
with those services while my office location is being moved or temporarily out of service. However, like many small
businesses, I don’t maintain an offsite backup of my critical data (though, in the past, I’ve covered services that offer
real-time backup of local servers to remote sites). The ongoing weather situation here has forced me to rethink this
practice. Even without a major disaster that destroys my upper-floor office, I could easily be in a situation in which
flooding prevents me from accessing my office and the data contained therein.
One of my habits, however, would help me alleviate some of the potential problems. I keep current copies of my in-
progress projects online, stored in password-protected archives in password-protected directories on one of my Web
servers. I update these archives as the projects progress and could continue with any of the projects without access
to my office as long as I have access to a computer that has Microsoft Office and Internet access. But I wouldn’t have
access to the relatively huge amount of historical data I retain, nor any of the more specialized applications that I run.
For this reason, I’ve given serious thought to one of two solutions: tape backup or a rack-mountable hard disk-based
backup appliance. Tape backup lets me do the traditional tape rotation with offsite storage, and the costs are fairly low.
However, tape backup also means a change in workflow, and my work style doesn’t lend itself to the workflow that
would allow me to use tape as my crisis solution.
With a hard disk-based appliance, I can automate the backup process so that a mirror of my office data is always
available in a single device that, although not exactly portable, is small enough to pick up and move if necessary. This
would provide minimal interference with my existing workflow and let me get up and running at an alternate location
with minimal trouble.
I still need to develop a business continuity plan in case a disaster destroys my office and its contents, but the added
security of maintaining a movable image of my office is a good place to start.
Chapter 3:
Microsoft SharePoint technologies support information sharing among groups of people within or across
companies. (In this article, we refer to Microsoft SharePoint Portal Server 2003 as Portal Server and
Windows SharePoint Services 2.0 as SharePoint Services. When referring to both, we use SharePoint.)
SharePoint provides document-management support through two types of file repositories: document
libraries and lists. Files are stored in a SharePoint content database, not in the Windows file system.
One glaring omission in SharePoint’s document-management capabilities is that you can’t easily
restore files accidentally deleted. In addition, although versioning is available, Portal Server and
SharePoint Services remove all version history after you delete an item. Besides these file-restore
problems, SharePoint has a relatively weak backup engine, which doesn’t give you the option of
performing single-file restores from a backup.
Despite SharePoint’s backup and restore gaps, workable solutions do exist. We’ll show you how
to automate the SharePoint backup process by using a script we’ve provided, then we’ll discuss some
approaches for supporting file restores.
Tool Options
If maintaining a second SharePoint environment containing a previous version of the production
environment is viable for your organization, you can perform full backup and restore operations. For a
Portal Server implementation, you can use the SharePoint Portal Server Data Backup and Restore utility
(spsbackup.exe). For a pure SharePoint Services implementation that doesn’t contain Portal Server,
you can use the SharePoint administration utility (stsadm.exe) or Microsoft SQL Server tools such as
OSQL (osql.exe) to perform backup and restore. For more information about backing up and restoring
SharePoint Services, see the Microsoft article “How to back up and restore installations of Windows
SharePoint Services that use Microsoft SQL Server 2000 Desktop Engine (Windows)” (http://support.
microsoft.com/?kbid=833797.)
Although you can use Stsadm to back up and restore Portal Server, using the tool in this way has
limitations. For information about those limitations, see the Microsoft article “Supported scenarios for
using the Stsadm.exe command-line tool to back up and to restore Windows SharePoint Services Web
sites and personal sites in SharePoint Portal Server 2003” (http://support.microsoft.com/?kbid=889236).
available in various object models. Developers (Microsoft or otherwise) can write cmdlets to automate
SharePoint administrative tasks by leveraging either the Portal Server or SharePoint Services .NET
object models.
After you’ve customized your script, you can add it as a scheduled task to your SharePoint server,
where you’ll execute the script by using the AT command-line utility or the Scheduled Task Wizard,
which you can launch from the Scheduled Tasks icon below the System Tools program group. The
wizard will walk you through the scheduling process. Alternatively, you can type AT /? at the command
line for help in using the AT command scheduler.
After you create the backups, you can do a full restore of a Portal Server backup by using the
Spsbackup utility either from the utility’s graphical interface or at the command line. You can see the
graphical interface for Spsbackup if you open Spsbackup from the Portal Server program group or if
you run Spsbackup from the command line without specifying any switches.
Restore Strategies
Now that you have a backup procedure, you might want to look at three choices for building your
restore strategy: the snapshot, ad hoc, and hybrid methods. In the snapshot method, you create a restore
environment that’s an earlier image of the production portal. You run periodic backup and restore
operations to get a snapshot of the portal at an earlier time. An obvious down side to this method is that
you can’t restore files prior to the date of the restored environment. This deficiency is solved in the ad
hoc method, where you can regularly create and archive backups and conduct restore operations as the
need arises. If you require a file that dates back three months, all you need is the backup file from that
time period to prepare an ad hoc restore and retrieve the file. However, depending upon the frequency
of restore requests, the ad hoc method might be a poor solution because restores can take a long time to
finish.
In the hybrid method, you combine the first two methods by maintaining a snapshot with the
option of restoring an ad hoc environment upon request. You can overwrite the mirrored environment
with the ad hoc restore or maintain a third server just for on-demand restores. The size of your
organization, administration team, and infrastructure and the number of restore requests can influence
whether any of these approaches will work for you.
Additional Approaches
Other approaches you can take within a second portal environment don’t require doing a complete
restore of the portal databases to find deleted data. If you can narrow critical document libraries and
lists to those found within a few site collections, sites, or subsites, you can back up and restore smaller
sections of the portal more frequently, on a periodic or an ad hoc basis. In this partial?file-restore
approach, you isolate backups and restores at the SharePoint Services level by using the smigrate.exe
command-line migration tool or the stsadm.exe command-line administrative tool to mirror smaller site
structures.
Running stsadm.exe with the -o backup and -o restore options lets you back up and restore site
collections from one environment to another; smigrate.exe provides similar functionality for individual
sites. You can find more information about these tools in the Microsoft Office SharePoint Portal Server
Solutions Snapshot
PROBLEM:
SharePoint provides no easy way to recover deleted files.
SOLUTION:
Recover files by running an automated backup procedure and using any of several file-restore strategies.
WHAT YOU NEED:
SharePoint Portal Server and Windows SharePoint Services; SharePoint secondary environment (physical or virtual);
automated-backup script
DIFFICULTY:
3 out of 5
SOLUTION STEPS:
Create a secondary portal environment (physical or virtual).
Create a script to automate backups.
Pick a restore strategy.
Restore files.
--BEGIN COMMENT
‘Get today’s date
--END COMMENT
strDate = Replace(Date(),”/”,”_”)
--BEGIN COMMENT
‘Provide a destination folder for the backup that points to a network
‘share and append the name with today’s date.
--END COMMENT
--BEGIN CALLOUT A
strDestinationFolder = CreateFolder(“\\usrds005\SPBackups\” & strDate)
--END CALLOUT A
--BEGIN COMMENT
‘Build the spsbackup parameter string
--END COMMENT
strParam = “/all /file “ & strDestinationFolder
--BEGIN COMMENT
‘Build the spsbackup backup string
‘Note, spbackup is located on the C drive in a default installation.
--END COMMENT
strExec = “C:\Progra~1\ShareP~1\Bin\spsbackup.exe”
--BEGIN COMMENT
‘Use the Exec method to run spsbackup
--END COMMENT
Set objExec = WshShell.Exec(strExec & “ “ & strParam)
--BEGIN COMMENT
‘Create a folder function
--END COMMENT
Function CreateFolder(folderName)
Set fso = CreateObject(“Scripting.FileSystemObject”)
Set f = fso.CreateFolder(folderName)
CreateFolder = f.Path
End Function
Chapter 4:
You know that someday disaster could strike at your Exchange environment—probably at the worst
possible time. Regardless of whether your Exchange organization is large or small, losing mail services
has a big impact on your business. These six tips will help you in designing, planning, testing, and
implementing an Exchange-specific disaster recovery plan.
Also you should regularly extract AD user information, such as email addresses, by using a utility
such as LDIFDE or CSVDE and add this information to the kit. For example, you’d use the following
command to export directory objects, including mail addresses:
ldifde -f C:\export.ldf -v
Tip 4: Include AD in Your Recovery Plan
In many cases, recovering Exchange also means recovering Active Directory (AD). Small companies
often have only one server for both Exchange and AD, and even in very large environments, a minor
mistake in AD can have consequences for the complete Exchange and AD configuration. Since
Exchange Server 2003 and Exchange 2000 Server rely heavily on AD, make sure you frequently back
up your domain controller’s (DC)’s system state, which includes AD, the registry, boot files, certificate
services, Microsoft IIS, COM+, and Sysvol information. Perform system-state backups at least as often as
you back up Exchange.
Thoroughly check and test your system-state backup and restore capabilities and make sure that
the NTDS and Sysvol volumes have enough space to perform a complete system-state restore. I’ve
seen restores of Global Catalogs (GCs) larger than 2GB fail on disks with more than 2GB of free space.
Make sure that your recovery plan includes procedures to restore AD both authoritatively and non-
authoritatively. For instance, deleting or changing important directory objects in AD in a multiple-DC
environment will require you to perform an authoritative AD restore, whereas you’d want to use the
non-authoritative restore to recover a DC that failed completely because of hardware errors. For more
information about AD backup and restore procedures, see the Microsoft TechNet “Active Directory
Operations Guide,” http://technet2.microsoft.com/windowsserver/en/library/9c6e4dd4-3877-4100-a8e2-
5c60c5e19bb01033.mspx.
which checks the higher-level IS database-table–structure integrity (replace ServerName with the name
of your Exchange server), and
eseutil /g
which checks the physical database pages. ExDRA will run similar commands for you to check IS integrity
and database consistency, then will then give suggestions and examples for fixing the problems.
At the outset of your database-mounting problems, if you don’t suspect Windows system problems,
full disks, or viruses as the cause, restart the Information Store service or the complete Exchange
server. During the Information Store service startup, selecting the soft recovery option—which checks
database consistency and replays uncommitted transaction logs into the database—could fix the
problems automatically. The Microsoft article “How to identify logical corruption problems in Exchange
Server,” http://support.microsoft.com/?kbid=828068, provides more information about troubleshooting
Exchange database-corruption problems and is a useful addition to your disaster recovery kit.
Chapter 5:
With every release of SQL Server, Microsoft has emphasized one area of technology. For SQL Server
7.0, that area was scalability; for SQL Server 2000, it was security. For SQL Server 2005, the emphasis is
system and database availability. Microsoft has not only added one completely new technology, database
mirroring, to achieve higher availability, but also substantially improved existing availability features.
SQL Server 2005 provides four high-availability technologies: failover clustering and database
mirroring, both with supported automatic failover; and log shipping and replication, with either manual
or custom-coded failover. Because Microsoft supports automatic failover for both failover clustering
and database mirroring, they’re clearly the technologies of choice to maximize uptime. If you don’t need
automatic failover or you’re willing to custom-code your automatic failover processes, log shipping and
replication might provide the availability you need.
These four availability solutions address a system and database failure. However, Microsoft has
also addressed another aspect of availability in SQL Server 2005: the availability of data in a highly
concurrent system. If you can’t access the data you need because another process has it locked, you have
an availability problem. Microsoft has added several new features to support data availability in highly
concurrent environments, including snapshot isolation and online index building.
In addition, some enhancements to the database restore process can make your data available
more quickly. Although you probably think first about restoring a database as part of recovery from a
failure, keep in mind that you might perform a database restore for other reasons, such as when you
move to new hardware or create a test system with data from an earlier backup. Two new features
that make your data available more quickly during a restore are online recovery and fast recovery (see
“Faster Restoring” in this article). Let’s look at what you can expect from these new and improved high-
availability features.
Failover Clustering
Of SQL Server’s high-availability solutions, failover clustering remains the technological leader. A
failover cluster consists of a set of redundant servers (called nodes) that share an external disk system.
Clustering requires special Windows software. In addition, to be eligible for Microsoft support,
Microsoft must certify your entire cluster configuration, and it must be listed in the Windows Catalog
in the cluster solution category. During a cluster failover, a virtual SQL Server instance moves from one
node to another.
As a result, a cluster failover appears to external applications as if the virtual SQL Server instance
is briefly unavailable (usually for less than a minute), then available again. The instance seemingly just
stops and restarts. Behind the scenes, an orderly process takes place quickly. One SQL Server instance
located on one physical server becomes unavailable. Windows closes the database data files that the
instance had open on a commonly shared disk space. Then, another SQL Server instance starts on
another physical server, opens the same data files, and takes over the virtual server name and virtual IP
address of the failed instance.
Database Mirroring
The most exciting new SQL Server 2005 high-availability feature is database mirroring. As discussed,
failover clustering, which provides server redundancy, doesn’t provide data-file redundancy. Although
database mirroring doesn’t provide server redundancy, it provides both database redundancy and data-
file redundancy.
When you set up database mirroring, you use two servers with a database that will be mirrored
from one to the other. The source server is called the principal server, and the database that you want
to protect is called the principal database. The other server, which receives mirrored data from the
source, is called the mirrored server, and the copy of the principal database on it is called the mirrored
database. When mirroring is up and running, the principal SQL Server 2005 instance transmits copies of
the principal database’s transaction log activity to the mirror SQL Server 2005 instance. The copy of the
transaction log activity is written to the mirrored database’s log, then those transactions are executed on
the mirror database. The result is that the mirror database executes the same transaction log activity as
the principal, but slightly behind in time. It mirrors the principal’s activity.
To enable automatic failover, you must specify that the transmission will be synchronous (with
SAFETY set to ON) and also specify a third observer SQL Server instance, called a witness. In
synchronous mode, the principal will wait for acknowledgment from the mirror that it has written the
mirrored log activity to disk before the principal moves ahead with the transaction. In the meantime,
the principal, mirror, and witness all communicate periodically, indicating their online status to each
other.
If the principal server suddenly fails, leaving both the mirror and witness servers still functional,
an automatic failover will occur. After the mirror server detects that the principal is no longer available,
the mirror server queries the witness to discover whether it detects the principal. If the witness also
can’t detect the principal, the mirror promotes itself to the principal role and brings its database online
as the new principal. The witness then records the presence of a new principal in the configuration.
If the old principal is then brought back online, the former principal finds that the old mirror is
now the new principal, and that it has been “outvoted.” The new principal and the witness agree that
the old principal is no longer the principal server. The old principal then takes on the mirror role and
starts receiving the new principal’s transaction log data. A database mirroring database failover can
occur in just a few seconds.
You can also enable the client to automatically redirect its connections if a failover occurs. If your
application connects to a principal database using ADO.NET or the Microsoft SQL Server Native Client
(SQL Native Client), the driver will automatically redirect connections when a database mirroring
failover occurs. You just specify the initial principal server and database in the connection string (and
optionally the failover partner server). If a mirroring failover occurs and your application attempts to
connect, the driver will detect the application and redirect the connection to the former mirror server,
which is now the principal.
Mirroring Restrictions
When you set up database mirroring, the principal database must be in the Full recovery model
and the mirror database must be restored with NORECOVERY. Therefore, you can’t read from the
mirror database, although you can make a database snapshot of it on the mirror server. The principal,
mirror, and witness must all be distinct-SQL Server instances: you can’t mirror a database on a single
SQL Server instance. Related to that restriction, the principal and mirror databases must have the
same name, and you can mirror only from one principal database to one mirror database. (However, a
server that’s a principal for one database can be a mirror in a different mirroring session for a different
database.)
Database mirroring requires either Enterprise Edition or Standard Edition for the principal and
mirror servers. The witness server, which is only an observer in a mirroring session, can be any edition
of SQL Server—including SQL Server 2005 Express Edition. The Standard Edition supports mirroring
only in synchronous mode (with SAFETY set to ON), whereas the Enterprise Edition also supports
mirroring in asynchronous mode.
What’s exciting about database mirroring is that it can provide very high availability, in most
scenarios failing over from one server to another in just a few seconds. This failover is automatic, just
like clustering, but much faster. And, unlike failover clustering, database mirroring doesn’t require
additional expensive and proprietary hardware for support. Database mirroring is supported on
commodity hardware and is easy to manage and monitor. As a result, in some cases, it can provide
higher availability than clustering at a significantly lower cost.
Of course, database mirroring provides redundancy only at the database level. Therefore, unlike
failover clustering, when you have a database mirroring failover, you must ensure that the mirror server
has all the proper logins, SQL Agent jobs, SQL Server Integration Services (SSIS) packages, and other
supporting components and configurations.
In addition, if you have a SQL Server instance with many interdependent databases, enabling
mirroring with automatic failover might not be appropriate. If only one database fails over, you could
end up with one database online on one server and all the other databases online on another server.
Then, the dependencies among the databases would break. As of this release, you don’t have a way to
bind a set of mirrored databases so that they all fail over together (although that’s a natural next step in
the evolution of database mirroring).
Log Shipping
You can think about log shipping as the opposite of failover clustering, at least from a technology
standpoint. It’s the low-tech, low-cost way to provide database redundancy, but without any automatic
failover. You might be tempted to view log shipping as simply a slow method of database mirroring,
but the underlying technologies are completely different. In log shipping, you automate the SQL Server
process of backing up transaction logs from a primary server and restoring them to a secondary server.
(Database mirroring uses a special endpoint transmission technology, and no intermediate files are
involved.)
In SQL Server 2005, you’ll find several important changes in log shipping. First, the supported
version of log shipping is now available in all editions of SQL Server that support SQL Server Agent,
which means in all editions except SQL Server Express. Additionally, SQL Server 2005 log shipping is
exclusively stored procedure and SQL Server Agent-based and doesn’t use database maintenance plans.
Finally, although a monitor server was required for SQL Server 2000 log shipping, that server is optional
in SQL Server 2005.
All of these changes are clearly improvements, but they come at a cost. SQL Server 2000 log
shipping can’t be directly upgraded to SQL Server 2005, because maintenance plans are no longer used.
Instead, you must manually reestablish log shipping on an upgraded set of servers.
SQL Server 2005 log shipping doesn’t support automatic failover. If the primary log shipping server
fails, you must recover the secondary server yourself, either manually or based on your own custom-
coded failure detection. You can set up a system to make role reversals easy, so that controlled failover
and failback, although still manual, involve only a few steps.
Like database mirroring, log shipping provides database redundancy only, not server redundancy.
So just as with database mirroring, you must ensure that the secondary server is kept in sync with the
primary for such matters as logins, permissions, and SQL Server Agent jobs. On the other hand, unlike
database mirroring, you can ship logs to multiple secondary servers.
Replication
Replication, which has been available since SQL Server 6.0, is one of the oldest high-availability features
in SQL Server. Although providing high availability isn’t replication’s primary purpose, in many cases, it
does so successfully.
Merge Replication
Microsoft designed merge replication for use by occasionally connected computers (e.g., laptops), but
you can use it between database servers to support high availability. On systems with low to moderate
activity, merge replication can provide redundant databases—although not with automatic failover.
Merge replication offers two key benefits: It lets you update the same data on both the publisher and
a subscriber, and it lets you manage any conflicts automatically. Also, merge replication offers the
unique capability of automatic synchronization: When either a publisher or subscriber goes offline or
is disconnected, each can work autonomously. When they’re reconnected or brought back online, they
automatically synchronize with each other. Merge replication can’t, however, guarantee transactional
consistency when multisite updates of the same data are involved.
Transactional Replication
You often see transactional replication used for high availability because its performance can be much
better than that of merge replication and because it can guarantee transactional consistency between
the publisher and subscribers. Perhaps the most common high availability scenario for transactional
replication occurs when you copy data from one database, the publisher, to one or more subscribers
through a distribution server. The subscribers are treated as read-only, and updates occur only on the
publisher. If the publisher fails, one of the subscribers can become a read/write server and accept data
updates—and even become a publisher to the other subscribers.
Snapshot Isolation
You can enable snapshot isolation as a database setting in all editions of SQL Server 2005. Snapshot
isolation lets SQL Server keep track of previous versions of all modified data. Therefore, even though
the data is still locked while it’s being modified, other transactions can access a previous committed
version of the locked data. Data is more available. However, as always, you pay a price.
The older versions of changed rows are stored in the tempdb database, and for systems that have a
large amount of modified data, tempdb space requirements can grow dramatically. On any system that
employs snapshot isolation, a DBA must carefully monitor the amount of row versioning that occurs
and watch the size limits for the tempdb database. You see another cost of using row versioning when
many changes are made to the same rows. SQL Server will maintain all changes to any row in a linked
list as long as any open transaction or running statement might need the older versions.
Additional changes to the same row will cause a new row version to be linked to the front of the
list. A query that needs to select older versions of data might need to traverse an increasingly longer
version chain, which means that a SELECT statement can take a long time to execute, even though
the data is technically available. The data modification operations will also be slower because previous
versions of the rows must be added to the linked list.
Online index creation uses row versioning to keep the original index rows available even while
changes are being made to the base table. Anyone selecting from the table sees the values as they were
before the rebuild began. As with snapshot isolation, with online index building, you pay a price for
the greater data availability. And again, part of that price is the space required in the tempdb database,
which can be considerable if you’re rebuilding the clustered index on a huge table. (Every row must be
versioned as you build the next index, but you also need space to version any rows modified during the
index-building process.) In addition, the actual building of the index might take more time than if the
building were occurring offline.
Faster Restoring
You might need to restore a database as part of disaster recovery, but you might also perform this
operation when you move a database to a new drive or copy it to a new machine. Restoring from a
backup is also a way to revert a test database to an earlier point in time so you can resume testing from
a known earlier state. To restore a database, SQL Server first copies the data and the log records from
the backup media, then goes through a process called recovery.
Usually, recovery applies to all files and filegroups and involves two phases. In the first phase,
called redo, all transactions marked in the transaction log as committed are verified in the data files
and redone, or rolled forward, if necessary. In the second phase, called undo, SQL Server checks to
see whether any uncommitted transactions have made changes to data files; those transactions will be
undone, or rolled back. In SQL Server versions before SQL Server 2005, the database wasn’t available
for any use until both the redo and the undo phases were finished.
Fast Restore
A new restore feature available only in SQL Server 2005 Enterprise Edition is fast restore. Fast restore
makes the database available as soon as the redo phase is finished. The data involved in any transactions
that were uncommitted when the backup was made are locked and unavailable in case an undo must be
performed, but the rest of the data in the database is fully available. You needn’t do anything to enable
this feature other than use SQL Server 2005 Enterprise Edition.
Online Restore
Another new restore feature available in SQL Server 2005’s Enterprise and Developer editions is online
restore. Online restore lets you restore damaged files or pages while the rest of the database remains
fully available. For a database to be online, its primary filegroup must be online. Therefore, if any files in
the primary filegroup are damaged, online restore isn’t available. However, some or all of the secondary
filegroups can be offline. You can restore the damaged files from backup while the rest of the database
is online. Only the file and filegroup being restored are offline. In addition, if your SQL Server 2005
database is running under the Full recovery model, you can also restore one or more individual pages
from a file. Only the filegroup containing those pages is offline; the rest of the database is online.
Piecemeal Restore
A final restore enhancement is piecemeal restore, which is new in all editions of SQL Server 2005 and
enhances the SQL Server 2000 partial restore. A partial restore in either SQL Server 2005 or SQL Server
2000 lets you restore only selected filegroups within a database. After the initial partial restore of the
primary filegroup and perhaps some of the secondary filegroups, piecemeal restore lets you restore
additional filegroups. Filegroups that aren’t restored are marked as offline and aren’t accessible until
they’re restored. In SQL Server 2000, you can perform a partial restore from a full database backup only,
but that’s no longer a requirement for SQL Server 2005.
Database Snapshots
One more new SQL Server 2005 feature that many people mention when they discuss high availability
is database snapshots. However, by themselves, database snapshots aren’t strictly an availability feature.
Although it’s beyond the scope of this article to go into any detail about database snapshots, be aware
that making a snapshot of a database has some availability benefits. First, if you’re running tests and
want to revert to an earlier point in time, the database is unavailable while restoring from a backup.
If you revert to a snapshot instead, the period of unavailability is drastically reduced. Second, you can
use snapshots in conjunction with database mirroring to provide a copy of the database for reporting
purposes. If you don’t use snapshot isolation, locking in the source database can make data unavailable
for short periods of time, but a read-only reporting database ameliorates some of that unavailability.
Final Words
The availability of your system, your databases, and your data is crucial to good performance in your
environment. SQL Server 2005 has added new features at every level to improve availability and has
enhanced many existing features to provide increased availability with more ease than ever before. This
discussion of new high-availability features and enhancements to existing features should help you see
which features will best support the availability of your systems, databases, and data.