Sunday, August 30, 2009

Interview Question for active directory and exchange

Interview Question for active directory and exchange



Document Version 3.00

Updates posted on http://www.microsoft.com/exchange/evalgd.htm

Joseph Pagano

Microsoft Consulting Services (MCS), New Jersey

With Contributing Reviewers:

Mahesh Nasta, Microsoft Product Support Services (PSS), Bill Skilton (MCS), Nitin Bhatia (MCS), and Todd Luttinen (MCS)

Product Group Review By:

Rob Sanfilippo and Cheen Liao

And:

The MCS NJ Messaging Team

Additional Contributors:

Brian Valentine, Bill Skilton, Nitin Bhatia, Todd Luttinen, Kristin Kinan, Jim Reitz, Max Benson, Mark Adcock, Sheri Spencer, Barbara Moatz, Craig Johnson, Joe Palermo, Lou Chaney, Daniel Chu, Tom DeFeo, Zev Yanovich, Stuart Schifter, and Deb Waldal.

Author’s Note

I just wanted to thank everyone who contributed to this paper and knowledge base by sending tons of feedback. Our goal in this paper is to minimize and eliminate downtime of critical messaging systems and to assist with increasing the rate of Exchange deployment within your company. Based on feedback to date and e-mail from all over the world, we have assisted many of you with this effort! By the way, the pictures below were meant to enhance the flow of information at trade shows and conferences. Some people tend to remember faces instead of names. I encourage other authors to do the same so that people can grab you in the hallways yelling “hey you - yeah the exchange backup . . . ” or “hey you – yeah the c++ . . . ”! This worked well where at the EDC I was approached by a lot of folks - including those from a famous beverage company who said “we luv ya man! - you guys saved us in the middle of the night a few weeks ago.”

Special thanks to the Microsoft Exchange product group - Brian “Persistence” Valentine (Sunday midnight mailer), Elaine Sharp, Cheen Liao, Rob Sanfilippo, Jim Reitz -- discussing Microsoft Exchange with them is like having Einstein help you with your physics homework! Thanks also to PSS who has been stellar and to Craig Johnson for the inquiring questions and ideas. And thanks to the MCSNJ Messaging team (Bill Skilton - taught me how to present material among a lot of other stuff, Nitin Bhatia, Todd Luttinen, Michael Bird, Kristin Kinan) in which I am a member, who have redefined the term “messaging architecture”! Also thanks to MCS, Microsoft Exchange which allowed me and Mahesh to do this without ever meeting in person, Kevin McGowan, Linda DeGruccio, Nick Robinson, Marty Thall, Ashish Kumar (“Ashman USA”), and my wife Tracey Findlay-Pagano who have provided encouragement and made this possible.

-Joseph Pagano

CHANGES SINCE LAST DOCUMENT

Due to the popularity of the topic, we have added a bit of material since the previous document version. Listed below is a summary of changes and additions to the material.

Formed two documents. Second document contains supplemental material such as FAQ, Additional Resources, Error Codes, and command line syntax.

Added example of tape rotation cycle and mixing backup job types.

Added information regarding Key Management Server backup.

Added an additional FAQ section.

Corrected batch file command order.

Added additional information on the Microsoft Exchange database architecture regarding recoverability.

Clarified procedure for restoring full server in a multiple server site.

Added a few key Knowledge Base articles.

SUMMARY

This paper outlines thoughts observations, and techniques that can be applied to Microsoft Exchange Disaster Recovery planning and should be used as a supplement to existing on-line and hard copy documentation. It is expected that you have read chapter 15 of the Microsoft Exchange Administrator Guide. Although Microsoft Exchange is a very robust and stable enterprise messaging platform, it is essential that you have a working plan for restoring Microsoft Exchange-based servers and data in a timely fashion in the event that an outage occurs. The objective is to help minimize downtime for your enterprise mail environment and to provide the quickest possible data recovery in the event of a system crash or other disaster. For those of you who have worked through data disasters, such a time is not conducive to learning, but a time for executing tried and trusted procedures and techniques. Third party utilities used for backing up Microsoft Exchange data are not covered. Do not take this information for granted but test, formulate and certify your own disaster recovery plans.

GENERAL NOTES

With the release of Microsoft Exchange, messaging disaster recovery is more critical than ever. Microsoft Exchange is a business critical application that corporations depend on daily to run their businesses. Microsoft Exchange can handle more users per server and a larger data set than previous shared file system (SFS) based messaging systems and therefore, from a disaster recovery standpoint, each server is more critical to an organization. Users have come to expect 7 x 24 availability of their messaging system, however, many organizations have inadequate maintenance and or disaster recovery capabilities.

Since Microsoft Exchange uses the Windows NT® security for authentication, Windows NT operating system backup and restore must be taken into consideration as well as Microsoft Exchange backup and restore. Because of this relationship, Microsoft Exchange disaster recovery cannot be considered independently from Windows NT disaster recovery.

An enhanced version of the Windows NT NTBACKUP.EXE program ships with Microsoft Exchange Server. One of the great benefits of Microsoft Exchange and the new NTBACKUP.EXE is that they provide for live backup of the Microsoft Exchange Information Store and Directory without interruption to the messaging system. NTBACKUP.EXE also provides file based backup services and will back up the Windows NT registry.

File Version Description

NTBACKUP.EXE 716,560 3-8-96 4:00a Includes support for Microsoft Exchange On-Line Backup; Ships With Microsoft Exchange Server 4.0

NTBACKUP.EXE 716,560 7-15-96 3:30 a Includes support for Microsoft Exchange On-Line Backup; Ships With Microsoft Exchange Server 4.0a (Microsoft Exchange SP2)

NTBACKUP.EXE 675,488 3-8-96 4:00a No Microsoft Exchange Extensions; Ships With Windows NT Service Pack 4

NTBACKUP.EXE 329,777 8-02-96 11:00 p No Microsoft Exchange Extensions: Ships with Windows NT Server 4.0

NTBACKUP.EXE 675,504 9-23-95 10:57a No Microsoft Exchange Extensions; Ships with Windows NT Server 3.51

Table 1.0 Versions of the NTBACKUP.EXE program.

<><>

Figure 1.0 The updated NTBACKUP.EXE program with Microsoft Exchange extensions.

Microsoft Exchange Server was designed such that it does not need to be taken off-line to perform backup. The entire Information Store, Directory, MTA, and System Attendant remain in service during on-line backup. While the Information Store and Directory can be backed up on-line, files in directories being accessed by other Microsoft Exchange for Windows NT services such as the DX or PCMTA services should be backed up when the respective service is not running. This can be automated and scheduled using the WINAT.EXE GUI scheduler (see Windows NT 3.51 resource kit). An example of a batch file that will shutdown and restart Microsoft Exchange services has been included in this document and can be used for other purposes as well.

Where Is The Data?

Where is the data? There are two types of data to be backed up; user data and configuration data. Microsoft Exchange user data is stored in the Information Store (PUB.EDB, PRIV.EDB), PSTs, OSTs, PABs, and transaction logs. Microsoft Exchange configuration data is stored in the Microsoft Exchange Directory (DIR.EDB), the Windows NT Registry, and in various subdirectories under the Microsoft Exchange Server installation path (and potentially paths created after running the Microsoft Exchange Performance Optimizer program). Depending on the backup and restore scenario, these data points need to be considered in your procedures.

The actual Microsoft Exchange database files are located in the following directories. Note that the default path of \exchsrvr is used in Table 1.1 but this is user selectable at time of installation. The transaction logs can be placed on a separate physical disk from the Information Store and Directory files by running the Microsoft Exchange Performance Optimizer program. You can also reconfigure the paths for all the database files using the Database Paths page on the server object.

Information Store

Private \exchsrvr\mdbdata\PRIV.EDB

Public \exchsrvr\mdbdata\PUB.EDB

Directory \exchsrvr\dsadata\DIR.EDB

Transaction Logs

Information Store \exchsrvr\mdbdata\*.LOG

Directory \exchsrvr\dsadata\*.LOG

Table 1.1 Microsoft Exchange Database File Locations.

PST, OST, and PAB

PST (Personal Message Store)

If users store PSTs on local drives, and local drives are not being backed up, you are out of luck in the event that the PST gets trashed. If PSTs are stored on fileservers (i.e. home directories) be sure to include them in your backup routines. Recovery is as simple as restoring the PST and adding the PST to an existing user profile. If a user has password protected his or her PST and then forgets the password, there is no way to recover the password and therefore the data in the PST file. This is a user education issue. Make sure users are aware of this. Note that a damaged PST file can be repaired by running the SCANPST program.

OST (Off-Line Message Store)

OST data is at risk when changes have been made to the local OST and have not yet been replicated up to the server based store. If a user machine crashes, a new OST can be created on the replacement machine and all server based information can be sent down to the OST file via synchronization. Note that a damaged OST file can be repaired by running the SCANPST program.

PAB (Personal Address Book)

Personal Address Book files can be stored on a server directory or locally. Since most servers are backed up regularly at night, any server based PAB files are also backed up. The risk for PAB loss is when users store the PAB file locally and do not arrange for a back up. This can cost an employee many hours of work and lost productivity to replace the PAB entries.

Microsoft Outlook - Archive and AutoArchive

Microsoft Outlook™ provides the ability to archive data into PST files automatically. This feature may require administrators to include these archive data files in backup strategies.

Your Outlook mailbox grows as items are created in the same way that papers pile up on your desk. In the paper-based world, you can occasionally shuffle through all your documents and put in storage those that are important but not frequently used. Documents that are less important, such as newspapers and magazines, you discard based on their age.

In Outlook, you can manually transfer old items to a storage file by clicking Archive on the File menu, or you can have old items automatically transferred by using AutoArchive. Items are considered old when they reach the age you specify. With AutoArchive, you can either delete or move old items. Outlook can archive all items but only files that are stored in a mail folder, such as an attached Microsoft Excel spreadsheet or Word document. A file that is not stored in a mail folder cannot be archived.

AutoArchive is a two-step process. First, you turn on AutoArchive on the AutoArchive tab in the Options dialog box (Tools menu). Second, you set the AutoArchive properties for each folder that you want archived. What and when items are AutoArchived is determined at the folder level. You can automatically archive individual folders, groups of folders, or all Outlook folders. The process runs automatically whenever you start Outlook. The AutoArchive properties of each folder are checked by date, and old items are moved to your archive file. Items in the Deleted Items folder are not moved to another folder. They are deleted.

Several Outlook folders are installed with AutoArchive turned on. These folders and their default aging periods are Calendar (6 months), Tasks (6 months), Journal (6 months), Sent Items (2 months), and Deleted Items (2 months). Inbox, Notes, and Contacts do not have AutoArchive activated automatically.

There is a difference between exporting and archiving. When you archive, the original items are copied to the archive file and then removed from the current folder. When you export, the original items are copied to the export file but are not removed from the current folder. In addition, you can only archive to one file type, a personal folder file, but you can export to many file types such as text.

When you archive, your existing folder structure is maintained in your new archive file. If there is a parent folder above the folder you choose to archive, the parent folder is created in the archive file, but items within the parent folder are not archived in the archive file. In this way, an identical folder structure exists between the archive file and your mailbox. Folders are left in place after being archived, even if they are empty.

Consider a backup strategy for archived Outlook data.

Backup Types - Review

On-Line vs. Off-Line Backup

On-Line

Microsoft Exchange “On-Line” backup requires that the respective service (Information Store, Directory) is running. This backup is performed without disrupting messaging on the Microsoft Exchange-based server. Note that you can back up the DS without the IS running. You can include the Windows NT registry in the backup job.

Off-Line

This is a file based backup. All Microsoft Exchange services must be stopped. You simply run the NTBACKUP program to backup all files on the desired drives. You also have the option to include the Windows NT registry in the backup job.

On-Line Backup Types

Normal (Full)

This will back up the entire Information Store and or Directory databases. Transaction logs will be backed up and then purged. Incremental and Differential backups are given context as a result of the transaction logs being purged.

Copy

This is similar to the Full backup except there is no context marking. The context for the Incremental and Differential backups does not get changed. Log files are not deleted. For example, a Copy is like taking a snap-shot of the databases at a given point in time without impacting other backup routines. You may wish to use “copy” for reproducing a scenario in a test environment.

Incremental

This backs up a subset of the Information Store and or Directory. Only changes since the last Full backup or last Incremental (whichever was most recent) backup are written to tape. In fact, only the .LOG files are written to tape. The .LOG files are then purged from disk. The purging of transaction logs sets context for the next backup job. For a typical Incremental restore, a tape of the last Full backup is required as well as tapes for each Incremental up to the point at which the system experienced an outage. For example, perhaps a Full backup is performed on Sunday evening and Incremental backups are performed Mon. - Fri. If an outage occurs on Friday morning, a Full restore would be performed and then each Incremental would be restored through Thursday. Services should not be started until the final incremental tape has been restored. Note that Incremental backup is disabled when circular logging is enabled. More on this later.

Differential

This backs up the changes in the Information Store and or Directory since the last Full (Normal) or Incremental backup – however, most administrators do not mix Differential and Incremental backups in a series. Only .LOG files are backed up but they are not purged from disk. For example, in the event that a transaction log and database restore is required, only two tapes are required for restoring using the Differential backup. The two required tapes would be the latest Full and the latest Differential. Note that Differential backup is disabled when circular logging is enabled. More on this later. Also note that if the transaction logs are in-tact since the last full backup, then only the last Full backup tape is required since the restore process will play back all logs from the point of the last full through the current edb.log file, thus restoring all transactions to date. Be sure NOT to select “erase existing data” when restoring in this case so that the log files to date are not erased.

LOG FILES AND CIRCULAR LOGGING

Logging Explained

Although transparent to the end-user, Microsoft Exchange Server maintains several database files or “stores.” The Information Store consists of two databases. These are referred to as the Private and Public Information Stores. The Private Store is located in PRIV.EDB and the Public Store is located in PUB.EDB. The Microsoft Exchange Directory is stored in the file DIR.EDB. The respective Microsoft Exchange Server services use transaction log files for each of these databases. Note that the Information Store service manages both the Public and Private store databases (there are not two distinct server services).

Microsoft Exchange database technology implements log files to accept, track, and maintain data. All message transactions are written first to log files and memory, and then to the respective database files. This is done for performance and recoverability. Since log files are written to sequentially and Microsoft Exchange Server writes message transactions to log files immediately, Microsoft Exchange clients experience a high level of performance. Log files are always appended to the end of the file however, Microsoft Exchange database files (PUB.EDB, PRIV.EDB, DIR.EDB) are written to randomly. For recoverability, log files can be used to recover message transaction data in the event that a hardware failure corrupts the Information Store or Directory database files, provided that you have either backed up the logs or the logs are in tact. Log files are typically kept on a separate physical disk drive from the actual Information Store and Directory database files. If the database files are damaged, a backup can be restored and any data that has not been backed up but that has been recorded in the transaction logs, can be “played” back. These transactions are entered into the restored database file to bring the database up-to-date.

The Directory and Information Store services use the following logs and files: Transaction Logs, Previous Logs, Checkpoint Files, Reserved Logs, and Patch Files.

Transaction Logs

Transaction logs can be kept on a separate physical drive from that of their respective EDB files. By default, Information Store logs are kept in \exchsrvr\mdbdata and Directory Service logs are kept in \exchsrvr\dsadata. Each subdirectory contains an EDB.LOG file which is the current transaction log file for the respective service. Both the Information Store subdirectory and the Directory Service subdirectory maintain a separate EDB.LOG file. Log files should always be 5,242,880 bytes in size. If log files do not reflect this exact size, they are most likely damaged. Since transactions are first written to the EDB.LOG files and then later written to the database, the current actual or effective database is a combination of the uncommitted transactions in the transaction log file, which also reside in memory, and the actual .EDB database file. When the EDB.LOG files are filled with transaction data, they are renamed and a new EDB.LOG file is created. Note that LOG files are always 5MB in size regardless of how many transactions have been recorded in them.

Previous Logs

When EDB.LOG is renamed, the renamed log files are stored in the same subdirectory as the EDB.LOG file. The log files are named in a sequential numbering order (i.e. EDB00014.LOG, EDB00015.LOG, etc. using hexadecimal). Previously committed log files are purged during an on-line Normal (Full) Backup or an on-line Incremental backup using the NTBACKUP.EXE program. Note that not all previous log files are purged. After every 5MB of transactions are written, a new log is created, but not necessarily committed. There may be several previous logs that aren’t committed that will not be purged. When Circular Logging is enabled, a history of Previous Logs are not maintained and therefore, are not purged by backup operations. In fact, Incremental and Differential on-line backups are not permitted when Circular Logging is enabled. Note that transactions in log files are committed to the respective EDB file when the service is shutdown normally. For example, when the Information Store service experiences a normal shutdown (service shuts down with no errors), any transactions that existed in log files and not in the PRIV.EDB and or PUB.EDB files are committed to the EDB files. Log files should not be manually purged while services are running. In general, it is best to purge logs via the backup process.

Checkpoint Files And The "Checkpoint"

Checkpoint files are used for recovering (“playing”) data from transaction logs into .EDB files. The “checkpoint” is referred to as the place marker within the EDB.CHK file that indicates which transactions have been committed. Separate EDB.CHK files are maintained by the Information Store and Directory services. Whenever data is written to a .EDB file from the transaction log, the file EDB.CHK file is updated with information specifying that the transaction was successfully committed to the respective .EDB file. During recovery, Microsoft Exchange determines what transactions have not yet been committed to the respective .EDB file by reading the EDB.CHK file or by reading the transaction log files directly (EDB.CHK is not required though). The Information Store and Directory service each reads their EDB.CHK file during startup and any transactions that have not been committed are played into the .EDB files from transaction logs. For example, if a Microsoft Exchange-based server experiences an outage, and transactions have been recorded into the transaction log but not yet to the actual database file, Microsoft Exchange attempts recovery on startup by recording transactions from the logs to the respective database files automatically.

Reserved Logs

Both the Directory and Information Store services independently maintain two reserve files, RES1.LOG and RES2.LOG. These are stored in MDBDATA and DSADATA. If the Directory and or Information Store service is in the process of renaming the EDB.LOG file and creating a new EDB.LOG, and there is not enough disk space to create a new EDB.LOG file, then the reserve log files are used. This is a fail-safe mechanism that is only used in the event of an emergency. When this occurs, JET will send an error to the respective service. The service will flush any transactions in memory that have not yet been written to a transaction log into the RES1.LOG and if necessary, the RES2.LOG. When this is completed, the service will shut down and record an event in the Windows NT event log. Note that RES transaction log files are always 5MB in size as are all transaction log files.

Patch Files

The Patch file mechanism was designed for the case where transactions are written to a database during the backup process. A convenient feature of Microsoft Exchange is the ability to backup databases without interrupting service to end-users. During the backup operation, data is read from the .EDB files. If a transaction is made to a part of the .EDB file that has already been backed up, it is recorded in a .PAT (patch) file. If a transaction is made to a part of the .EDB file that has not yet been backed up, it is simply processed and does not need to be written to the patch file. A separate .PAT file is used for each database -- PRIV.PAT, PUB.PAT, and DIR.PAT. Note that these .PAT files will only be seen during the backup process. During an on-line backup operation, the following takes place:

.PAT file is created for the current database

The backup operation for the current .EDB file begins

Transactions that must be written to parts of the .EDB file that have already been backed up are recorded both to the .EDB file and to the .PAT file

.PAT file is written to the backup tape

.PAT file is deleted from \MDBDATA or \DSADATA

TEMP.EDB

The file TEMP.EDB is used to store transactions that are in progress. TEMP.EDB is also used for some transient storage during online compaction.

How Backup Purges Log Files

When Circular Logging is not enabled, log files will accumulate on the transaction log disk drive until an on-line Normal (Full) or Incremental backup is performed. During an on-line backup operation the following takes place:

The backup process copies the specified database files.

Patch files are created as required (patch files maintain transactions written during a backup operation to the portion of an EDB file that has already been backed up).

Log files created during the backup process are copied to tape.

Patch files are written to tape.

Log files older than the checkpoint at the start of the backup operation are purged. These are not required since the transactions have already been committed to the EDB files and the EDB files have been written to tape.

Database Circular Logging

Database Circular Logging is a mechanism that uses transaction log technology but does not maintain previous transaction log files. Instead, a window of a few log files are maintained and eventually purged as new log files are created. When transactions in transaction log files have been committed to the database, the existing log files are removed and previous transactions are discarded by this process. Circular logging is enabled by default. Although circular logging helps to manage disk space and prevents the build up of transaction log files, Differential and Incremental backups cannot be performed because these rely on past transaction log files.

When Database Circular Logging is enabled, you may see multiple EDBXXXXX.LOG files in the \MDBDATA or \DSADATA subdirectory. This is normal as Microsoft Exchange will use several log files before setting the circular window (wrapping around). For example, logs edb00010.log, edb00011.log, edb00012.log, edb00013.log would become edb00011.log, edb00012.log, edb00013.log, edb00014.log. Note that the numbers increment in hexadecimal.

Microsoft Exchange attempts to maintain a window of 4 log files for circular logging however, if the server I/O load is large, more than 4 log files will be used. Log files in excess of 4 will not be purged until the respective service (IS and or DS) is stopped and restarted.

How To Check Whether Circular Logging Is Enabled

To review Database Circular Logging settings perform the following:

Run the Microsoft Exchange Server Administrator Program

Highlight the desired server under the Site, Configuration, Servers object

Select FILE.PROPERITES from the pulldown menu

Select the Advanced Tab. Note that circular logging can be set separately for the Information Store and Directory.

Circular logging settings can be changed on-the-fly from the Microsoft Exchange administrator program. However, Microsoft Exchange will stop the corresponding service and restart it after making the change.

Recovery Example - Transaction Logs

The circumstances are as follows. Circular logging is not enabled and transaction logs are stored on a disk separate from the database files. Imagine that the last Full (Normal) backup took place two days ago. Due to a hardware failure (i.e. bad hard disk) the Information Store databases become damaged but the transaction log drive remains intact. Does this mean that the best you can do is go back two days and lose two days of production data? The answer is no. Since the transaction logs are complete, they contain all transactions from the point of the Full backup. In this case, after hardware is restored, perform a full restore. Do not tell the backup program to remove existing log files (don’t select “Erase All Existing Data”). The full restore writes the database files and the log files that were backed up with the Full backup. Restored log files would be log files up to the first log file on the current transaction log drive. For example, say the full backup copied edb00012.log through edb00014.log. The log files on the transaction log drive would be edb00015.log and up. The full restore will copy out logs edb00012.log - edb00014.log and the Information Store database files that are part of the backup set. When the Information Store is started, it will replay transactions from edb00012.log through the last log file (i.e. edb00019.log) and then replay EDB.LOG, the most recent log file. After this is complete, the service will start and the database will be up-to-date. The log files contain signatures to make sure log files are part of the sequence to be replayed.

MORE ON DATABASE ARCHITECTURE

The following section is an excerpt from the TechNet article titled “Microsoft Exchange Server 4.0 vs. Lotus Notes 4.1”. This provides additional insight into the Microsoft Exchange database architecture and is a good discussion point for your friends who run Lotus Notes.

Reliable data store with transaction logs

Borrowing an idea from production relational databases like Microsoft SQL Server™, the Microsoft Exchange Server information store and directory service use separate transaction log files to improve both performance and data integrity. All changes are quickly recorded in sequential transaction logs, then committed to the actual underlying database file. In the event of power loss or unexpected server shutdown this ensures that your data will remain intact and recoverable – right up to the last complete transaction. The architecture prevents the data from being left in an inconsistent or “corrupted” state. Although Lotus calls their store a database, it has no transaction/recovery logging facilities. All database changes are written directly into the main database file, making it more likely for data to be left in a “corrupted” state in the event of abnormal server shutdown. All major systems facilities in Lotus Notes are built on top of this database architecture – the Directory, Mail routing, Groupware applications, Replication, Systems Management, etc. – and all are subject to its architectural limitations.

The principles behind database transaction integrity have been well-understood since the 70’s. Readers are likely familiar with the so-called “ACID” test of database transaction integrity. The database underlying the Microsoft Exchange Server information store supports all of the ACID properties:

Atomicity -- Results of the transaction's execution are either all committed or all rolled back. In Microsoft Exchange Server, atomic operations are achieved through the use of transaction logs. As described above, transactions in the log that haven’t yet been committed to the main database file are either rolled forward & committed, or rolled back if incomplete. This process happens quickly and automatically upon system re-start.

Consistency -- A shared resource (such as a database) is always transformed from one valid state to another valid state. All operations on the Microsoft Exchange Server information store are atomic, and ensure that the data is always in a consistent state. Updating a transaction log to indicate that a transaction has been completely “committed” back to the main database file is an atomic operation.

Isolation – Transactions are “serializable” – in a system handling multiple simultaneous transactions, the results of any transaction are the same as if it were the only transaction running on the system. This essentially means safe, concurrent access to the data by multiple simultaneous users. Simultaneous user operations cannot interfere with each other in such a way as to render the database invalid. The isolation property is enforced by the database underlying the Microsoft Exchange Server information store.

Durability -- The results of a transaction are permanent and survive future system and media failures. Microsoft Exchange Server transaction logs implement the principle of durability. If a portion of a log file is corrupt or unreadable (due to physical drive damage, etc), then those transactions are simply “rolled back". Even in case of media damage, the physical format of transaction logs is carefully designed to reduce the impact of any media failure, though a combination of sequential writes, the creation of new log files every 5MB, and low-level techniques such as "ping-pong" logging, which helps maximize the durability of transactions even within a partially corrupted log file.

Some people have expressed concern that using transaction logs may incur significant overhead, because the data has to be written more than once (first to the log, then later committed to the main database file). In fact, proper use of transaction logs actually improves overall system throughput, for a number of reasons. When transaction log files are kept on a separate disk, they are written sequentially, rather than random-access. Because the disk drive head doesn’t have to seek randomly, this is at least an order of magnitude faster than random-access writes to the main database file, even with today’s very fast hard drive subsystems. The transactions are then “lazily” committed back to the main database file – and this can be done very efficiently, because 1) it’s done asynchronously, when the server has idle cycles, and 2) the NTFS and FAT disk cache systems in Windows NT Server will automatically order the writes in the most efficient manner, using classic techniques such as “elevator seeking”, again to minimize physical head seeks.

Additionally, Microsoft Exchange Server recovery techniques work as well for large records as small ones. Its transaction logs are smart enough to write only that part of the data that’s actually changed. So if a user changes a few bytes to a 2MB document stored in Microsoft Exchange, only the actual data pages that are changed get written to the log.

The Lotus Notes database does not implement all of the ACID properties – since it is not transaction-based. Also, Lotus Notes cannot take advantage of the performance advantages of transaction logs – in order to maintain even minimal database consistency, all Notes database transactions must be written-through to the main database file, causing the database to wait for random-access writes to complete, and also bypassing the file system’s disk caching.

Fast automatic recovery using transaction rollback

When a Microsoft Exchange Server information store or directory is started after abnormal server shutdown – the transaction log file is scanned to see if there were any incomplete transactions. If there were, these transactions are “rolled back” automatically to the state before they took place. This automatic recovery operation is relatively quick, since only the most recent transactions in the log have to be checked.

When a Lotus Notes server is restarted after an abnormal shutdown a version of their “FIXUP” program is run first against the system error log database (LOG.NSF) and then every database file on the server is scanned for evidence of corruption. If a database file was open or being updated at the time of the server crash – the database is very likely marked as corrupt. If corruption is detected in a database, then FIXUP must individually verify every field in every document -- a lengthy process that can take hours for large databases. FIXUP also automatically deletes any “damaged documents” that it finds. Besides long recovery times, this raises the distinct possibility of critical data loss - an incomplete update to an existing document could result in the original document’s record being damaged, and therefore deleted entirely – even if a user was simply modifying a single field in an existing document.

Microsoft Exchange Server also provides utilities similar to the Notes FIXUP program (ISINTEG and EDBUTIL) – but these are tools of “last resort”. They are used when the normal, automatic roll-back recovery mechanisms in Microsoft Exchange Server can’t be used – for example, if the log files are lost entirely. Lotus Notes’ primary recovery mechanism is the “last resort” FIXUP utility.

For details on Lotus Notes’ use of the FIXUP utility – see the Lotus Notes 4.0 Administrator’s Guide, p312 -- “About Corrupted Databases”.

The differences here in recoverability are analogous to those between production DBMS servers like Microsoft SQL Server™ or Oracle, and end-user databases like Microsoft Access or Lotus Approach.

Single-instance storage with automatic referential integrity

Single-instance storage is a key requirement from customers who wish to store users’ mail centrally, on the server. With single-instance storage, if 100 users on the same server all receive the same message – only a single copy of the message is stored on the server, and 100 “pointers” to the message are placed in the users’ mailboxes. Significant space savings and server performance gains can be realized with single-instance storage.

Notes 4.0 introduces a rudimentary single-instance storage mechanism called “Shared Mail.” Notes Mail users all have their own separate Notes database files (.NSF files) on the server for their mailbox. Normally, if multiple users receive the same message it is stored multiple times, once for each recipient. When the optional Shared Mail feature is enabled, users’ mail becomes fragmented – part is still stored in their personal mail file, and part is now stored in the shared mail file. Messages with a single recipient are stored entirely in the personal file. Messages with multiple recipients are split up – the message header and properties are stored multiple times in each personal mail file. The message body and attachments are stored once in the shared mail file. The Notes application creates links between the shared mail database file, and each of the users’ personal mail database files.

When a user opens a message that has multiple recipients, the Notes application hides this complexity from the user. It opens both their personal and the shared mail database files, and resolves the links to create the “illusion” that the user has opened a personal copy of the message. The problem is the links between these separate database files. Because these are separate MS-DOS® files, the links between them are fragile and must be maintained by Notes’ application-level code. They are not intrinsic to the Notes’ database engine – there is no automatic referential integrity to ensure their consistency. If a user’s mailbox file is deleted, for example, then the pointers from the user’s mailbox to the shared mail database are never updated. The shared mail database can easily be left with orphaned messages - taking up physical space, but not actually existing in anyone’s mailbox. Administrators have to do a lot of manual work, using command-line LOAD OBJECT utilities, in order to avoid and correct situations like this.

Microsoft Exchange Server has built single-instance storage into its information store design from the ground up, as opposed to simply stringing together existing databases. Single-instance storage is always in effect, requires no special configuration or administration, and most importantly is intrinsic to the information store. When a message or user mailbox is deleted, the right thing always happens; messages cannot be orphaned or lost. Pointers cannot get out of sync between files – because everything is stored in a single file, and referential integrity is handled internally by the database engine.

The net result is that Microsoft Exchange Server is optimized for efficient, reliable storage of messages on the server. Lotus Notes Shared Mail may meet the minimum bar to get a “single-instance storage” checkbox – but its complexity and fragility will almost certainly increase the administrator’s workload.

Single-instance storage with per-user storage limits

A more obvious problem with Notes Shared Mail is that administrators must forgo the ability to limit individual user storage. Notes can only limit the overall size of a database file – there is no way to limit an individual user’s storage quota within a database. One user could easily hog all of the available shared mail storage.

Years of working with large Microsoft Mail customers impressed upon us the importance of giving control of storage to the administrator. Research shows that one of the most common reasons for mail system outages is simply the inability to limit user storage, which eventually causes servers to fill up and cease working.

Therefore, within its shared mail information store, Microsoft Exchange Server allows administrators to set and enforce disk quotas – either an overall default, or individual user limits. Users can be given a warning limit as well as a “hard” limit. The hard limit is enforced by prohibiting the offending user from sending any new email until they clean up their mailbox. In this way, this user won’t miss any critical incoming email messages, and other innocent users won’t be penalized by receiving “non-delivery” notices from the offender’s mailbox.

Live online backup to tape for 7x24 operation

Microsoft Exchange Server has built-in support for on-line backups directly to tape media. The server does not have to be shut down, nor do users have to be logged out. Furthermore, Microsoft Exchange Server backup is integrated with Windows NT Server backup, allowing administrators to back up both Microsoft Exchange Servers and file servers from the same location. Administrators can perform full, incremental, or differential backups directly to a wide variety of tape devices, from ¼ inch cartridges to high-capacity DAT systems.

DATA RECOVERY SCENARIOS

Three recovery scenarios are discussed below. These are Single Mailbox, Full Server, and PST/OST/PAB.

Single Mailbox

CAUTION: Please note that this procedure should not be performed on a server that is in production. As noted below, this procedure calls for restoring data to a server that is NOT part of your production Microsoft Exchange site. The dedicated recovery server is installed using the same Site and Org name as the production site, however, is installed by selecting “CREATE NEW SITE”.

Requirements

The following are required:

A dedicated server with enough capacity to restore the entire Private Information Store database.

A backup of the Information Store Private database.

Microsoft Exchange Client and Microsoft Exchange Server Installation Code.

Windows NT and Windows NT Service Pack Installation Code.

In the event that a single mailbox needs to be recovered you can perform the following procedure. This might be necessary due to an accidental mailbox deletion or mailbox data deletion. In a centrally supported organization, affiliate offices may mail tapes to an internal “recovery center”. This procedure will provide single mailbox recovery for any server in your organization, regardless of the server name.

Note that at present, you must restore the entire Information Store and then retrieve data from the desired mailbox . In short, prepare a server running Windows NT Server and install Microsoft Exchange with the same Site and Organization name in which the mailbox to be restored resided on. Then restore the Information Store from a backup tape, logon with Microsoft Exchange administrative privileges and assign the Windows NT administrator ID access to the desired mailbox. Restore mailbox data to a PST file and attach the PST to the desired user profile.

Prepare Recovery Machine

Prepare an non-production recovery server. For quickest recovery, this machine should be up and running and available for recovery at all times. This machine can be installed as a Windows NT PDC, BDC or member server. The server should have the respective Windows NT service pack installed. Make sure that there is enough disk space for restoring the entire information store from your backup tape. This machine should also be equipped with a tape drive compatible with tape drives deployed on production servers. The tape drive should be tested and known to be working at all times.

CREATE NEW SITE (DO NOT JOIN SITE). During the installation of Microsoft Exchange (next step), be sure NOT to join the site. This recovery machine should be a “stand alone” machine and should not be joined to your existing production site.

Log on to Windows NT as administrator and install Microsoft Exchange (Complete Install) using the same Site and Organization name that was used on the server in which you are restoring the mailbox from. DO NOT JOIN SITE. Note that the server name of the restore machine does not matter for the single mailbox restore procedure. This is because we are only restoring the IS and not the directory. If you have a dedicated recovery server per location, you can have Microsoft Exchange installed prior to going into recovery mode. If the recovery server will be shared among sites, it is best to keep a copy of the Microsoft Exchange installation code on the hard disk for quick installation so that you can install Microsoft Exchange based on the required Site and Organization. The paths for the this Microsoft Exchange install do not need to match the paths of the production Microsoft Exchange install being recovered.

Install the Microsoft Exchange Client on the Recovery Server.

Restore the Information Store From Tape

This procedure assumes a tape from an on-line backup is used for the restore. If an off-line tape is used, do not select to start the services following the restore. After the restore, execute the command “isinteg -patch”, start the DS and IS services and then perform the DS/IS Consistency Adjustment. Make sure the DS is started before running “isinteg -patch”.

Insert the backup tape in the drive.

Log on to the recovery domain as administrator.

From the Administrative Tools group run BACKUP.

From the pulldown menu select OPERATIONS, MICROSOFT EXCHANGE.

Select the tapes icon and double click on the tape name. A catalog status box will be displayed stating “loading . . . .”.

Select “ORG\SITE\SERVER”\Information Store in the right hand side of the Tapes Window

Select the RESTORE BUTTON from the upper part of the Backup main screen.

On the RESTORE INFORMATION screen enter the name of the destination server in destination server field (i.e. HOTSPARE).

Select ERASE ALL EXISTING DATA, PRIVATE, PUBLIC, VERIFY AFTER RESTORE, START SERVICE AFTER RESTORE. Select the OK button.

Select OK on the Restore Message (“You are about to restore Microsoft Exchange components. The Microsoft Exchange services on the destination server will be stopped”).

Select OK on the VERIFY STATUS screen.

Run Control Panel, Services and verify that the Microsoft Exchange services are running.

Recover User Mailbox

Log onto the recovery server using the Windows NT Administrator ID.

Run Microsoft Exchange Administrator program.

Run the DS/IS Consistency Adjustment. Highlight your server name, select File.Properties, Advanced tab, All inconsistencies, select the Adjust button.

Select the recipients container and double click on the desired user’s mailbox name.

From the GENERAL tab, select the button PRIMARY WINDOWS NT ACCOUNT.

From the “Primary Windows NT Account” dialogue box select “SELECT AN EXISTING WINDOWS NT ACCOUNT” and then select OK.

From the ADD USER OR GROUP screen select ADMINISTRATOR, select the ADD button and then select OK.

Select OK on the User Property screen.

From the Microsoft Exchange Client program group run MICROSOFT EXCHANGE SERVICES.

Configure a profile for the desired user.

Add a Personal folder file to the profile.

Run the Microsoft Exchange client.

Highlight the “Mailbox - USERNAME” on the left panel.

Select the first folder or item in the list on the right panel.

From the pulldown menu select EDIT, SELECT ALL.

From the pulldown menu select FILE, COPY.

In the Copy screen highlight the PERSONAL FOLDER and then select OK. All data will be copied to this PST file.

Copy the PST to the destination location. This can be done via tape backup and restore if necessary.

Add this PST to the user’s profile on the production server and or send the PST to the end user with instructions. You may need to send this on a tape. If you have network access, you might copy this recovered PST to the desired server.

<><>

Figure 1.1 Microsoft Exchange Single Mailbox Recovery.

The Single Mailbox Recovery server can be maintained on-line with production servers because the server name does not need to be the same as the production server running Microsoft Exchange. This recovery server, however, should not be performing DS replication with the production servers.

<><>

Figure 1.2 Microsoft Exchange Single Mailbox Recovery - Example Topology

This is an example topology for maintaining a spare server for single mailbox recovery. Note that the spare server “Sabc” is NOT joined to the production site, however, the server was installed using the same site and org name as the production site.

Full Server - Restoring To A Different Server Machine / Moving Microsoft Exchange Server To A New Machine

This section discusses issues with restoring a Microsoft Exchange Server to a different physical machine. Note that this is a special case since Windows NT is reinstalled and a new registry is created. This requires that a new Windows NT S.I.D. (security identifier) be created for the recovery machine in the domain as outlined below. This information is also useful for moving a Microsoft Exchange server installation to a more powerful server for a hardware upgrade. Keep in mind that a Windows NT registry can be restored to the same physical machine (i.e. a hard disk is replaced on the same machine) and if this takes place, the machine will maintain its unique S.I.D. and a new S.I.D. should not be created as outlined below.

There are a number of situations in which it is necessary to do a full restore of the Exchange server databases (IS/DS). Depending on your particular environment it may also be necessary to restore the Windows NT SAM database. Microsoft Exchange automatically adds two accounts upon initial installation. The Windows NT service account and the Windows NT account that was logged on during the initial installation of the software. While both of these accounts receive special privileges during installation, only the Windows NT account S.I.D. that was originally used during the installation is required to restore the Exchange DS. The Exchange DS will not be accessible unless this S.I.D. exists in the Windows NT environment. If for any reason there are no Domain Controllers (DC) of the original domain available, it is necessary to restore the Windows NT Primary Domain Controller (PDC) SAM.

Requirements

The following are required:

A full backup of the Information Store and Directory.

Replacement PC with same hardware capacity as production server.

Access to the original Windows NT SAM.

Production Server configuration sheet

Microsoft Exchange Installation Code.

Windows NT Server and Windows NT Service Pack Installation Code.

Microsoft Exchange Production Server configuration sheet.

A full server recovery is a bit more complex than single mailbox recovery. A full server recovery is defined as the ability to restore an original production Microsoft Exchange-based server such that all Windows NT security and configuration information as well as Microsoft Exchange configuration and data is recovered. This will allow users to log into their mailboxes upon deployment of a recovery server using their current passwords.

Where the single mailbox restore requires that only the IS be restored, a full server recovery requires that both the IS and DS be restored. Microsoft Exchange relies on Windows NT security for providing access to mailbox data. Microsoft Exchange uses Windows NT account S.I.D. information in object properties within the Microsoft Exchange Directory. For a successful DS restore, there are two key conditions:

1. The DS must be restored to a Windows NT-based machine that has the same Site, Organization, AND Server Name as the production server.

2. The recovery server must have access from the domain in which Microsoft Exchange Server was originally installed.

A full server disaster recovery involves three machines. Two of these will be in production, one is a non-production or non-essential machine (could be in production doing some other task but used at any time for recovery). One machine is a PDC and the other, usually one (or more) of the Microsoft Exchange-based servers, is configured as a BDC. The third machine is designated as a recovery server.

The reason for requiring a PDC, BDC, and recovery server configuration is due to the way Microsoft Exchange uses the Windows NT Security Accounts Manager (SAM) database to provide authentication to directory objects. A full server Microsoft Exchange restore including the Information Store and Directory requires access to the SAM from the domain in which the Microsoft Exchange-based server was first installed.

For example, let’s say that there is one Microsoft Exchange-based server in a site and this server also acts as a PDC. We have a recovery server off-line. The Microsoft Exchange Information Store (IS) and Directory are backed up nightly. If this server crashes, a hot backup PDC with Microsoft Exchange can be built from scratch and the information store can be restored (as is the case for single mailbox restore). When the Microsoft Exchange Directory is restored, it expects the security properties of all Directory objects to match the Windows NT SAM for the respective accounts.

Since this machine was rebuilt as a PDC, a new Windows NT SAM is created. The restored Microsoft Exchange Directory objects will not match up with the SAM objects. The administrator will NOT be able to log on to the Microsoft Exchange administrator program and the Microsoft Exchange services will not be able to start. All restored data will be inaccessible if the Microsoft Exchange Directory is restored in this scenario. You can then restore the IS without the Directory for access to the data of each mailbox but this does not meet our definition of full server recovery and will only provide administrator access to mailbox data. The original Microsoft Exchange directory information will have been lost from the production server.

Now let’s say there is a dedicated PDC and the production Microsoft Exchange-based server acts as a BDC. We also have a recovery server. The production Microsoft Exchange-based server crashes. We simply build a Windows NT domain controller from the recovery server with the same computer name as the crashed Microsoft Exchange-based server. We connect this to the domain as a BDC which provides us with a copy of the SAM from the domain in which the production Microsoft Exchange-based server resided (using Server Manager we first delete the original computer name – BDC definition -- from the PDC and then add it again during the BDC install - this is because each computer name gets a unique security identifier (S.I.D.) when it is added to the domain and we will need a new one for the recovery machine). We then install Microsoft Exchange using the same Site and Organization name and by default, the same Server name will be used because Microsoft Exchange uses the computer name to create the Microsoft Exchange server name. (Note that if you are recovering a server and joining an existing site during this reinstallation, refer to the Microsoft Exchange Administrator guide for more details – “Install Microsoft Exchange Server on the new or repaired server, but do not replicate it with the existing organization. Give the server its original organization and site name. Run Setup /R.”)

After this we restore the IS and Directory from the last Microsoft Exchange-based production server. Note that we could have added the recovery server to the production domain as a member server instead of a BDC.

The following is an example of a server recovery using a backup tape of an IS and DS from a production server. The backup type was set to “normal” which performs a full on-line backup of the Information Store and Directory. Note that during the Microsoft Exchange software installation, you will NOT join the site. Instead you select CREATE NEW SITE. This is because you have a backup copy of the Microsoft Exchange directory database and even though you selected to create a new site, upon restart, the server will synchronize with other servers in the site automatically because the knowledge of being in a site is stored in the directory database.

Prepare Recovery Machine

You will need to install Windows NT with the same computer name as the crashed Microsoft Exchange-based server. If the production Microsoft Exchange-based server is a BDC, add the recovery server to the production domain as a BDC. This will involve first deleting and re-adding the computer name on the PDC to create a new S.I.D. for the recovery machine. The recovery server should have the respective Windows NT service pack installed matching the configuration of the production server. Make sure that there is enough disk space for restoring the entire Information Store and Directory from your backup tape. This machine should also be equipped with a tape drive compatible with tape drives deployed on production servers. To expedite the restore process, you should keep a copy of the Windows NT installation code and service pack on the hard disk of the recovery machine. Be sure to refer to your production Windows NT Server configuration sheet for pertinent settings including protocol addresses, partitioning information, protocols, options, tuning, etc.

CREATE NEW SITE (DO NOT JOIN EXISTING SITE).

Log on to Windows NT as administrator and install Microsoft Exchange using the same Site and Organization name that was used on the server in which you are restoring the mailbox from. Use the “Setup /R” command which allows for recovery of an existing Microsoft Exchange-based server to new hardware. Note that the server name of the recovery server matches that of the production machine. Make sure that you select the same service account that was used for the production server.

Run the Microsoft Exchange Performance Optimizer utility to optimize Microsoft Exchange for the same configuration that was used for the production server. Again, refer to your production server configuration documentation.

Install Microsoft Exchange Client on the recovery server.

Run The Restore

This procedure assumes an On-line backup tape is used for the restore (backup performed while Microsoft Exchange services were running). If an Off-line tape is used, do not select to start the services following the restore. After the restore, execute the command “isinteg -patch” (DS must be started), and start the IS service.

Insert the restore tape.

From the Administrative Tools group run the BACKUP icon.

Double click on the TAPES icon.

Double click on the words “Full Backup Tape . . . This will result in a Catalog Status screen “ . . . Catalog Status . . . Loading set list from tape.”

On the right panel select the Directory AND Information Store.

Select the Restore button from the top of the Backup window.

On the Restore Information Screen select “Erase all existing data”, Verify After Restore, Start Services After Restore. Select OK. (If the DS and IS were backed up using separate backup jobs, be sure not to start services until both have been restored).

Select OK on the prompt re: adding a destination server.

On the Restore Information screen enter the name of the destination Server (i.e. “SALES1”). Select “Erase all existing data”. Note that if your public information store is on a separate machine, do not select to erase the public store. Contact Microsoft PSS for assistance if this is the case. Verify that Private, Public, and Start Services After Restore are all selected. Also check Verify after restore. Select OK.

Select OK from the restore prompt. This will result in a Restore Status screen.

After the restore is completed select OK from the Restore Status screen.

Close the Backup program.

Review Mailboxes For Windows NT Account Association

Highlight the Recipients container under the site

Double click on any user

Review the “Primary Windows NT Account” field to see if the Windows NT account matches the mailbox.

Repeat this procedure for several users.

Test User Log On From Client Workstations

Run the Microsoft Exchange client

Validate that the user password works

Repeat this from several workstations.

Restore Microsoft Exchange Customization

From your production server configuration sheet, create connectors and other services that you had configured. Also check the circular logging and advanced diagnostic settings as these are also stored in the Microsoft Windows NT registry.

Key Information Store Articles & Notes

Note You should ALWAYS work with PSS on these procedures. These have been provided in this document for reference. PSS is constantly gathering new information and may save you a lot of effort by checking with them FIRST to see what the alternatives are for Information Store recovery. Your circumstances may not require the procedures outlined in this section. These articles are available from the online Microsoft Knowledge Base. For the most up-to-date information, connect to http://www.microsoft.com/kb

XADM: Troubleshooting Information Store Start Up Problems

Article ID: Q147244

Revision Date: 05-SEP-1996

The information in this article applies to:

- Microsoft Exchange Server, version 4.0

SUMMARY

In Microsoft Exchange Server 4.0 it is possible that the Information Store could become corrupt and fail to start. The corruption could be caused by such things as sudden loss of power to a running Microsoft Exchange Server or faulty hardware that has written information to disk incorrectly. This article outlines the steps to recover from an Information Store that will not start.

The steps outlined in this article will be most successful if Circular logging is turned OFF and the customer has some type of regular backup procedures in place. If Circular logging is turned ON (setup default) then steps 1, 2, and 6-9 are valid (circular logging automatically writes over transaction logs files after the data they contain has been committed to the database). If a backup is NOT available then steps 1, 2, 8, and 9 are valid. For Information and strategies on Backup & Restore procedures please see Chapter 15 of the Microsoft Exchange Administrator's Guide.

The following steps should be followed in strict order to recover an Information Store that does not start. These steps followed in order will attempt to preserve, in descending order, as much data as possible.

1. Check the Windows NT Event Viewer Application Log for EDB, MSExchangeIS, MSExchangePriv, MSExchangePub messages. These error messages may give a clear reason for the problems with the Information Store. Two of the most common error messages reported to the Application log will be out of disk space or an error stating that Isinteg -patch needs to be run.

For additional information, please see the following articles in the Microsoft Knowledge Base:

ARTICLE-ID: Q128325

TITLE : XSRV IS: Reclaiming Disk Space for the Information Store

ARTICLE-ID: Q149238

TITLE : XSRV IS: Information Store Fails to Start with -1011 Error

2. Shut down all Microsoft Exchange Services, and reboot the Microsoft Exchange server. When the Information Store restarts it will automatically try to recover and return the database to a consistent state.

3. Make a full off-line backup (stop all Microsoft Exchange Services) of the Information Store. This should include all EDB and LOG files (Important: the EDB and LOG files can be stored on different physical drives. To determine where they are located, look in the registry under the HKEY_LOCAL_MACHINE subtree under the following subkey:

\SYSTEM\CurrentControlSet\Services\MSExchangeIS\ParametersSystem

and look at the DB Log Path parameter. This is done as a precautionary measure to capture the existing state of the Microsoft Exchange Server before proceeding with the following steps. This step is necessary if/when you reach steps 8 or 9 of this procedure.

4. Restore the last FULL on-line backup, be sure NOT to check the Start Services after Restore checkbox. Next restore any incremental (from the time of the last FULL on-line up to the day before the crash) on-line backups of the Information Store. Check the box Start Service after Restore only when the LAST incremental backup is being restored.

Do NOT check the box to erase all existing data.

When the Information Store starts it will roll forward through all the existing database logs and place the data into the restored Information Store. This will bring the Information Store back to the point of the crash. If successful there will be no loss of data at all.

NOTE FROM STEP 5 ON THERE WILL BE LOSS OF DATA!

5. If step 4 still will not start the Information Store, then go into the Event Viewer Application Log and review the logged messages for the source EDB; there will be one message per log file it replayed during the restore in step 4. If one of these EDB messages reports a problem replaying a particular log file then go into the Mdbdata directory and remove the corrupted log file and all log files greater in number. Once these log files are moved, try restarting the Information Store. For Example, if the Application Log says that Edb00012.log could not be processed or was corrupt and in the Mdbdata directory the log files range in number from Edb000001.log to Edb000025.log, you should remove Edb000012.log to Edb0000025.log and try restarting the Information Store. If successful this will result in the loss only of the data stored in the removed log files.

6. If step 5 fails, restore the last full on-line backup of the Information Store. Check the box to Start Service After Restore. Do check the box to Erase all existing data. This will restore the Information Store to the point in time that the on-line backup was taken. Then go into the Microsoft Exchange Administrator and run the DS/IS consistency tool on the Advanced tab of the Server object properties page.

7. If step 6 fails, repeat step 6 with the next most recent version of either a full off-line or full on-line backup.

8. If step 7 fails, delete all .EDB and .LOG files from the Mdbdata directory and restore a copy of the Priv.edb and Pub.edb from the backup of the Database when the problem started (Step 3). Next go into the Exchsrvr\bin directory and run Edbutil /d /r /ispriv followed by Edbutil /d /r /ispub. This utility will defragment the private and public information stores and try to fix any database errors it encounters. Once Edbutil.exe has finished successfully, try restarting the Information Store. If the Information Store starts, Microsoft highly recommends that Isinteg -fix be run against both the private and public information stores to clean up any inconsistencies that may have arisen as a result of Edbutil. Running “edbutil /d /r” can delete data. Running this should be a last resort and you should always work with Microsoft PSS regarding using this command.

For additional information about the Edbutil and Isinteg utilities, please refer to the Troubleshooting section in Volume 2 of the Microsoft Exchange Administrator's Guide or see the following article in the Microsoft Knowledge Base:

ARTICLE-ID: Q143233

TITLE : XSRV Adm: Command Line Parameters for Edbutil.exe

9. If all of the above steps fail, then as a last resort the Information Store can be wiped. To determine if the problem exists in either the PUBLIC or PRIVATE Information Store, you must wipe them one at a time starting with the Public Store. This process irrevocably deletes all user mail messages, all user folders (wiping the Private Store, Priv.edb) and all Public Folders (wiping the Public Store, Pub.edb).

To Wipe the Public Information Store do the following:

1. Ensure that you have completed step 3 (full backup) or copy the Exchsrvr\Mdbdata directory to another location on the hard drive.

2. In the Exchsrvr\Mdbdata directory delete all Edb*.log files, all Res*.log files, Edb.chk, and Pub.edb.

3. Now restart the Information Store service, if it starts, then you have lost all Public Folder Information (a new Pub.edb, Res*.log and Edb.chk will be created automatically) and all information in the log files, however, you will retain all the user mail messages and folders that were stored in the Private Information Store (Priv.edb)

If wiping the Public Information Store fails then do the following:

1. Remove all information that exists in the Mdbdata directory.

2. Bring back a copy of the Pub.edb from tape or alternate location.

3. Try restarting the Information Store, if the service starts then Public Folder information will be intact, however, all user mail messages, folders and information in the logs will be lost. The users mailboxes will be recreated the next time they log in.

If all of the above fails then remove all information from the Exchsrvr\Mdbdata and then restart the Information Store service, this will return the it to installation defaults.

The DS/IS consistency checker (advanced tab of server object)checker can be used to clear up any Directory/Information Store inconsistencies.

MORE INFORMATION

The Microsoft Exchange Server Information Store was designed to be a recoverable system. It relies on a daily backup procedure and transaction logs to ensure this recoverability. It is HIGHLY recommended that a DAILY backup procedure is in place and that these backups are verified regularly.

XADM: Err Msg: Error -550 Has Occurred

Article ID: Q143235

Revision Date: 25-JUL-1996

The information in this article applies to:

- Microsoft Exchange Server, version 4.0

SYMPTOMS

If the Exchange Server computer hangs or was not shut down gracefully after stopping all the services properly, the following error may be displayed on screen and in the event logs:

Error -550 has occurred

This message may also appear in the Directory or Information Store database in case of a power failure.

RESOLUTION

This error usually means that the database is in an inconsistent state and cannot start. Confirm that the state of the database is inconsistent, and then try a defragmentation repair. Be sure to stop all services and backup all files before you run the EDBUTIL.EXE program.

To check the state of the database, use EDBUTIL.EXE with the "MH" option on the problem database and dump the output to a text file:

EDBUTIL /MH c:\exchsrvr\dsadata\dir.edb >c:\edbdump.txt

-OR-

EDBUTIL /MH c:\exchsrvr\mdbdata\priv.edb >c:\edbdump.txt

-OR-

EDBUTIL /MH c:\exchsrvr\mdbdata\pub.edb >c:\edbdump.txt

Edit the EDBDUMP.TXT file and confirm that the state of the database is Inconsistent. If the database is in an Inconsistent State, run the EDBUTIL.EXE utility against the database. You must complete a full backup of the Exchange Server s Directory and Information Stores before you use the EDBUTIL utility. To repair the database, use the following EDBUTIL syntax:

EDBUTIL /D /DS /R

where D=Defragmentation and R=Repair database while defragmenting. Running “edbutil /d /r” can delete data. Running this should be a last resort and you should always work with Microsoft PSS regarding using this command.

Use /ISPRIV or /ISPUB instead of /DS for repairing the private or public Information Stores. There is a difference between the Repair(/d [database] /r) database while defragmenting and Recovery(/r) option of EDBUTIL.

After repairing the database, move the EDB.CHK file out of the c:\exchsrvr\dsadata directory (in case of Directory database repair) or from the c:\exchsrvr\mdbdata directory if you are repairing the Information Store. Now try to restart the services; this will create a new EDB.CHK file.

If the above does not work (the service does not start), then the LOG files may have a problem. When the service tries to start-up, if it was not stopped gracefully, it reads the LOG files. Try moving all files from c:\exchsrvr\dsadata directory except for DIR.EDB or from the c:\exchsrvr\mdbdata directory except for PRIV.EDB and PUB.EDB to a TEMP directory. Now try to restart the services; the new log files will get re- created upon startup.

The EDB-HexId.log files that were moved to the TEMP directory are required for incremental backups. If you are able to start the service successfully, move these back to the original database directory.

If you are having further problems and the service still returns the error above, please contact your primary support provider.

Additional reference words: 4.00 winnt XSRVDS

How To Remove The First Exchange Server In A Site

Article ID: Q152959

Revision Date: 26-SEP-1996

The information in this article applies to:

- Microsoft Exchange Server, version 4.0

- Microsoft Exchange Server, version 4.0a

SUMMARY

This article outlines the steps necessary to remove the first Microsoft Exchange Server installed in a Microsoft Exchange site.

In addition to any mailboxes and public folders, by default the first Server in a Site will contain and be responsible for the Site Folders. Site Folders consist of the Off-line Address Book folder (OAB), the Schedule+ Free Busy Information folder, and the Organizational Forms folder, if one exists. Other Servers installed in the Site rely, by default, on the first Server for this information. For example, in order for the third Server in the Site to generate the OAB, it must make a connection to the first Server. Removing the first Server in the Site without performing the steps in this article might lead to the data contained in these Site Folders becoming inaccessible or lost.

The first Server in the is also, by default, the Routing Calculation Server. The Routing Calculation Server is responsible for updating the Site s Gateway Routing Table (GWART). This responsibility also must be re- assigned before removing the first Server from the Site.

If you have removed the first Server in the Site before reading this article, please see the following article in the Microsoft Knowledge Base:

ARTICLE-ID: Q152960

TITLE : Rebuilding the Site Folders in a Site

MORE INFORMATION

Before removing the first Microsoft Exchange Server in the Site, follow these steps to avoid problems:

Important If there are any users or Public Folders (non-Site) homed on this Microsoft Exchange Server, the mailboxes must be moved to another Server in the Site, and any Public Folders must be replicated to other Servers in the Site, to ensure no loss of data. Please refer to the Microsoft Exchange Administrator Guide, chapters 12 and 15, for more information.

Off-line Address Book:

1. Pick a Server in the Site to contain the OAB.

2. Using the Microsoft Exchange Administrator program, select the Configuration container and open the properties of the DS Site Configuration object.

3. In the DS Site Configuration object's properties page, select the Offline Address Book tab.

4. In the Offline Address Book Server drop down list, select the Server that was chosen in Step 1.

5. Click the Generate Offline Address Book Now button.

6. Click OK.

Schedule+ Free Busy Information and Organizational Forms:

1. Pick a Server in the Site to contain the Schedule+ information and the Organizational Forms.

2. Using the Microsoft Exchange Administrator Program, select the Configuration container, the Servers container, and the Server chosen in step 1.

3. Double-click the Public Information Store object.

4. Click the Instances tab.

5. In the Public Folders list box, select the Schedule+ Free Busy Information and Organization Forms folders and click Add. Note that these folders should have the name of the first Server in the Site after the dash, in other words, Schedule+ Free Busy Information - firstserver. This process creates a replica of these folders on the Server chosen in step 1.

6. Click OK.

Routing Calculation Server:

1. Pick a Server in the Site to be the new Routing Calculation Server.

2. Using the Microsoft Exchange Administrator program, select the Configuration container and double-click the Site Addressing object.

3. Click the General tab.

4. In the Routing Calculation Sever drop down list box select the Server chosen in Step 1.

5. Click the Routing tab and click the Recalculate Routing button.

6. Click OK.

Note We recommend that the first Server in the Site be powered off or unplugged from the network temporarily after the above steps are performed to verify that this procedure has completed successfully. Once the changes have been verified to work, turn the first Server back on, or plug it back into the network, and then perform the following steps to permanently remove this Server from the Site:

1. Using the Microsoft Exchange Administrator program, select the Configuration container and then the Servers container.

2. Highlight the first Server in the Site.

3. Click Delete from the Edit menu or hit the delete key on the keyboard.

Authoritative Restore

If you find that you perform a restore and directory information on the restored server changes or automatically gets purged, you may be experiencing an undesired backfill state where previous replicated changes from the restored server are replicating back from another server since the other server has a change record that is more up-to-date than is reflected in the restored database.

The Authoritative Restore tool (Authrest.exe) available on the Microsoft Exchange server CD allows you to force a restored directory database to replicate to other servers after restoring from a backup. You can receive assistance using this tool from Microsoft Product Support Services.

Normally, a restored database is assumed to be more out-of-date than the collective information held on all the other directory replicas in the organization. A restored directory would normally replace its own information with the more recent data held by other servers. This functionality is correct when the reason for the restore is that a database or server was destroyed, but it is not correct in all cases. For example, if an administrative error deleted thousands of mailboxes or vital configuration information, the goal of restoring from backup is not to restore one server to functionality, but to move the entire system back to before the undesired changes were made.

Without Authoritative Restore, you would need to restore every server in the organization from a backup that predates the error or restore every server in the site, and then force all bridgeheads in other sites to resynchronize from scratch. If only one server were restored, or if servers were restored one at a time, the restored server would quickly overwrite its restored data with the more recent (incorrect) information held by all other servers in the site.

Using the Authoritative Restore tool, object versions and USNs can be advanced on all writable objects held by that directory so that the data held on the backup appears to be more recent than any copy held by other servers. Normal replication then causes the restored information to spread to all servers throughout the organization. This tool allows you to restore one server (presumably the one server with the most recent pre-mistake backup) rather than all servers.

The authrest.exe executable file can be found on the Microsoft Exchange Server CD in the directory “\Support\Utils\<platform>”.

Restoring Service Packs

When restoring databases, it is important that the restored databases are run under the same Microsoft Exchange version that they had previously run under. Therefore, you should not start services until all of the code is up-to-date. For example, if you are running at Microsoft Exchange service pack 2 but have the original server CD and SP2 code, you should have the SP2 code loaded before running Microsoft Exchange with your restored databases from an SP2 level server.

To accomplish this, you can use the SETUP /R and UPDATE /R switches for the original server code and service pack installations. This tells the setup program NOT to start services. The “/R” switch also assumes that you will be providing the database files from a restore. You can also run SETUP and UPDATE without the “/R” switches, and when at the correct service pack level, perform a restore of your databases replacing the new databases that were installed by SETUP. Be sure to follow the appropriate restore procedures.

Note that if you run SETUP /R, it will NOT create the DIR, PUB, and PRIV.EDB files. Normally these files are created as per the Org and Site name given during setup. SETUP /R simply copies the DIR.EDB exactly the way it is from the Microsoft Exchange Server CD. You will not be able to start the DS service with this default DIR.EDB after running SETUP /R.

Also note that when running SETUP /R you must restore ALL of the database files (DIR, PUB, and PRIV.EDB). If you plan to restore only the Information Store and not the Directory Store, then DO NOT run SETUP /R.

Restoring From An OST After Mailbox Deletion

OSTs are "slave" replicas of server-based folders. If you delete the master, the slave is orphaned.

If the original Exchange profile was not modified, then you should still be able startup offline with the old OST, and recover the data by copying to a PST file. However, if the old profile was deleted, or modified (by using it to log onto the new mailbox) then the data is lost.

This is because of how security is enforced on OSTs -- obviously we can't enforce NT authentication while you're offline. Instead, you have to "prove" that you're allowed to log onto the server-based master before the OST file will give you local access. Exchange does this by creating an encrypted "cookie" from your mailbox's unique entry ID, while you're successfully logged into the server. This cookie is securely stored in your Exchange profile -- essentially your profile stores the "key" for the OST. Every time you try to access the OST file, it checks your profile for the existence of this "key".

So OST data can be un-recoverable if two things happen: 1) you delete the master server mailbox, and 2) you also delete or modify the profile containing the "key" to the OST.

Using SCANPST.EXE To Repair PST and OST Files

The SCANPST program, also known as the Inbox Repair Tool, will repair both PST and OST files. This tool is similar to the MMF check capability in Microsoft® Mail and is installed in the Microsoft Exchange client subdirectory by default. SCANPST will perform 8 checks on the selected file. Note that during repair, you have the option to backup the existing file prior to making the repair. This will however require that you have up to 1x the available disk space of the PST or OST file size.

<><>

Figure 1.3 SCANPST screen.

S.I.D.s (Security Ids), Secret Objects, and Windows NT-based Machine Accounts

<><>

Figure 1.4 Example of Windows NT Secure Channel During Normal Production.

Note that the Windows NT SID for EXS1 is “xyz”. Each Windows NT-based machine has a unique SID which is used for domain authentication. Note that “xyz” was used for the example and is not the actual SID format. In order to connect to the domain, the Windows NT BDC or Member Server must have a matching SID and LSA (local security authority) password in order for authentication to take place.

<><>

Figure 1.5 Secure Channel Failure.

This is an example of what occurs if you do not first delete and re-add the Microsoft Exchange-based server machine account before installing a recovery server. If a recovery server is rebuilt installing Windows NT from scratch, and the same machine name is used, NETLOGON will fail since the old machine account and SID remain in the domain SAM and can only be reset from within the Server Manager program by deleting and re-adding the machine account.

<><>

Figure 1.6 A Re-established Machine Account.

This is an example of a re-established machine account. When the old machine account is deleted and re-added to the domain SAM, the SAM entry is first set to an initialize state. When the new server is added, a local LSA secret object is created along with a SID, thus synchronizing the LSA secret object (stored locally on the BDC or Member Server) with the SAM object for the respective machine. Additionally a password is generated that is used whenever the BDC or Member Server machine logs on to the domain. This process creates a secure channel between the BDC or Member Server and the PDC. This secure channel password is changed automatically by NETLOGON to prevent the password from being discovered.

Note that the LSA secret object is created by setup during the initial installation or when a server joins a domain. The SAM machine account however, is created by the Server Manager program. For more information refer to PSS article Q102476.

GENERAL PRACTICES

Create and Verify Daily Backups

This is a very critical step in disaster recovery. It sounds simplistic but you can only recover data if you have a valid backup. It is often “assumed” that backup tapes are being swapped and that data is being properly backed up. It should be a daily routine to review all back up logs and to follow up on any errors or inconsistencies. Furthermore, Full (Normal) backups reset and remove transaction logs. This results in free disk space (this is less of an issue if circular logging is enabled). If circular logging is not enabled and daily Full backups are failing, transaction logs will not be purged and can fill up the entire transaction log disk drive. Failure to verify backups is one of the most common mistakes made.

Perform Periodic File Based Backup

In order to capture all configuration data, it is best to perform a full file based backup periodically. Services should be shut down so that open files can be backed up. This will ensure that you have backed up all possible Microsoft Exchange related files. This might be performed during the scheduled maintenance window. Note that file based backup is not required for backing up the Information Store and Directory databases. On-line backups are recommended for backing up the Information Store and Directory.

Standardize Tape Backup Formats

Recovery equipment must be compatible with production tape equipment. If you deploy a new type of tape drive, make sure that you equip recovery equipment with a compatible model. You should also test reading and restoring production tape backups on the tape drive used for recovery.

Deploy an UPS and Test It Periodically

Don’t take the approach that if the Microsoft Exchange-based server “goes” due to a power outage, all other servers will go too. Make sure that you are UPS protected. Many computer rooms are supposedly UPS protected. Even though this may be the case, it is very possible that not all outlets are UPS protected. If you do not have a dedicated UPS, make sure that you speak with the local electricians or operations personnel and perform a test. It is wise not to make assumptions here because users will users will hold you accountable and not the person or paperwork that said the outlets were covered by UPS protection. Also note that server class UPS system batteries can wear out every 3 years or so and require replacement.

Perform A Periodic Fire Drill

This is the to measure your ability to recover from a disaster and to certify your disaster recovery plans. Conduct this in a test environment and simply attempt a complete recovery. Be sure to use data from production backups. During this process it is best to record the time it takes to recover. This information will assist you in determining time to recovery in a real disaster recovery situation. From personal experience, up to 1/3 of the recovery time can be spent in preparing and getting the correct tools in place to complete the job. For maximum effect, provide no notice to your staff that you are performing a drill. This will be the most valuable experience that you will have in your disaster recovery planning.

Review the Environment When Placing Production Servers

Inspect the area when deploying servers. Make sure that the environment will be receptive. For example, make sure that there is enough power and if possible, dedicate power lines for your equipment. Review existing amperage and new amperage requirements. Make sure the servers are not placed under fire sprinklers. Also be sure to locate servers in a physically secure location and ensure that the room temperature is acceptable. The robustness of Microsoft Exchange can be compromised by failure to perform basic preventive maintenance routines when deploying servers.

Check Windows NT Event Logs Daily

It is best to take a proactive approach and review logs regularly. This can help you identify problems before they have an impact. Extensive logging is available in Microsoft Exchange and this should be leveraged. Logging tools are available on the Microsoft Exchange Server Technical Resource CD-ROM.

Create a Disaster Kit

Planning ahead will reduce the time to recovery. It is critical to build a kit that includes items such as operating system configuration sheet, hard drive partition configuration sheet, RAID configuration, hardware configuration sheet, EISA/MCA configuration disks, Microsoft Exchange configuration sheet, Windows NT emergency repair diskette, Microsoft Exchange Performance Optimizer settings sheet, etc. The goal is to minimize the time to recovery. I have found that a significant portion of recovery time in tests was spent trying to locate information or disks that we needed to configure our recovery system.

Publish A Microsoft Exchange Maintenance Window

A Microsoft Exchange-based server is no different than a car that requires oil changes and check-ups. Unlike mainframes, servers often get overlooked when it comes to scheduling downtime for maintenance. It is a simple formula: planned maintenance generally reduces unplanned downtime. It is important though to set user expectation levels by publishing a maintenance window especially when users expect 7x24 service. Maintenance is inevitable since the nature of the data processing business includes service pack updates, software upgrades, and hardware upgrades. Although rare, It might be necessary to take down the Information Store service in order to reduce the size of store files using EDBUTIL.

Determine Downtime Cost

This is useful when justifying the purchase of recovery equipment. There are different models for calculating the per hour downtime cost and this varies per business. Some calculations include Lost Orders Per Hour, Delayed Financial Transactions, and the cost of Delayed Time Sensitive Market Decisions. See the National Computer Security Association News (July 1996) for detailed articles on justifying disaster recovery expenditures.

Consider Maintaining Off-Site Tapes and Equipment

Due to legal and or security issues, certain companies opt not to send backup tapes to a third party off-site location. An alternative to this is to send tapes to an off-site location within the same company.

Dedicate Recovery Equipment And Build A Recovery Lab

It is important to dedicate hardware. Don’t fall into the trap of allowing test equipment to become production equipment without replacement. Make sure that the recovery equipment is always in working order and available at a moments notice. What tends to happen is that companies purchase recovery equipment, install some “test only” software and then become dependent on this equipment for production use. In short, keep recovery equipment in a dedicated mode. Another reason to build a lab is for recovery purposes. Note that up to 2X the disk space of the largest production server Information Store database is required for recovery and database defragmenting using the EDBUTIL utility. It is more cost effective for an organization to maintain one recovery server with sufficient disk space.

Keep Solid Records Of All Configuration Done To The Production Server

This will be necessary when configuring the recovery server. Records include Windows NT tuning settings, path information, protocol addresses, Microsoft Exchange connector configuration, etc. These records should be part of the disaster recovery kit discussed above.

Take A Proactive Approach To Monitoring The Information Store

Monitor the growth of the Information Store and server performance and be prepared with a plan to remedy these issues. Windows NT disk space alerts can be setup as well to monitor remaining disk space. Performance Monitor objects exist for the Information Store and should be used.

Devise An Archiving Plan

An archiving plan will allow end users to move server based messages into local store files. This will help reduce the size of the server based Information Store. Have users store PSTs on local drives or on a separate disk or server from that of the Information Store. Dedicate a file server for PST archiving if required. Otherwise, data will be reduced in the Store but added to another area of the same disk or logical drive. The hit will be greater since PST storage maintains messages in both RTF and ASCII format. Note also that disk space limits cannot be set on PST files. Be sure to include all sensitive data in backup strategies, including end user PST files. Use encryption when creating .OST and .PST files.

Microsoft Exchange Configuration Considerations

Consider Microsoft Exchange Server Roles

You should always avoid making the Microsoft Exchange server a PDC. If this machine becomes unavailable, an alternate domain controller must be promoted to become the primary domain controller. If the Microsoft Exchange-based server is not the PDC, you don't need to worry about promotions and demotions of Domain Controllers in a recovery situation.

Some companies prefer to place the Microsoft Exchange Server on a BDC in the accounts domain so that a second machine is not required for Windows NT authentication in remote offices. This can save in the cost of purchasing another machine, however, be sure to account for additional RAM overhead for the Windows NT SAM in addition to Microsoft Exchange server memory requirements. In general, Windows NT domain controllers require RAM equal to 2.5x the size of the SAM. See the Microsoft Windows NT domain planning white paper for details.

If the Microsoft Exchange-based server is a member server and not a PDC or BDC, additional memory overhead for the domain SAM will not be required, however, for remote offices, companies can save money by having the local Microsoft Exchange-based server provide authentication (BDC) and messaging services. Note that for a proper DS restore, access to the original SAM is required. Never install a Microsoft Exchange-based server in a domain that does not have a BDC.

An alternative is to place the Microsoft Exchange servers in a large resource domain that trusts each accounts domain. In this case, the Microsoft Exchange servers can be placed on BDCs without incurring significant memory overhead since the SAM for the Microsoft Exchange resource domain will be relatively small in size.

Locate Transaction Log Files On Separate Dedicated Physical Disk

This is the single most important aspect of Microsoft Exchange-based server performance however there are recovery implications as well. Transaction logs provide an additional mechanism for recovery.

Locate IS On RAID5 Stripe Set or Mirrored Set

Since the IS uses random access this provides excellent performance. Furthermore, these provide an added level of recoverability.

Disable SCSI Controller Write Cache

To avoid the potential for data loss, disable SCSI controller write cache. At a programming level, if the write through flag is set, Windows NT will not use buffers and therefore, when a program receives a write complete signal from Windows NT, it is guaranteed that the write was completed to disk. This is critical to the Microsoft Exchange transaction logging process. So if write cache is enabled, Windows NT will think that a write has made it to disk and will inform the calling application of this “false” information. This could result in data corruption if a crash is experienced before this lazy write operation makes it to disk.

Mirror Or RAID5 The OS Partition

This provides redundancy for the underlying operating system.

Use Hardware RAID And MIRRORING When Possible

Use hardware RAID5 wherever possible so that a disk drive failure can be remedied real-time by plugging in a replacement drive. Software RAID requires reconfiguration to add a new drive when bringing the system back to its original configuration following a failure. System partitions should be mirrored or RAID5 for redundancy.

Disable Circular Logging If Possible

While Circular Logging can help conserve disk space the drawbacks include Incremental and Differential Backups are disabled and transaction log history is cyclical and cannot be “played back”. Furthermore, if a solid backup strategy is in place, transaction log files will be purged on a regular basis thus freeing up disk space.

Place Limits On Information Store Attributes Early To Set User Expectations And To Properly Size Servers

Configure mailbox storage limits and maximum age of server based messages. Also limit MTA message sizes and the size of messages that users can send.

Configure The MTAs Accordingly

Configure the MTA frequency such that queues are cleared quickly. This prevents queued messages from accumulating in the Information Store. Also, design a redundant MTA path so that messages keep flowing in the event of a link outage. It is important that MTAs are able to keep up with the traffic that flows through them to reduce messages in the store and for timely message delivery.

Equip Servers With Sufficient Disk Space

Off-Line maintenance and repair routines require up to 2x the disk space of the database file being administered with the EDBUTIL utility.

<><>

Figure 1.7 Sample configuration outlining distribution of Microsoft Exchange Information and Local Store data.

For optimal performance and recoverability, the Operating System drive should be mirrored (or RAID5); transaction logs should be placed on a dedicated physical drive (this too can be mirrored); Windows NT Swap file and the Information Store should be placed on a RAID5 stripe set.

Backup Type Strategies

This section deals with backup type strategies. For example, is it better to perform full backups every night? What are the tradeoffs? Depending on business requirements, backup strategies may vary.

Time Required For Backing Up

<><>

Figure 1.8 Time required for backup depends on backup type.

The time required for backing up depends on the backup type. In this chart we note that performing a Full backup daily requires the most amount of time. For smaller databases, this is not so much of an issue however, when data becomes in the gigabyte range, it is not always desirable to perform daily Full backups. A Full backup in combination with an Incremental and or Differential backup type may be more applicable to your situation.

Example A - Full Daily Backup

Schedule: SU:F, M:F, T:F, W:F, TH:F, FR: F, S:F

Advantages Disadvantages

Always remove transaction log files Impacts server performance longest

Only requires one tape restore Requires the most tape space

Simple schedule Daily tape swaps usually required

Allows circular logging

Example B - Full Plus One Incremental

Schedule: SU:F, M:I, T:F, W:I, TH:F, FR: I, S:F

Advantages Disadvantages

Always remove transaction log files Requires two tapes to restore

Multiple full backups on separate tapes Must have knowledge of backup cycle

Incremental has much less performance impact Circular logging must be disabled

Tape rotations are less frequent

At most, two tapes required for restore

Example C - Full Plus Two Incrementals

Schedule: SU:F, M:I, T:I, W:F, TH:I, FR: I, S:F

Advantages Disadvantages

Always remove transaction log files Restore requires full plus each incremental - in this case up to three tapes

Full backups relatively frequent Must have knowledge of backup cycle

Little performance impact on server Circular logging must be disabled

Incremental requires minimal tape space

Example D - Full Plus Two Differentials

Schedule: SU:F, M:D, T:D, W:F, TH:D, FR: D, S:F

Advantages Disadvantages

Full backups fairly frequent Differential backups do not remove log files

At most, two tapes required for restore Circular logging must be disabled

Little performance impact on server

Differential require minimal tape space

Summary

The strategy that you choose must fit your business requirements. A general rule however is to use Full daily backups for small data sets. For large data sets use a combination of Full, Incremental, and or Differential (examples B,D,C). This minimizes the performance impact and also minimizes required tape space.

"HOT SPARE" Question

A common question is whether it is possible to maintain a live “hot spare” server running at all times for Microsoft Exchange recovery. The answer depends on how you define “hot spare”.

Configuring a recovery server with the same computer name is necessary for a complete server restore (includes restoring the Directory and the Information Store). For this reason, a secondary recovery server cannot remain on-line since duplicate NETBIOS names will exist. Also, two machines with the same name cannot exist within a Windows NT domain. Without restoring the Microsoft Exchange Directory, you can get back individual mailboxes quite easily however all objects will need Windows NT security reconfigured. This can be a complex operation since it is manual.

You can however, prepare recovery equipment with copies of all required production code. For Single Mailbox Recovery, a Windows NT Server-based machine can be kept on-line since recovery does not include restoring the Microsoft Exchange directory. The Microsoft Exchange requirement for single mailbox restore is that the same Site and Org name are used on the recover server. If the recovery server is only servicing one Site, then Microsoft Exchange can be up and running but not connected to the production site.

In the case of full server recovery, keep installation code on the recovery server (i.e. \ntinstall\i386, \patches\sp4, \exchinst\i386). The recovery server hardware should be equipped with the same capacity as the production server. In fact, the recovery server can be running under a different machine name so that it can be doing non-critical work such as an additional Remote Access Service server or Microsoft Mail MMTA. If the machine is required for Full Server Recovery, the machine name can be renamed and or Windows NT can be reinstalled.

Note that as cluster technology becomes available, the BackOffice™ family of products such as Microsoft Exchange and Internet Information Server can capitalize on the high availability that clusters provide. For more information contact http://www.microsoft.com/ntserver/clustrmb.htm.

ON-LINE BACKUP AUTOMATION EXAMPLE: IS/DS

Install the WINAT.EXE program from the Windows NT 3.51 Resource kit into c:\winnt35 on the desired machine.

Create a Windows NT Common Group called Microsoft Exchange Backup.

Create an icon for the c:\winnt35\backup.log file. This will provide quick access to review the backup log.

Copy the NTBACKUP.EXE icon from the Administrative Tools group to the Microsoft Exchange Backup Group.

Create an icon for WINAT.EXE in the Microsoft Exchange Backup group.

From control panel, services, highlight the SCHEDULE service and click on the STARTUP button. Configure for automatic startup and assign an ID that is a member of the Windows NT Backup Operators group. Be sure to enter the correct password. If the administrator ID password changes, note that you must change the password for the SCHEDULE service. When done, start the SCHEDULE service.

Create the backup batch file. Name this file BACK.BAT and save it in the c:\winnt35 subdirectory. See below for an example.

Run the WINAT.EXE program and schedule the BACK.BAT file. Note that you do not need to have a logon session on the machine in which WINAT is running since the SCHEDULE service will log on to perform the operation under the defined security context. Be sure to set the batch job for “interactive” mode.

Sample Batch File For On-line Backup

rem ** 3/7/96 Backup Written by Joseph Pagano

rem ** This will backup the IS and DS on both WNTEXS1 and WNTEXD1.

ntbackup backup DS \\WNTEXS1 IS \\WNTEXS1 /v /d "WNTEXS1 IS-DS" /b /t Normal /l c:\winnt35\backup.log /e

ntbackup backup DS \\WNTEXD1 IS \\WNTEXD1 /a /v /d "WNTEXS1 IS-DS" /b /t Normal /l c:\winnt35\backup.log /e

exit

Sample Batch File For Off-line Backup: Example 1

Note that you may need to experiment with the order in which you stop the services so that you do not get prompted when there is service dependent upon the one which you are stopping.

rem ** stop Microsoft Exchange Services

rem ** you can stop Microsoft Exchange services and restart them automatically to

backup

rem ** files that a particular service may hold open

REM // stop all services

echo Stopping Services...

net stop MSExchangeMSMI

net stop MSExchangePCMTA

net stop MSExchangeFB

net stop MSExchangeDX

net stop MSExchangeIMC

net stop MSExchangeMTA

net stop MSExchangeIS

net stop MSExchangeDS

net stop MSExchangeSA

ntbackup backup c:\ d:\ /a /v /d "Full File Based Backup" /b /l c:\winnt35\backup.log /e

REM edbutil OPTIONS

net start MSExchangeSA

net start MSExchangeDS

net start MSExchangeIS

net start MSExchangeMTA

net start MSExchangeIMC

net start MSExchangeDX

net start MSExchangeFB

net start MSExchangePCMTA

net start MSExchangeMSMI

Sample Batch File For Off-line Backup: Example 2

Note that you are able to start and stop PCMTA services by enclosing the service name in quotes. You can determine the service names from the Microsoft Exchange administrator program, the Windows NT control panel, or by looking into the Windows NT registry. If using the Windows NT registry, select HKEY_LOCAL_MACHINE, SYSTEM, CurrentControlSet, Services. All services will be listed in alphabetical order.

rem Batch File To Stop and Restart Microsoft Exchange Services

rem For File Based Backup

echo Stopping Services ...

net stop MSExchangeMSMI

net stop MSExchangePCMTA

net stop MSExchangeFB

net stop MSExchangeDX

net stop MSExchangeMTA

net stop MSExchangeIMC

net stop MSExchangeIS

net stop MSExchangeDS

net stop "PC MTA - HUB"

net stop MSExchangeSA

ntbackup BACKUP d:\exchsrvr\mdbdata /v /d "File Based Backup" /b /l c:\winnt35\backup.log /e

net start MSExchangeSA

net start MSExchangeDS

net start MSExchangeIS

net start MSExchangeMTA

net start MSExchangeIMC

net start MSExchangeDX

net start MSExchangeFB

net start MSExchangePCMTA

net start MSExchangeMSMI

net start "PC MTA - HUB”

WINAT Scheduler and The Windows NT Schedule Service

<><>

Figure 1.9 Windows® AT command scheduler.

Make sure that the back.bat jobs are set for interactive. This is required by the NTBACKUP.EXE program.

When using the WINAT scheduler program, the Windows NT Schedule service actually runs all jobs that have been scheduled. Since batch jobs are run in the context of the schedule service, Windows NT security must be taken into account. When configuring the Schedule service, configure the account to be a member of the Windows NT Backup Operators Group. This will allow for a full backup of the Information Store and Directory to take place.

<><>

Figure 1.10 Windows NT Schedule service configuration dialogue box.

A Real World Information Store Recovery Example

This procedure was used as a last resort operation. This was a case where there was no recent backup available and Circular Logging was enabled, preventing the playback of past transactions. The IS PRIV.EDB was repaired using the EDBUTIL. This is an example and should only be implemented with the assistance of Microsoft Product Support Services (PSS). Although extremely rare, an IS can become corrupted due to hardware failure or device driver failure.

A symptom that your IS is in an inconsistent state is when on-line backup of the IS fails. This is why it is critical to review backup logs and the Windows NT event log on a regular basis. In a real-world situation, the following errors appeared in the Windows NT System event log:

Event ID: 23 ; Source: EDB; Type: Error; Category: Database Page Cache; Description: MSMicrosoft ExchangeIS ((458) Direct read found corrupted page error -1018 ((-1:550144) (0-589866), 486912 1162627398 3480849804). Please restore the database from a previous backup.

Event ID: 8010; Source: NTBACKUP; Type: Error; Category: None; Description: Microsoft Exchange services returned ‘c80003fa’ from a call to ‘BackupRead()’ additional data.

The NTBACKUP log revealed that the on-line backup was failing 2.2GB into a 3.4GB IS backup. Until the problem could be resolved, an off-line backup strategy had to be executed. We used the batch file in Example 2 of the sample batch file section with the Windows NT AT scheduler to shutdown Microsoft Exchange services, perform an off-line backup, and restart Microsoft Exchange. Note that in this case, the IS appeared fine from a user perspective and remained functional during business hours.

Listed below is the procedure that was used to repair the IS. Note that the procedure was performed strictly in a test lab prior to implementing in production. It is strongly advised that you pursue the same strategy and perform all repair testing in a lab environment prior to implementing in a production environment. This was executed on an IBM 720 server with 8GB disk capacity, RAID 5 drives. The PRIV.EDB file was 3.4GB.

IMPORTANT NOTE When running EDBUTIL with the /d (defragment option), you will need available disk space up to 1X the size of the EDB file(s) in which you are running against (or a disk 2X the size of the EDB). For example, if your PRIV.EDB file is 3.4 GB and there is zero empty space in the file, you will need 3.4 GB of available space for the defragment operation and this is not including any swap file growth that may temporarily occur. If there is .4GB of empty space in the PRIV.EDB file, you will need 3GB of available space. By default, the file TEMPDFRG.EDB will be used to rebuild the EDB file. You can use the /t option when running EDBUTIL to redirect the TEMPDFRG.EDB file to a location other than the current directory and also to give it a different name. You cannot however, redirect this temporary file to a LAN connected drive. WARNING: EDBUTIL /D /R will DELETE messages. Use only as a last resort. EDBUTIL /D does not delete messages and is used for off-line defragmenting and reducing EDB file size.

Action Time Notes

NTBACKUP.EXE ~2.5 hours We stopped all Microsoft Exchange services and performed file based RESTORE of PUB.EDB and PRIV.EDB from a recent production backup. Our approach was to establish the production problem in a test environment. We built a server with same ORG and SITE name as production but used a different server name. This machine was also setup on a separate domain. This allowed us to restore an IS from a different machine (the production machine). Org=XYZ, Site = NJ1, Server = SERVER2.

isinteg - patch ~5 minutes After restore, ran isinteg -patch command for guid’s – required for restoring an off-line IS backup. We needed to start the DS and System Attendant prior to running this command.

Start Microsoft Exchange Services ~3 minutes

Ran Microsoft Exchange Admin; Highlighted Server; Selected File, Properties; Advanced Tab, All Inconsistencies, Adjust ~5 minutes This was to synch the IS with the DS on the test server since we only restored the IS files and not the directory.

Associate NT account to test mailbox ~5 minutes After running the DS/IS consistency adjustment, we needed to use the administrator program to assign a valid Windows NT ID to a recovered mailbox.

Test Messaging ~10 minutes From Microsoft Exchange Client, sent and received several messages to test basic message functionality.

NTBACKUP.EXE ~1.5 hours Ran an on-line ntbackup.exe job to test whether we would fail as we do in production. At 1.5 hours, IS backup failed 2.2 GB into the job with event id 8010 error. This verified that we were testing with failing IS which was our objective. In essence, we reproduced the production problem in a test environment with a copy of the production IS.

Stop Microsoft Exchange Services ~3 minutes

edbutil /ispriv /d /r /n ~2 hrs 30 minutes From the directory d:\exchsrvr\mdbdata\ , ran this is to determine if we can repair to corrupt IS

Isinteg -pri -fix -verbose -l isinteg.log ~1 hr 10 min Ran this against PRIV and PUB .EDB files.

Start Microsoft Exchange Services ~3 minutes

Test Messaging ~10 minutes

NTBACKUP.EXE ~2.5 hours Performed another on-line backup to see if the problem still remained. The on-line backup of the IS and DS completed successfully!

Backing Up A Key Management Server

It is recommended that you back up the KM data files (i.e., C:\SECURITY\MGRENT) separately from other data, and keep these backup tapes more secure than your everyday backups. All of the actual keys in these files are 64-bit CAST encrypted, so it's extremely secure -- but remember that this database contains every user's private encryption keys for your whole enterprise.

The problem with tape cartridges is they are offline -- if someone got a hold of one, they could restore the files to their own server, and then take all the time in the world to try & crack the key used for the database, with no fear of detection due to online logins, etc.

Now with today's technology it's really infeasible to crack a 64-bit key (est. 12+ years with $300K worth of dedicated crypto hardware -- much longer with PC technology**) -- but technology improves every year. You should get in the habit of treating the KM databases as among the most secure assets in their entire information system.

** see http://www.bsa.org/policy/encryption/cryptographers.html for a discussion of key lengths, estimated time to crack, etc.

From the Microsoft Exchange Administrator Guide:

Advanced security data is stored in the \MGRENT subdirectory where you installed the KM server. You can back up the advanced security data using the Windows NT Backup utility to back up all files and subdirectories in this directory. Stop the Microsoft Key Management Server service before backing up.

Caution You should periodically back up the advanced security data. If a user’s security file is corrupted, or a security logon password is forgotten, encrypted messages, including all previously encrypted messages, cannot be opened by the user. The key must be recovered using “Recovering Advanced Security Keys” [see admin guide] and the advanced security data must be current for this operation to succeed.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.

© 1996 Microsoft Corporation. All rights reserved.

Microsoft, Windows, Windows NT, and BackOffice are registered trademarks of Microsoft Corporation.

SCSI is a registered trademark of Security Control Systems, Inc.





No comments:

Post a Comment

LinkWithin

Popular Posts