Anda di halaman 1dari 40

SQL SERVER ARCHITECTURE

TOPICS Pages and Extents Architecture Files and Filegroups Architecture Transaction Log Architecture

PAGES AND EXTENTS ARCHITECTURE

The page is the fundamental unit of data storage in SQL Server.


An extent is a collection of eight physically continuous pages. Extents help efficiently manage pages.

TOPICS NEEDS TO BE COVERED

Understanding pages and extents Managing Extent Allocations and Free Space Tracking Modified Extents

UNDERSTANDING PAGES AND EXTENTS

PAGES
In SQL Server, the page size is 8 KB. SQL Server databases have 128 pages per megabyte. Each page begins with a 96-byte header that is used to store system information about the page. This information includes the page number, page type, the amount of free space on the page, and the allocation unit ID of the object that owns the page

EXTENTS Extents are the basic unit in which space is managed. An extent is eight physically continuous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte. To make its space allocation efficient, SQL Server does not allocate whole extents to tables with small amounts of data. SQL Server has two types of extents: Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object. Mixed extents are shared by up to eight objects. Each of the eight pages in the extent can be owned by a different object.

MANAGING EXTENT ALLOCATIONS AND FREE SPACE

GLOBAL ALLOCATION MAP (GAM)


GAM pages record what extents have been allocated. Each GAM covers 64,000 extents, or almost 4 GB of data. The GAM has one bit for each extent in the interval it covers. If the bit is 1, the extent is free; if the bit is 0, the extent is allocated.

SHARED GLOBAL APPLICATION MAP

(SGAM)

SGAM pages record which extents are currently being used as mixed extents and also have at least one unused page. Each SGAM covers 64,000 extents, or almost 4 GB of data. The SGAM has one bit for each extent in the interval it covers. If the bit is 1, the extent is being used as a mixed extent and has a free page. If the bit is 0, the extent is not used as a mixed extent, or it is a mixed extent and all its pages are being used.

TRACKING FREE SPACE


A PFS page is the first page after the file header page in a data file (page number 1). This is followed by a GAM page (page number 2), and then an SGAM page (page 3). There is a PFS page approximately 8,000 pages in size after the first PFS page. There is another GAM page 64,000 extents after the first GAM page on page 2, and another SGAM page 64,000 extents after the first SGAM page on page 3. The following illustration shows the sequence of pages used by the Database Engine to allocate and manage extents.

TRACKING MODIFIED EXTENTS

SQL Server uses two internal data structures to track extents modified by bulk copy operations and extents modified since the last full backup.

Differential Changed Map (DCM) Bulk Changed Map (BCM)

DIFFERENTIAL CHANGED MAP (DCM)


This tracks the extents that have changed since the last BACKUP DATABASE statement. If the bit for an extent is 1, the extent has been modified since the last BACKUP DATABASE statement. If the bit is 0, the extent has not been modified. Differential backups read just the DCM pages to determine which extents have been modified. The length of time that a differential backup runs is proportional to the number of extents modified since the last BACKUP DATABASE statement and not the overall size of the database.

BULK CHANGED MAP (BCM)

This tracks the extents that have been modified by bulk logged operations since the last BACKUP LOG statement. If the bit for an extent is 1, the extent has been modified by a bulk logged operation after the last BACKUP LOG statement. If the bit is 0, the extent has not been modified by bulk logged operations. In this recovery model, when a BACKUP LOG is performed, the backup process scans the BCMs for extents that have been modified. It then includes those extents in the log backup. BCM pages are not relevant in a database that is using the simple recovery model, because no bulk logged operations are logged. They are not relevant in a database that is using the full recovery model, because that recovery model treats bulk logged operations as fully logged operations.

The interval between DCM pages and BCM pages is the same as the interval between GAM and SGAM page, 64,000 extents. The DCM and BCM pages are located behind the GAM and SGAM pages in a physical file:

PAGES AND EXTENTS ARCHITECTURE END

QUESTIONS

TOPICS Pages and Extents Architecture Files and Filegroups Architecture Transaction Log Architecture

FILES AND FILEGROUPS ARCHITECTURE

DATABASE FILES

Primary data files: The primary data file is the starting point of the database and points to the other files in the database.. The recommended file name extension for primary data files is .mdf. Secondary data files: Secondary data files make up all the data files, other than the primary data file. The recommended file name extension for secondary data files is .ndf. Log files: Log files hold all the log information that is used to recover the database. The recommended file name extension for log files is .ldf.

FILES AND FILEGROUPS ARCHITECTURE END

QUESTIONS

TOPICS Pages and Extents Architecture Files and Filegroups Architecture Transaction Log Architecture

TRANSACTION LOG ARCHITECTURE

TOPICS NEEDS TO BE COVERED


Transaction Log Logical Architecture Transaction Log Physical Architecture Checkpoints and the Active Portion of the Log Write-Ahead Transaction Log

TRANSACTION LOG LOGICAL ARCHITECTURE

Each log record is identified by a log sequence number (LSN). Log records are stored in a serial sequence as they are created Each log record contains the ID of the transaction that it belongs to Log records for data modifications record either the logical operation performed or they record the before and after images of the modified data. Rollback operations are also logged Each transaction reserves space on the transaction log to make sure that enough log space exists to support a rollback that is caused by either an explicit rollback statement or if an error is encountered.

TRANSACTION LOG PHYSICAL ARCHITECTURE

The transaction log is used to guarantee the data integrity of the database and for data recovery. The transaction log in a database maps over one or more physical files. Physically, the sequence of log records is stored efficiently in the set of physical files that implement the transaction log. The SQL Server Database Engine divides each physical log file internally into a number of virtual log files. The only time virtual log files affect system performance is if the log files are defined by small size and growth_increment values. The transaction log is a wrap-around file.

CHECKPOINTS AND THE ACTIVE PORTION OF THE LOG

Checkpoints flush dirty data pages from the buffer cache of the current database to disk. This minimizes the active portion of the log that must be processed during a full recovery of a database. During a full recovery, the following types of actions are performed:

The log records of modifications not flushed to disk before the system stopped are rolled forward. All modifications associated with incomplete transactions, such as transactions for which there is no COMMIT or ROLLBACK log record, are rolled back.

ACTIVE LOG
The section of the log file from the MinLSN (Minimum Recovery LSN) to the last-written log record is called the active portion of the log, or the active log. This is the section of the log required to do a full recovery of the database. No part of the active log can ever be truncated. All log records must be truncated from the parts of the log before the MinLSN.

LONG-RUNNING TRANSACTIONS
The active log must include every part of all uncommitted transactions. An application that starts a transaction and does not commit it or roll it back prevents the Database Engine from advancing the MinLSN. This can cause two types of problems:

If the system is shut down after the transaction has performed many uncommitted modifications, the recovery phase of the subsequent restart can take much longer than the time specified in the recovery interval option.
The log might grow very large, because the log cannot be truncated past the MinLSN. This occurs even if the database is using the simple recovery model, in which the transaction log is generally truncated on each automatic checkpoint.

REPLICATION TRANSACTIONS
The Log Reader Agent monitors the transaction log of each database configured for transactional replication, and it copies the transactions marked for replication from the transaction log into the distribution database. The active log must contain all transactions that are marked for replication, but that have not yet been delivered to the distribution database. If these transactions are not replicated in a timely manner, they can prevent the truncation of the log.

WRITE-AHEAD TRANSACTION LOG

SQL Server uses a write-ahead log (WAL), which guarantees that no data modifications are written to disk before the associated log record is written to disk. This maintains the ACID properties for a transaction.

To understand how the write-ahead log works, it is important for you to know how modified data is written to disk. SQL Server maintains a buffer cache into which it reads data pages when data must be retrieved. Data modifications are not made directly to disk, but are made to the copy of the page in the buffer cache. The modification is not written to disk until a checkpoint occurs in the database, or the modification must be written to disk so the buffer can be used to hold a new page. Writing a modified data page from the buffer cache to disk is called flushing the page. A page modified in the cache, but not yet written to disk, is called a dirty page. At the time a modification is made to a page in the buffer, a log record is built in the log cache that records the modification. This log record must be written to disk before the associated dirty page is flushed from the buffer cache to disk. If the dirty page is flushed before the log record is written, the dirty page creates a modification on the disk that cannot be rolled back if the server fails before the log record is written to disk. SQL Server has logic that prevents a dirty page from being flushed before the associated log record is written. Log records are written to disk when the transactions are committed.

TRANSACTION LOG ARCHITECTURE END

QUESTIONS

THE END

Anda mungkin juga menyukai