Anda di halaman 1dari 7

Transaction & Recovery

1. Recovery
Recovery means recovering the database itself; that is restoring the database to a state that is known to be a correct or consistent after some failure has rendered the current state inconsistent or at least suspect. It is based on redundancy principle. There is only one way to make sure that database recoverable, is to make sure that any piece of information it contains can be reconstructed from some other information stored somewhere else in the system. Recovery or any transaction processing is independent of underlying system i.e. if it is relation or something else.

2. Transactions
Transaction is a logical unit of work. For e.g. consider SUPPLIER Relvar S and following sequence of transactions to be performed on it. BEGIN TRANSACTION; INSERT INTO S RELATION {TUPLE {S# S#(S3), SNAME (SMITH), STATUS (30), CITY (MUMBAI)}}; IF ANY ERROR OCCURRED THEN GOTO UNDO; END IF; UPDATE S SET STATUS: =10 WHERE S#=S#(S3); IF ANY ERROR OCCURRED THEN GOTO UNDO; END IF; COMMIT; GOTO FINISH; UNDO: ROLLBACK; FINISH: RETURN; END TRANSACTION; The constraint is, supplier from MUMBAI must have status > 20. So, in between insert and update operation it might be possible that database is not consistent and may violate the constraint.

CITC

Vishwas Raval

Transaction & Recovery


Note that logical unit of work id not necessarily mean a single database operation, it could be a sequence of several such operations. Transaction transforms a consistent state of the database into another consistent state without necessarily preserving consistency at all intermediate point. So, if the transaction executes some updates and then failure occurs before its planned termination, then that update will be undone. This means transaction either executes in its entirety or not at all.

3. Transaction Manager
The system component that provides atomicity is known as TM or transaction processing monitor (TP Monitor). COMMIT and ROLLBACK operations are the key the way it works.

COMMIT:
It signals successful end of transaction. The database is again in a consistent state after complete transaction and all of the updates can now be COMMITTED i.e. made permanent.

ROLLBACK:
It indicate un-successful end of transaction. The database might be in an inconsistent state because of something might has gone wrong during transaction and all of the updated made by the logical unit of work must be ROLLED BACK or undone. Whenever, we execute any transaction, the system maintains a log or journal on tape or disk on which details of all updates before and after images of updated objects are recorded. So, it becomes necessary to undo some particular update. The system can use corresponding log entry to restore the updated object to its previous value.

4. Transaction Recovery
Any transaction begins with successful BEGIN TRANSACTION statement and ends with successful execution of either COMMIT or ROLLBACK statement. COMMIT point is also called synchpoint, it corresponds to the end of a logical unit of work and to a point at which database is in a consistent state.

CITC

Vishwas Raval

Transaction & Recovery


ROLLBACK, rolls the system back to the state it was in at BEGIN TRANSACTION, which means back to the previous COMMIT point. Here, the term database means the portion of the database being accessed by that particular transaction only. Other transaction might be executing in parallel and making changes to their own portion of the database. When a COMMIT point is established: 1. All updates made by the executing program since the previous COMMIT point are committed i.e. made permanent. Prior to the COMMIT point all such updates should be regarded as tentative only in the sense that they might sequentially be undone. Once committed, an update is guaranteed to be permanent. 2. All database positioning are lost and all tuples locks are released. Database positioning refers to the addressibility mechanism to certain tuples by means of CURSOR. This principle is also applied to the ROLLBACK except first one. COMMIT and ROLLBACK terminates the transaction, not the program. A single program execution will consists of a sequence of several transactions running one after another. Begin Transaction T1 Commit

Begin Transaction

T2

Rollback

Begin Transaction

T3

Commit

Transaction is unit also unit of recovery. If a transaction successfully commits, then the system will guarantee that its update will be permanently installed in the database, even if the system crashed the very next moment. It might be possible that system crash after the commit has been honored but before the updates have been physically written to the database they might still be

CITC

Vishwas Raval

Transaction & Recovery


stored in main memory buffer or might be lost at crash. Even if this happens, the systems restart procedure will still install those updates into the database by examining the relevant entries in the log.

The Write Ahead Log Rule: The log entry must be written physically before
COMMIT processing completes which is also known as Partial Commit. So, restart procedure will recover transactions that completed successfully but did not manage to get their updates physically written prior to crash. So, in this way, transactions are indeed the unit of recovery.

The ACID properties:


Transaction has four main properties as under: 1. Atomicity: Transactions are atomic i.e. either the transaction will be completed fully or will not be done. 2. Consistency: Transaction preserves consistency means any transaction transforms a consistent state of a database into another consistent state, without necessarily preserving consistency at all intermediate points. 3. Isolation: Isolation means separation. All transactions running simultaneously will be isolated from each other i.e. update made by one transaction will not be visible to another transaction unless first transaction commits. 4. Durability: Once a transaction is committed, its updates in the database are guaranteed to remain in the database even if theres a subsequent crash.

5. System Recovery
1. Local failure: Occurrence of an overflow condition within an individual transaction
is called local failure. For e.g. when a transaction is inserting data into the database and due to low size of the database transaction could not complete the insertion then it is a local failure. Local failure affects the transaction in which the failure has occurred actually.

2. Global failure: Power failure. Global failure affects all the transactions in progress
at the time of failure. Global failure is categorized into 2 broad categories: I) System failure and recovery:

CITC

Vishwas Raval

Transaction & Recovery


It affects all the transactions currently in progress but do not physically damage the database. It is also known as soft crash. The contents of the main memory are lost i.e. the database buffers are lost. The precise state of any transaction that was in progress at the time of failure is no longer known; such a transaction therefore can never be completed successfully and so must be undone at restart time. Sometimes it is also necessary to redo those transactions whose updates did not transferred to the physical database before failure occurred. So how can the system decide which transaction is to undo or redo? For this purpose a concept coined, called checkpoint. At certain prescribed time intervals or whenever some prescribed numbers of entries have been written to the log system automatically takes a checkpoint. It is responsible to: 1.write the contents of database buffers to the physical database. 2.write a special checkpoint record into the physical log. Checkpoint record gives a list of all transactions that were in progress at the time the checkpoint was taken. Time -----> Tc

Tf

T1 T2 T3 T4 T5

Checkpoint

System Failure

The transaction of T1 will not enter into the restart process at all because it completed prior to Tc and its updates were forced to write into the physical database at the time checkpoint was taken as a part of checkpoint process. The transactions, which completed before Tf will also not enter in, restart process because those transactions changes would not have been made into the database.

CITC

Vishwas Raval

Transaction & Recovery


Following is the procedure to identify the transaction list for the restart time. Start with two lists, UNDO list and REDO list. Set UNDO list equals to the list of all transactions given in the most recent checkpoint record. Set REDO list equals to empty. Search forward through the log, starting from the checkpoint record. If BEGIN TRANSACTION entry found then put that transaction into UNDO list. If COMMIT entry found in log then put that transaction into REDO list. When end-of-log is reached the UNDO list contains T3 and T5 whereas REDO list contains T2 and T4.

System goes forward first to undo the transactions and then goes backward to redo the transaction. Remember here, redoing the transaction means whole transaction will be performed from the scratch. This undo and redo processes are known as backward and forward recovery respectively. II) Media failure and recovery: It causes damage to the database or to some portion of it and affects at least those transactions currently using that portion. It is known as hard crash. For e.g. disk head crash, a disk controller failure in which some portion of database has been physically destroyed. Recovery form such failure involves reloading (restoring) the database from a back up copy (dump) and then using the log active and archive both. Here, no need to undo transactions that were still in progress at the time of failure, since by definition all updates of all such transactions would have been undone (actually lost). For media recovery there is a need for a dump utility. The dump portion of the utility is used to take backup on demand. It can be kept on tape or other archival storage media. So after failure, the restore portion of the utility is used to recreate the database from a specified backup copy.

6. Two-phase commit protocol


Whenever a given transaction can interact with several independent resource managers, each managing its own set of recoverable resources and maintaining its won recovery log.

CITC

Vishwas Raval

Transaction & Recovery


For e.g. consider a transaction running on an IBM mainframe that updates an IMS and DB2 database both. If transaction completes successfully, all updates made in both IMS and DB2 data must be committed else must be rolled back in case of failure. In order to maintain consistency in both databases, there should be some mechanism to have a system-wide commit or rollback. That mechanism is called as 2-phase commit protocol. Coordinator: The global commit and rollback is handled by a system component called coordinator whose task is to guarantee that both resource managers will either commit or rollback even if system fails in the middle of the process. How does the 2-phase commit work? Assuming that a transaction has completed its database processing successfully and system issues commit to the coordinator. On receiving the COMMIT, coordinator goes through the following process. i) Phase-I: It instructs all resource managers to get ready to go either way on the transaction. That means each participant in the process (resource manager) must force all log entries for local resource to its own physical log. If that entry is successful the resource manager replies to coordinator OK, NOT OK otherwise. Phase-II: When coordinator receives relies from all participants; it forces an entry to its own physical log and records its decision about the transaction. Decision will be COMMIT if all replies were OK. If any of the reply is NOT OK then decision is ROLLBACK. Once again this decision is informed to all participants and then based on the decision participants commits or rolls back the transaction locally. Each participant must do what coordinator in phase-2 has told.

ii)

So now if failure occurs in system at some point during the overall process, the restart procedure will look for the decision record and based on the decision UNDO or REDO will take place. Here, data communication manager can also work as resource manager. Data communication manager is an autonomous system in its own rights but it should work in harmony with DBMS. Now, in case of distributed DBMS, where users workstation and DBMS are physically remote, then sending request to DBMS from end-user and responses back from the DBMS are transmitted in the form of messages. All such message transmission takes place under the control of DC manager.

CITC

Vishwas Raval

Anda mungkin juga menyukai