Transaction Management

Last updated Nov 8, 2022 Edit Source

# Transaction Management

Transactions are the basic unit of change in a DBMS. It is essential for data recovery and concurrency control.

[!Idea:]
Take a database as an input
Perform an action
Generate new version of database

# SQL Transactions

A new transaction starts with BEGIN
Transactions are stopped with either a COMMIT or ABORT
COMMIT: changes are saved
ABORT: changes are undone

# ACID Properties

# Atomicity

A transaction is either performed in its entirety (can be in steps) or not performed at all.

# Logging

DBMS logs all actions so that it can undo the actions of aborted transactions

# Shadow paging

Make copies of pages, perform changes on these copies. Only when transaction commits, the page is made visible to others.

# Consistency

# Database consistency

A database is in consistent state if it obeys all of the consistency (integrity) constraints defined over it. A database may be inconsistent in between states.

# Transaction consistency

Database is in consistent state even if there are a number of concurrent transactions

# Isolation

A transaction should appear as though it is executed in isolation from other transactions. An executing transaction cannot reveal its results to other concurrent transactions before its commitment.

# Concurrency control protocol

Pessimistic: Do not let problems arise in the first place (prevention)
Optimistic: Assume conflicts are rare and deal with them when they happen (detection and recovery)

# Durability

Changes applied to the database by a committed transaction must persist in the database.

# Failures

# Types of Failures

# Transaction failures

Logical Errors: Transaction cannot complete due to some internal error condition (e.g., integrity constraint violation).
Internal State Errors: DBMS must terminate an active transaction due to an error condition (e.g., deadlock).

# System failures

Software Failure: Problem with the DBMS implementation (e.g., uncaught divide-by-zero exception).
Hardware Failure: The computer hosting the DBMS crashes (e.g., power plug gets pulled, disk crash). They can be recoverable or non-recoverable

# Recovery Algorithms

Recovery must kick in to ensure the database satisfies the ACID properties in the case of failure. They must carry in 2 parts:

Actions during normal transaction processing to ensure that the DBMS can recover from a failure.
Actions after a failure to recover to a state that ensures atomicity, consistency and durability

Ensuring atomicity and durability – Logging: The database stores additional files called log files. Logs record every action of each transaction and are append only.

# Undo Logging

Idea: undo the effects of transactions that may not have completed before failure.

With undo logging, there are a set of rules the DBMS must follow. Order of writing to disk:
Log records indicating changed database elements
Changed database elements themselves
COMMIT log record

# Recovery without checkpoints

All undo commands are idempotent, if failure occurs during recovery, we can simply restart.

Use periodic checkpoints to prevent having to read the entirety of the log file for recovery. Any transactions executed before the checkpoint will have finished and there will be no need to undo them. Problem

The database is frozen while performing checkpointing. Active transactions may take a long time to commit or abort and will result in variant performance of the DBMS.

# Non-quiescent Checkpointing

Start checkpointing at any time using the current incomplete transactions.

# Recovery with checkpoints

Scan backwards identifying transactions which did not commit.
If we reach END CKPT, we know that the only transactions which may not have committed must be those after the START CKPT. Thus, we can stop at START CKPT
If we reach START CKPT first, we failed during checkpointing and need to search up till the earliest START T of those in the checkpoint, because we know those are the transactions active at the start of the checkpoint
Undo uncommitted transactions and ignore those that have committed.

# Limitations

Cannot commit a transaction without first writing all its changed data to disk. This means we will need many disk I/O for each transaction.

# Redo Logging

Do not write all changed data to disk before committing. Write to the main memory log file all the changes to the DB, and commit before any changes are written to disk. Perform recovery by redoing effects of committed transactions before the crash.

If a transaction modifies X, then both <T,x,v> and COMMIT T must be written to disk before OUTPUT(X). Order of writing to disk:
Log records indicating the changed elements
COMMIT T log record
Changes to the elements themselves

# Recovery without checkpoints

# Checkpointing

Start checkpointing for all active transactions. When we see an START CHECKPOINT, we can be sure that all prior transactions have been recorded to the disk and there is no need to redo them. Example where we do not need to redo the actions for T1:

# Recovery with checkpoints

If we see a END CHKPT in the log, we can be sure that changes made by transactions that have committed before the START CHKPT is already in disk, and we can ignore them. If no END CKPT in logs, we need to search back to next to last START CKPT.
From START CKPT (T1, ... Tk) we cannot be sure that any transaction that is among the $T_i$ or those started after this checkpoint log have been written to disk. We must search back until the earliest $T_i$ and redo.
However, if $T_i$ ’s commit message is not found, we write ABORT Ti as it must not be redone.

# Limitations

It requires all modified blocks to be kept in buffers until the transaction commits and the log records have been flushed. This increases the average number of buffers needed by transactions.

# Undo/Redo Logging

Increase flexibility by maintaining more information on the log. <T,X,v,w> now stores both old and new values in the log.

Before modifying any DB element X, the log record must be written to the disk.

We can commit at any time after the <T,B,8,6> record has been written to the disk. Recovery process:

Redo all committed transactions top-down
Undo all uncommitted transactions bottom-up This is because we may have committed transactions with some of the changes not on disk as well as uncommitted transactions with some of the changes on disk.

# Checkpointing

The END CKPT can appear anywhere after the START CKPT

# Recovery with checkpoints

We are sure we do not need to check any further than the START TRANSACTION of active transactions in the checkpoint as the corresponding END CHECKPOINT ensures that all changes prior to the START have been flushed to the disk.

If crash occurs at the end, T2 and T3 are identified as committed and we redo actions starting from START CKPT
If crash occurs before COMMIT T3, T2 is committed but. T3 is not. Redo T2 from START CKPT and undo T3 until START T3.
If crash occurs before END CKPT, any changes made by previous transactions may not have been written to disk. We need to search back to the next to last START CKPT, redo committed transactions and undo incomplete ones up till their corresponding START T logs.

# Practice Problems

a. We know that at an END CHKPNT, all dirty buffers from the corresponding START CKPT are written to disk A: 21 B: 40, 41 C: 30,31,32,33 D: 50,51,52 b. All uncommitted transactions. T1, T6 c. All committed transactions. T3, T4, T5 d. A: 21 B: 41. T4 redone C: 31. T1 undone, T3 redone D: 52. T5 redone.

Brendan Ang

Transaction Management

# Transaction Management

# SQL Transactions

# ACID Properties

# Atomicity

# Logging

# Shadow paging

# Consistency

# Database consistency

# Transaction consistency

# Isolation

# Concurrency control protocol

# Durability

# Primitive Operations

# Failures

# Types of Failures

# Transaction failures

# System failures

# Recovery Algorithms

# Undo Logging

# Recovery without checkpoints

# Checkpointing

# Non-quiescent Checkpointing

# Recovery with checkpoints

# Limitations

# Redo Logging

# Recovery without checkpoints

# Checkpointing

# Recovery with checkpoints

# Limitations

# Undo/Redo Logging

# Checkpointing

# Recovery with checkpoints

# Practice Problems

Backlinks

Interactive Graph