Brendan Ang

Search

Search IconIcon to open search

Concurrency Control

Last updated Nov 8, 2022 Edit Source

# Concurrency Control

The DBMS needs to ensure consistency during concurrent execution of transactions, as concurrency can result in the database being in an inconsistent state despite preserving the correctness of transactions and without encountering a failure.

# Scheduler

A schedule is a sequence of interleaved actions from all transactions.

# Serial schedule

A schedule is serial if its actions consists of all the actions of one transaction, then all the actions of another transaction, and so on. 500

# Serializable schedule

Result is equivalent to a serial schedule, but actions of one transaction do not have to occur before the actions of another. 500

# Conflict Serializable Schedule

If a pair of actions conflict, if their order is interchanged, the behavior of at least one of the transactions can change. From this we can draw 2 conclusions about when a pair can be swapped.

  1. Actions involve the same element
  2. At least one is a write

# Precedence Graph

We can use a precedence graph to determine if a set of transactions are conflict serializable. An edge from one node to another represents a constraint on the order of the transactions. i.e. Actions in a transaction t1 cannot be swapped with another transaction t2. If a cycle occurs, the order of transactions become contradictory and no serial schedule can exist.

# Recoverable Schedule

A schedule is recoverable if transactions commit only after all transactions whose changes they read have committed. Else, the DBMS is unable to guarantee that transactions read data that will be the same as before the crash and after the crash.

# Locks

Ensure that data items that are shared by conflicting operations are accessed one operation at a time. Same as with Process Synchronization.

# Two Phase Locking (2PL)

Arbitrary assignment of locks do not lead to a serializable schedule. Two transactions can operate on elements in a different order resulting in different results. We can solve this by ensuring that transactions take up all lock actions before all unlock actions.

# Why 2PL works?

Intuitively, each two-phase-locked transaction may be thought to execute in its entirety at the instant it issues its first unlock request. i.e. a legal schedule can only be such that the transaction completes fully, because otherwise, this means that another transaction is attempting to take a held lock, causing a deadlock. Hence, there is at least one conflict-serialisable schedule: the one in which the transactions appear in the same order as first unlocks. Suppose the schedule starts with T1 locking and reading A. If T2 locks B before T1 reaches its unlocking phase, then there is a deadlock, and the schedule cannot complete. Thus, if T1 performs an action first, it must perform all its actions before T2 performs any. Likewise, if T2 starts first, it must complete before T1 starts, or there is a deadlock. Thus, only the two serial schedules of these transactions are legal.

# Lock mechanisms

# Shared and Exclusive locks

# Update locks

Deadlocks can occur when transactions are unable to upgrade their shared locks to exclusive ones, since there are already shared locks taken. A separate lock type that may be later upgraded to an exclusive lock is needed.

# Compatibility matrix

# Workings of Scheduler

  1. Part 1: Inserts appropriate lock actions ahead of all DB access operations and release the locks held by the Transaction when it aborts/commits
  2. Part 2: maintains a waiting list of transactions that need to acquire locks

# Deadlock Detection & Prevention

# Timeout

Place a limit on how long a transaction may be active, if it exceeds this time, it is forced to release its locks and other resources and roll back.

# Waits-For Graph

Utilises the cyclic properties of deadlocks to detect them.

# Timestamps

Assign each transaction with a timestamp. This timestamp never changes for the transaction even if it is rolled back.

# Wait-Die

# Wound-Wait

# Comparison

# Timestamp Ordering

An optimistic approach. Use the timestamps of transactions to determine the serialisability of transactions.

# Rules

# Thomas Write Rule

When a transaction wants to write, and TS(T) < WT(X), ignore the write and allow the transaction to continue without aborting.

# Comparisons

# Multi Version Concurrency Control

A misnomer as it is not actually a concurrency control protocol. DBMS maintains multiple physical versions of a single object in the database:

# Concurrency Protocol

Able to use any concurrency protocol to underly its implementation. MVCC is simply another layer on top of these protocols to provide transactional memory.

# Version Storage

DBMS creates a version chain per tuple, which allows it to find version that is visible to a particular transaction at runtime. There are indexes which point to the head of the chain. Version ordering:

# Append only

# Time travel storage

# Delta storage

# Garbage Collection

# Practice Problems

graph LR; T3 --> T2; T1 --> T2; T3 --> T1;
TimeT1T2T3
t1Read(A)
t2Read(B)
t3Read(A)
t4Read(C)
t5Write(A)
t6Read(C)
t7Read(B)
t8Write(C)
t9Write(B)
In 2PL, all locks must be acquired by the transaction, operations are done and then all locks are released at once.
This is schedule is not consistent with 2PL.
T1 takes ul(B), xl(B), ul(D), xl(D). T2 reads and writes item B at step 6, this is not possible if T1 still has the exclusive lock on B. Hence, T1 must have released all locks by step 6. However, T1 still takes read and write actions on D at step 13,14.
The minimal set of actions to remove: