Query Processing
# Query Processing
Given a query, we need to devise an algorithm to obtain the desired result
# Example
Select B, D From R, S Where R.A = ācā AND S.E = 2 AND R.C=S.C
Some possible solutions:
- Cartesian product -> Select tuple -> Project
- Select tuple -> Natural join ->Project
- Index to select R tuples -> Use S.C index to select found R.C values -> Remove S.E != 2 -> Join matching
# Operator Analysis
Notations:
- R: a relation/table
- T(R): number of tuples in R
- B(R): number of blocks needed to hold tuples of R
- V(R, a): number of distinct values of attribute a in R Assumptions for analysis:
- Main memory is organized as input buffers (for holding data for computations) and output buffers (for storing results) each being the size of 1 block. Let M represent the number of input buffers available.
- Relation R is clustered if its tuples are packed into as few blocks as possible. If T(R) = 1,000 and each block can hold 1000 tuples, B(R) = 100
- We assume R is clustered. If not clustered, it may take T(R) disk I/Os rather than B(R) to read all the tuples
- Ignore the final I/O cost for writing result in output buffer back to disk. Irrelevant to our calculations.