Indexing¶
约 1016 个字 -1 行代码 14 张图片 预计阅读时间 3 分钟
Outline:
- Basic Concepts
- Ordered indices
- B+ Tree Index
- B+ Tree File Organization
- B Tree Index Files
- Indices on Multiple Keys
- Indexing on a Flash
- Indexing in Main Memory
- Write Optimized Indices
- Log Structured Merge(LSM) Tree
- Buffer Tree
- Bitmap indices
Basic Concepts¶
Search Key - attribute to set of attributes used to look up records in a file.
An index file consists of records(called index entries) of the form
Search-key | pointer |
---|---|
Two basic kinds of indices:
- Ordered indices: search keys are stored in sorted order
- Hash indices: search keys are distributed uniformly across "buckets" using a "hash function".
Index Evaluation Metrics¶
- Access types supported efficiently.
- Point query: records with a specified value in the attribute
- Range query: or records with an attribute value falling in a specified range of values
支持点查以及范围查询。
- Access time
- Insertion time
- Deletion time
- Space overhead
Ordered Indices¶
- Primary index(主索引): in a sequentially ordered file, the index whose search key specifies the sequential order of the file.
- Also called clustering index(聚集索引)
- The search key of a primary index is usually but not necessarily the primary key.
- Secondary index(辅助索引): an index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index
- Index-sequential file(索引顺序文件): ordered sequential file with a primary index.
主索引只能有一个,其他都是辅助索引。
Example
Dense Index Files¶
Dense index(稠密索引) - index record appears for every search-key value in the file.
每一个 search-key 都要出现在文件里。


图一是以 ID 为主索引来搜索,图二是以 dept_name
为主索引来搜索。
Sparse Index Files¶
Sparse index(稀疏索引) - contains index records for only some search-key values.

Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block.

Multilevel Index(多级索引)¶
对索引再次建立索引,类似 B+ 树。
B+ Tree Index Files¶
- All paths from root to the leaf are of the same length
- Inner node(not a root or a leaf) between \(\lceil n/2 \rceil\) and \(n\) children
- Leaf node between \(\lceil (n-1)/2 \rceil\) and \(n-1\) values
- Special cases:
- If the root is not a leaf: at least 2 children
- If the root is a leaf: between \(0\) and \(n-1\) values

数据库中B+ 树一般一个叶子结点就是一个块的大小,即 4KB.
Observations about B+ Trees¶
- Since the inter-node connections are done by pointers, "logically" close blocks need not be "physically" close.
- The non-leaf levels of the B+ tree form a hierarchy of sparse indices.
- The B+ tree contains a relatively small number of levels
- Level below root has at least \(2 * \lceil n/2 \rceil\) values
- Next level has at least \(2 * \lceil n/2 \rceil * \lceil n/2 \rceil\) values
- ..etc.
如果文件中有 \(K\) 个索引项,那么树的高度不会超过 \(\lceil \log_{\ceil n/2 \rceil}(K/2) \rceil + 1\)
B+ 树的插入,删除等操作这里就不展示了,可以在这个网站查看
B+ tree: height and size estimation¶


B+ tree File Organization¶
- B+ tree File Organization:
- Leaf nodes in a B+ tree file organization store records, instead of pointers.叶子节点不放指针,而是放记录本身。
- Helps keep data records clustered even when there are insertions/deletions/updates.
- Leaf nodes are still required to be half full
- Insertion and deletion are handled in the same way as insertion and deletion of entries in a B+ tree index.

为了提高空间利用率,我们可以改变“半满”的条件,比如将“半满”定义为 \(\lfloor 2n/3 \rfloor\)
Other Issues in Indexing¶
- Record relocation and secondary indices
- If a record moves, all secondary indices that store record pointers have to be updated.
- Node splits in B+ tree file organizations become very expensive.
- Solution: use primary-index search key instead of record pointer in secondary index
- Variable length strings as keys
- Variable fanout
- Prefix compression
- Key values at internal nodes can be perfixes of full key
- Keys in leaf node can be compressed by sharing common prefixes.
Bulk Loading and Bottom-Up Build¶
- Efficient alternative 1: Insert in sorted order
- Sort entries first(using efficient external-memory sort algorithms)
- insert in sorted order
- Efficient alternative 2: Bottom-up B+ tree construction
- As before sort entries
- And then create tree layer-by-layer, starting with leaf level
- Implemented as part of bulk-utility by most database systems

Bulk insert index entries¶

Indices on Multiple Keys¶
- Composite search keys are search keys containing more that one attribute.
- Lexicographic ordering: \((a_1, a_2) < (b_1, b_2)\) if either \(a_1 < b_1\) or \(a_1 = b_1 \mathrm{and} a_2 < b_2\)
Indexing in Main Memory¶
- Random access in memory
- Much cheaper than on disk/flash, but still expensive compared to cache read
- Binary search for a key value within a large B+ tree node results in many cache misses
- Data structures that make best use of cache perferable - cache conscious
- B+ trees with small nodes that fit in cache line are preferable to reduce cache misses
- Key idea:
- use large node size to optimize disk access
- but structure data within a node using a tree with small node size, instead of using an array, to optimize cache access
Indexing on Flash¶
- Random I/O cost much lower on flash
- Writes are not in-place, and (eventually) require a more expensive erase
- Optimum page size therefore much smaller
- Bulk-loading still useful since it minimizes page erases
- Write-optimized tree structures (i.e., LSM-tree) have been adapted to minimize page writes for flash-optimized search trees
Write Optimized Indices¶
- Two approaches to reducing cost of writes
- Log-structured merge tree (LSM-tree)
- Buffer tree
Log Structured Merge Tree(LSM-tree)¶

- Records inserted first into in-memory tree(\(L_0\) tree)
- When in-memory tree is full, records moved to disk(\(L_1\) tree).如果内存中的树满了,就将内存中的树移动到磁盘上。
- B+ tree constructed using bottom-up build by merging existing \(L_1\) tree with records from \(L_0\) tree
- When \(L_1\) tree exceeds some threshold, merge into \(L_2\) tree
- And so on for more levels
- Sizethreshold for \(L_{i+1}\) tree is \(K\) times size threshold for \(L_i\) tree

Stepped Merge Index¶
- Stepped-merge index: variant of LSM tree with k trees at each level on disk
- When all \(k\) indices exist at a level, merge them into one index of next level.
- Reduces write cost compared to LSM tree
- But queries are even more expensive since many trees need to be queries
- Optimization for point lookups
- Compute Bloom filter for each tree and store in-memory
- Query a tree only if Bloom filter returns a positive result