Efficiently traversing InnoDB B+Trees with the page director

Efficientlytraversing InnoDB B+Trees with the page directory1、the purpose of the page directory

As described in the posts mentioned above,all records in INDEX pages are linked together in a singly-linked list inascending order. However, list traversal through a page with potentiallyseveral hundred records in it is very expensive: every record’s key must becompared, and this needs to be done at each level of the B+Tree until therecord sought is found on a leaf page.

Index page页中的所有记录都以单链表递增的形式串联。但是在一页中以链表的形式检索记录代价很大:每一个记录的key必须比较,这个动作需要在所有高度的B+树上进行,知道在叶子节点找到记录。

The page directory greatly optimizes thissearch by providing a fixed-width data structure with direct pointers to 1 ofevery 4-8 records, in order. Thus, it can be used for a traditional binarysearch of the records in each page, starting at the mid-point of the directoryand progressively pruning the directory by half until only a single entryremains, and then linear-scanning from there. Since the directory iseffectively an array, it can be traversed in either ascending or descendingorder, despite the records being linked in only ascending order.

Page directory通过提供一个固定大小的数据结构(这个结构指向4-8个记录中的一个)优化查询。因此能够在每个页中使用二叉查找的方法。根据slot折半查找,,知道只剩下一个条目,然后从这个条目开水线性扫描。由于directory是一个高效的数组,可以以递增或者递减的顺序进行扫描,即使记录只是以递增的顺序链接。

2、The physical structure of the pagedirectory

The structure is actually very simple. Thenumber of slots (the page directory length) is specified in the first field ofthe INDEX header of the page. The page directory always contains an entry forthe infimum and supremum system records (so the minimum size is 2 entries), andmay contain 0 or more additional entries, one for each 4-8 system records. Arecord is said to “own” another record if it represents it in the pagedirectory. Each entry in the page directory “owns” the records between theprevious entry in the directory, up to and including itself. The count ofrecords “owned” by each record is stored in the record header that precedeseach record.

Slots的个数在该页的index header部分的第一域指定。Page directory至少包含infimum和supremum的slot。因此directory最少有2个slot。一个记录如果own其他记录,表示在这个slot里。每个slot管理本身和上一个slot中的记录之间的记录。记录owned的个数存在每个记录的record header部分。

The page-directory-summary mode of innodb_spacecan be used to view the page directory contents, in this case for a completelyempty table (with the same schema as the 1 million row table used in A quickintroduction to innodb_ruby), showing the minimum possible page directory:

$ innodb_space -f t_page_directory.ibd -p 3page-directory-summary

slot offset typeowned key

0 99infimum 1

1112 supremum 1

If we insert a single record, we can seethat it gets owned by the record with a greater key than itself that has anentry in the page directory. In this case, supremum will own the record (aspreviously discussed, supremum represents a record higher than any possible keyin the page):

$ innodb_space -f t_page_directory.ibd -p 3page-directory-summary

slotoffset type owned key

099 infimum 1

1112 supremum 2

The infimum record always owns only itself,since no record can have a lower key. The supremum record always owns itself,but has no minimum record ownership. Each additional entry in the pagedirectory should own a minimum of 4 records (itself plus 3 others) and amaximum of 8 records (itself plus 7 others).

Infimum记录总是只own自己,因为是最小记录。Supremum记录总是own自己。除了infimum和supremum的slot,每个slot都会至少管理4个记录(itself+3others),最多管理8个。

To illustrate, each record with an entry inthe page directory (bolded) owns the records immediately prior to it in thesingly-linked list (K = Key, O = Number of Records Owned):

快乐要有悲伤作陪,雨过应该就有天晴。

Efficiently traversing InnoDB B+Trees with the page director

相关文章:

你感兴趣的文章:

标签云: