說說explain中的Using filesort

發(fā)布時(shí)間：2020-07-23 19:55:47 來源：網(wǎng)絡(luò) 閱讀：7279 作者：coveringindex 欄目：MySQL數(shù)據(jù)庫

有時(shí)查看SQL的執(zhí)行計(jì)劃時(shí), 會遇到Using filesort, 如下.

mysql> explain select * from tb1 where col1 = 4 order by col2\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: tb1

type: ref

possible_keys: idx_col1

key: idx_col1

key_len: 4

ref: const

rows: 1

Extra: Using where; Using filesort

1 row in set (0.00 sec)

這個(gè)filesort是說, MySQL要多做一次額外的排序, 確切的說是快速排序(Quicksort).

先初步了解下Quicksort排序的概念(From Wikipedia).

Quicksort is a divide and conquer algorithm. Quicksort first divides a large array into two smaller sub-arrays: the low elements and the high elements. Quicksort can then recursively sort the sub-arrays.

The steps are:

1. Pick an element, called a pivot, from the array.

2. Partitioning: reorder the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.

3. Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values.

再看下Python對于其的一個(gè)實(shí)現(xiàn).

#!/usr/bin/env python

# -*- coding: utf-8 -*-

from __future__ import print_function

def quicksort(array):

if len(array) < 2:

return array

else:

pivot = array[0]

less = [i for i in array[1:] if i <= pivot]

greater = [i for i in array[1:] if i > pivot]

return quicksort(less) + [pivot] + quicksort(greater)

print(quicksort([10, 5, 2, 3]))

再回來說filesort, 在MySQL中有the Original, Modified和In-Memory filesort Algorithm 3種實(shí)現(xiàn).

The Original filesort Algorithm

1. 掃描或根據(jù)WHERE條件, 獲取所有記錄.

2. 把每條記錄的sort key和row ID, 即<sort_key, rowid>, 放入sort buffer中. 若sort buffer滿了, 就在內(nèi)存中進(jìn)行一次quicksort, 然后將<sort_key, rowid>寫入臨時(shí)文件, 并記錄指向指針. 重復(fù)該過程, 直到讀取了所有記錄.

3. 進(jìn)行若干次multi-merge操作, 將所有row ID寫入結(jié)果文件.

4. 根據(jù)row ID再次獲取記錄.

很容易發(fā)現(xiàn), 上面的步驟1和4, 一共讀取了2遍記錄, 所以也就有了下面的改進(jìn)實(shí)現(xiàn).

The Modified filesort Algorithm

較Original改變的地方是, 在第2步記錄的是sort key和涉及到的其它列, 即<sort_key, additional_fields>, 不是row ID了. 第3步完成后, 就可得到結(jié)果了.

這個(gè)算法中<sort_key, additional_fields>占用空間比<sort_key, rowid>要大, 若排序數(shù)據(jù)量很大的情況下, 會頻繁寫臨時(shí)文件, 為了避免其, 引入了max_length_for_sort_data參數(shù).

The In-Memory filesort Algorithm

那么排序數(shù)據(jù)量比較小的情況下呢, 小到在sort buffer中就可完成排序, 針對這種情況又有了In-Memory filesort. 這時(shí)MySQL把sort buffer當(dāng)成priority queue使用, 避免使用臨時(shí)文件.

上面可以看到MySQL已在盡量優(yōu)化排序了, 也從側(cè)面說明其不希望排序的出現(xiàn), 如最開始的SQL, 建立一個(gè)(col1, col2)的聯(lián)合索引, 就可以避免排序了, 該原因還要從B+樹索引說起...

若感興趣可關(guān)注訂閱號”數(shù)據(jù)庫最佳實(shí)踐”(DBBestPractice).

說說explain中的Using filesort

向AI問一下細(xì)節(jié)

說說explain中的Using filesort

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽