溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

MySQL數(shù)據(jù)庫中怎么對(duì)in與range查詢進(jìn)行優(yōu)化

發(fā)布時(shí)間:2020-12-04 15:20:45 來源:億速云 閱讀:225 作者:Leah 欄目:開發(fā)技術(shù)

今天就跟大家聊聊有關(guān)MySQL數(shù)據(jù)庫中怎么對(duì)in與range查詢進(jìn)行優(yōu)化,可能很多人都不太了解,為了讓大家更加了解,小編給大家總結(jié)了以下內(nèi)容,希望大家根據(jù)這篇文章可以有所收獲。

首先我們來說下in()這種方式的查詢。在《高性能MySQL》里面提及用in這種方式可以有效的替代一定的range查詢,提升查詢效率,因?yàn)樵谝粭l索引里面,range字段后面的部分是不生效的。使用in這種方式其實(shí)MySQL優(yōu)化器是轉(zhuǎn)化成了n*m種組合方式來進(jìn)行查詢,最終將返回值合并,有點(diǎn)類似union但是更高效。同時(shí)它存在這一些問題:

老版本的MySQL在IN()組合條件過多的時(shí)候會(huì)發(fā)生很多問題。查詢優(yōu)化可能需要花很多時(shí)間,并消耗大量?jī)?nèi)存。新版本MySQL在組合數(shù)超過一定的數(shù)量就不進(jìn)行計(jì)劃評(píng)估了,這可能導(dǎo)致MySQL不能很好的利用索引。

這里的“一定數(shù)量”在MySQL5.6.5以及以后的版本中是由eq_range_index_dive_limit這個(gè)參數(shù)控制(感謝@葉金榮同學(xué)的指點(diǎn))。默認(rèn)設(shè)置是10,一直到5.7以后的版本默認(rèn)會(huì)修改成200,當(dāng)然我們是可以手動(dòng)設(shè)置的。我們看下5.6手冊(cè)中的說明:

The eq_range_index_dive_limit system variable enables you to configure the number of values at which the optimizer switches from one row estimation strategy to the other. To disable use of statistics and always use index dives, set eq_range_index_dive_limit to 0. To permit use of index dives for comparisons of up to N equality ranges, set eq_range_index_dive_limit to N + 1.
eq_range_index_dive_limit is available as of MySQL 5.6.5. Before 5.6.5, the optimizer uses index dives, which is equivalent to eq_range_index_dive_limit=0.

也就是說:

1. eq_range_index_dive_limit = 0 只能使用index dive
2. 0 < eq_range_index_dive_limit <= N 使用index statistics
3. eq_range_index_dive_limit > N 只能使用index dive

index dive與index statistics是MySQL優(yōu)化器對(duì)開銷代價(jià)的估算方法,前者統(tǒng)計(jì)速度慢但是能得到精準(zhǔn)的值,后者統(tǒng)計(jì)速度快但是數(shù)據(jù)未必精準(zhǔn)。

the optimizer can estimate the row count for each range using dives into the index or index statistics.

在MySQL5.7版本中將默認(rèn)值從10修改成200目的是為了盡可能的保證范圍等值運(yùn)算(IN())執(zhí)行計(jì)劃盡量精準(zhǔn),因?yàn)镮N()list的數(shù)量很多時(shí)候都是超過10的。

說在前面

今天文章的主題有兩個(gè):

  1. range查詢與索引使用
  2. eq_range_index_dive_limit的說明
     

range查詢與索引使用

SQL如下:

SELECT * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2') 
ORDER BY dateline DESC LIMIT 10;

索引如下:

+----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| pre_forum_post |     0 | PRIMARY   |      1 | tid     | A     |    NULL |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     0 | PRIMARY   |      2 | position  | A     |  25521392 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     0 | pid     |      1 | pid     | A     |  25521392 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | fid     |      1 | fid     | A     |    1490 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | displayorder |      1 | tid     | A     |   880048 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | displayorder |      2 | invisible  | A     |   945236 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | displayorder |      3 | dateline  | A     |  25521392 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | first    |      1 | tid     | A     |   880048 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | first    |      2 | first    | A     |   1215304 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | new_auth   |      1 | authorid  | A     |   1963184 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | new_auth   |      2 | invisible  | A     |   1963184 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | new_auth   |      3 | tid     | A     |  12760696 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | idx_dt    |      1 | dateline  | A     |  25521392 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | mul_test   |      1 | tid     | A     |   880048 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | mul_test   |      2 | invisible  | A     |   945236 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | mul_test   |      3 | dateline  | A     |  25521392 |   NULL | NULL  |   | BTREE   |     |        | 
| pre_forum_post |     1 | mul_test   |      4 | pid     | A     |  25521392 |   NULL | NULL  |   | BTREE   |     |        | 
+----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

看下執(zhí)行計(jì)劃:

root@localhost 16:08:27 [ultrax]> explain SELECT * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2') 
  -> ORDER BY dateline DESC LIMIT 10;
+----+-------------+----------------+-------+-------------------------------------------+--------------+---------+------+------+---------------------------------------+
| id | select_type | table     | type | possible_keys               | key     | key_len | ref | rows | Extra                 |
+----+-------------+----------------+-------+-------------------------------------------+--------------+---------+------+------+---------------------------------------+
| 1 | SIMPLE   | pre_forum_post | range | PRIMARY,displayorder,first,mul_test,idx_1 | displayorder | 4    | NULL |  54 | Using index condition; Using filesort | 
+----+-------------+----------------+-------+-------------------------------------------+--------------+---------+------+------+---------------------------------------+
1 row in set (0.00 sec)

MySQL優(yōu)化器認(rèn)為這是一個(gè)range查詢,那么(tid,invisible,dateline)這條索引中,dateline字段肯定用不上了,也就是說這個(gè)SQL最后的排序肯定會(huì)生成一個(gè)臨時(shí)結(jié)果集,然后再結(jié)果集里面完成排序,而不是直接在索引中直接完成排序動(dòng)作,于是我們嘗試增加了一條索引。

root@localhost 16:09:06 [ultrax]> alter table pre_forum_post add index idx_1 (tid,dateline);  
Query OK, 20374596 rows affected, 0 warning (600.23 sec)
Records: 0 Duplicates: 0 Warnings: 0
root@localhost 16:20:22 [ultrax]> explain SELECT * FROM pre_forum_post force index (idx_1) WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;
+----+-------------+----------------+------+---------------+-------+---------+-------+--------+-------------+
| id | select_type | table     | type | possible_keys | key  | key_len | ref  | rows  | Extra    |
+----+-------------+----------------+------+---------------+-------+---------+-------+--------+-------------+
| 1 | SIMPLE   | pre_forum_post | ref | idx_1     | idx_1 | 3    | const | 120646 | Using where | 
+----+-------------+----------------+------+---------------+-------+---------+-------+--------+-------------+
1 row in set (0.00 sec)
root@localhost 16:22:06 [ultrax]> SELECT sql_no_cache * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;
...
10 rows in set (0.40 sec)
root@localhost 16:23:55 [ultrax]> SELECT sql_no_cache * FROM pre_forum_post force index (idx_1) WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;
...
10 rows in set (0.00 sec)

實(shí)驗(yàn)證明效果是極好的,其實(shí)不難理解,上面我們就說了in()在MySQL優(yōu)化器里面是以多種組合方式來檢索數(shù)據(jù)的,如果加了一個(gè)排序或者分組那勢(shì)必只能在臨時(shí)結(jié)果集上操作,也就是說索引里面即使包含了排序或者分組的字段依然是沒用的。唯一不滿的是MySQL優(yōu)化器的選擇依然不夠靠譜。

總結(jié)下:在MySQL查詢里面使用in(),除了要注意in()list的數(shù)量以及eq_range_index_dive_limit的值以外(具體見下),還要注意如果SQL包含排序/分組/去重等等就需要注意索引的使用。

eq_range_index_dive_limit的說明

還是上面的案例,為什么idx_1無法直接使用?需要使用hint強(qiáng)制只用這個(gè)索引呢?這里我們首先看下eq_range_index_dive_limit的值。

root@localhost 22:38:05 [ultrax]> show variables like 'eq_range_index_dive_limit';
+---------------------------+-------+
| Variable_name       | Value |
+---------------------------+-------+
| eq_range_index_dive_limit | 2   | 
+---------------------------+-------+
1 row in set (0.00 sec)

根據(jù)我們上面說的這種情況0 < eq_range_index_dive_limit <= N使用index statistics,那么接下來我們用OPTIMIZER_TRACE來一看究竟。

{
 "index": "displayorder",
 "ranges": [
  "7932552 <= tid <= 7932552 AND -2 <= invisible <= -2",
  "7932552 <= tid <= 7932552 AND 0 <= invisible <= 0"
 ],
 "index_dives_for_eq_ranges": false,
 "rowid_ordered": false,
 "using_mrr": false,
 "index_only": false,
 "rows": 54,
 "cost": 66.81,
 "chosen": true
}
// index dive為false,最終chosen是true
...
{
 "index": "idx_1",
 "ranges": [
  "7932552 <= tid <= 7932552"
 ],
 "index_dives_for_eq_ranges": true,
 "rowid_ordered": false,
 "using_mrr": false,
 "index_only": false,
 "rows": 120646,
 "cost": 144776,
 "chosen": false,
 "cause": "cost"
}

我們可以看到displayorder索引的cost是66.81,而idx_1的cost是120646,而最終MySQL優(yōu)化器選擇了displayorder這條索引。那么如果我們把eq_range_index_dive_limit設(shè)置>N是不是應(yīng)該就會(huì)使用index dive計(jì)算方式,得到更準(zhǔn)確的執(zhí)行計(jì)劃呢?

root@localhost 22:52:52 [ultrax]> set eq_range_index_dive_limit = 3;
Query OK, 0 rows affected (0.00 sec)
root@localhost 22:55:38 [ultrax]> explain SELECT * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;
+----+-------------+----------------+------+-------------------------------------------+-------+---------+-------+--------+-------------+
| id | select_type | table     | type | possible_keys               | key  | key_len | ref  | rows  | Extra    |
+----+-------------+----------------+------+-------------------------------------------+-------+---------+-------+--------+-------------+
| 1 | SIMPLE   | pre_forum_post | ref | PRIMARY,displayorder,first,mul_test,idx_1 | idx_1 | 3    | const | 120646 | Using where | 
+----+-------------+----------------+------+-------------------------------------------+-------+---------+-------+--------+-------------+
1 row in set (0.00 sec)

optimize_trace結(jié)果如下

{
 "index": "displayorder",
 "ranges": [
  "7932552 <= tid <= 7932552 AND -2 <= invisible <= -2",
  "7932552 <= tid <= 7932552 AND 0 <= invisible <= 0"
 ],
 "index_dives_for_eq_ranges": true,
 "rowid_ordered": false,
 "using_mrr": false,
 "index_only": false,
 "rows": 188193,
 "cost": 225834,
 "chosen": true
}
...
{
 "index": "idx_1",
 "ranges": [
  "7932552 <= tid <= 7932552"
 ],
 "index_dives_for_eq_ranges": true,
 "rowid_ordered": false,
 "using_mrr": false,
 "index_only": false,
 "rows": 120646,
 "cost": 144776,
 "chosen": true
}
...
 "cost_for_plan": 144775,
 "rows_for_plan": 120646,
 "chosen": true
// 在備選索引選擇中兩條索引都被選擇,在最后的邏輯優(yōu)化中選在了代價(jià)最小的索引也就是idx_1

以上就是在等值范圍查詢中eq_range_index_dive_limit的值怎么影響MySQL優(yōu)化器計(jì)算開銷,從而影響索引的選擇。另外我們可以通過profiling來看看優(yōu)化器的統(tǒng)計(jì)耗時(shí):

index dive

+----------------------+----------+
| Status        | Duration |
+----------------------+----------+
| starting       | 0.000048 | 
| checking permissions | 0.000004 | 
| Opening tables    | 0.000015 | 
| init         | 0.000044 | 
| System lock     | 0.000009 | 
| optimizing      | 0.000014 | 
| statistics      | 0.032089 | 
| preparing      | 0.000022 | 
| Sorting result    | 0.000003 | 
| executing      | 0.000003 | 
| Sending data     | 0.000101 | 
| end         | 0.000004 | 
| query end      | 0.000002 | 
| closing tables    | 0.000009 | 
| freeing items    | 0.000013 | 
| cleaning up     | 0.000012 | 
+----------------------+----------+

index statistics

+----------------------+----------+
| Status        | Duration |
+----------------------+----------+
| starting       | 0.000045 | 
| checking permissions | 0.000003 | 
| Opening tables    | 0.000014 | 
| init         | 0.000040 | 
| System lock     | 0.000008 | 
| optimizing      | 0.000014 | 
| statistics      | 0.000086 | 
| preparing      | 0.000016 | 
| Sorting result    | 0.000002 | 
| executing      | 0.000002 | 
| Sending data     | 0.000016 | 
| Creating sort index | 0.412123 | 
| end         | 0.000012 | 
| query end      | 0.000004 | 
| closing tables    | 0.000013 | 
| freeing items    | 0.000023 | 
| cleaning up     | 0.000015 | 
+----------------------+----------+

可以看到當(dāng)eq_range_index_dive_limit加大使用index dive時(shí),優(yōu)化器統(tǒng)計(jì)耗時(shí)明顯比ndex statistics方式來的長(zhǎng),但最終它使用了作出了更合理的執(zhí)行計(jì)劃。統(tǒng)計(jì)耗時(shí)0.032089s vs .000086s,但是SQL執(zhí)行耗時(shí)卻是約0.03s vs 0.41s。

附:如何使用optimize_trace

set optimizer_trace='enabled=on'; 
select * from information_schema.optimizer_trace\G
// 注:optimizer_trace建議只在session模式下開啟調(diào)試即可

看完上述內(nèi)容,你們對(duì)MySQL數(shù)據(jù)庫中怎么對(duì)in與range查詢進(jìn)行優(yōu)化有進(jìn)一步的了解嗎?如果還想了解更多知識(shí)或者相關(guān)內(nèi)容,請(qǐng)關(guān)注億速云行業(yè)資訊頻道,感謝大家的支持。

向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI