您好,登錄后才能下訂單哦!
一直以來(lái)未對(duì)Innodb 的undo進(jìn)行好好的學(xué)習(xí),最近剛好有點(diǎn)時(shí)間準(zhǔn)備學(xué)習(xí)一下,通過(guò)阿里內(nèi)核月報(bào)和自己看代碼的綜合總結(jié)一下。本文環(huán)境:
本文描述使用如上參數(shù)的設(shè)置。
本過(guò)程調(diào)用函數(shù)srv_undo_tablespaces_init進(jìn)行,棧幀如下:
#0 srv_undo_tablespaces_init (create_new_db=true, n_conf_tablespaces=4, n_opened=0x2ef55b0) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/srv/srv0start.cc:824#1 0x0000000001bbd7e0 in innobase_start_or_create_for_mysql () at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/srv/srv0start.cc:2188#2 0x00000000019ca74e in innobase_init (p=0x2f2a420) at /root/mysqlc/percona-server-locks-detail-5.7.22/storage/innobase/handler/ha_innodb.cc:4409#3 0x0000000000f7ec2a in ha_initialize_handlerton (plugin=0x2fca110) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/handler.cc:871#4 0x00000000015f9edf in plugin_initialize (plugin=0x2fca110) at /root/mysqlc/percona-server-locks-detail-5.7.22/sql/sql_plugin.cc:1252
本過(guò)程主要有如下幾個(gè)步驟:
for (i = 0; create_new_db && i < n_conf_tablespaces; ++i) //n_conf_tablespaces 為innodb_undo_tablespaces的配置的個(gè)數(shù)/** Default undo tablespace size in UNIV_PAGEs count (10MB). */const ulint SRV_UNDO_TABLESPACE_SIZE_IN_PAGES = ((1024 * 1024) * 10) / UNIV_PAGE_SIZE_DEF; ... err = srv_undo_tablespace_create( name, SRV_UNDO_TABLESPACE_SIZE_IN_PAGES); //建立undo文件...
本步驟會(huì)有一個(gè)注釋如下:
/* Create the undo spaces only if we are creating a new instance. We don't allow creating of new undo tablespaces in an existing instance (yet). This restriction exists because we check in several places for SYSTEM tablespaces to be less than the min of user defined tablespace ids. Once we implement saving the location of the undo tablespaces and their space ids this restriction will/should be lifted. */
簡(jiǎn)單的講就是建立undo tablespace只能在初始化實(shí)例的時(shí)候,因?yàn)閟pace id已經(jīng)固定了。
for (i = 0; i < n_undo_tablespaces; ++i) { .... err = srv_undo_tablespace_open(name, undo_tablespace_ids[i]); //打開(kāi)UNDO文件 建立 file node... }
for (i = 0; i < n_undo_tablespaces; ++i) { fsp_header_init( //初始化fsp header 明顯 space id 已經(jīng)寫(xiě)入 undo_tablespace_ids[i], SRV_UNDO_TABLESPACE_SIZE_IN_PAGES, &mtr); //SRV_UNDO_TABLESPACE_SIZE_IN_PAGES 默認(rèn)的undo大小 10MB }
其中fsp_header_init部分代碼如下:
mlog_write_ulint(header + FSP_SPACE_ID, space_id, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_NOT_USED, 0, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_SIZE, size, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_FREE_LIMIT, 0, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_SPACE_FLAGS, space->flags, MLOG_4BYTES, mtr); mlog_write_ulint(header + FSP_FRAG_N_USED, 0, MLOG_4BYTES, mtr); flst_init(header + FSP_FREE, mtr); flst_init(header + FSP_FREE_FRAG, mtr); flst_init(header + FSP_FULL_FRAG, mtr); flst_init(header + FSP_SEG_INODES_FULL, mtr); flst_init(header + FSP_SEG_INODES_FREE, mtr);
這些都是fsp的內(nèi)容。
做完這個(gè)步驟只是生成了4個(gè)大小為10MB的 undo tablespace文件,并且已經(jīng)加入到Innodb文件體系,但是里面沒(méi)有任何類(lèi)容。
本步驟調(diào)用 trx_sys_create_sys_pages->trx_sysf_create進(jìn)行,本步驟除了初始化transaction system segment以外還會(huì)初始化其header( ibdata page no 5))信息如下:
/* Create the trx sys file block in a new allocated file segment */ block = fseg_create(TRX_SYS_SPACE, 0, TRX_SYS + TRX_SYS_FSEG_HEADER, mtr); //建立segment buf_block_dbg_add_level(block, SYNC_TRX_SYS_HEADER); ut_a(block->page.id.page_no() == TRX_SYS_PAGE_NO); page = buf_block_get_frame(block); //獲取內(nèi)存位置 mlog_write_ulint(page + FIL_PAGE_TYPE, FIL_PAGE_TYPE_TRX_SYS, //寫(xiě)入block 的類(lèi)型 MLOG_2BYTES, mtr); ... /* Start counting transaction ids from number 1 up */ mach_write_to_8(sys_header + TRX_SYS_TRX_ID_STORE, 1); // 初始化TRX_SYS_TRX_ID_STORE /* Reset the rollback segment slots. Old versions of InnoDB define TRX_SYS_N_RSEGS as 256 (TRX_SYS_OLD_N_RSEGS) and expect that the whole array is initialized. */ ptr = TRX_SYS_RSEGS + sys_header; len = ut_max(TRX_SYS_OLD_N_RSEGS, TRX_SYS_N_RSEGS) * TRX_SYS_RSEG_SLOT_SIZE;//TRX_SYS_OLD_N_RSEGS 為256個(gè) memset(ptr, 0xff, len); //將slot的信息的全部初始化為ff ptr += len; ut_a(ptr <= page + (UNIV_PAGE_SIZE - FIL_PAGE_DATA_END)); /* Initialize all of the page. This part used to be uninitialized. */ memset(ptr, 0, UNIV_PAGE_SIZE - FIL_PAGE_DATA_END + page - ptr); //將剩下的空間設(shè)置為0x00 mlog_log_string(sys_header, UNIV_PAGE_SIZE - FIL_PAGE_DATA_END + page - sys_header, mtr); /* Create the first rollback segment in the SYSTEM tablespace */ slot_no = trx_sysf_rseg_find_free(mtr, false, 0); page_no = trx_rseg_header_create(TRX_SYS_SPACE, univ_page_size, ULINT_MAX, slot_no, mtr); //將第一個(gè)slot固定在ibdata中
完成了這一步過(guò)后ibdata的 block 5 就初始化完了,而且我們看到所有的rollback segment slots 都初始化完成(源碼所示有256個(gè),實(shí)際上最多只會(huì)有128個(gè),其中0號(hào)solt固定在ibdata中),注意這里的槽大小是TRX_SYS_RSEG_SLOT_SIZE設(shè)置的大小為8字節(jié),4字節(jié)space id ,4字節(jié) page no,它們會(huì)指向 rollback segment header所在的位置。
/** Transaction system header *//*------------------------------------------------------------- @{ */#define TRX_SYS_TRX_ID_STORE 0 /*!< the maximum trx id or trx number modulo TRX_SYS_TRX_ID_UPDATE_MARGIN written to a file page by any transaction; the assignment of transaction ids continues from this number rounded up by TRX_SYS_TRX_ID_UPDATE_MARGIN plus TRX_SYS_TRX_ID_UPDATE_MARGIN when the database is started */ //最大的事物ID,下次實(shí)例啟動(dòng)會(huì)加上TRX_SYS_TRX_ID_UPDATE_MARGIN啟動(dòng)#define TRX_SYS_FSEG_HEADER 8 /*!< segment header for the tablespace segment the trx system is created into */#define TRX_SYS_RSEGS (8 + FSEG_HEADER_SIZE) /*!< the start of the array of rollback segment specification slots *///指向rollback segment header的槽/*------------------------------------------------------------- @} */
調(diào)用 trx_sys_create_rsegs進(jìn)行:
根據(jù)注釋和代碼innodb_undo_logs已經(jīng)是個(gè)淘汰的參數(shù),應(yīng)該用innodb_rollback_segments代替。
這兩個(gè)參數(shù)默認(rèn)是就是TRX_SYS_N_RSEGS及 128 其實(shí)不用設(shè)置的。本文也用128進(jìn)行討論。
參數(shù) innodb_rollback_segments
static MYSQL_SYSVAR_ULONG(rollback_segments, srv_rollback_segments, PLUGIN_VAR_OPCMDARG, "Number of rollback segments to use for storing undo logs.", NULL, NULL, TRX_SYS_N_RSEGS, /* Default setting */ 1, /* Minimum value */ TRX_SYS_N_RSEGS, 0); /* Maximum value */
參數(shù) innodb_undo_logs
static MYSQL_SYSVAR_ULONG(undo_logs, srv_undo_logs, PLUGIN_VAR_OPCMDARG, "Number of rollback segments to use for storing undo logs. (deprecated)", NULL, innodb_undo_logs_update, TRX_SYS_N_RSEGS, /* Default setting */ 1, /* Minimum value */ TRX_SYS_N_RSEGS, 0); /* Maximum value */
TRX_SYS_N_RSEGS 就是128
下面是注釋和代碼
/* Deprecate innodb_undo_logs. But still use it if it is set to non-default and innodb_rollback_segments is default. */ if (srv_undo_logs < TRX_SYS_N_RSEGS) { ib::warn() << deprecated_undo_logs; if (srv_rollback_segments == TRX_SYS_N_RSEGS) { srv_rollback_segments = srv_undo_logs; } }
n_noredo_created = trx_sys_create_noredo_rsegs(n_tmp_rsegs); //創(chuàng)建 32個(gè) 臨時(shí)rollback segments
我們這里不準(zhǔn)備考慮臨時(shí)rollback segments
ulint new_rsegs = n_rsegs - n_used; //eg:128 -33 = 95 for (i = 0; i < new_rsegs; ++i) { //對(duì)每個(gè)rollback segment進(jìn)行初始化 ulint space_id; space_id = (n_spaces == 0) ? 0 : (srv_undo_space_id_start + i % n_spaces); //獲取 undo space_id 采用 取模的方式循環(huán)初始化 1 2 3 4 ut_ad(n_spaces == 0 || srv_is_undo_tablespace(space_id)); if (trx_rseg_create(space_id, 0) != NULL)
我們能夠注意到這里是i % n_spaces的取模方式n_spaces為我們innodb_undo_tablespaces參數(shù)設(shè)置的值,因此每個(gè)rollback segment 是輪序的方式分布到4個(gè)不同的undo tablespace中的。
如上是trx_rseg_create調(diào)用trx_rseg_header_create完成的。步驟大概如下:
1、建立rollback segment
block = fseg_create(space, 0, TRX_RSEG + TRX_RSEG_FSEG_HEADER, mtr); //建立一個(gè)回滾段,返回段頭所在的塊
2、初始化TRX_RSEG_MAX_SIZE和TRX_RSEG_HISTORY_SIZE信息
/* Initialize max size field */ mlog_write_ulint(rsegf + TRX_RSEG_MAX_SIZE, max_size, MLOG_4BYTES, mtr); /* Initialize the history list */ mlog_write_ulint(rsegf + TRX_RSEG_HISTORY_SIZE, 0, MLOG_4BYTES, mtr); flst_init(rsegf + TRX_RSEG_HISTORY, mtr);
3、初始化每個(gè)undo segment header所在的page no
for (i = 0; i < TRX_RSEG_N_SLOTS; i++) { //TRX_RSEG_N_SLOTS 為1024 初始化每個(gè)槽 值為 4字節(jié)指向 undo segment header的page no trx_rsegf_set_nth_undo(rsegf, i, FIL_NULL, mtr); }
初始化的情況下我們看到指向的page no都是 FIL_NULL,說(shuō)明沒(méi)有分配任何實(shí)際的undo segment。
4、整個(gè)rollback segment 初始化完成后將space id和page no 寫(xiě)回到 transaction system segment header中。
sys_header = trx_sysf_get(mtr); //獲取 5號(hào) block指針 跳過(guò) FIL_PAGE_DATA 38U trx_sysf_rseg_set_space(sys_header, rseg_slot_no, space, mtr); //設(shè)置spacetrx_sysf_rseg_set_page_no(sys_header, rseg_slot_no, page_no, mtr); //設(shè)置 no
/* Transaction rollback segment header *//*-------------------------------------------------------------*/#define TRX_RSEG_MAX_SIZE 0 /* Maximum allowed size for rollback segment in pages */#define TRX_RSEG_HISTORY_SIZE 4 /* Number of file pages occupied by the logs in the history list */ //history 鏈表大小#define TRX_RSEG_HISTORY 8 /* The update undo logs for committed transactions */ //鏈表頭base node 他們通常調(diào)用include/fut0lst.ic中的函數(shù)進(jìn)行更改#define TRX_RSEG_FSEG_HEADER (8 + FLST_BASE_NODE_SIZE) /* Header for the file segment where this page is placed */#define TRX_RSEG_UNDO_SLOTS (8 + FLST_BASE_NODE_SIZE + FSEG_HEADER_SIZE) /* Undo log segment slots */ ///*-------------------------------------------------------------*/
作為 base node的 TRX_RSEG_HISTORY我們可以看到定義如下
/* We define the field offsets of a base node for the list */#define FLST_LEN 0 /* 32-bit list length field */#define FLST_FIRST 4 /* 6-byte address of the first element of the list; undefined if empty list */#define FLST_LAST (4 + FIL_ADDR_SIZE) /* 6-byte address of the last element of the list; undefined if empty list */#define FIL_ADDR_PAGE 0 /* first in address is the page offset */#define FIL_ADDR_BYTE 4 /* then comes 2-byte byte offset within page*/#endif /* !UNIV_INNOCHECKSUM */#define FIL_ADDR_SIZE 6 /* address size is 6 bytes */
多了一個(gè)長(zhǎng)度
到這里128 rollback segment已經(jīng)初始化完成,并且 每個(gè)都包含1024個(gè) undo segment slots。
為了讓圖更加美觀和好理解,我這里使用的是innodb_undo_tablespaces=2的情況下作圖,也就是只有2個(gè) undo tablespace的情況。其實(shí)4個(gè)也是同樣的道理,因?yàn)閞ollback segment slot是輪詢?cè)诒砜臻g分配的。
最終我們看到初始化完成后undo segment slot指向的都是FIL_NULL,及沒(méi)有指向,當(dāng)實(shí)際分配的時(shí)候這些slot就會(huì)指向我們的undo segment header。
同時(shí)我們可以看看undotablespace到底包含哪些類(lèi)型塊,使用自制的小工具讀取如下:
./myblock undo001 -d|more current read blocks is : 0 --This Block is file space header blocks! current read blocks is : 1 --This Block is insert buffer bitmap blocks! current read blocks is : 2 --This Block is inode blocks! current read blocks is : 3 --This Block is system blocks! current read blocks is : 4 --This Block is system blocks! current read blocks is : 5 --This Block is system blocks! current read blocks is : 6 --This Block is system blocks! current read blocks is : 7 --This Block is system blocks! current read blocks is : 8 --This Block is system blocks! current read blocks is : 9 --This Block is system blocks! current read blocks is : 10 --This Block is system blocks! current read blocks is : 11 --This Block is system blocks! current read blocks is : 12 --This Block is system blocks! current read blocks is : 13 --This Block is system blocks! current read blocks is : 14 --This Block is system blocks! current read blocks is : 15 --This Block is system blocks! current read blocks is : 16 --This Block is system blocks! current read blocks is : 17 --This Block is system blocks! current read blocks is : 18 --This Block is system blocks! current read blocks is : 19 --This Block is system blocks! current read blocks is : 20 --This Block is system blocks! current read blocks is : 21 --This Block is system blocks! current read blocks is : 22 --This Block is system blocks! current read blocks is : 23 --This Block is system blocks! current read blocks is : 24 --This Block is system blocks! current read blocks is : 25 --This Block is system blocks! current read blocks is : 26 --This Block is system blocks! current read blocks is : 27 --This Block is undo blocks! current read blocks is : 28 --This Block is undo blocks! current read blocks is : 29 --This Block is undo blocks! current read blocks is : 30 --This Block is undo blocks! current read blocks is : 31 --This Block is undo blocks! current read blocks is : 32 --This Block is undo blocks! current read blocks is : 33 --This Block is undo blocks! current read blocks is : 34 --This Block is undo blocks! current read blocks is : 35 --This Block is undo blocks! current read blocks is : 36 --This Block is undo blocks! current read blocks is : 37 --This Block is undo blocks! current read blocks is : 38 --This Block is new allocate blocks! current read blocks is : 39 --This Block is new allocate blocks! current read blocks is : 40 --This Block is new allocate blocks! current read blocks is : 41 --This Block is new allocate blocks! current read blocks is : 42 --This Block is new allocate blocks!
這里 block3-block26 就是我們的rollback segment header block。我這里當(dāng)然是 4個(gè)undo tablespace的情況,看的是undo tablespace 1??磥?lái)沒(méi)有問(wèn)題。分析正確。
普通的undo segment的關(guān)聯(lián)方式是:ibdata的block 5 system segment header通過(guò)33-128這些 rollback segment slot 輪詢指向不同的undo tablespace 的rollback segment header,然后每個(gè)rollback segment header中有1024個(gè)slot來(lái)指向?qū)嶋H的undo segment header,來(lái)實(shí)現(xiàn)的。實(shí)際的undo block會(huì)掛載到undo segment header下的鏈表中。
undo tablespaces數(shù)量的變化只能通過(guò)重新初始化實(shí)例來(lái)改變,space id是固定了,所以要考慮清楚
innodb_undo_tablespaces是undo tablespace的數(shù)量而innodb_rollback_segments是 rollback segment的數(shù)量,參數(shù)innodb_undo_logs已經(jīng)過(guò)時(shí)了,它和innodb_rollback_segments是同樣的功能,默認(rèn)他們都是128
rollback segment slot 0 固定在 ibdata中,而 rollback segment slot 1-32 為臨時(shí)rollback segment,33-128才是普通事物的rollback segment。
參考文獻(xiàn):
http://mysql.taobao.org/monthly/2015/04/01/
阿里內(nèi)核月報(bào)
作者微信:gp_22389860
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。