溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

PostgreSQL 源碼解讀(132)- MVCC#16(vacuum過程-lazy_vacuum_index函數(shù)#1)

發(fā)布時間:2020-08-08 00:05:24 來源:ITPUB博客 閱讀:168 作者:husthxd 欄目:關系型數(shù)據庫

本節(jié)簡單介紹了PostgreSQL手工執(zhí)行vacuum的處理流程,主要分析了ExecVacuum->vacuum->vacuum_rel->heap_vacuum_rel->lazy_scan_heap->lazy_vacuum_index函數(shù)的實現(xiàn)邏輯,該函數(shù)清理index relation。

一、數(shù)據結構

宏定義
Vacuum和Analyze命令選項


/* ----------------------
 *      Vacuum and Analyze Statements
 *      Vacuum和Analyze命令選項
 * 
 * Even though these are nominally two statements, it's convenient to use
 * just one node type for both.  Note that at least one of VACOPT_VACUUM
 * and VACOPT_ANALYZE must be set in options.
 * 雖然在這里有兩種不同的語句,但只需要使用統(tǒng)一的Node類型即可.
 * 注意至少VACOPT_VACUUM/VACOPT_ANALYZE在選項中設置.
 * ----------------------
 */
typedef enum VacuumOption
{
    VACOPT_VACUUM = 1 << 0,     /* do VACUUM */
    VACOPT_ANALYZE = 1 << 1,    /* do ANALYZE */
    VACOPT_VERBOSE = 1 << 2,    /* print progress info */
    VACOPT_FREEZE = 1 << 3,     /* FREEZE option */
    VACOPT_FULL = 1 << 4,       /* FULL (non-concurrent) vacuum */
    VACOPT_SKIP_LOCKED = 1 << 5,    /* skip if cannot get lock */
    VACOPT_SKIPTOAST = 1 << 6,  /* don't process the TOAST table, if any */
    VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7   /* don't skip any pages */
} VacuumOption;

IndexVacuumInfo
傳遞給ambulkdelete/amvacuumcleanup的輸入參數(shù)結構體


/*
 * Struct for input arguments passed to ambulkdelete and amvacuumcleanup
 * 傳遞給ambulkdelete/amvacuumcleanup的輸入參數(shù)結構體
 *
 * num_heap_tuples is accurate only when estimated_count is false;
 * otherwise it's just an estimate (currently, the estimate is the
 * prior value of the relation's pg_class.reltuples field).  It will
 * always just be an estimate during ambulkdelete.
 * 在estimated_count為F的情況下,num_heap_tuples才是精確的.
 * 否則,該值只是一個故事(當前的實現(xiàn)是,該值是relation's pg_class.reltuples字段的上一個值).
 * 在ambulkdelete期間該值會一直都是估算值.
 */
typedef struct IndexVacuumInfo
{
    //index relation
    Relation    index;          /* the index being vacuumed */
    //是否只是ANALYZE(沒有實際的vacuum)
    bool        analyze_only;   /* ANALYZE (without any actual vacuum) */
    //如為T,則num_heap_tuples是一個估算值
    bool        estimated_count;    /* num_heap_tuples is an estimate */
    //進度信息的日志等級
    int         message_level;  /* ereport level for progress messages */
    //在堆中仍存在的元組數(shù)
    double      num_heap_tuples;    /* tuples remaining in heap */
    //訪問策略
    BufferAccessStrategy strategy;  /* access strategy for reads */
} IndexVacuumInfo;

IndexBulkDeleteResult
ambulkdelete/amvacuumcleanup返回的統(tǒng)計信息結構體


/*
 * Struct for statistics returned by ambulkdelete and amvacuumcleanup
 * ambulkdelete/amvacuumcleanup返回的統(tǒng)計信息結構體
 * 
 * This struct is normally allocated by the first ambulkdelete call and then
 * passed along through subsequent ones until amvacuumcleanup; however,
 * amvacuumcleanup must be prepared to allocate it in the case where no
 * ambulkdelete calls were made (because no tuples needed deletion).
 * Note that an index AM could choose to return a larger struct
 * of which this is just the first field; this provides a way for ambulkdelete
 * to communicate additional private data to amvacuumcleanup.
 * 該結構體通常由第一個ambulkdelete調用分配內存,傳遞到下一個處理過程,直至amvacuumcleanup;
 * 但是,在ambulkdelete沒有調用時,amvacuumcleanup必須預分配(因為沒有元組需要刪除).
 * 注意索引訪問方法(AM)可以選擇返回一個更大的結構體,而該結構體是這個更大的結構體的第一個域;
 * 這為ambulkdelete提供了一個方法用于與需要額外私有數(shù)據的amvacuumcleanup函數(shù)通訊.
 *
 * Note: pages_removed is the amount by which the index physically shrank,
 * if any (ie the change in its total size on disk).  pages_deleted and
 * pages_free refer to free space within the index file.  Some index AMs
 * may compute num_index_tuples by reference to num_heap_tuples, in which
 * case they should copy the estimated_count field from IndexVacuumInfo.
 * 注意:pages_remove是索引物理收縮(shrank)的數(shù)量,如果有的話(即它在磁盤上的總大小的變化)。
 * pages_deleted和pages_free指的是索引文件中的空閑空間.
 * 某些索引訪問方法(AMs)可能通過參考num_heap_tuples計算num_index_tuples,
 *   在這種情況下會拷貝從IndexVacuumInfo中拷貝estimated_count域.
 */
typedef struct IndexBulkDeleteResult
{
    //index中剩下的pages
    BlockNumber num_pages;      /* pages remaining in index */
    //在vacuum期間清除的元組數(shù)
    BlockNumber pages_removed;  /* # removed during vacuum operation */
    //num_index_tuples是一個估算值?
    bool        estimated_count;    /* num_index_tuples is an estimate */
    //剩余的元組數(shù)
    double      num_index_tuples;   /* tuples remaining */
    //在vacuum期間清除的元組數(shù)
    double      tuples_removed; /* # removed during vacuum operation */
    //索引中未使用的pages
    BlockNumber pages_deleted;  /* # unused pages in index */
    //可重用的pages
    BlockNumber pages_free;     /* # pages available for reuse */
} IndexBulkDeleteResult;

二、源碼解讀

lazy_vacuum_index
lazy_vacuum_index清理index relation,刪除指向在vacrelstats->dead_tuples元組的索引條目,更新運行時統(tǒng)計信息.
主要邏輯如下:
1.初始化IndexVacuumInfo結構體變量
2.調用index_bulk_delete函數(shù)
3.報告進展


/*
 *  lazy_vacuum_index() -- vacuum one index relation.
 *  lazy_vacuum_index() -- 清理index relation
 *
 *      Delete all the index entries pointing to tuples listed in
 *      vacrelstats->dead_tuples, and update running statistics.
 *      刪除指向在vacrelstats->dead_tuples元組的索引條目,更新運行時統(tǒng)計信息.
 */
static void
lazy_vacuum_index(Relation indrel,
                  IndexBulkDeleteResult **stats,
                  LVRelStats *vacrelstats)
{
    IndexVacuumInfo ivinfo;
    PGRUsage    ru0;
    pg_rusage_init(&ru0);
    ivinfo.index = indrel;
    ivinfo.analyze_only = false;
    ivinfo.estimated_count = true;
    ivinfo.message_level = elevel;
    /* We can only provide an approximate value of num_heap_tuples here */
    //這里只能提供num_heap_tuples的近似值
    ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
    ivinfo.strategy = vac_strategy;
    /* Do bulk deletion */
    //執(zhí)行批量刪除
    *stats = index_bulk_delete(&ivinfo, *stats,
                               lazy_tid_reaped, (void *) vacrelstats);
    ereport(elevel,
            (errmsg("scanned index \"%s\" to remove %d row versions",
                    RelationGetRelationName(indrel),
                    vacrelstats->num_dead_tuples),
             errdetail_internal("%s", pg_rusage_show(&ru0))));
}

lazy_vacuum_index->index_bulk_delete
index_bulk_delete批量刪除索引項,回調函數(shù)會給出main-heap元組是否將被刪除,返回值是已預分配內存的統(tǒng)計信息結構體.


/* ----------------
 *      index_bulk_delete - do mass deletion of index entries
 *      index_bulk_delete - 批量刪除索引項
 *
 *      callback routine tells whether a given main-heap tuple is
 *      to be deleted
 *      回調函數(shù)會給出main-heap元組是否將被刪除.
 *
 *      return value is an optional palloc'd struct of statistics
 *      返回值是已預分配內存的統(tǒng)計信息結構體
 * ----------------
 */
IndexBulkDeleteResult *
index_bulk_delete(IndexVacuumInfo *info,
                  IndexBulkDeleteResult *stats,
                  IndexBulkDeleteCallback callback,
                  void *callback_state)
{
    //獲取relation
    Relation    indexRelation = info->index;
    RELATION_CHECKS;
    CHECK_REL_PROCEDURE(ambulkdelete);
    //ambulkdelete指向的實際函數(shù)是btbulkdelete
    return indexRelation->rd_indam->ambulkdelete(info, stats,
                                                 callback, callback_state);
}

lazy_vacuum_index->index_bulk_delete->…btbulkdelete
Index Relation的rd_amroutine->ambulkdelete,實際是btbulkdelete函數(shù)


/*
 * Bulk deletion of all index entries pointing to a set of heap tuples.
 * The set of target tuples is specified via a callback routine that tells
 * whether any given heap tuple (identified by ItemPointer) is being deleted.
 * 批量刪除指向heap tuples集合的索引條目.
 * 目標元組集合通過回調函數(shù)指定,從而得到哪些給定的元組(通過ItemPointer定義)將被刪除.
 *
 * Result: a palloc'd struct containing statistical info for VACUUM displays.
 * 返回結果:用于VACUUM顯示的統(tǒng)計信息
 */
IndexBulkDeleteResult *
btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
             IndexBulkDeleteCallback callback, void *callback_state)
{
    //relation
    Relation    rel = info->index;
    BTCycleId   cycleid;
    /* allocate stats if first time through, else re-use existing struct */
    //如果是第一次調用,則分配內存,否則重用已存在的結構體
    if (stats == NULL)
        stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
    /* Establish the vacuum cycle ID to use for this scan */
    /* The ENSURE stuff ensures we clean up shared memory on failure */
    //建立vacuum循環(huán)ID,用于本次掃描
    //PG_ENSURE_ERROR_CLEANUP確保在發(fā)生故障時清理共享內存
    PG_ENSURE_ERROR_CLEANUP(_bt_end_vacuum_callback, PointerGetDatum(rel));
    {
        TransactionId oldestBtpoXact;//事務ID
        //開始vacuum
        cycleid = _bt_start_vacuum(rel);
        //指向BTree vacuum掃描
        btvacuumscan(info, stats, callback, callback_state, cycleid,
                     &oldestBtpoXact);
        /*
         * Update cleanup-related information in metapage. This information is
         * used only for cleanup but keeping them up to date can avoid
         * unnecessary cleanup even after bulkdelete.
         * 更新清理相關的信息.
         * 該信息用于清理,但保持該信息最新可以避免不必要的清理.
         */
        _bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
                                     info->num_heap_tuples);
    }
    PG_END_ENSURE_ERROR_CLEANUP(_bt_end_vacuum_callback, PointerGetDatum(rel));
    _bt_end_vacuum(rel);
    //返回統(tǒng)計信息
    return stats;
}

三、跟蹤分析

測試腳本 : 刪除數(shù)據,執(zhí)行vacuum


10:08:46 (xdb@[local]:5432)testdb=# delete from t1 where id < 1200;
DELETE 100
11:26:03 (xdb@[local]:5432)testdb=# checkpoint;
CHECKPOINT
11:26:04 (xdb@[local]:5432)testdb=# 
11:25:55 (xdb@[local]:5432)testdb=# vacuum t1;

啟動gdb,設置斷點


(gdb) b lazy_vacuum_index
Breakpoint 1 at 0x6bea40: file vacuumlazy.c, line 1689.
...
Breakpoint 1, lazy_vacuum_index (indrel=0x7f7334825050, stats=0x2aaffb8, vacrelstats=0x2aaf958) at vacuumlazy.c:1689
1689        pg_rusage_init(&ru0);
(gdb)

輸入參數(shù)


(gdb) p *indrel
$6 = {rd_node = {spcNode = 1663, dbNode = 16402, relNode = 50823}, rd_smgr = 0x0, rd_refcnt = 1, rd_backend = -1, 
  rd_islocaltemp = false, rd_isnailed = false, rd_isvalid = true, rd_indexvalid = 0 '\000', rd_statvalid = false, 
  rd_createSubid = 0, rd_newRelfilenodeSubid = 0, rd_rel = 0x7f733491ad20, rd_att = 0x7f733491a9b8, rd_id = 50823, 
  rd_lockInfo = {lockRelId = {relId = 50823, dbId = 16402}}, rd_rules = 0x0, rd_rulescxt = 0x0, trigdesc = 0x0, 
  rd_rsdesc = 0x0, rd_fkeylist = 0x0, rd_fkeyvalid = false, rd_partkeycxt = 0x0, rd_partkey = 0x0, rd_pdcxt = 0x0, 
  rd_partdesc = 0x0, rd_partcheck = 0x0, rd_indexlist = 0x0, rd_oidindex = 0, rd_pkindex = 0, rd_replidindex = 0, 
  rd_statlist = 0x0, rd_indexattr = 0x0, rd_projindexattr = 0x0, rd_keyattr = 0x0, rd_pkattr = 0x0, rd_idattr = 0x0, 
  rd_projidx = 0x0, rd_pubactions = 0x0, rd_options = 0x0, rd_index = 0x7f733491a8d8, rd_indextuple = 0x7f733491a8a0, 
  rd_amhandler = 330, rd_indexcxt = 0x2a05340, rd_amroutine = 0x2a05480, rd_opfamily = 0x2a05598, rd_opcintype = 0x2a055b8, 
  rd_support = 0x2a055d8, rd_supportinfo = 0x2a05600, rd_indoption = 0x2a05738, rd_indexprs = 0x0, rd_indpred = 0x0, 
  rd_exclops = 0x0, rd_exclprocs = 0x0, rd_exclstrats = 0x0, rd_amcache = 0x0, rd_indcollation = 0x2a05718, 
  rd_fdwroutine = 0x0, rd_toastoid = 0, pgstat_info = 0x2a5e198}
(gdb) p *indrel->rd_rel
$9 = {relname = {data = "idx_t1_id", '\000' <repeats 54 times>}, relnamespace = 2200, reltype = 0, reloftype = 0, 
  relowner = 10, relam = 403, relfilenode = 50823, reltablespace = 0, relpages = 60, reltuples = 8901, relallvisible = 0, 
  reltoastrelid = 0, relhasindex = false, relisshared = false, relpersistence = 112 'p', relkind = 105 'i', relnatts = 1, 
  relchecks = 0, relhasoids = false, relhasrules = false, relhastriggers = false, relhassubclass = false, 
  relrowsecurity = false, relforcerowsecurity = false, relispopulated = true, relreplident = 110 'n', 
  relispartition = false, relrewrite = 0, relfrozenxid = 0, relminmxid = 0}
(gdb) p *stats
$7 = (IndexBulkDeleteResult *) 0x0
(gdb) p *vacrelstats
$8 = {hasindex = true, old_rel_pages = 124, rel_pages = 124, scanned_pages = 59, pinskipped_pages = 0, 
  frozenskipped_pages = 1, tupcount_pages = 59, old_live_tuples = 12686, new_rel_tuples = 14444, new_live_tuples = 14444, 
  new_dead_tuples = 0, pages_removed = 0, tuples_deleted = 100, nonempty_pages = 124, num_dead_tuples = 100, 
  max_dead_tuples = 36084, dead_tuples = 0x2ab8820, num_index_scans = 0, latestRemovedXid = 397076, 
  lock_waiter_detected = false}
(gdb)

初始化IndexVacuumInfo結構體


(gdb) n
1691        ivinfo.index = indrel;
(gdb) 
1692        ivinfo.analyze_only = false;
(gdb) 
1693        ivinfo.estimated_count = true;
(gdb) 
1694        ivinfo.message_level = elevel;
(gdb) 
1696        ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
(gdb) 
1697        ivinfo.strategy = vac_strategy;
(gdb)

調用index_bulk_delete,進入該函數(shù)


1700        *stats = index_bulk_delete(&ivinfo, *stats,
(gdb) step
index_bulk_delete (info=0x7fff39c5d620, stats=0x0, callback=0x6bf507 <lazy_tid_reaped>, callback_state=0x2aaf958)
    at indexam.c:748
748     Relation    indexRelation = info->index;
(gdb)

輸入參數(shù)
info -> IndexVacuumInfo結構體
stats為NULL
回調函數(shù)為lazy_tid_reaped
回調函數(shù)狀態(tài)結構體為callback_state


(gdb) p *info
$10 = {index = 0x7f7334825050, analyze_only = false, estimated_count = true, message_level = 13, num_heap_tuples = 12686, 
  strategy = 0x2a9d478}
(gdb) 
(gdb) p *callback_state
Attempt to dereference a generic pointer.
(gdb) 
(gdb) p *info->strategy
$11 = {btype = BAS_VACUUM, ring_size = 32, current = 4, current_was_in_ring = false, buffers = 0x2a9d488}
(gdb)

調用indexRelation->rd_amroutine->ambulkdelete,該函數(shù)實際指向的是btbulkdelete


(gdb) n
750     RELATION_CHECKS;
(gdb) 
751     CHECK_REL_PROCEDURE(ambulkdelete);
(gdb) 
753     return indexRelation->rd_amroutine->ambulkdelete(info, stats,
(gdb) p indexRelation->rd_amroutine
$12 = (struct IndexAmRoutine *) 0x2a05480
(gdb) p *indexRelation->rd_amroutine
$13 = {type = T_IndexAmRoutine, amstrategies = 5, amsupport = 3, amcanorder = true, amcanorderbyop = false, 
  amcanbackward = true, amcanunique = true, amcanmulticol = true, amoptionalkey = true, amsearcharray = true, 
  amsearchnulls = true, amstorage = false, amclusterable = true, ampredlocks = true, amcanparallel = true, 
  amcaninclude = true, amkeytype = 0, ambuild = 0x5123f0 <btbuild>, ambuildempty = 0x507e6b <btbuildempty>, 
  aminsert = 0x507f11 <btinsert>, ambulkdelete = 0x5096b6 <btbulkdelete>, amvacuumcleanup = 0x509845 <btvacuumcleanup>, 
  amcanreturn = 0x50a21f <btcanreturn>, amcostestimate = 0x9c5356 <btcostestimate>, amoptions = 0x511cd4 <btoptions>, 
  amproperty = 0x511cfe <btproperty>, amvalidate = 0x51522b <btvalidate>, ambeginscan = 0x5082f7 <btbeginscan>, 
  amrescan = 0x508492 <btrescan>, amgettuple = 0x507f90 <btgettuple>, amgetbitmap = 0x50819e <btgetbitmap>, 
  amendscan = 0x508838 <btendscan>, ammarkpos = 0x508b28 <btmarkpos>, amrestrpos = 0x508d20 <btrestrpos>, 
  amestimateparallelscan = 0x5090e6 <btestimateparallelscan>, aminitparallelscan = 0x5090f1 <btinitparallelscan>, 
  amparallelrescan = 0x50913f <btparallelrescan>}

進入btbulkdelete


(gdb) step
btbulkdelete (info=0x7fff39c5d620, stats=0x0, callback=0x6bf507 <lazy_tid_reaped>, callback_state=0x2aaf958) at nbtree.c:857
857     Relation    rel = info->index;
(gdb)

輸入參數(shù)參見上述函數(shù)輸入參數(shù),類似
獲取relation,初始化統(tǒng)計信息


857     Relation    rel = info->index;
(gdb) n
861     if (stats == NULL)
(gdb) 
862         stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
(gdb) 
866     PG_ENSURE_ERROR_CLEANUP(_bt_end_vacuum_callback, PointerGetDatum(rel));
(gdb)

獲取cycleid


(gdb) n
870         cycleid = _bt_start_vacuum(rel);
(gdb) 
872         btvacuumscan(info, stats, callback, callback_state, cycleid,
(gdb) p cycleid
$14 = 1702
(gdb)

調用btvacuumscan,返回統(tǒng)計信息


(gdb) n
880         _bt_update_meta_cleanup_info(info->index, oldestBtpoXact,
(gdb) 
883     PG_END_ENSURE_ERROR_CLEANUP(_bt_end_vacuum_callback, PointerGetDatum(rel));
(gdb) 
884     _bt_end_vacuum(rel);
(gdb) 
886     return stats;
(gdb) p *stats
$15 = {num_pages = 60, pages_removed = 0, estimated_count = false, num_index_tuples = 8801, tuples_removed = 100, 
  pages_deleted = 6, pages_free = 6}
(gdb)

DONE!

btvacuumscan下節(jié)再行介紹

四、參考資料

PG Source Code

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內容(圖片、視頻和文字)以原創(chuàng)、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。

AI