溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

PostgreSQL 源碼解讀(118)- MVCC#3(Tuple可見性判斷)

發(fā)布時間:2020-08-09 10:48:03 來源:ITPUB博客 閱讀:388 作者:husthxd 欄目:關(guān)系型數(shù)據(jù)庫

本節(jié)介紹了PostgreSQL判斷元組可見性的實現(xiàn)邏輯,重點解析了HeapTupleSatisfiesMVCC函數(shù)的處理過程。

一、數(shù)據(jù)結(jié)構(gòu)

宏定義


//事務ID,無符號32位整型
typedef uint32 TransactionId;
/* ----------------
 *      Special transaction ID values
 *      系統(tǒng)保留的事務ID
 *
 * BootstrapTransactionId is the XID for "bootstrap" operations, and
 * FrozenTransactionId is used for very old tuples.  Both should
 * always be considered valid.
 *
 * FirstNormalTransactionId is the first "normal" transaction id.
 * Note: if you need to change it, you must change pg_class.h as well.
 * ----------------
 */
#define InvalidTransactionId        ((TransactionId) 0)
#define BootstrapTransactionId      ((TransactionId) 1)
#define FrozenTransactionId         ((TransactionId) 2)
#define FirstNormalTransactionId    ((TransactionId) 3)
#define MaxTransactionId            ((TransactionId) 0xFFFFFFFF)
/* ----------------
 *      transaction ID manipulation macros
 * ----------------
 */
#define TransactionIdIsValid(xid)       ((xid) != InvalidTransactionId)
#define TransactionIdIsNormal(xid)      ((xid) >= FirstNormalTransactionId)
#define TransactionIdEquals(id1, id2)   ((id1) == (id2))
#define TransactionIdStore(xid, dest)   (*(dest) = (xid))
#define StoreInvalidTransactionId(dest) (*(dest) = InvalidTransactionId)
/* advance a transaction ID variable, handling wraparound correctly */
#define TransactionIdAdvance(dest)  \
    do { \
        (dest)++; \
        if ((dest) < FirstNormalTransactionId) \
            (dest) = FirstNormalTransactionId; \
    } while(0)
/* back up a transaction ID variable, handling wraparound correctly */
#define TransactionIdRetreat(dest)  \
    do { \
        (dest)--; \
    } while ((dest) < FirstNormalTransactionId)
/* compare two XIDs already known to be normal; this is a macro for speed */
//比較兩個已知是常規(guī)事務的XIDs;宏定義是為了性能考慮.
#define NormalTransactionIdPrecedes(id1, id2) \
    (AssertMacro(TransactionIdIsNormal(id1) && TransactionIdIsNormal(id2)), \
    (int32) ((id1) - (id2)) < 0)
/* compare two XIDs already known to be normal; this is a macro for speed */
#define NormalTransactionIdFollows(id1, id2) \
    (AssertMacro(TransactionIdIsNormal(id1) && TransactionIdIsNormal(id2)), \
    (int32) ((id1) - (id2)) > 0)
/*
 * HeapTupleHeaderGetRawXmin returns the "raw" xmin field, which is the xid
 * originally used to insert the tuple.  However, the tuple might actually
 * be frozen (via HeapTupleHeaderSetXminFrozen) in which case the tuple's xmin
 * is visible to every snapshot.  Prior to PostgreSQL 9.4, we actually changed
 * the xmin to FrozenTransactionId, and that value may still be encountered
 * on disk.
 * HeapTupleHeaderGetRawXmin返回“原始”xmin字段,這是最初用于插入元組的xid。
 * 但是,實際上元組實可能被凍結(jié)(通過HeapTupleHeaderSetXminFrozen),
 *   在這種情況下,元組可對每個快照都可見。
 * 在PostgreSQL 9.4之前,我們實際上將xmin更改為FrozenTransactionId,該值仍然可能在磁盤上遇到。
 */
#define HeapTupleHeaderGetRawXmin(tup) \
( \
    (tup)->t_choice.t_heap.t_xmin \
)
#define HeapTupleHeaderXminCommitted(tup) \
( \
    ((tup)->t_infomask & HEAP_XMIN_COMMITTED) != 0 \
)
/*
 * HeapTupleHeaderGetRawXmax gets you the raw Xmax field.  To find out the Xid
 * that updated a tuple, you might need to resolve the MultiXactId if certain
 * bits are set.  HeapTupleHeaderGetUpdateXid checks those bits and takes care
 * to resolve the MultiXactId if necessary.  This might involve multixact I/O,
 * so it should only be used if absolutely necessary.
 * HeapTupleHeaderGetRawXmax返回原始的Xmax字段.
 * 搜索更新元組的事務Xid,如設(shè)定了特定的標記位則可能需要解決MultiXactId.
 * HeapTupleHeaderGetUpdateXid檢查這些標記位并在需要時解決MultiXactId問題.
 * 這可能會引起multixact I/O,如非必要不要使用.
 */
#define HeapTupleHeaderGetUpdateXid(tup) \
( \
    (!((tup)->t_infomask & HEAP_XMAX_INVALID) && \
     ((tup)->t_infomask & HEAP_XMAX_IS_MULTI) && \
     !((tup)->t_infomask & HEAP_XMAX_LOCK_ONLY)) ? \
        HeapTupleGetUpdateXid(tup) \
    : \
        HeapTupleHeaderGetRawXmax(tup) \
)
#define HeapTupleHeaderGetRawXmax(tup) \
( \
    (tup)->t_choice.t_heap.t_xmax \
)
/*
 * A tuple is only locked (i.e. not updated by its Xmax) if the
 * HEAP_XMAX_LOCK_ONLY bit is set; or, for pg_upgrade's sake, if the Xmax is
 * not a multi and the EXCL_LOCK bit is set.
 * 如HEAP_XMAX_LOCK_ONLY標記位設(shè)置,則表示僅僅只是鎖定了tuple(比如并非更新了Xmax字段).
 * 或者,對于pg_upgrade升級,如Xmax并非HEAP_XMAX_IS_MULTI而且EXCL_LOCK標記位設(shè)置,也是表達這個意思.
 *
 * See also HeapTupleHeaderIsOnlyLocked, which also checks for a possible
 * aborted updater transaction.
 * 可同時參考HeapTupleHeaderIsOnlyLocked,該函數(shù)也檢查了可能已回滾的更新事務.
 *
 * Beware of multiple evaluations of the argument.
 * 注意對參數(shù)的多種分析.
 */
#define HEAP_XMAX_IS_LOCKED_ONLY(infomask) \
    (((infomask) & HEAP_XMAX_LOCK_ONLY) || \
     (((infomask) & (HEAP_XMAX_IS_MULTI | HEAP_LOCK_MASK)) == HEAP_XMAX_EXCL_LOCK))

二、源碼解讀

全局變量snapshot->satisfies函數(shù)在初始化該全局變量已設(shè)置為HeapTupleSatisfiesMVCC.


static SnapshotData CurrentSnapshotData = {HeapTupleSatisfiesMVCC};

HeapTupleSatisfiesMVCC — 如Tuple對于給定的MVCC快照可見,則返回T


/*
 * HeapTupleSatisfiesMVCC
 *      True iff heap tuple is valid for the given MVCC snapshot.
 * HeapTupleSatisfiesMVCC -- 如Tuple對于給定的MVCC快照可見,則返回T
 *
 *  Here, we consider the effects of:
 *      all transactions committed as of the time of the given snapshot
 *      previous commands of this transaction
 * 在這里,我們考慮的是:
 *     在給定快照之前的事務命令執(zhí)行時提交的所有事務
 *
 *  Does _not_ include:
 *      transactions shown as in-progress by the snapshot
 *      transactions started after the snapshot was taken
 *      changes made by the current command
 * 但不包括:
 *      快照顯示仍在進行中的事務.
 *      快照之后才啟動的事務.
 *     
 *
 * Notice that here, we will not update the tuple status hint bits if the
 * inserting/deleting transaction is still running according to our snapshot,
 * even if in reality it's committed or aborted by now.  This is intentional.
 * Checking the true transaction state would require access to high-traffic
 * shared data structures, creating contention we'd rather do without, and it
 * would not change the result of our visibility check anyway.  The hint bits
 * will be updated by the first visitor that has a snapshot new enough to see
 * the inserting/deleting transaction as done.  In the meantime, the cost of
 * leaving the hint bits unset is basically that each HeapTupleSatisfiesMVCC
 * call will need to run TransactionIdIsCurrentTransactionId in addition to
 * XidInMVCCSnapshot (but it would have to do the latter anyway).  In the old
 * coding where we tried to set the hint bits as soon as possible, we instead
 * did TransactionIdIsInProgress in each call --- to no avail, as long as the
 * inserting/deleting transaction was still running --- which was more cycles
 * and more contention on the PGXACT array.
 * 在這里需要注意的是,如果插入/刪除事務仍然在快照范圍內(nèi)運行,
 *   那么我們將不會更新元組狀態(tài)提示位,即使它現(xiàn)在已經(jīng)提交或中止。這是做法內(nèi)部約定的.
 * 檢查真正的事務狀態(tài)需要訪問高頻訪問的共享數(shù)據(jù)結(jié)構(gòu),
 *   這就產(chǎn)生了我們不希望看到的爭用,而且它也不會改變可見性檢查的結(jié)果。
 * 第一個訪問者將更新提示位,該訪問者的快照足夠新,可以看到插入/刪除事務已經(jīng)完成。
 * 與此同時,不設(shè)置提示位的代價是,除了XidInMVCCSnapshot之外,
 *   每個HeapTupleSatisfiesMVCC調(diào)用都需要運行TransactionIdIsCurrentTransactionId
 *   (無論如何都必須執(zhí)行后者)。
 * 在舊的編碼中,我們試圖盡快地設(shè)置提示位,
 *   而不是在每次調(diào)用中執(zhí)行TransactionIdIsInProgress——
 *      只要插入/刪除事務仍在運行,這些工作看起來都是徒勞的
 *   ——導致PGXACT數(shù)組上更多的循環(huán)和爭用。
 */
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
                       Buffer buffer)
{
    HeapTupleHeader tuple = htup->t_data;//Tuple
    Assert(ItemPointerIsValid(&htup->t_self));
    Assert(htup->t_tableOid != InvalidOid);
    if (!HeapTupleHeaderXminCommitted(tuple))
    {
        //A.xmin事務未提交(HEAP_XMIN_COMMITTED標記未設(shè)置)
        if (HeapTupleHeaderXminInvalid(tuple))
            //xmin = 0,此tuple不可見
            return false;
        /* Used by pre-9.0 binary upgrades */
        //用于9.0前二進制可執(zhí)行文件的升級,HEAP_MOVED_OFF & HEAP_MOVED_IN已不再使用
        if (tuple->t_infomask & HEAP_MOVED_OFF)
        {
            TransactionId xvac = HeapTupleHeaderGetXvac(tuple);
            if (TransactionIdIsCurrentTransactionId(xvac))
                return false;
            if (!XidInMVCCSnapshot(xvac, snapshot))
            {
                if (TransactionIdDidCommit(xvac))
                {
                    SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
                                InvalidTransactionId);
                    return false;
                }
                SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
                            InvalidTransactionId);
            }
        }
        /* Used by pre-9.0 binary upgrades */
        //同上
        else if (tuple->t_infomask & HEAP_MOVED_IN)
        {
            TransactionId xvac = HeapTupleHeaderGetXvac(tuple);
            if (!TransactionIdIsCurrentTransactionId(xvac))
            {
                if (XidInMVCCSnapshot(xvac, snapshot))
                    return false;
                if (TransactionIdDidCommit(xvac))
                    SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
                                InvalidTransactionId);
                else
                {
                    SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
                                InvalidTransactionId);
                    return false;
                }
            }
        }
        else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))
        {
            //A.xmin事務未提交
            //A1.xmin對應的事務是當前事務
            if (HeapTupleHeaderGetCmin(tuple) >= snapshot->curcid)
                //A1-1.tuple.cmin >= 快照的cid,說明插入在掃描開始前,返回F
                return false;   /* inserted after scan started */
            if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid */
                //A1-2.xmax無效,返回T
                return true;
            if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))    /* not deleter */
                //A1-3.XMAX僅僅只是鎖定而已,返回T
                return true;
            if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
            {
                //xmax is a MultiXactId 
                TransactionId xmax;
                //獲取更新的Xid
                xmax = HeapTupleGetUpdateXid(tuple);
                /* not LOCKED_ONLY, so it has to have an xmax */
                //驗證xmax是有效的
                Assert(TransactionIdIsValid(xmax));
                /* updating subtransaction must have aborted */
                //執(zhí)行更新的子事務必須已回滾(why?因為前提是xmin事務是未提交的,因此xmax必須已回滾)
                if (!TransactionIdIsCurrentTransactionId(xmax))
                    //正在執(zhí)行刪除的事務不是本事務,返回T
                    return true;
                else if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
                    //本事務執(zhí)行刪除:刪除命令在快照之后,返回T
                    return true;    /* updated after scan started */
                else
                    //本事務執(zhí)行刪除:刪除命令在快照之前,返回F
                    return false;   /* updated before scan started */
            }
            if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tuple)))
            {
                //A1-4.tuple.xmax對應的事務不是本事務
                /* deleting subtransaction must have aborted */
                //刪除子事務已終止(why?因為前提是xmin事務是未提交的,因此xmax必須已回滾)
                SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
                            InvalidTransactionId);
                return true;
            }
            if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
                //A1-5.刪除在快照之后,返回T
                return true;    /* deleted after scan started */
            else
                //A1-6.刪在快照之前,返回F
                return false;   /* deleted before scan started */
        }
        else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
            //A.xmin事務未提交;
            //A2.xmin在當前快照中(非本事務) --> 不可見
            return false;
        else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple)))
            //A.xmin事務未提交;
            //A3.查詢事務日志clog,Xmin事務已提交,則設(shè)置標志位,可見判斷邏輯在后續(xù)執(zhí)行
            SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
                        HeapTupleHeaderGetRawXmin(tuple));
        else
        {
            /* it must have aborted or crashed */
            //A4.以上各種情況都不符合,事務回滾或者在執(zhí)行事務過程中crash了
            //設(shè)置標志位,返回F
            SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
                        InvalidTransactionId);
            return false;
        }
    }
    else
    {
        /* xmin is committed, but maybe not according to our snapshot */
        //B.xmin事務已提交,但可能不屬于該快照
        if (!HeapTupleHeaderXminFrozen(tuple) &&
            XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
            return false;       /* treat as still in progress */
    }
    /* by here, the inserting transaction has committed */
    //C.Xmin事務已提交
    if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid or aborted */
        //C1.xmax無效(HEAP_XMAX_INVALID),則返回T
        return true;
    if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
        //C2.xmax僅僅只是lock而已,返回T
        return true;
    if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
    {
        //xmax is a MultiXactId 
        TransactionId xmax;
        /* already checked above */
        //再次檢查
        Assert(!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask));
        //獲取xmax
        xmax = HeapTupleGetUpdateXid(tuple);
        /* not LOCKED_ONLY, so it has to have an xmax */
        Assert(TransactionIdIsValid(xmax));
        if (TransactionIdIsCurrentTransactionId(xmax))
        {
            //xmax為當前事務id
            if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
                //快照之后執(zhí)行刪除,返回T
                return true;    /* deleted after scan started */
            else
                //快照之前執(zhí)行刪除,返回F
                return false;   /* deleted before scan started */
        }
        if (XidInMVCCSnapshot(xmax, snapshot))
            //xmax仍在進行中,返回T
            return true;
        if (TransactionIdDidCommit(xmax))
            //xmax事務通過查詢clog日志,已確認提交,則返回F
            return false;       /* updating transaction committed */
        /* it must have aborted or crashed */
        //xmax回滾或者執(zhí)行期間crash,返回T
        return true;
    }
    if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
    {
        //C3.xmax未提交
        if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tuple)))
        {
            //C3-1:本事務
            if (HeapTupleHeaderGetCmax(tuple) >= snapshot->curcid)
                return true;    /* deleted after scan started */
            else
                return false;   /* deleted before scan started */
        }
        if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot))
            //C3-2:非本事務
            //xmax事務仍在進行中
            return true;
        if (!TransactionIdDidCommit(HeapTupleHeaderGetRawXmax(tuple)))
        {
            /* it must have aborted or crashed */
            //C3-3查詢clog日志,xmax事務未提交(回滾或crash),設(shè)置標記
            SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
                        InvalidTransactionId);
            //可見
            return true;
        }
        /* xmax transaction committed */
        //以上判斷均不成立,則可認為事務已提交,設(shè)置標記
        SetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
                    HeapTupleHeaderGetRawXmax(tuple));
    }
    else
    {
        /* xmax is committed, but maybe not according to our snapshot */
        //C4.xmax已提交,但快照指示該事務仍在進行中
        if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot))
            //仍在處理中,返回T
            return true;        /* treat as still in progress */
    }
    /* xmax transaction committed */
    //C5.xmax已提交,返回F
    return false;
}

三、跟蹤分析

創(chuàng)建數(shù)據(jù)表,插入數(shù)據(jù),刪除其中一行數(shù)據(jù),提交,更新其中一條數(shù)據(jù),不提交,執(zhí)行查詢.
session 1


11:55:00 (xdb@[local]:5432)testdb=# drop table t_mvcc2;
ERROR:  table "t_mvcc2" does not exist
11:55:01 (xdb@[local]:5432)testdb=# create table t_mvcc2(c1 int not null,c2  varchar(40),c3 varchar(40));
CREATE TABLE
11:55:02 (xdb@[local]:5432)testdb=# 
11:55:02 (xdb@[local]:5432)testdb=# insert into t_mvcc2(c1,c2,c3) values(1,'C2-1','C3-1');
INSERT 0 1
11:55:02 (xdb@[local]:5432)testdb=# insert into t_mvcc2(c1,c2,c3) values(2,'C2-2','C3-2');
INSERT 0 1
11:55:03 (xdb@[local]:5432)testdb=# 
11:55:38 (xdb@[local]:5432)testdb=# begin;
BEGIN
11:55:38 (xdb@[local]:5432)testdb=#* 
11:55:38 (xdb@[local]:5432)testdb=#* delete from t_mvcc2 where c1 = 1;
DELETE 1
11:55:38 (xdb@[local]:5432)testdb=#* 
11:55:38 (xdb@[local]:5432)testdb=#* commit;
COMMIT
11:55:38 (xdb@[local]:5432)testdb=# 
11:55:38 (xdb@[local]:5432)testdb=# begin;
BEGIN
11:55:38 (xdb@[local]:5432)testdb=#* 
11:55:38 (xdb@[local]:5432)testdb=#* update t_mvcc2 set c2 = 'C2#'||substr(c2,4,40) where c1 = 2;
UPDATE 1
11:55:38 (xdb@[local]:5432)testdb=#* 
12:19:17 (xdb@[local]:5432)testdb=#* select txid_current();
 txid_current 
--------------
         2363
(1 row)

另外啟動session 2,查詢數(shù)據(jù)表t_mvcc2


11:58:45 (xdb@[local]:5432)testdb=# select lp,lp_off,lp_flags,t_xmin,t_xmax,t_field3 as t_cid,t_ctid,t_infomask2,t_infomask from heap_page_items(get_raw_page('t_mvcc2',0));
 lp | lp_off | lp_flags | t_xmin | t_xmax | t_cid | t_ctid | t_infomask2 | t_infomask 
----+--------+----------+--------+--------+-------+--------+-------------+------------
  1 |   8152 |        1 |   2360 |   2362 |     0 | (0,1)  |        8195 |       1282
  2 |   8112 |        1 |   2361 |   2363 |     0 | (0,3)  |       16387 |        258
  3 |   8072 |        1 |   2363 |      0 |     0 | (0,3)  |       32771 |      10242
(3 rows)
11:59:38 (xdb@[local]:5432)testdb=# select * from t_mvcc2;

啟動gdb,設(shè)置斷點


(gdb) b HeapTupleSatisfiesMVCC
Breakpoint 1 at 0xa93125: file tqual.c, line 966.
(gdb) c
Continuing.
Breakpoint 1, HeapTupleSatisfiesMVCC (htup=0x1985d18, snapshot=0x1a06358, buffer=69) at tqual.c:966
966     HeapTupleHeader tuple = htup->t_data;
(gdb)

查看調(diào)用棧


(gdb) bt
#0  HeapTupleSatisfiesMVCC (htup=0x1985d18, snapshot=0x1a06358, buffer=69) at tqual.c:966
#1  0x00000000004de959 in heap_hot_search_buffer (tid=0x1985d1c, relation=0x7f002f73e2b0, buffer=69, snapshot=0x1a06358, 
    heapTuple=0x1985d18, all_dead=0x7ffc56ed8862, first_call=true) at heapam.c:2127
#2  0x00000000004fd12e in index_fetch_heap (scan=0x1985cb8) at indexam.c:608
#3  0x00000000004fd35d in index_getnext (scan=0x1985cb8, direction=ForwardScanDirection) at indexam.c:691
#4  0x00000000004fb910 in systable_getnext (sysscan=0x19868f8) at genam.c:425
#5  0x0000000000a1cb63 in SearchCatCacheMiss (cache=0x19de480, nkeys=1, hashValue=1281076841, hashIndex=105, v1=42610, 
    v2=0, v3=0, v4=0) at catcache.c:1386
#6  0x0000000000a1ca10 in SearchCatCacheInternal (cache=0x19de480, nkeys=1, v1=42610, v2=0, v3=0, v4=0) at catcache.c:1317
#7  0x0000000000a1c6fe in SearchCatCache1 (cache=0x19de480, v1=42610) at catcache.c:1185
#8  0x0000000000a37543 in SearchSysCache1 (cacheId=50, key1=42610) at syscache.c:1119
#9  0x0000000000a6ae60 in check_enable_rls (relid=42610, checkAsUser=0, noError=false) at rls.c:66
#10 0x000000000087b286 in get_row_security_policies (root=0x1985dd0, rte=0x1985ee8, rt_index=1, 
    securityQuals=0x7ffc56ed8cc0, withCheckOptions=0x7ffc56ed8cb8, hasRowSecurity=0x7ffc56ed8cb7, 
    hasSubLinks=0x7ffc56ed8cb6) at rowsecurity.c:133
#11 0x0000000000875e1c in fireRIRrules (parsetree=0x1985dd0, activeRIRs=0x0) at rewriteHandler.c:1904
#12 0x0000000000878b23 in QueryRewrite (parsetree=0x1985dd0) at rewriteHandler.c:3712
#13 0x00000000008c64d7 in pg_rewrite_query (query=0x1985dd0) at postgres.c:782
#14 0x00000000008c634e in pg_analyze_and_rewrite (parsetree=0x1985c20, query_string=0x1984ec8 "select * from t_mvcc2;", 
    paramTypes=0x0, numParams=0, queryEnv=0x0) at postgres.c:698
#15 0x00000000008c6993 in exec_simple_query (query_string=0x1984ec8 "select * from t_mvcc2;") at postgres.c:1070
#16 0x00000000008cae70 in PostgresMain (argc=1, argv=0x19b0dc8, dbname=0x19b0c30 "testdb", username=0x1981ba8 "xdb")
    at postgres.c:4182
#17 0x000000000082642b in BackendRun (port=0x19a6c00) at postmaster.c:4361
#18 0x0000000000825b8f in BackendStartup (port=0x19a6c00) at postmaster.c:4033
#19 0x0000000000821f1c in ServerLoop () at postmaster.c:1706
#20 0x00000000008217b4 in PostmasterMain (argc=1, argv=0x197fb60) at postmaster.c:1379
---Type <return> to continue, or q <return> to quit---
#21 0x00000000007488ef in main (argc=1, argv=0x197fb60) at main.c:228
(gdb)

第一個Tuple
輸入?yún)?shù),主要是tuple/snapshot/buffer


(gdb) p *snapshot --> 快照信息
$3 = {satisfies = 0xa9310d <HeapTupleSatisfiesMVCC>, xmin = 2363, xmax = 2363, xip = 0x0, xcnt = 0, subxip = 0x0, 
  subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = true, curcid = 0, speculativeToken = 0, 
  active_count = 0, regd_count = 1, ph_node = {first_child = 0x0, next_sibling = 0x0, 
    prev_or_parent = 0xf9bfa0 <CatalogSnapshotData+64>}, whenTaken = 0, lsn = 0}

獲取tuple
t_infomask = 2313,十六進制為0x0909,即HEAP_XMAX_INVALID | HEAP_XMIN_COMMITTED | HEAP_HASOID | HEAP_HASNULL


(gdb) n
968     Assert(ItemPointerIsValid(&htup->t_self));
(gdb) p tuple
$1 = (HeapTupleHeader) 0x7f0002da7a60
(gdb) p *tuple
$2 = {t_choice = {t_heap = {t_xmin = 2359, t_xmax = 0, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {datum_len_ = 2359, 
      datum_typmod = 0, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 12}, ip_posid = 6}, t_infomask2 = 33, 
  t_infomask = 2313, t_hoff = 32 ' ', t_bits = 0x7f0002da7a77 "\377\377\377?"}
(gdb)

判斷tuple.xmin是否已提交,按上一步的t_infomask標記,該事務已提交,進入相應邏輯


(gdb) n
969     Assert(htup->t_tableOid != InvalidOid);
(gdb) 
971     if (!HeapTupleHeaderXminCommitted(tuple))
(gdb)

判斷:B.xmin事務已提交,但可能不屬于該快照
不符合此條件,繼續(xù)執(zhí)行


(gdb) n
1074                XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
(gdb) 
1073            if (!HeapTupleHeaderXminFrozen(tuple) &&
(gdb) 
1080        if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid or aborted */
(gdb)

判斷是否HEAP_XMAX_INVALID,按t_infomask標記,符合條件,返回T


(gdb) 
1080        if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid or aborted */
(gdb) n
1081            return true;
(gdb) 
1148    }
(gdb)

第二個Tuple
接下來是第二個Tuple


(gdb) c
Continuing.
Breakpoint 1, HeapTupleSatisfiesMVCC (htup=0x1985d18, snapshot=0x1a06358, buffer=155) at tqual.c:966
966     HeapTupleHeader tuple = htup->t_data;
(gdb)

獲取tuple并查看
t_infomask = 2313,十六進制為0x0909,即HEAP_XMAX_INVALID | HEAP_XMIN_COMMITTED | HEAP_HASOID | HEAP_HASNULL


(gdb) p *htup
$4 = {t_len = 172, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 1}, ip_posid = 40}, t_tableOid = 1247, t_data = 0x7f0002e52800}
(gdb) n
968     Assert(ItemPointerIsValid(&htup->t_self));
(gdb) p tuple
$5 = (HeapTupleHeader) 0x7f0002e52800
(gdb) p *tuple
$6 = {t_choice = {t_heap = {t_xmin = 1, t_xmax = 0, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {datum_len_ = 1, 
      datum_typmod = 0, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 1}, ip_posid = 40}, t_infomask2 = 30, 
  t_infomask = 2313, t_hoff = 32 ' ', t_bits = 0x7f0002e52817 "\377\377\377\a"}
(gdb)

與第一個Tuple類似,下面查看第三個Tupel
第三個Tuple
獲取tuple并查看
t_infomask = 10505,十六進制值為0x2909,即HEAP_UPDATED | HEAP_XMAX_INVALID | HEAP_XMIN_COMMITTED | HEAP_HASOID | HEAP_HASNULL


(gdb) c
Continuing.
Breakpoint 1, HeapTupleSatisfiesMVCC (htup=0x1985d18, snapshot=0xf9bf60 <CatalogSnapshotData>, buffer=4) at tqual.c:966
966     HeapTupleHeader tuple = htup->t_data;
(gdb) n
968     Assert(ItemPointerIsValid(&htup->t_self));
(gdb) 
969     Assert(htup->t_tableOid != InvalidOid);
(gdb) p *tuple
$7 = {t_choice = {t_heap = {t_xmin = 2117, t_xmax = 0, t_field3 = {t_cid = 7, t_xvac = 7}}, t_datum = {datum_len_ = 2117, 
      datum_typmod = 0, datum_typeid = 7}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 1}, ip_posid = 58}, t_infomask2 = 33, 
  t_infomask = 10505, t_hoff = 32 ' ', t_bits = 0x7f0002d25aa7 "\377\377\377?"}
(gdb)

第三個Tuple,類似于第一/二個Tuple,仍返回T


(gdb) n
971     if (!HeapTupleHeaderXminCommitted(tuple))
(gdb) 
1073            if (!HeapTupleHeaderXminFrozen(tuple) &&
(gdb) 
1074                XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
(gdb) 
1073            if (!HeapTupleHeaderXminFrozen(tuple) &&
(gdb) 
1080        if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid or aborted */
(gdb) 
1081            return true;
(gdb) 
1148    }

第四/五/六個Tuple
均有HEAP_XMAX_INVALID標志,不作展開


(gdb) c
Continuing.
Breakpoint 1, HeapTupleSatisfiesMVCC (htup=0x1985d18, snapshot=0x1a06358, buffer=44) at tqual.c:966
966     HeapTupleHeader tuple = htup->t_data;
(gdb) n
968     Assert(ItemPointerIsValid(&htup->t_self));
(gdb) 
969     Assert(htup->t_tableOid != InvalidOid);
(gdb) p *tuple
$8 = {t_choice = {t_heap = {t_xmin = 1, t_xmax = 0, t_field3 = {t_cid = 75, t_xvac = 75}}, t_datum = {datum_len_ = 1, 
      datum_typmod = 0, datum_typeid = 75}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 21}, ip_posid = 1}, t_infomask2 = 24, 
  t_infomask = 2305, t_hoff = 32 ' ', t_bits = 0x7f0002d76307 "\377\377\017"}
(gdb) 
...
(gdb) p *tuple
$9 = {t_choice = {t_heap = {t_xmin = 1, t_xmax = 0, t_field3 = {t_cid = 75, t_xvac = 75}}, t_datum = {datum_len_ = 1, 
      datum_typmod = 0, datum_typeid = 75}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 1}, ip_posid = 19}, t_infomask2 = 20, 
  t_infomask = 2307, t_hoff = 32 ' ', t_bits = 0x7f0002d7368f "\377\377\003"}
...
(gdb) p *tuple
$10 = {t_choice = {t_heap = {t_xmin = 1, t_xmax = 0, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {datum_len_ = 1, 
      datum_typmod = 0, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 1}, t_infomask2 = 4, 
  t_infomask = 2313, t_hoff = 32 ' ', t_bits = 0x7f0002ee632f "\003"}

第七個Tuple
t_infomask = 1282,0x0502,即HEAP_XMAX_COMMITTED | HEAP_XMIN_COMMITTED | HEAP_HASVARWIDTH


(gdb) c
Continuing.
Breakpoint 1, HeapTupleSatisfiesMVCC (htup=0x7ffc56ed8920, snapshot=0x1a062c0, buffer=206) at tqual.c:966
966     HeapTupleHeader tuple = htup->t_data;
(gdb) n
968     Assert(ItemPointerIsValid(&htup->t_self));
(gdb) 
969     Assert(htup->t_tableOid != InvalidOid);
(gdb) p *tuple
$11 = {t_choice = {t_heap = {t_xmin = 2360, t_xmax = 2362, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {
      datum_len_ = 2360, datum_typmod = 2362, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, 
    ip_posid = 1}, t_infomask2 = 8195, t_infomask = 1282, t_hoff = 24 '\030', t_bits = 0x7f0002eba36f ""}
(gdb)

這是被”deleted”的tuple,2362事務已提交,不可見


1086        if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
(gdb) 
1113        if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
(gdb) 
1141            if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot))
(gdb) n
1147        return false;
(gdb)

第八個Tuple
t_infomask = 258,0x0102,即HEAP_XMIN_COMMITTED | HEAP_HASVARWIDTH


(gdb) p *tuple
$12 = {t_choice = {t_heap = {t_xmin = 2361, t_xmax = 2363, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {
      datum_len_ = 2361, datum_typmod = 2363, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, 
    ip_posid = 3}, t_infomask2 = 16387, t_infomask = 258, t_hoff = 24 '\030', t_bits = 0x7f0002eba347 ""}
(gdb)

這是正在update的tuple,應可見


1080        if (tuple->t_infomask & HEAP_XMAX_INVALID)  /* xid invalid or aborted */
(gdb) 
1083        if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
(gdb) 
1086        if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
(gdb) 
1113        if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
(gdb) 
1115            if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmax(tuple)))  --> 非當前事務
(gdb) 
1123            if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmax(tuple), snapshot)) --> 事務快照標志該事務進行中
(gdb) 
1124                return true; --> 可見
(gdb)

第九個Tuple
t_infomask = 10242,0x2802,即HEAP_UPDATED | HEAP_XMAX_INVALID | HEAP_HASVARWIDTH


(gdb) p *tuple
$13 = {t_choice = {t_heap = {t_xmin = 2363, t_xmax = 0, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {datum_len_ = 2363, 
      datum_typmod = 0, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 3}, 
  t_infomask2 = 32771, t_infomask = 10242, t_hoff = 24 '\030', t_bits = 0x7f0002eba31f ""}
(gdb)

這是update操作新生成的tuple,不可見


(gdb) n
971     if (!HeapTupleHeaderXminCommitted(tuple)) --> xmin未提交
(gdb) 
973         if (HeapTupleHeaderXminInvalid(tuple))
(gdb) 
977         if (tuple->t_infomask & HEAP_MOVED_OFF)
(gdb) 
996         else if (tuple->t_infomask & HEAP_MOVED_IN)
(gdb) 
1015            else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple))) --> 非當前事務
(gdb) 
1057            else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) --> xmin事務處于活動中
(gdb) 
1058                return false; --> 不可見
(gdb)

查詢結(jié)果


11:59:38 (xdb@[local]:5432)testdb=# select * from t_mvcc2;
 c1 |  c2  |  c3  
----+------+------
  2 | C2-2 | C3-2
(1 row)

DONE!

四、參考資料

PG Source Code

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI