您好,登錄后才能下訂單哦!
這篇文章主要講解了“PostgreSQL中GetSnapshotData的處理過程是什么”,文中的講解內(nèi)容簡單清晰,易于學(xué)習(xí)與理解,下面請大家跟著小編的思路慢慢深入,一起來研究和學(xué)習(xí)“PostgreSQL中GetSnapshotData的處理過程是什么”吧!
全局/靜態(tài)變量
/* * Currently registered Snapshots. Ordered in a heap by xmin, so that we can * quickly find the one with lowest xmin, to advance our MyPgXact->xmin. * 當(dāng)前已注冊的快照. * 按照xmin堆排序,這樣我們可以快速找到xmin最小的一個(gè),從而可以設(shè)置MyPgXact->xmin。 */ static int xmin_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg); static pairingheap RegisteredSnapshots = {&xmin_cmp, NULL, NULL}; /* first GetTransactionSnapshot call in a transaction? */ bool FirstSnapshotSet = false; /* * Remember the serializable transaction snapshot, if any. We cannot trust * FirstSnapshotSet in combination with IsolationUsesXactSnapshot(), because * GUC may be reset before us, changing the value of IsolationUsesXactSnapshot. * 如存在則記下serializable事務(wù)快照. * 我們不能信任與IsolationUsesXactSnapshot()結(jié)合使用的FirstSnapshotSet, * 因?yàn)镚UC可能會在我們之前重置,改變IsolationUsesXactSnapshot的值。 */ static Snapshot FirstXactSnapshot = NULL; /* * CurrentSnapshot points to the only snapshot taken in transaction-snapshot * mode, and to the latest one taken in a read-committed transaction. * SecondarySnapshot is a snapshot that's always up-to-date as of the current * instant, even in transaction-snapshot mode. It should only be used for * special-purpose code (say, RI checking.) CatalogSnapshot points to an * MVCC snapshot intended to be used for catalog scans; we must invalidate it * whenever a system catalog change occurs. * CurrentSnapshot指向在transaction-snapshot模式下獲取的唯一快照/在read-committed事務(wù)中獲取的最新快照。 * SecondarySnapshot是即使在transaction-snapshot模式下,也總是最新的快照。它應(yīng)該只用于特殊用途碼(例如,RI檢查)。 * CatalogSnapshot指向打算用于catalog掃描的MVCC快照; * 無論何時(shí)發(fā)生system catalog更改,我們都必須馬上使其失效。 * * These SnapshotData structs are static to simplify memory allocation * (see the hack in GetSnapshotData to avoid repeated malloc/free). * 這些SnapshotData結(jié)構(gòu)體是靜態(tài)的便于簡化內(nèi)存分配. * (可以回過頭來看GetSnapshotData函數(shù)如何避免重復(fù)的malloc/free) */ static SnapshotData CurrentSnapshotData = {HeapTupleSatisfiesMVCC}; static SnapshotData SecondarySnapshotData = {HeapTupleSatisfiesMVCC}; SnapshotData CatalogSnapshotData = {HeapTupleSatisfiesMVCC}; /* Pointers to valid snapshots */ //指向有效的快照 static Snapshot CurrentSnapshot = NULL; static Snapshot SecondarySnapshot = NULL; static Snapshot CatalogSnapshot = NULL; static Snapshot HistoricSnapshot = NULL; /* * These are updated by GetSnapshotData. We initialize them this way * for the convenience of TransactionIdIsInProgress: even in bootstrap * mode, we don't want it to say that BootstrapTransactionId is in progress. * 這些變量通過函數(shù)GetSnapshotData更新. * 為了便于TransactionIdIsInProgress,以這種方式初始化它們: * 即使在引導(dǎo)模式下,我們也不希望表示BootstrapTransactionId正在進(jìn)行中。 * * RecentGlobalXmin and RecentGlobalDataXmin are initialized to * InvalidTransactionId, to ensure that no one tries to use a stale * value. Readers should ensure that it has been set to something else * before using it. * RecentGlobalXmin和RecentGlobalDataXmin初始化為InvalidTransactionId, * 以確保沒有人嘗試使用過時(shí)的值。 * 在使用它之前,讀取進(jìn)程應(yīng)確保它已經(jīng)被設(shè)置為其他值。 */ TransactionId TransactionXmin = FirstNormalTransactionId; TransactionId RecentXmin = FirstNormalTransactionId; TransactionId RecentGlobalXmin = InvalidTransactionId; TransactionId RecentGlobalDataXmin = InvalidTransactionId; /* (table, ctid) => (cmin, cmax) mapping during timetravel */ static HTAB *tuplecid_data = NULL;
MyPgXact
當(dāng)前的事務(wù)信息.
/* * Flags for PGXACT->vacuumFlags * PGXACT->vacuumFlags標(biāo)記 * * Note: If you modify these flags, you need to modify PROCARRAY_XXX flags * in src/include/storage/procarray.h. * 注意:如果修改了這些標(biāo)記,需要更新src/include/storage/procarray.h中的PROCARRAY_XXX標(biāo)記 * * PROC_RESERVED may later be assigned for use in vacuumFlags, but its value is * used for PROCARRAY_SLOTS_XMIN in procarray.h, so GetOldestXmin won't be able * to match and ignore processes with this flag set. * PROC_RESERVED可能在接下來分配給vacuumFlags使用, * 但是它在procarray.h中用于標(biāo)識PROCARRAY_SLOTS_XMIN, * 因此GetOldestXmin不能匹配和忽略使用此標(biāo)記的進(jìn)程. */ //是否auto vacuum worker? #define PROC_IS_AUTOVACUUM 0x01 /* is it an autovac worker? */ //正在運(yùn)行l(wèi)azy vacuum #define PROC_IN_VACUUM 0x02 /* currently running lazy vacuum */ //正在運(yùn)行analyze #define PROC_IN_ANALYZE 0x04 /* currently running analyze */ //只能通過auto vacuum設(shè)置 #define PROC_VACUUM_FOR_WRAPAROUND 0x08 /* set by autovac only */ //在事務(wù)外部正在執(zhí)行邏輯解碼 #define PROC_IN_LOGICAL_DECODING 0x10 /* currently doing logical * decoding outside xact */ //保留用于procarray #define PROC_RESERVED 0x20 /* reserved for procarray */ /* flags reset at EOXact */ //在EOXact時(shí)用于重置標(biāo)記的MASK #define PROC_VACUUM_STATE_MASK \ (PROC_IN_VACUUM | PROC_IN_ANALYZE | PROC_VACUUM_FOR_WRAPAROUND) /* * Prior to PostgreSQL 9.2, the fields below were stored as part of the * PGPROC. However, benchmarking revealed that packing these particular * members into a separate array as tightly as possible sped up GetSnapshotData * considerably on systems with many CPU cores, by reducing the number of * cache lines needing to be fetched. Thus, think very carefully before adding * anything else here. */ typedef struct PGXACT { //當(dāng)前的頂層事務(wù)ID(非子事務(wù)) //出于優(yōu)化的目的,只讀事務(wù)并不會分配事務(wù)號(xid = 0) TransactionId xid; /* id of top-level transaction currently being * executed by this proc, if running and XID * is assigned; else InvalidTransactionId */ //在啟動事務(wù)時(shí),當(dāng)前正在執(zhí)行的最小事務(wù)號XID,但不包括LAZY VACUUM //vacuum不能清除刪除事務(wù)號xid >= xmin的元組 TransactionId xmin; /* minimal running XID as it was when we were * starting our xact, excluding LAZY VACUUM: * vacuum must not remove tuples deleted by * xid >= xmin ! */ //vacuum相關(guān)的標(biāo)記 uint8 vacuumFlags; /* vacuum-related flags, see above */ bool overflowed; bool delayChkpt; /* true if this proc delays checkpoint start; * previously called InCommit */ uint8 nxids; } PGXACT; extern PGDLLIMPORT struct PGXACT *MyPgXact;
Snapshot
SnapshotData結(jié)構(gòu)體指針,SnapshotData結(jié)構(gòu)體可表達(dá)的信息囊括了所有可能的快照.
有以下幾種不同類型的快照:
1.常規(guī)的MVCC快照
2.在恢復(fù)期間的MVCC快照(處于Hot-Standby模式)
3.在邏輯解碼過程中使用的歷史MVCC快照
4.作為參數(shù)傳遞給HeapTupleSatisfiesDirty()函數(shù)的快照
5.作為參數(shù)傳遞給HeapTupleSatisfiesNonVacuumable()函數(shù)的快照
6.用于在沒有成員訪問情況下SatisfiesAny、Toast和Self的快照
//SnapshotData結(jié)構(gòu)體指針 typedef struct SnapshotData *Snapshot; //無效的快照 #define InvalidSnapshot ((Snapshot) NULL) /* * We use SnapshotData structures to represent both "regular" (MVCC) * snapshots and "special" snapshots that have non-MVCC semantics. * The specific semantics of a snapshot are encoded by the "satisfies" * function. * 我們使用SnapshotData結(jié)構(gòu)體表示"regular" (MVCC) snapshots和具有非MVCC語義的"special" snapshots。 */ //測試函數(shù) typedef bool (*SnapshotSatisfiesFunc) (HeapTuple htup, Snapshot snapshot, Buffer buffer); //常見的有: //HeapTupleSatisfiesMVCC:判斷元組對某一快照版本是否有效 //HeapTupleSatisfiesUpdate:判斷元組是否可更新(同時(shí)更新同一個(gè)元組) //HeapTupleSatisfiesDirty:判斷當(dāng)前元組是否存在臟數(shù)據(jù) //HeapTupleSatisfiesSelf:判斷tuple對自身信息是否有效 //HeapTupleSatisfiesToast:判斷是否TOAST表 //HeapTupleSatisfiesVacuum:判斷元組是否能被VACUUM刪除 //HeapTupleSatisfiesAny:所有元組都可見 //HeapTupleSatisfiesHistoricMVCC:用于CATALOG 表 /* * Struct representing all kind of possible snapshots. * 該結(jié)構(gòu)體可表達(dá)的信息囊括了所有可能的快照. * * There are several different kinds of snapshots: * * Normal MVCC snapshots * * MVCC snapshots taken during recovery (in Hot-Standby mode) * * Historic MVCC snapshots used during logical decoding * * snapshots passed to HeapTupleSatisfiesDirty() * * snapshots passed to HeapTupleSatisfiesNonVacuumable() * * snapshots used for SatisfiesAny, Toast, Self where no members are * accessed. * 有以下幾種不同類型的快照: * * 常規(guī)的MVCC快照 * * 在恢復(fù)期間的MVCC快照(處于Hot-Standby模式) * * 在邏輯解碼過程中使用的歷史MVCC快照 * * 作為參數(shù)傳遞給HeapTupleSatisfiesDirty()函數(shù)的快照 * * 作為參數(shù)傳遞給HeapTupleSatisfiesNonVacuumable()函數(shù)的快照 * * 用于在沒有成員訪問情況下SatisfiesAny、Toast和Self的快照 * * TODO: It's probably a good idea to split this struct using a NodeTag * similar to how parser and executor nodes are handled, with one type for * each different kind of snapshot to avoid overloading the meaning of * individual fields. * TODO: 使用類似于parser/executor nodes的處理,使用NodeTag來拆分結(jié)構(gòu)體會是一個(gè)好的做法, * 使用OO(面向?qū)ο罄^承)的方法. */ typedef struct SnapshotData { //測試tuple是否可見的函數(shù) SnapshotSatisfiesFunc satisfies; /* tuple test function */ /* * The remaining fields are used only for MVCC snapshots, and are normally * just zeroes in special snapshots. (But xmin and xmax are used * specially by HeapTupleSatisfiesDirty, and xmin is used specially by * HeapTupleSatisfiesNonVacuumable.) * 余下的字段僅用于MVCC快照,在特殊快照中通常為0。 * (xmin和xmax可用于HeapTupleSatisfiesDirty,xmin可用于HeapTupleSatisfiesNonVacuumable) * * An MVCC snapshot can never see the effects of XIDs >= xmax. It can see * the effects of all older XIDs except those listed in the snapshot. xmin * is stored as an optimization to avoid needing to search the XID arrays * for most tuples. * XIDs >= xmax的事務(wù),對該快照是不可見的(沒有任何影響). * 對該快照可見的是小于xmax,但不在snapshot列表中的XIDs. * 記錄xmin是出于優(yōu)化的目的,避免為大多數(shù)tuples搜索XID數(shù)組. */ //XID ∈ [2,min)是可見的 TransactionId xmin; /* all XID < xmin are visible to me */ //XID ∈ [xmax,∞)是不可見的 TransactionId xmax; /* all XID >= xmax are invisible to me */ /* * For normal MVCC snapshot this contains the all xact IDs that are in * progress, unless the snapshot was taken during recovery in which case * it's empty. For historic MVCC snapshots, the meaning is inverted, i.e. * it contains *committed* transactions between xmin and xmax. * 對于普通的MVCC快照,xip存儲了所有正在進(jìn)行中的XIDs,除非在恢復(fù)期間產(chǎn)生的快照(這時(shí)候數(shù)組為空) * 對于歷史MVCC快照,意義相反,即它包含xmin和xmax之間的*已提交*事務(wù)。 * * note: all ids in xip[] satisfy xmin <= xip[i] < xmax * 注意: 所有在xip數(shù)組中的XIDs滿足xmin <= xip[i] < xmax */ TransactionId *xip; //xip數(shù)組中的元素個(gè)數(shù) uint32 xcnt; /* # of xact ids in xip[] */ /* * For non-historic MVCC snapshots, this contains subxact IDs that are in * progress (and other transactions that are in progress if taken during * recovery). For historic snapshot it contains *all* xids assigned to the * replayed transaction, including the toplevel xid. * 對于非歷史MVCC快照,下面這些域含有活動的subxact IDs. * (以及在恢復(fù)過程中狀態(tài)為進(jìn)行中的事務(wù)). * 對于歷史MVCC快照,這些域字段含有*所有*用于回放事務(wù)的快照,包括頂層事務(wù)XIDs. * * note: all ids in subxip[] are >= xmin, but we don't bother filtering * out any that are >= xmax * 注意:sbuxip數(shù)組中的元素均≥ xmin,但我們不需要過濾掉任何>= xmax的項(xiàng) */ TransactionId *subxip; //subxip數(shù)組元素個(gè)數(shù) int32 subxcnt; /* # of xact ids in subxip[] */ //是否溢出? bool suboverflowed; /* has the subxip array overflowed? */ //在Recovery期間的快照? bool takenDuringRecovery; /* recovery-shaped snapshot? */ //如為靜態(tài)快照,則該值為F bool copied; /* false if it's a static snapshot */ //在自身的事務(wù)中,CID < curcid是可見的 CommandId curcid; /* in my xact, CID < curcid are visible */ /* * An extra return value for HeapTupleSatisfiesDirty, not used in MVCC * snapshots. * HeapTupleSatisfiesDirty返回的值,在MVCC快照中無用 */ uint32 speculativeToken; /* * Book-keeping information, used by the snapshot manager * 用于快照管理器的Book-keeping信息 */ //在ActiveSnapshot棧中的引用計(jì)數(shù) uint32 active_count; /* refcount on ActiveSnapshot stack */ //在RegisteredSnapshots中的引用計(jì)數(shù) uint32 regd_count; /* refcount on RegisteredSnapshots */ //RegisteredSnapshots堆中的鏈接 pairingheap_node ph_node; /* link in the RegisteredSnapshots heap */ //快照"拍攝"時(shí)間戳 TimestampTz whenTaken; /* timestamp when snapshot was taken */ //拍照時(shí)WAL stream中的位置 XLogRecPtr lsn; /* position in the WAL stream when taken */ } SnapshotData;
ShmemVariableCache
VariableCache是共享內(nèi)存中的一種數(shù)據(jù)結(jié)構(gòu),用于跟蹤OID和XID分配狀態(tài)。
ShmemVariableCache是VariableCache結(jié)構(gòu)體指針.
/* * VariableCache is a data structure in shared memory that is used to track * OID and XID assignment state. For largely historical reasons, there is * just one struct with different fields that are protected by different * LWLocks. * VariableCache是共享內(nèi)存中的一種數(shù)據(jù)結(jié)構(gòu),用于跟蹤OID和XID分配狀態(tài)。 * 由于歷史原因,這個(gè)結(jié)構(gòu)體有不同的字段,由不同的LWLocks保護(hù)。 * * Note: xidWrapLimit and oldestXidDB are not "active" values, but are * used just to generate useful messages when xidWarnLimit or xidStopLimit * are exceeded. * 注意:xidWrapLimit和oldestXidDB是不"活躍"的值,在xidWarnLimit或xidStopLimit * 超出限制時(shí)用于產(chǎn)生有用的信息. */ typedef struct VariableCacheData { /* * These fields are protected by OidGenLock. * 這些域字段通過OidGenLock字段保護(hù) */ //下一個(gè)待分配的OID Oid nextOid; /* next OID to assign */ //在必須執(zhí)行XLOG work前可用OIDs uint32 oidCount; /* OIDs available before must do XLOG work */ /* * These fields are protected by XidGenLock. * 這些字段通過XidGenLock鎖保護(hù). */ //下一個(gè)待分配的事務(wù)ID TransactionId nextXid; /* next XID to assign */ //集群范圍內(nèi)最小datfrozenxid TransactionId oldestXid; /* cluster-wide minimum datfrozenxid */ //在該XID開始強(qiáng)制執(zhí)行autovacuum TransactionId xidVacLimit; /* start forcing autovacuums here */ //在該XID開始提出警告 TransactionId xidWarnLimit; /* start complaining here */ //在該XID開外,拒絕生成下一個(gè)XID TransactionId xidStopLimit; /* refuse to advance nextXid beyond here */ //"世界末日"XID,需回卷 TransactionId xidWrapLimit; /* where the world ends */ //持有最小datfrozenxid的DB Oid oldestXidDB; /* database with minimum datfrozenxid */ /* * These fields are protected by CommitTsLock * 這些字段通過CommitTsLock鎖保護(hù) */ TransactionId oldestCommitTsXid; TransactionId newestCommitTsXid; /* * These fields are protected by ProcArrayLock. * 這些字段通過ProcArrayLock鎖保護(hù) */ TransactionId latestCompletedXid; /* newest XID that has committed or * aborted */ /* * These fields are protected by CLogTruncationLock * 這些字段通過CLogTruncationLock鎖保護(hù) */ //clog中最古老的XID TransactionId oldestClogXid; /* oldest it's safe to look up in clog */ } VariableCacheData; //結(jié)構(gòu)體指針 typedef VariableCacheData *VariableCache; /* pointer to "variable cache" in shared memory (set up by shmem.c) */ //共享內(nèi)存中的指針(通過shmem.c設(shè)置) VariableCache ShmemVariableCache = NULL;
GetSnapshotData函數(shù)返回快照信息.
重點(diǎn)是構(gòu)造xmin : xmax : xip_list,其實(shí)現(xiàn)邏輯簡單總結(jié)如下:
1.獲取xmax = ShmemVariableCache->latestCompletedXid + 1;
2.遍歷全局procArray數(shù)組,構(gòu)建快照信息
2.1 獲取進(jìn)程相應(yīng)的事務(wù)信息pgxact
2.2 獲取進(jìn)程事務(wù)ID(pgxact->xid),取最小的xid作為xmin(不包括0)
2.3 把xid放入快照->xip數(shù)組中(不包括本進(jìn)程所在的事務(wù)id)
/* * GetSnapshotData -- returns information about running transactions. * GetSnapshotData -- 返回關(guān)于正在運(yùn)行中的事務(wù)的相關(guān)信息 * * The returned snapshot includes xmin (lowest still-running xact ID), * xmax (highest completed xact ID + 1), and a list of running xact IDs * in the range xmin <= xid < xmax. It is used as follows: * All xact IDs < xmin are considered finished. * All xact IDs >= xmax are considered still running. * For an xact ID xmin <= xid < xmax, consult list to see whether * it is considered running or not. * This ensures that the set of transactions seen as "running" by the * current xact will not change after it takes the snapshot. * 返回的snapshot包括xmin(最小的正在運(yùn)行的事務(wù)ID),xmax(已完結(jié)事務(wù)ID + 1), * 以及在xmin <= xid < xmax之間正在運(yùn)行的事務(wù)IDs. * 意義如下: * 事務(wù)IDs < xmin是已確定完成的事務(wù). * 事務(wù)IDs >= xmax是正在運(yùn)行的事務(wù). * 對于XID ∈ [xmin,xmax)的事務(wù),需查閱列表確認(rèn)是否正在運(yùn)行中 * * All running top-level XIDs are included in the snapshot, except for lazy * VACUUM processes. We also try to include running subtransaction XIDs, * but since PGPROC has only a limited cache area for subxact XIDs, full * information may not be available. If we find any overflowed subxid arrays, * we have to mark the snapshot's subxid data as overflowed, and extra work * *may* need to be done to determine what's running (see XidInMVCCSnapshot() * in tqual.c). * 所有正在運(yùn)行的頂層XIDs包含在快照中,除了lazy VACUUM進(jìn)程. * 我們嘗試包含所有正在運(yùn)行的子事務(wù)XIDs,但由于PGPROC只有有限的緩存,包含所有的子事務(wù)信息暫未實(shí)現(xiàn). * 如果我們搜索溢出的子事務(wù)數(shù)組,我們必須標(biāo)記快照的subxid數(shù)據(jù)為溢出, * 而且需要執(zhí)行額外的工作以確定哪些在運(yùn)行(查看tqual.c中的XidInMVCCSnapshot()函數(shù)) * * We also update the following backend-global variables: * TransactionXmin: the oldest xmin of any snapshot in use in the * current transaction (this is the same as MyPgXact->xmin). * RecentXmin: the xmin computed for the most recent snapshot. XIDs * older than this are known not running any more. * RecentGlobalXmin: the global xmin (oldest TransactionXmin across all * running transactions, except those running LAZY VACUUM). This is * the same computation done by * GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM). * RecentGlobalDataXmin: the global xmin for non-catalog tables * >= RecentGlobalXmin * 我們同時(shí)更新了以下后臺全局變量: * TransactionXmin: 當(dāng)前事務(wù)中在所有仍在使用的快照中最舊的xmin(與MyPgXact->xmin一致). * RecentXmin: 最近快照的xmin.小于xmin的事務(wù)已知已完結(jié). * RecentGlobalXmin:全局的xmin(除了正在運(yùn)行的LAZY VACUUM,跨越所有正在運(yùn)行事務(wù)的最舊的TransactionXmin), * 這是使用同樣的規(guī)則,通過GetOldestXmin(NULL, PROCARRAY_FLAGS_VACUUM)處理. * RecentGlobalDataXmin:非catalog數(shù)據(jù)表的全局xmin,該值>= RecentGlobalXmin. * * Note: this function should probably not be called with an argument that's * not statically allocated (see xip allocation below). * 注意:不應(yīng)該使用非靜態(tài)分配的參數(shù)調(diào)用這個(gè)函數(shù)(參見下面的xip分配)。 */ Snapshot GetSnapshotData(Snapshot snapshot) { ProcArrayStruct *arrayP = procArray;//進(jìn)程數(shù)組 TransactionId xmin;//xmin TransactionId xmax;//xmax TransactionId globalxmin;//全局xmin int index; int count = 0; int subcount = 0; bool suboverflowed = false; TransactionId replication_slot_xmin = InvalidTransactionId; TransactionId replication_slot_catalog_xmin = InvalidTransactionId; Assert(snapshot != NULL); /* * Allocating space for maxProcs xids is usually overkill; numProcs would * be sufficient. But it seems better to do the malloc while not holding * the lock, so we can't look at numProcs. Likewise, we allocate much * more subxip storage than is probably needed. * 為maxProcs xids分配空間通常是多余的;numProcs就足夠了。 * 但是在不持有鎖的情況下執(zhí)行malloc似乎更好,因此我們不能查看numProcs。 * 同樣地,我們分配的子xip存儲可能比實(shí)際需要的多得多。 * * This does open a possibility for avoiding repeated malloc/free: since * maxProcs does not change at runtime, we can simply reuse the previous * xip arrays if any. (This relies on the fact that all callers pass * static SnapshotData structs.) * 這確實(shí)為避免重復(fù)的malloc/free創(chuàng)造了一種可能性:因?yàn)閙axProcs在運(yùn)行時(shí)不會改變, * 如果有的話,我們可以簡單地重用前面的xip數(shù)組。 * (這依賴于所有調(diào)用者都傳遞靜態(tài)快照數(shù)據(jù)結(jié)構(gòu)這一事實(shí)。) */ if (snapshot->xip == NULL) { /* * First call for this snapshot. Snapshot is same size whether or not * we are in recovery, see later comments. * 首次調(diào)用.快照的大小不管是在常規(guī)還是在恢復(fù)狀態(tài)都是一樣的,看稍后的注釋. */ snapshot->xip = (TransactionId *) malloc(GetMaxSnapshotXidCount() * sizeof(TransactionId)); if (snapshot->xip == NULL) ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("out of memory"))); Assert(snapshot->subxip == NULL); snapshot->subxip = (TransactionId *) malloc(GetMaxSnapshotSubxidCount() * sizeof(TransactionId)); if (snapshot->subxip == NULL) ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("out of memory"))); } /* * It is sufficient to get shared lock on ProcArrayLock, even if we are * going to set MyPgXact->xmin. * 即使我們要設(shè)置MyPgXact->xmin,也需要獲取鎖,在ProcArrayLock上獲得共享鎖就足夠了. * */ LWLockAcquire(ProcArrayLock, LW_SHARED); /* xmax is always latestCompletedXid + 1 */ //xmax = latestCompletedXid + 1 //已完結(jié)事務(wù)號 + 1 xmax = ShmemVariableCache->latestCompletedXid; Assert(TransactionIdIsNormal(xmax)); TransactionIdAdvance(xmax);// + 1 /* initialize xmin calculation with xmax */ //初始化xmin為xmax globalxmin = xmin = xmax; //是否處于恢復(fù)過程中? snapshot->takenDuringRecovery = RecoveryInProgress(); if (!snapshot->takenDuringRecovery) { //不是,正常運(yùn)行中 int *pgprocnos = arrayP->pgprocnos;//進(jìn)程數(shù) int numProcs; /* * Spin over procArray checking xid, xmin, and subxids. The goal is * to gather all active xids, find the lowest xmin, and try to record * subxids. * Spin Over procArray,檢查xid/xmin和subxids. * 目標(biāo)是搜集所有活動的xids,找到最小的xmin,并嘗試記錄subxids. */ numProcs = arrayP->numProcs; for (index = 0; index < numProcs; index++)//遍歷procArray數(shù)組 { int pgprocno = pgprocnos[index];//allPgXact[]索引 PGXACT *pgxact = &allPgXact[pgprocno];//獲取PGXACT TransactionId xid;//事務(wù)id /* * Skip over backends doing logical decoding which manages xmin * separately (check below) and ones running LAZY VACUUM. * 跳過正在執(zhí)行邏輯解碼(單獨(dú)管理xmin)和執(zhí)行LAZY VACUUM的進(jìn)程. * */ if (pgxact->vacuumFlags & (PROC_IN_LOGICAL_DECODING | PROC_IN_VACUUM)) continue; /* Update globalxmin to be the smallest valid xmin */ //更新globalxmin為最小有效的xmin xid = UINT32_ACCESS_ONCE(pgxact->xmin);//獲取進(jìn)程事務(wù)的xmin if (TransactionIdIsNormal(xid) && NormalTransactionIdPrecedes(xid, globalxmin)) globalxmin = xid; /* Fetch xid just once - see GetNewTransactionId */ //只提取一次xid -- 查看函數(shù)GetNewTransactionId xid = UINT32_ACCESS_ONCE(pgxact->xid); /* * If the transaction has no XID assigned, we can skip it; it * won't have sub-XIDs either. If the XID is >= xmax, we can also * skip it; such transactions will be treated as running anyway * (and any sub-XIDs will also be >= xmax). * 如果事務(wù)未分配XID事務(wù)號,跳過此事務(wù).該事務(wù)也不會含有子事務(wù). * 如果XID >= xmax,我們也可以跳過,這些事務(wù)可被處理為正在運(yùn)行的思維. * (這些事務(wù)的子事務(wù)XID也同樣會 >= xmax) */ if (!TransactionIdIsNormal(xid) || !NormalTransactionIdPrecedes(xid, xmax)) continue; /* * We don't include our own XIDs (if any) in the snapshot, but we * must include them in xmin. * 在快照中,不會包含自己的XIDs,但必須體現(xiàn)在xmin中 */ if (NormalTransactionIdPrecedes(xid, xmin)) //xid 小于 xmin,設(shè)置為xid xmin = xid; if (pgxact == MyPgXact) continue;//跳過本事務(wù) /* Add XID to snapshot. */ //添加XID到快照中 snapshot->xip[count++] = xid; /* * Save subtransaction XIDs if possible (if we've already * overflowed, there's no point). Note that the subxact XIDs must * be later than their parent, so no need to check them against * xmin. We could filter against xmax, but it seems better not to * do that much work while holding the ProcArrayLock. * 如可能,保存子事務(wù)XIDs(如果已經(jīng)溢出,那就沒法了). * 注意子事務(wù)XIDs必須在他們的父事務(wù)之后發(fā)生,因此無需檢查xmin. * 我們可以利用xmax進(jìn)行過濾,但是在持有鎖ProcArrayLock時(shí)最好不要做那么多的工作。 * * The other backend can add more subxids concurrently, but cannot * remove any. Hence it's important to fetch nxids just once. * Should be safe to use memcpy, though. (We needn't worry about * missing any xids added concurrently, because they must postdate * xmax.) * 其他后臺進(jìn)程可能并發(fā)增加子事務(wù)ID,但不能清除. * 因此,只取一次nxids很重要.不過,使用memcpy是安全的. * (不需要擔(dān)心遺漏并發(fā)增加xids,因?yàn)樗麄冊趚max之后) * * Again, our own XIDs are not included in the snapshot. * 再次,我們自己的XIDs不需要包含在快照中 */ if (!suboverflowed) { if (pgxact->overflowed) suboverflowed = true; else { int nxids = pgxact->nxids; if (nxids > 0) { PGPROC *proc = &allProcs[pgprocno]; pg_read_barrier(); /* pairs with GetNewTransactionId */ memcpy(snapshot->subxip + subcount, (void *) proc->subxids.xids, nxids * sizeof(TransactionId)); subcount += nxids; } } } } } else { /* * We're in hot standby, so get XIDs from KnownAssignedXids. * 處于hot standby中,通過KnownAssignedXids獲取XIDs. * * We store all xids directly into subxip[]. Here's why: * 直接存儲所有的xids到subxip[]中,這是因?yàn)? * * In recovery we don't know which xids are top-level and which are * subxacts, a design choice that greatly simplifies xid processing. * 在恢復(fù)過程中,我們不需要知道哪些xids是頂層事務(wù),哪些是子事務(wù), * 這可以極大的簡化xid處理過程. * * It seems like we would want to try to put xids into xip[] only, but * that is fairly small. We would either need to make that bigger or * to increase the rate at which we WAL-log xid assignment; neither is * an appealing choice. * 似乎我們只想把xid放到xip[]中,但xip數(shù)組是相當(dāng)小的。 * 我們要么需要擴(kuò)展,要么提高WAL-log xid分派的速度; * 但這兩個(gè)選擇都不吸引人。 * * We could try to store xids into xip[] first and then into subxip[] * if there are too many xids. That only works if the snapshot doesn't * overflow because we do not search subxip[] in that case. A simpler * way is to just store all xids in the subxact array because this is * by far the bigger array. We just leave the xip array empty. * 如果xid太多的話,我們嘗試先將xid存儲到xip[]中,然后再在subxip[]中存儲。 * 這只在快照沒有溢出的情況下有效,因?yàn)樵谶@種情況下我們不搜索subxip[]。 * 一種更簡單的方法是將所有xid存儲在subxact數(shù)組中,因?yàn)檫@個(gè)數(shù)組要大得多。 * 讓xip數(shù)組為空。 * * Either way we need to change the way XidInMVCCSnapshot() works * depending upon when the snapshot was taken, or change normal * snapshot processing so it matches. * 無論哪種方式,我們都需要根據(jù)快照的拍攝時(shí)間更改XidInMVCCSnapshot()的工作方式, * 或者更改正常的快照處理,使其匹配。 * * Note: It is possible for recovery to end before we finish taking * the snapshot, and for newly assigned transaction ids to be added to * the ProcArray. xmax cannot change while we hold ProcArrayLock, so * those newly added transaction ids would be filtered away, so we * need not be concerned about them. * 注意:在我們完成快照之前,恢復(fù)可能會結(jié)束, * 并且新分配的事務(wù)id可能會添加到ProcArray中。 * 當(dāng)我們持有鎖ProcArrayLock時(shí),xmax無法更改, * 因此那些新添加的事務(wù)id將被過濾掉,因此無需擔(dān)心。 */ subcount = KnownAssignedXidsGetAndSetXmin(snapshot->subxip, &xmin, xmax); if (TransactionIdPrecedesOrEquals(xmin, procArray->lastOverflowedXid)) suboverflowed = true; } /* * Fetch into local variable while ProcArrayLock is held - the * LWLockRelease below is a barrier, ensuring this happens inside the * lock. * 持有ProcArrayLock鎖時(shí),提前到本地變量中, * 下面的LWLockRelease是一個(gè)屏障,確保這發(fā)生在鎖內(nèi)部。 */ replication_slot_xmin = procArray->replication_slot_xmin; replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin; if (!TransactionIdIsValid(MyPgXact->xmin)) MyPgXact->xmin = TransactionXmin = xmin; LWLockRelease(ProcArrayLock); /* * Update globalxmin to include actual process xids. This is a slightly * different way of computing it than GetOldestXmin uses, but should give * the same result. * 更新globalxmin已包含實(shí)際的進(jìn)程xids. * 這是一種與GetOldestXmin使用的計(jì)算方法略有不同的方法,但是應(yīng)該會得到相同的結(jié)果。 */ if (TransactionIdPrecedes(xmin, globalxmin)) globalxmin = xmin; /* Update global variables too */ //更新全局變量 RecentGlobalXmin = globalxmin - vacuum_defer_cleanup_age; if (!TransactionIdIsNormal(RecentGlobalXmin)) RecentGlobalXmin = FirstNormalTransactionId; /* Check whether there's a replication slot requiring an older xmin. */ //檢查是否存在正在請求更舊xmin的復(fù)制slot if (TransactionIdIsValid(replication_slot_xmin) && NormalTransactionIdPrecedes(replication_slot_xmin, RecentGlobalXmin)) RecentGlobalXmin = replication_slot_xmin; /* Non-catalog tables can be vacuumed if older than this xid */ //比該xid小的非catalog表可被vacuum進(jìn)程清除 RecentGlobalDataXmin = RecentGlobalXmin; /* * Check whether there's a replication slot requiring an older catalog * xmin. * 檢查是否存在正確請求更舊catalog xmin的復(fù)制slot */ if (TransactionIdIsNormal(replication_slot_catalog_xmin) && NormalTransactionIdPrecedes(replication_slot_catalog_xmin, RecentGlobalXmin)) RecentGlobalXmin = replication_slot_catalog_xmin; RecentXmin = xmin; snapshot->xmin = xmin; snapshot->xmax = xmax; snapshot->xcnt = count; snapshot->subxcnt = subcount; snapshot->suboverflowed = suboverflowed; //當(dāng)前命令id snapshot->curcid = GetCurrentCommandId(false); /* * This is a new snapshot, so set both refcounts are zero, and mark it as * not copied in persistent memory. * 這是一個(gè)新的快照,因此設(shè)置refcounts為0,并標(biāo)記其未在持久化內(nèi)存中拷貝. */ snapshot->active_count = 0; snapshot->regd_count = 0; snapshot->copied = false; if (old_snapshot_threshold < 0) { /* * If not using "snapshot too old" feature, fill related fields with * dummy values that don't require any locking. * 如啟用"snapshot too old"特性,使用虛擬值填充相關(guān)的字段,這里不需要鎖. */ snapshot->lsn = InvalidXLogRecPtr; snapshot->whenTaken = 0; } else { /* * Capture the current time and WAL stream location in case this * snapshot becomes old enough to need to fall back on the special * "old snapshot" logic. * 捕獲當(dāng)前時(shí)間和WAL流位置,以防快照變得足夠舊時(shí)需要使用特殊的“old snapshot”邏輯。 */ snapshot->lsn = GetXLogInsertRecPtr(); snapshot->whenTaken = GetSnapshotCurrentTimestamp(); MaintainOldSnapshotTimeMapping(snapshot->whenTaken, xmin); } //返回快照 return snapshot; }
執(zhí)行簡單查詢,可觸發(fā)獲取快照邏輯.
16:35:08 (xdb@[local]:5432)testdb=# begin; BEGIN 16:35:13 (xdb@[local]:5432)testdb=#* select 1;
啟動gdb,設(shè)置斷點(diǎn)
(gdb) b GetSnapshotData Breakpoint 1 at 0x89aef3: file procarray.c, line 1519. (gdb) c Continuing. Breakpoint 1, GetSnapshotData (snapshot=0xf9be60 <CurrentSnapshotData>) at procarray.c:1519 1519 ProcArrayStruct *arrayP = procArray; (gdb)
輸入?yún)?shù)snapshot,實(shí)質(zhì)是全局變量CurrentSnapshotData
(gdb) p *snapshot $1 = {satisfies = 0xa9310d <HeapTupleSatisfiesMVCC>, xmin = 2354, xmax = 2358, xip = 0x24c7e40, xcnt = 1, subxip = 0x251dfa0, subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = false, curcid = 0, speculativeToken = 0, active_count = 0, regd_count = 0, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0x0}, whenTaken = 0, lsn = 0}
查看共享內(nèi)存(ShmemVariableCache)中的信息.
nextXID = 2358,下一個(gè)待分配的事務(wù)ID = 2358.
(gdb) p *ShmemVariableCache $2 = {nextOid = 42605, oidCount = 8183, nextXid = 2358, oldestXid = 561, xidVacLimit = 200000561, xidWarnLimit = 2136484208, xidStopLimit = 2146484208, xidWrapLimit = 2147484208, oldestXidDB = 16400, oldestCommitTsXid = 0, newestCommitTsXid = 0, latestCompletedXid = 2357, oldestClogXid = 561} (gdb)
獲取全局進(jìn)程數(shù)組procArray,賦值->arrayP.
初始化相關(guān)變量.
(gdb) n 1524 int count = 0; (gdb) n 1525 int subcount = 0; (gdb) 1526 bool suboverflowed = false; (gdb) 1527 volatile TransactionId replication_slot_xmin = InvalidTransactionId; (gdb) 1528 volatile TransactionId replication_slot_catalog_xmin = InvalidTransactionId; (gdb) 1530 Assert(snapshot != NULL); (gdb) 1543 if (snapshot->xip == NULL) (gdb)
查看進(jìn)程數(shù)組信息和allPgXact[]數(shù)組編號(arrayP->pgprocnos數(shù)組).
allPgXact定義:static PGXACT *allPgXact;
(gdb) p *arrayP $3 = {numProcs = 5, maxProcs = 112, maxKnownAssignedXids = 7280, numKnownAssignedXids = 0, tailKnownAssignedXids = 0, headKnownAssignedXids = 0, known_assigned_xids_lck = 0 '\000', lastOverflowedXid = 0, replication_slot_xmin = 0, replication_slot_catalog_xmin = 0, pgprocnos = 0x7f8765d9a3a8} (gdb) p arrayP->pgprocnos[0] $4 = 97 (gdb) p arrayP->pgprocnos[1] $5 = 98 (gdb) p arrayP->pgprocnos[2] $6 = 99 (gdb) p arrayP->pgprocnos[3] $7 = 103 (gdb) p arrayP->pgprocnos[4] $9 = 111
加鎖,獲取/修改相關(guān)信息
(gdb) 1568 LWLockAcquire(ProcArrayLock, LW_SHARED);
計(jì)算xmax
(gdb) n 1571 xmax = ShmemVariableCache->latestCompletedXid; (gdb) 1572 Assert(TransactionIdIsNormal(xmax)); (gdb) p xmax $10 = 2357 (gdb) n 1573 TransactionIdAdvance(xmax); (gdb) 1576 globalxmin = xmin = xmax; (gdb) 1578 snapshot->takenDuringRecovery = RecoveryInProgress(); (gdb) p xmax $11 = 2358
判斷是否處于恢復(fù)狀態(tài),當(dāng)前不是恢復(fù)狀態(tài),進(jìn)入相應(yīng)的處理邏輯
(gdb) n 1580 if (!snapshot->takenDuringRecovery) (gdb) p snapshot->takenDuringRecovery $13 = false (gdb) n 1582 int *pgprocnos = arrayP->pgprocnos; (gdb)
獲取進(jìn)程數(shù)和PGXACT索引數(shù)組,準(zhǔn)備遍歷
(gdb) n 1590 numProcs = arrayP->numProcs; (gdb) 1591 for (index = 0; index < numProcs; index++) (gdb) (gdb) p *pgprocnos $14 = 97 (gdb) p numProcs $15 = 5 (gdb)
獲取pgxact信息
(gdb) n 1593 int pgprocno = pgprocnos[index]; (gdb) 1594 volatile PGXACT *pgxact = &allPgXact[pgprocno]; (gdb) 1601 if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING) (gdb) 1605 if (pgxact->vacuumFlags & PROC_IN_VACUUM) (gdb) 1609 xid = pgxact->xmin; /* fetch just once */ (gdb) p *pgxact $16 = {xid = 0, xmin = 0, vacuumFlags = 0 '\000', overflowed = false, delayChkpt = false, nxids = 0 '\000'} (gdb)
不是正常的xid,下一個(gè)pgxact
(gdb) n 1610 if (TransactionIdIsNormal(xid) && (gdb) 1615 xid = pgxact->xid; (gdb) 1623 if (!TransactionIdIsNormal(xid) (gdb) p xid $17 = 0 (gdb) n 1625 continue; (gdb)
下一個(gè)xid = 2355,正常的事務(wù)ID
(gdb) 1591 for (index = 0; index < numProcs; index++) (gdb) 1593 int pgprocno = pgprocnos[index]; (gdb) 1594 volatile PGXACT *pgxact = &allPgXact[pgprocno]; (gdb) 1601 if (pgxact->vacuumFlags & PROC_IN_LOGICAL_DECODING) (gdb) p *pgxact $18 = {xid = 2355, xmin = 0, vacuumFlags = 0 '\000', overflowed = false, delayChkpt = false, nxids = 0 '\000'} (gdb)
進(jìn)行處理
(gdb) n 1605 if (pgxact->vacuumFlags & PROC_IN_VACUUM) (gdb) 1609 xid = pgxact->xmin; /* fetch just once */ (gdb) 1610 if (TransactionIdIsNormal(xid) && (gdb) 1615 xid = pgxact->xid; (gdb) 1623 if (!TransactionIdIsNormal(xid) (gdb) 1624 || !NormalTransactionIdPrecedes(xid, xmax)) (gdb) 1631 if (NormalTransactionIdPrecedes(xid, xmin)) (gdb) p xid $19 = 2355 (gdb) p xmin $20 = 2358 (gdb) n 1632 xmin = xid; (gdb) 1633 if (pgxact == MyPgXact) (gdb)
這是同一個(gè)xact,處理下一個(gè)xact
(gdb) 1633 if (pgxact == MyPgXact) (gdb) p pgxact $21 = (volatile PGXACT *) 0x7f8765d9a218 (gdb) p MyPgXact $22 = (struct PGXACT *) 0x7f8765d9a218 (gdb) n 1634 continue; (gdb)
下一個(gè)是2354
... (gdb) p *pgxact $23 = {xid = 2354, xmin = 0, vacuumFlags = 0 '\000', overflowed = false, delayChkpt = false, nxids = 0 '\000'} (gdb)
xmin調(diào)整為2354
1631 if (NormalTransactionIdPrecedes(xid, xmin)) (gdb) 1632 xmin = xid; (gdb) 1633 if (pgxact == MyPgXact) (gdb) p xmin $24 = 2354 (gdb)
寫入到xip_list中
1637 snapshot->xip[count++] = xid; (gdb) 1654 if (!suboverflowed) (gdb) (gdb) p count $25 = 1
繼續(xù)循環(huán),完成5個(gè)pgxact的遍歷
1591 for (index = 0; index < numProcs; index++) (gdb) 1715 replication_slot_xmin = procArray->replication_slot_xmin; (gdb)
無復(fù)制信息
(gdb) 1715 replication_slot_xmin = procArray->replication_slot_xmin; (gdb) p procArray->replication_slot_xmin $28 = 0 (gdb) n 1716 replication_slot_catalog_xmin = procArray->replication_slot_catalog_xmin; (gdb) 1718 if (!TransactionIdIsValid(MyPgXact->xmin))
調(diào)整本進(jìn)程的事務(wù)信息
(gdb) n 1719 MyPgXact->xmin = TransactionXmin = xmin; (gdb) p MyPgXact->xmin $29 = 0 (gdb) n
釋放鎖
1721 LWLockRelease(ProcArrayLock); (gdb) 1728 if (TransactionIdPrecedes(xmin, globalxmin)) (gdb)
調(diào)整全局xmin
(gdb) p xmin $30 = 2354 (gdb) p globalxmin $31 = 2358 (gdb) n 1729 globalxmin = xmin; (gdb)
更新其他信息
(gdb) 1732 RecentGlobalXmin = globalxmin - vacuum_defer_cleanup_age; (gdb) p RecentGlobalXmin $32 = 2354 (gdb) p vacuum_defer_cleanup_age $33 = 0 (gdb) n 1733 if (!TransactionIdIsNormal(RecentGlobalXmin)) (gdb) 1737 if (TransactionIdIsValid(replication_slot_xmin) && (gdb) 1742 RecentGlobalDataXmin = RecentGlobalXmin; (gdb) p RecentGlobalXmin $34 = 2354 (gdb) n 1748 if (TransactionIdIsNormal(replication_slot_catalog_xmin) && (gdb)
填充snapshot域字段信息
(gdb) 1752 RecentXmin = xmin; (gdb) 1754 snapshot->xmin = xmin; (gdb) 1755 snapshot->xmax = xmax; (gdb) 1756 snapshot->xcnt = count; (gdb) 1757 snapshot->subxcnt = subcount; (gdb) 1758 snapshot->suboverflowed = suboverflowed; (gdb) 1760 snapshot->curcid = GetCurrentCommandId(false); (gdb) 1766 snapshot->active_count = 0; (gdb) 1767 snapshot->regd_count = 0; (gdb) 1768 snapshot->copied = false; (gdb) 1770 if (old_snapshot_threshold < 0) (gdb) 1776 snapshot->lsn = InvalidXLogRecPtr; (gdb) 1777 snapshot->whenTaken = 0; (gdb) 1791 return snapshot; (gdb)
返回snapshot
(gdb) p snapshot $35 = (Snapshot) 0xf9be60 <CurrentSnapshotData> (gdb) p *snapshot $36 = {satisfies = 0xa9310d <HeapTupleSatisfiesMVCC>, xmin = 2354, xmax = 2358, xip = 0x24c7e40, xcnt = 1, subxip = 0x251dfa0, subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = false, curcid = 0, speculativeToken = 0, active_count = 0, regd_count = 0, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0x0}, whenTaken = 0, lsn = 0} (gdb)
注意:snapshot->satisfies函數(shù)在初始化該全局變量已設(shè)置為HeapTupleSatisfiesMVCC.
感謝各位的閱讀,以上就是“PostgreSQL中GetSnapshotData的處理過程是什么”的內(nèi)容了,經(jīng)過本文的學(xué)習(xí)后,相信大家對PostgreSQL中GetSnapshotData的處理過程是什么這一問題有了更深刻的體會,具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是億速云,小編將為大家推送更多相關(guān)知識點(diǎn)的文章,歡迎關(guān)注!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。