您好,登錄后才能下訂單哦!
本篇內(nèi)容介紹了“PostgreSQL中StrategyGetBuffer函數(shù)有什么作用”的有關(guān)知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!
BufferDesc
共享緩沖區(qū)的共享描述符(狀態(tài))數(shù)據(jù)
/* * Flags for buffer descriptors * buffer描述器標(biāo)記 * * Note: TAG_VALID essentially means that there is a buffer hashtable * entry associated with the buffer's tag. * 注意:TAG_VALID本質(zhì)上意味著有一個與緩沖區(qū)的標(biāo)記相關(guān)聯(lián)的緩沖區(qū)散列表條目。 */ //buffer header鎖定 #define BM_LOCKED (1U << 22) /* buffer header is locked */ //數(shù)據(jù)需要寫入(標(biāo)記為DIRTY) #define BM_DIRTY (1U << 23) /* data needs writing */ //數(shù)據(jù)是有效的 #define BM_VALID (1U << 24) /* data is valid */ //已分配buffer tag #define BM_TAG_VALID (1U << 25) /* tag is assigned */ //正在R/W #define BM_IO_IN_PROGRESS (1U << 26) /* read or write in progress */ //上一個I/O出現(xiàn)錯誤 #define BM_IO_ERROR (1U << 27) /* previous I/O failed */ //開始寫則變DIRTY #define BM_JUST_DIRTIED (1U << 28) /* dirtied since write started */ //存在等待sole pin的其他進(jìn)程 #define BM_PIN_COUNT_WAITER (1U << 29) /* have waiter for sole pin */ //checkpoint發(fā)生,必須刷到磁盤上 #define BM_CHECKPOINT_NEEDED (1U << 30) /* must write for checkpoint */ //持久化buffer(不是unlogged或者初始化fork) #define BM_PERMANENT (1U << 31) /* permanent buffer (not unlogged, * or init fork) */ /* * BufferDesc -- shared descriptor/state data for a single shared buffer. * BufferDesc -- 共享緩沖區(qū)的共享描述符(狀態(tài))數(shù)據(jù) * * Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change * the tag, state or wait_backend_pid fields. In general, buffer header lock * is a spinlock which is combined with flags, refcount and usagecount into * single atomic variable. This layout allow us to do some operations in a * single atomic operation, without actually acquiring and releasing spinlock; * for instance, increase or decrease refcount. buf_id field never changes * after initialization, so does not need locking. freeNext is protected by * the buffer_strategy_lock not buffer header lock. The LWLock can take care * of itself. The buffer header lock is *not* used to control access to the * data in the buffer! * 注意:必須持有Buffer header鎖(BM_LOCKED標(biāo)記)才能檢查或修改tag/state/wait_backend_pid字段. * 通常來說,buffer header lock是spinlock,它與標(biāo)記位/參考計數(shù)/使用計數(shù)組合到單個原子變量中. * 這個布局設(shè)計允許我們執(zhí)行原子操作,而不需要實際獲得或者釋放spinlock(比如,增加或者減少參考計數(shù)). * buf_id字段在初始化后不會出現(xiàn)變化,因此不需要鎖定. * freeNext通過buffer_strategy_lock鎖而不是buffer header lock保護(hù). * LWLock可以很好的處理自己的狀態(tài). * 務(wù)請注意的是:buffer header lock不用于控制buffer中的數(shù)據(jù)訪問! * * It's assumed that nobody changes the state field while buffer header lock * is held. Thus buffer header lock holder can do complex updates of the * state variable in single write, simultaneously with lock release (cleaning * BM_LOCKED flag). On the other hand, updating of state without holding * buffer header lock is restricted to CAS, which insure that BM_LOCKED flag * is not set. Atomic increment/decrement, OR/AND etc. are not allowed. * 假定在持有buffer header lock的情況下,沒有人改變狀態(tài)字段. * 持有buffer header lock的進(jìn)程可以執(zhí)行在單個寫操作中執(zhí)行復(fù)雜的狀態(tài)變量更新, * 同步的釋放鎖(清除BM_LOCKED標(biāo)記). * 換句話說,如果沒有持有buffer header lock的狀態(tài)更新,會受限于CAS, * 這種情況下確保BM_LOCKED沒有被設(shè)置. * 比如原子的增加/減少(AND/OR)等操作是不允許的. * * An exception is that if we have the buffer pinned, its tag can't change * underneath us, so we can examine the tag without locking the buffer header. * Also, in places we do one-time reads of the flags without bothering to * lock the buffer header; this is generally for situations where we don't * expect the flag bit being tested to be changing. * 一種例外情況是如果我們已有buffer pinned,該buffer的tag不能改變(在本進(jìn)程之下), * 因此不需要鎖定buffer header就可以檢查tag了. * 同時,在執(zhí)行一次性的flags讀取時不需要鎖定buffer header. * 這種情況通常用于我們不希望正在測試的flag bit將被改變. * * We can't physically remove items from a disk page if another backend has * the buffer pinned. Hence, a backend may need to wait for all other pins * to go away. This is signaled by storing its own PID into * wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER. At present, * there can be only one such waiter per buffer. * 如果其他進(jìn)程有buffer pinned,那么進(jìn)程不能物理的從磁盤頁面中刪除items. * 因此,后臺進(jìn)程需要等待其他pins清除.這可以通過存儲它自己的PID到wait_backend_pid中, * 并設(shè)置標(biāo)記位BM_PIN_COUNT_WAITER. * 目前,每個緩沖區(qū)只能由一個等待進(jìn)程. * * We use this same struct for local buffer headers, but the locks are not * used and not all of the flag bits are useful either. To avoid unnecessary * overhead, manipulations of the state field should be done without actual * atomic operations (i.e. only pg_atomic_read_u32() and * pg_atomic_unlocked_write_u32()). * 本地緩沖頭部使用同樣的結(jié)構(gòu),但并不需要使用locks,而且并不是所有的標(biāo)記位都使用. * 為了避免不必要的負(fù)載,狀態(tài)域的維護(hù)不需要實際的原子操作 * (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32()) * * Be careful to avoid increasing the size of the struct when adding or * reordering members. Keeping it below 64 bytes (the most common CPU * cache line size) is fairly important for performance. * 在增加或者記錄成員變量時,小心避免增加結(jié)構(gòu)體的大小. * 保持結(jié)構(gòu)體大小在64字節(jié)內(nèi)(通常的CPU緩存線大小)對于性能是非常重要的. */ typedef struct BufferDesc { //buffer tag BufferTag tag; /* ID of page contained in buffer */ //buffer索引編號(0開始) int buf_id; /* buffer's index number (from 0) */ /* state of the tag, containing flags, refcount and usagecount */ //tag狀態(tài),包括flags/refcount和usagecount pg_atomic_uint32 state; //pin-count等待進(jìn)程ID int wait_backend_pid; /* backend PID of pin-count waiter */ //空閑鏈表鏈中下一個空閑的buffer int freeNext; /* link in freelist chain */ //緩沖區(qū)內(nèi)容鎖 LWLock content_lock; /* to lock access to buffer contents */ } BufferDesc;
BufferTag
Buffer tag標(biāo)記了buffer存儲的是磁盤中哪個block
/* * Buffer tag identifies which disk block the buffer contains. * Buffer tag標(biāo)記了buffer存儲的是磁盤中哪個block * * Note: the BufferTag data must be sufficient to determine where to write the * block, without reference to pg_class or pg_tablespace entries. It's * possible that the backend flushing the buffer doesn't even believe the * relation is visible yet (its xact may have started before the xact that * created the rel). The storage manager must be able to cope anyway. * 注意:BufferTag必須足以確定如何寫block而不需要參照pg_class或者pg_tablespace數(shù)據(jù)字典信息. * 有可能后臺進(jìn)程在刷新緩沖區(qū)的時候深圳不相信關(guān)系是可見的(事務(wù)可能在創(chuàng)建rel的事務(wù)之前). * 存儲管理器必須可以處理這些事情. * * Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have * to be fixed to zero them, since this struct is used as a hash key. * 注意:如果在結(jié)構(gòu)體中有填充的字節(jié),INIT_BUFFERTAG必須將它們固定為零,因為這個結(jié)構(gòu)體用作散列鍵. */ typedef struct buftag { //物理relation標(biāo)識符 RelFileNode rnode; /* physical relation identifier */ ForkNumber forkNum; //相對于relation起始的塊號 BlockNumber blockNum; /* blknum relative to begin of reln */ } BufferTag;
SMgrRelation
smgr.c維護(hù)一個包含SMgrRelation對象的hash表,SMgrRelation對象本質(zhì)上是緩存的文件句柄.
/* * smgr.c maintains a table of SMgrRelation objects, which are essentially * cached file handles. An SMgrRelation is created (if not already present) * by smgropen(), and destroyed by smgrclose(). Note that neither of these * operations imply I/O, they just create or destroy a hashtable entry. * (But smgrclose() may release associated resources, such as OS-level file * descriptors.) * smgr.c維護(hù)一個包含SMgrRelation對象的hash表,SMgrRelation對象本質(zhì)上是緩存的文件句柄. * SMgrRelation對象(如非現(xiàn)成)通過smgropen()方法創(chuàng)建,通過smgrclose()方法銷毀. * 注意:這些操作都不會執(zhí)行I/O操作,只會創(chuàng)建或者銷毀哈希表條目. * (但是smgrclose()方法可能會釋放相關(guān)的資源,比如OS基本的文件描述符) * * An SMgrRelation may have an "owner", which is just a pointer to it from * somewhere else; smgr.c will clear this pointer if the SMgrRelation is * closed. We use this to avoid dangling pointers from relcache to smgr * without having to make the smgr explicitly aware of relcache. There * can't be more than one "owner" pointer per SMgrRelation, but that's * all we need. * SMgrRelation可能會有"宿主",這個宿主可能只是從某個地方指向它的指針而已; * 如SMgrRelationsmgr.c會清除該指針.這樣做可以避免從relcache到smgr的懸空指針, * 而不必要讓smgr顯式的感知relcache(也就是隔離了smgr了relcache). * 每個SMgrRelation不能跟多個"owner"指針關(guān)聯(lián),但這就是我們所需要的. * * SMgrRelations that do not have an "owner" are considered to be transient, * and are deleted at end of transaction. * SMgrRelations如無owner指針,則被視為臨時對象,在事務(wù)的最后被刪除. */ typedef struct SMgrRelationData { /* rnode is the hashtable lookup key, so it must be first! */ //-------- rnode是哈希表的搜索鍵,因此在結(jié)構(gòu)體的首位 //關(guān)系物理定義ID RelFileNodeBackend smgr_rnode; /* relation physical identifier */ /* pointer to owning pointer, or NULL if none */ //--------- 指向擁有的指針,如無則為NULL struct SMgrRelationData **smgr_owner; /* * These next three fields are not actually used or manipulated by smgr, * except that they are reset to InvalidBlockNumber upon a cache flush * event (in particular, upon truncation of the relation). Higher levels * store cached state here so that it will be reset when truncation * happens. In all three cases, InvalidBlockNumber means "unknown". * 接下來的3個字段實際上并不用于或者由smgr管理, * 除非這些表里在cache flush event發(fā)生時被重置為InvalidBlockNumber * (特別是在關(guān)系被截斷時). * 在這里,更高層的存儲緩存了狀態(tài)因此在截斷發(fā)生時會被重置. * 在這3種情況下,InvalidBlockNumber都意味著"unknown". */ //當(dāng)前插入的目標(biāo)bloc BlockNumber smgr_targblock; /* current insertion target block */ //最后已知的fsm fork大小 BlockNumber smgr_fsm_nblocks; /* last known size of fsm fork */ //最后已知的vm fork大小 BlockNumber smgr_vm_nblocks; /* last known size of vm fork */ /* additional public fields may someday exist here */ //------- 未來可能新增的公共域 /* * Fields below here are intended to be private to smgr.c and its * submodules. Do not touch them from elsewhere. * 下面的字段是smgr.c及其子模塊私有的,不要從其他模塊接觸這些字段. */ //存儲管理器選擇器 int smgr_which; /* storage manager selector */ /* * for md.c; per-fork arrays of the number of open segments * (md_num_open_segs) and the segments themselves (md_seg_fds). * 用于md.c,打開段(md_num_open_segs)和段自身(md_seg_fds)的數(shù)組(每個fork一個) */ int md_num_open_segs[MAX_FORKNUM + 1]; struct _MdfdVec *md_seg_fds[MAX_FORKNUM + 1]; /* if unowned, list link in list of all unowned SMgrRelations */ //如沒有宿主,未宿主的SMgrRelations鏈表的鏈表鏈接. struct SMgrRelationData *next_unowned_reln; } SMgrRelationData; typedef SMgrRelationData *SMgrRelation;
RelFileNodeBackend
組合relfilenode和后臺進(jìn)程ID,用于提供需要定位物理存儲的所有信息.
/* * Augmenting a relfilenode with the backend ID provides all the information * we need to locate the physical storage. The backend ID is InvalidBackendId * for regular relations (those accessible to more than one backend), or the * owning backend's ID for backend-local relations. Backend-local relations * are always transient and removed in case of a database crash; they are * never WAL-logged or fsync'd. * 組合relfilenode和后臺進(jìn)程ID,用于提供需要定位物理存儲的所有信息. * 對于普通的關(guān)系(可通過多個后臺進(jìn)程訪問),后臺進(jìn)程ID是InvalidBackendId; * 如為臨時表,則為自己的后臺進(jìn)程ID. * 臨時表(backend-local relations)通常是臨時存在的,在數(shù)據(jù)庫崩潰時刪除,無需WAL-logged或者fsync. */ typedef struct RelFileNodeBackend { RelFileNode node;//節(jié)點(diǎn) BackendId backend;//后臺進(jìn)程 } RelFileNodeBackend;
StrategyControl
共享的空閑鏈表控制信息
/* * The shared freelist control information. * 共享的空閑鏈表控制信息. */ typedef struct { /* Spinlock: protects the values below */ //自旋鎖,用于保護(hù)下面的值 slock_t buffer_strategy_lock; /* * Clock sweep hand: index of next buffer to consider grabbing. Note that * this isn't a concrete buffer - we only ever increase the value. So, to * get an actual buffer, it needs to be used modulo NBuffers. * Clock sweep hand:下一個考慮交換出去的buffer索引. * 注意這并不是一個精確的buffer - 我們只是曾經(jīng)增加值而已. * 因此,獲得一個實際的buffer,需要取模(使用NBuffers). */ pg_atomic_uint32 nextVictimBuffer; //未使用的buffers鏈表頭部 int firstFreeBuffer; /* Head of list of unused buffers */ //未使用的buffers鏈表尾部 int lastFreeBuffer; /* Tail of list of unused buffers */ /* * NOTE: lastFreeBuffer is undefined when firstFreeBuffer is -1 (that is, * when the list is empty) * 注意:如firstFreeBuffer是-1,則lastFreeBuffer是未定義的. * (這意味著,當(dāng)鏈表是空的時候會出現(xiàn)這種情況) */ /* * Statistics. These counters should be wide enough that they can't * overflow during a single bgwriter cycle. * 統(tǒng)計信息.這些計數(shù)器需要足夠大,以確保在單個bgwriter循環(huán)時不會溢出. */ //完成一輪clock sweep循環(huán),進(jìn)行計數(shù) uint32 completePasses; /* Complete cycles of the clock sweep */ //自上次重置后分配的buffers pg_atomic_uint32 numBufferAllocs; /* Buffers allocated since last reset */ /* * Bgworker process to be notified upon activity or -1 if none. See * StrategyNotifyBgWriter. * 活動時通知Bgworker進(jìn)程,否則該值為-1.詳細(xì)參見StrategyNotifyBgWriter. */ int bgwprocno; } BufferStrategyControl; /* Pointers to shared state */ //指向BufferStrategyControl結(jié)構(gòu)體的指針 static BufferStrategyControl *StrategyControl = NULL;
StrategyGetBuffer在BufferAlloc()中,由bufmgr調(diào)用,用于獲得下一個候選的buffer.
其主要的處理邏輯如下:
1.初始化相關(guān)變量
2.如策略對象不為空,則從環(huán)形緩沖區(qū)中獲取buffer,如成功則返回buf
3.如需要,則喚醒后臺進(jìn)程bgwriter,從共享內(nèi)存中讀取一次,然后根據(jù)該值設(shè)置latch
4.計算buffer分配請求,這樣bgwriter可以估算buffer消耗的比例.
5.檢查freelist中是否存在buffer
5.1如存在,則執(zhí)行相關(guān)判斷邏輯,如成功,則返回buf
5.2如不存在
5.2.1則使用clock sweep算法,選擇buffer,執(zhí)行相關(guān)判斷,如成功,則返回buf
5.2.2如無法獲取,在嘗試過trycounter次后,報錯
/* * StrategyGetBuffer * * Called by the bufmgr to get the next candidate buffer to use in * BufferAlloc(). The only hard requirement BufferAlloc() has is that * the selected buffer must not currently be pinned by anyone. * 在BufferAlloc()中,由bufmgr調(diào)用,用于獲得下一個候選的buffer. * BufferAlloc()中唯一稍微困難的需求是選擇的buffer不能被其他后臺進(jìn)程pinned. * * strategy is a BufferAccessStrategy object, or NULL for default strategy. * strategy是BufferAccessStrategy對象,如為默認(rèn)策略,則為NULL. * * To ensure that no one else can pin the buffer before we do, we must * return the buffer with the buffer header spinlock still held. * 為了確保沒有其他后臺進(jìn)程在我們完成之前pin buffer,必須返回仍持有buffer header自旋鎖的buffer. */ BufferDesc * StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state) { BufferDesc *buf;//buffer描述符 int bgwprocno; int trycounter;//嘗試次數(shù) //避免重復(fù)的依賴和解依賴 uint32 local_buf_state; /* to avoid repeated (de-)referencing */ /* * If given a strategy object, see whether it can select a buffer. We * assume strategy objects don't need buffer_strategy_lock. * 如果給定了一個策略對象,看看是否可以選擇一個buffer. * 我們假定策略對象不需要buffer_strategy_lock鎖. */ if (strategy != NULL) { //從環(huán)形緩沖區(qū)中獲取buffer,如獲取成功,則返回該buffer buf = GetBufferFromRing(strategy, buf_state); if (buf != NULL) return buf; } /* * If asked, we need to waken the bgwriter. Since we don't want to rely on * a spinlock for this we force a read from shared memory once, and then * set the latch based on that value. We need to go through that length * because otherwise bgprocno might be reset while/after we check because * the compiler might just reread from memory. * 如需要,則喚醒后臺進(jìn)程bgwriter. * 我們不希望依賴自旋鎖實現(xiàn)這一點(diǎn),所以強(qiáng)制從共享內(nèi)存中讀取一次,然后根據(jù)該值設(shè)置latch. * 我們需要走完這一步,否則的話bgprocno在檢查期間或之后被重置,因為編譯器可能重新從內(nèi)存中讀取數(shù)據(jù). * * This can possibly set the latch of the wrong process if the bgwriter * dies in the wrong moment. But since PGPROC->procLatch is never * deallocated the worst consequence of that is that we set the latch of * some arbitrary process. * 如果bgwriter出現(xiàn)異常宕機(jī),可能會出現(xiàn)latch被設(shè)置為錯誤的進(jìn)程. * 但是由于PGPROC->procLatch從來沒有被釋放過,最壞的結(jié)果是我們設(shè)置了一些任意進(jìn)程的latch。 */ bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno); if (bgwprocno != -1) { //--- 如bgwprocno不是-1 /* reset bgwprocno first, before setting the latch */ //在設(shè)置latch前,首先重置bgwprocno為-1 StrategyControl->bgwprocno = -1; /* * Not acquiring ProcArrayLock here which is slightly icky. It's * actually fine because procLatch isn't ever freed, so we just can * potentially set the wrong process' (or no process') latch. * 在這里不需要請求"令人生厭"的ProcArrayLock. * 因為procLatch未曾釋放過,因此實際上是沒有問題的, * 所以我們可能會設(shè)置錯誤的進(jìn)程(或沒有進(jìn)程)latch。 */ SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch); } /* * We count buffer allocation requests so that the bgwriter can estimate * the rate of buffer consumption. Note that buffers recycled by a * strategy object are intentionally not counted here. * 計算buffer分配請求,這樣bgwriter可以估算buffer消耗的比例. * 注意通過策略對象進(jìn)行的buffer回收不會在這里計算. */ pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1); /* * First check, without acquiring the lock, whether there's buffers in the * freelist. Since we otherwise don't require the spinlock in every * StrategyGetBuffer() invocation, it'd be sad to acquire it here - * uselessly in most cases. That obviously leaves a race where a buffer is * put on the freelist but we don't see the store yet - but that's pretty * harmless, it'll just get used during the next buffer acquisition. * 不需要請求鎖,首次檢查在freelist中是否存在buffer. * 因為我們不需要在每次StrategyGetBuffer()調(diào)用時都使用自旋鎖, * 在這里請求自旋鎖有點(diǎn)郁悶 -- 因為大多數(shù)情況下都沒有用. * 這顯然存在一個競爭,其中緩沖區(qū)被放在空閑列表中,但進(jìn)程卻看不到存儲 * -- 但這是無害的,在下次buffer申請期間使用. * * If there's buffers on the freelist, acquire the spinlock to pop one * buffer of the freelist. Then check whether that buffer is usable and * repeat if not. * 如果在空閑列表中有buffer存在,請求自旋鎖,從空閑列表中彈出一個可用的buffer. * 然后檢查該buffer是否可用,如不可用則繼續(xù)處理. * * Note that the freeNext fields are considered to be protected by the * buffer_strategy_lock not the individual buffer spinlocks, so it's OK to * manipulate them without holding the spinlock. * 注意freeNext字段通過buffer_strategy_lock鎖來保護(hù),而不是使用獨(dú)立的緩沖區(qū)自旋鎖保護(hù), * 因此不需要持有自旋鎖就可以維護(hù)這些字段. */ if (StrategyControl->firstFreeBuffer >= 0) { while (true) { /* Acquire the spinlock to remove element from the freelist */ //請求自旋鎖,刪除空閑鏈表中的元素 SpinLockAcquire(&StrategyControl->buffer_strategy_lock); if (StrategyControl->firstFreeBuffer < 0) { //如無空閑空間,則馬上跳出循環(huán) SpinLockRelease(&StrategyControl->buffer_strategy_lock); break; } //獲取緩沖描述符 buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer); Assert(buf->freeNext != FREENEXT_NOT_IN_LIST); /* Unconditionally remove buffer from freelist */ //無條件的清除空閑鏈表中的buffer StrategyControl->firstFreeBuffer = buf->freeNext; buf->freeNext = FREENEXT_NOT_IN_LIST; /* * Release the lock so someone else can access the freelist while * we check out this buffer. * 釋放鎖,這樣其他進(jìn)程在我們檢查該緩沖的時候可以訪問空閑鏈表 */ SpinLockRelease(&StrategyControl->buffer_strategy_lock); /* * If the buffer is pinned or has a nonzero usage_count, we cannot * use it; discard it and retry. (This can only happen if VACUUM * put a valid buffer in the freelist and then someone else used * it before we got to it. It's probably impossible altogether as * of 8.3, but we'd better check anyway.) * 如果緩沖pinned或者usage_count非0,則不能使用該buffer,丟棄并重試. * (這種情況發(fā)生在VACUUM把一個有效的buffer放在空閑鏈表中,然后其他進(jìn)程提前獲得了這個buffer. * 在8.3中是完全不可能的,但最好執(zhí)行該檢查) */ //鎖定緩沖頭部 local_buf_state = LockBufHdr(buf); if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0 && BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0) { //refcount == 0 && usagecount == 0 if (strategy != NULL) //非默認(rèn)策略,則添加到環(huán)形緩沖區(qū)中 AddBufferToRing(strategy, buf); //設(shè)置輸出參數(shù) *buf_state = local_buf_state; //返回buf return buf; } //不滿足條件,解鎖buffer header UnlockBufHdr(buf, local_buf_state); } } /* Nothing on the freelist, so run the "clock sweep" algorithm */ //空閑鏈表中找不到或者滿足不了條件,則執(zhí)行"clock sweep"算法 //int NBuffers = 1000; trycounter = NBuffers;//嘗試次數(shù) for (;;) { //------- 循環(huán) //獲取buffer描述符 buf = GetBufferDescriptor(ClockSweepTick()); /* * If the buffer is pinned or has a nonzero usage_count, we cannot use * it; decrement the usage_count (unless pinned) and keep scanning. * 如果buffer已pinned,或者有一個非零值的usage_count,不能使用這個buffer. * 減少usage_count(除非已pinned)繼續(xù)掃描. */ local_buf_state = LockBufHdr(buf); if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0) { //----- refcount == 0 if (BUF_STATE_GET_USAGECOUNT(local_buf_state) != 0) { //usage_count <> 0 //usage_count - 1 local_buf_state -= BUF_USAGECOUNT_ONE; //重置嘗試次數(shù) trycounter = NBuffers; } else { //usage_count = 0 /* Found a usable buffer */ //發(fā)現(xiàn)一個可用的buffer if (strategy != NULL) //添加到該策略的環(huán)形緩沖區(qū)中 AddBufferToRing(strategy, buf); //輸出參數(shù)賦值 *buf_state = local_buf_state; //返回buf return buf; } } else if (--trycounter == 0) { //----- refcount <> 0 && --trycounter == 0 /* * We've scanned all the buffers without making any state changes, * so all the buffers are pinned (or were when we looked at them). * We could hope that someone will free one eventually, but it's * probably better to fail than to risk getting stuck in an * infinite loop. * 在沒有改變?nèi)魏螤顟B(tài)的情況,我們已經(jīng)完成了所有buffers的遍歷, * 因此所有的buffers已pinned(或者在搜索的時候pinned). * 我們希望某些進(jìn)程會周期性的釋放buffer,但如果實在拿不到,那報錯總比傻傻的死循環(huán)要好. */ UnlockBufHdr(buf, local_buf_state); elog(ERROR, "no unpinned buffers available"); } //解鎖buffer header UnlockBufHdr(buf, local_buf_state); } }
測試腳本,查詢數(shù)據(jù)表:
10:01:54 (xdb@[local]:5432)testdb=# select * from t1 limit 10;
啟動gdb,設(shè)置斷點(diǎn)
(gdb) Continuing. Breakpoint 1, StrategyGetBuffer (strategy=0x0, buf_state=0x7ffcc97fb4ec) at freelist.c:212 212 if (strategy != NULL) (gdb)
輸入?yún)?shù)
strategy=NULL,策略對象,使用默認(rèn)策略
(gdb) p *buf_state $1 = 0
1.初始化相關(guān)變量
2.如策略對象不為空,則從環(huán)形緩沖區(qū)中獲取buffer,如成功則返回buf
3.如需要,則喚醒后臺進(jìn)程bgwriter,從共享內(nèi)存中讀取一次,然后根據(jù)該值設(shè)置latch
(gdb) n 231 bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno); (gdb) 232 if (bgwprocno != -1) (gdb) 235 StrategyControl->bgwprocno = -1; (gdb) p bgwprocno $2 = 112 (gdb) p StrategyControl $3 = (BufferStrategyControl *) 0x7f8607b21700 (gdb) p *StrategyControl $4 = {buffer_strategy_lock = 0 '\000', nextVictimBuffer = {value = 0}, firstFreeBuffer = 134, lastFreeBuffer = 65535, completePasses = 0, numBufferAllocs = {value = 0}, bgwprocno = 112} (gdb) n 242 SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch); (gdb)
4.計算buffer分配請求,這樣bgwriter可以估算buffer消耗的比例.
(gdb) 250 pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
5.檢查freelist中是否存在buffer
(gdb) 268 if (StrategyControl->firstFreeBuffer >= 0)
5.1如存在,則執(zhí)行相關(guān)判斷邏輯,如成功,則返回buf
(gdb) n 273 SpinLockAcquire(&StrategyControl->buffer_strategy_lock); (gdb) 275 if (StrategyControl->firstFreeBuffer < 0) (gdb) 281 buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer); (gdb) 282 Assert(buf->freeNext != FREENEXT_NOT_IN_LIST); (gdb) p *buf $5 = {tag = {rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forkNum = InvalidForkNumber, blockNum = 4294967295}, buf_id = 134, state = {value = 0}, wait_backend_pid = 0, freeNext = 135, content_lock = {tranche = 54, state = { value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}} (gdb) n 285 StrategyControl->firstFreeBuffer = buf->freeNext; (gdb) 286 buf->freeNext = FREENEXT_NOT_IN_LIST; (gdb) 292 SpinLockRelease(&StrategyControl->buffer_strategy_lock); (gdb) 301 local_buf_state = LockBufHdr(buf); (gdb) 302 if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0 (gdb) 303 && BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0) (gdb) 305 if (strategy != NULL) (gdb) 307 *buf_state = local_buf_state; (gdb) 308 return buf; (gdb) p *buf_state $6 = 4194304 (gdb) p *buf $7 = {tag = {rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forkNum = InvalidForkNumber, blockNum = 4294967295}, buf_id = 134, state = {value = 4194304}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = { value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}} (gdb)
返回結(jié)果,回到BufferAlloc
(gdb) n 358 } (gdb) BufferAlloc (smgr=0x22a38a0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, strategy=0x0, foundPtr=0x7ffcc97fb5c3) at bufmgr.c:1073 1073 Assert(BUF_STATE_GET_REFCOUNT(buf_state) == 0); (gdb)
“PostgreSQL中StrategyGetBuffer函數(shù)有什么作用”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識可以關(guān)注億速云網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實用文章!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。