PostgreSQL中的Tuplesortstate數(shù)據(jù)結(jié)構(gòu)是怎樣的

發(fā)布時間：2021-11-09 11:48:37 來源：億速云閱讀：163 作者：iii 欄目：關(guān)系型數(shù)據(jù)庫
本篇內(nèi)容主要講解“PostgreSQL中的Tuplesortstate數(shù)據(jù)結(jié)構(gòu)是怎樣的”，感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷，實用性強。下面就讓小編來帶大家學(xué)習(xí)“PostgreSQL中的Tuplesortstate數(shù)據(jù)結(jié)構(gòu)是怎樣的”吧!
Tuplesortstate
Tuplesort操作的私有狀態(tài).
/*
 * Possible states of a Tuplesort object.  These denote the states that
 * persist between calls of Tuplesort routines.
 * Tuplesort對象可能的狀態(tài).
 * 這些標(biāo)示在Tuplesort例程調(diào)用之間會持久存在的狀態(tài).
 */
typedef enum
{
    //裝載元組,在內(nèi)存限制之內(nèi)
    TSS_INITIAL,                /* Loading tuples; still within memory limit */
    //裝載元組到有界堆中
    TSS_BOUNDED,                /* Loading tuples into bounded-size heap */
    //裝載元組,寫入到tape中
    TSS_BUILDRUNS,              /* Loading tuples; writing to tape */
    //完全在內(nèi)存中完成排序
    TSS_SORTEDINMEM,            /* Sort completed entirely in memory */
    //完成排序,最后在tape上執(zhí)行排序
    TSS_SORTEDONTAPE,           /* Sort completed, final run is on tape */
    //不落地執(zhí)行最后的歸并
    TSS_FINALMERGE              /* Performing final merge on-the-fly */
} TupSortStatus;
/*
 * Parameters for calculation of number of tapes to use --- see inittapes()
 * and tuplesort_merge_order().
 * 用于計算需要使用多少個tapes的參數(shù).--- 詳細參見inittapes()和tuplesort_merge_order().
 *
 * In this calculation we assume that each tape will cost us about 1 blocks
 * worth of buffer space.  This ignores the overhead of all the other data
 * structures needed for each tape, but it's probably close enough.
 * 在這個計算中,我們假定每一個tape會大概消耗緩存空間的一個block.
 * 雖然已經(jīng)忽略了所有每個tape依賴的其他數(shù)據(jù)結(jié)構(gòu),但已經(jīng)非常接近了.
 *
 * MERGE_BUFFER_SIZE is how much data we'd like to read from each input
 * tape during a preread cycle (see discussion at top of file).
 * MERGE_BUFFER_SIZE表示在每一輪讀周期中我們將要從每個輸入taple中讀取的數(shù)據(jù)大小
 */
#define MINORDER        6       /* minimum merge order */
#define MAXORDER        500     /* maximum merge order */
#define TAPE_BUFFER_OVERHEAD        BLCKSZ
#define MERGE_BUFFER_SIZE           (BLCKSZ * 32)
typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b,
                                    Tuplesortstate *state);
/*
 * Private state of a Tuplesort operation.
 * Tuplesort操作的私有狀態(tài).
 */
struct Tuplesortstate
{
    //狀態(tài) : 枚舉值詳見上面的信息
    TupSortStatus status;       /* enumerated value as shown above */
    //sort key中的列數(shù)
    int         nKeys;          /* number of columns in sort key */
    //調(diào)用者需要隨機訪問?
    bool        randomAccess;   /* did caller request random access? */
    //調(diào)用者是否指定了最大返回的元組的數(shù)目?
    bool        bounded;        /* did caller specify a maximum number of
                                 * tuples to return? */
    //使用有界堆,則返回T
    bool        boundUsed;      /* true if we made use of a bounded heap */
    //如為有界堆,這里存儲最大的元組個數(shù)
    int         bound;          /* if bounded, the maximum number of tuples */
    //SortTuple.tuple是否可以設(shè)置?
    bool        tuples;         /* Can SortTuple.tuple ever be set? */
    //剩余可用內(nèi)存大小(單位:字節(jié))
    int64       availMem;       /* remaining memory available, in bytes */
    //允許的內(nèi)存總大小(單位:字節(jié))
    int64       allowedMem;     /* total memory allowed, in bytes */
    //tapes個數(shù)
    int         maxTapes;       /* number of tapes (Knuth's T) */
    //tapes個數(shù) - 1
    int         tapeRange;      /* maxTapes-1 (Knuth's P) */
    //主要用于排序數(shù)據(jù)的內(nèi)存上下文
    MemoryContext sortcontext;  /* memory context holding most sort data */
    //用于元組數(shù)據(jù)的sortcontext的子上下文
    MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
    //臨時文件中tapes的logtape.c對象
    LogicalTapeSet *tapeset;    /* logtape.c object for tapes in a temp file */
    /*
     * These function pointers decouple the routines that must know what kind
     * of tuple we are sorting from the routines that don't need to know it.
     * They are set up by the tuplesort_begin_xxx routines.
     * 這些函數(shù)指針將必須知道排序的哪種元組的例程與不需要知道它的例程解耦.
     *
     * Function to compare two tuples; result is per qsort() convention, ie:
     * <0, 0, >0 according as a<b, a=b, a>b.  The API must match
     * qsort_arg_comparator.
     * 比較兩個元組的函數(shù),結(jié)果由每個qsort()約定,比如:
     *   < 0, 0, >0代表a<b,a=b,a>b.API必須匹配qsort_arg_comparator.
     */
    SortTupleComparator comparetup;
    /*
     * Function to copy a supplied input tuple into palloc'd space and set up
     * its SortTuple representation (ie, set tuple/datum1/isnull1).  Also,
     * state->availMem must be decreased by the amount of space used for the
     * tuple copy (note the SortTuple struct itself is not counted).
     * 該函數(shù)用于拷貝一個輸入的元組到由palloc分配的內(nèi)存空間中,
     *   同時設(shè)置SortTuple數(shù)據(jù)結(jié)構(gòu)(比如設(shè)置tuple/datum1/isnull1等).
     * 同時,state->availMem必須減去用于元組拷貝的空間大小(注意:SortTuple結(jié)構(gòu)體不計算在內(nèi)).
     */
    void        (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup);
    /*
     * Function to write a stored tuple onto tape.  The representation of the
     * tuple on tape need not be the same as it is in memory; requirements on
     * the tape representation are given below.  Unless the slab allocator is
     * used, after writing the tuple, pfree() the out-of-line data (not the
     * SortTuple struct!), and increase state->availMem by the amount of
     * memory space thereby released.
     * 用于寫入元組到taple的函數(shù).
     * tape中的元組聲明不需要與內(nèi)存中的一致,tape中的聲明要求詳見下面說明.
     * 除非使用slab分配器,在寫入元組后,pfree() out-of-line的數(shù)據(jù)(不是SortTuple結(jié)構(gòu)體),
     *   同時把剛才釋放的內(nèi)存空間加到state->availMem中.
     */
    void        (*writetup) (Tuplesortstate *state, int tapenum,
                             SortTuple *stup);
    /*
     * Function to read a stored tuple from tape back into memory. 'len' is
     * the already-read length of the stored tuple.  The tuple is allocated
     * from the slab memory arena, or is palloc'd, see readtup_alloc().
     * 從tape中讀取元組到內(nèi)存中的函數(shù).
     * 'len'是已讀取的存儲元組的長度.元組在slab內(nèi)存空間/palloc中分配,詳細參考readtup_alloc()函數(shù)
     */
    void        (*readtup) (Tuplesortstate *state, SortTuple *stup,
                            int tapenum, unsigned int len);
    /*
     * This array holds the tuples now in sort memory.  If we are in state
     * INITIAL, the tuples are in no particular order; if we are in state
     * SORTEDINMEM, the tuples are in final sorted order; in states BUILDRUNS
     * and FINALMERGE, the tuples are organized in "heap" order per Algorithm
     * H.  In state SORTEDONTAPE, the array is not used.
     * 該數(shù)組保存排序內(nèi)存中的元組.當(dāng)前狀態(tài)為
     * INITIAL:元組沒有特定的順序;
     * SORTEDINMEM:元組處于最終已排序的狀態(tài);
     * BUILDRUNS/FINALMERGE:元組按算法H的'堆'順序組織.
     * SORTEDONTAPE:數(shù)組未使用.
     */
    //SortTuple結(jié)構(gòu)體數(shù)組
    SortTuple  *memtuples;      /* array of SortTuple structs */
    //當(dāng)前存在的元組數(shù)
    int         memtupcount;    /* number of tuples currently present */
    //memtuples數(shù)組的已分配的大小
    int         memtupsize;     /* allocated length of memtuples array */
    //memtuples的增長仍在進行中?
    bool        growmemtuples;  /* memtuples' growth still underway? */
    /*
     * Memory for tuples is sometimes allocated using a simple slab allocator,
     * rather than with palloc().  Currently, we switch to slab allocation
     * when we start merging.  Merging only needs to keep a small, fixed
     * number of tuples in memory at any time, so we can avoid the
     * palloc/pfree overhead by recycling a fixed number of fixed-size slots
     * to hold the tuples.
     * 有時候元組的內(nèi)存分配使用簡單的slab分配器實現(xiàn)而不是palloc().
     * 在開始歸并時,同步切換至slab分配器.歸并只需要在內(nèi)存中保持簡單/固定的元組數(shù)目,
     *   因此可以避免頻繁回收固定數(shù)目固定大小的slots(用于保存元組)而導(dǎo)致的palloc/pfree過載.
     *
     * For the slab, we use one large allocation, divided into SLAB_SLOT_SIZE
     * slots.  The allocation is sized to have one slot per tape, plus one
     * additional slot.  We need that many slots to hold all the tuples kept
     * in the heap during merge, plus the one we have last returned from the
     * sort, with tuplesort_gettuple.
     * 對于slab,使用大型分配器,拆分為SLAB_SLOT_SIZE個大小的slots.
     * 分配器的大小為每個tape一個slot,外加一個為的slot.
     * 我們需要這么多slots是因為在歸并期間需要保存所有在堆中的元組,外加tuplesort_gettuple從排序中最終返回的元組
     *
     * Initially, all the slots are kept in a linked list of free slots.  When
     * a tuple is read from a tape, it is put to the next available slot, if
     * it fits.  If the tuple is larger than SLAB_SLOT_SIZE, it is palloc'd
     * instead.
     * 一開始,所有的slots在空閑slots中以鏈表的方式保存.
     * 從tape中讀取元組時,如合適的話,元組會放到下一個可用slot中.
     * 如果元組比SLAB_SLOT_SIZE要大,改用palloc分配內(nèi)存空間.
     *
     * When we're done processing a tuple, we return the slot back to the free
     * list, or pfree() if it was palloc'd.  We know that a tuple was
     * allocated from the slab, if its pointer value is between
     * slabMemoryBegin and -End.
     * 如果已完成元組的處理,返回slot到空閑鏈表中,如果使用palloc分配則使用pfree回收空間.
     * 如果元組指針值在slabMemoryBegin和slabMemoryEnd之間,那么我們可以知道元組是從slab中分配的.
     *
     * When the slab allocator is used, the USEMEM/LACKMEM mechanism of
     * tracking memory usage is not used.
     * 如使用了slab分配器,不會使用USEMEM/LACKMEM機制跟蹤內(nèi)存使用.
     */
    bool        slabAllocatorUsed;
    //slab內(nèi)存空間的起始位置
    char       *slabMemoryBegin;    /* beginning of slab memory arena */
    //slab內(nèi)存空間的結(jié)束位置
    char       *slabMemoryEnd;  /* end of slab memory arena */
    //鏈表頭
    SlabSlot   *slabFreeHead;   /* head of free list */
    /* Buffer size to use for reading input tapes, during merge. */
    //在歸并期間用于讀取輸入tapes的緩存大小
    size_t      read_buffer_size;
    /*
     * When we return a tuple to the caller in tuplesort_gettuple_XXX, that
     * came from a tape (that is, in TSS_SORTEDONTAPE or TSS_FINALMERGE
     * modes), we remember the tuple in 'lastReturnedTuple', so that we can
     * recycle the memory on next gettuple call.
     * 通過tuplesort_gettuple_XXX方法調(diào)用返回元組給調(diào)用者時,元組從tape中獲取
     * (也就是說,TSS_SORTEDONTAPE/TSS_FINALMERGE模式),這時候會把元組放在'lastReturnedTuple'中,
     *   因此可在下次gettuple調(diào)用中回收內(nèi)存.
     */
    void       *lastReturnedTuple;
    /*
     * While building initial runs, this is the current output run number.
     * Afterwards, it is the number of initial runs we made.
     * 在構(gòu)建initial運行時,這是當(dāng)前輸出的run編號.
     * 后續(xù)這是我們構(gòu)建好的initial runs的編號.
     */
    int         currentRun;
    /*
     * Unless otherwise noted, all pointer variables below are pointers to
     * arrays of length maxTapes, holding per-tape data.
     * 除非特別注意,下面所有的指針變量是指向長度為maxTapes的數(shù)組,保存per-tape數(shù)據(jù).
     */
    /*
     * This variable is only used during merge passes.  mergeactive[i] is true
     * if we are reading an input run from (actual) tape number i and have not
     * yet exhausted that run.
     * 該變量在歸并過程中使用.
     * mergeactive[i]為T如果我們從編號為i的tape中讀取數(shù)據(jù),仍未在該run中消耗完畢.
     */
    //活躍的輸入run源?
    bool       *mergeactive;    /* active input run source? */
    /*
     * Variables for Algorithm D.  Note that destTape is a "logical" tape
     * number, ie, an index into the tp_xxx[] arrays.  Be careful to keep
     * "logical" and "actual" tape numbers straight!
     * 用于算法D的變量.
     * 注意destTape是一個邏輯tape編號,例如,是指向tp_xxx[]數(shù)組的索引.
     * 注意保持"邏輯"和"實際"tape編號的連續(xù)性.
     */
    //Knuth's l
    int         Level;          /* Knuth's l */
    //當(dāng)前輸出tape(Knuth's j)
    int         destTape;       /* current output tape (Knuth's j, less 1) */
    //目標(biāo)斐波拉契run計數(shù)(A[])
    int        *tp_fib;         /* Target Fibonacci run counts (A[]) */
    //每一個tape上真正runs的編號
    int        *tp_runs;        /* # of real runs on each tape */
    //每一個tape(D[])上虛擬runs的編號
    int        *tp_dummy;       /* # of dummy runs for each tape (D[]) */
    //實際的tape編號(TAPE[])
    int        *tp_tapenum;     /* Actual tape numbers (TAPE[]) */
    //歸并輪中的活動輸入tapes編號
    int         activeTapes;    /* # of active input tapes in merge pass */
    /*
     * These variables are used after completion of sorting to keep track of
     * the next tuple to return.  (In the tape case, the tape's current read
     * position is also critical state.)
     * 這些變量用于在排序完成后保持下一個返回元組時的跟蹤.
     * (在tape情況下,tape的當(dāng)前讀取位置也是重要的狀態(tài))
     */
    //已完成輸出的實際tape編號
    int         result_tape;    /* actual tape number of finished output */
    //數(shù)組編號(僅用于SORTEDINMEM)
    int         current;        /* array index (only used if SORTEDINMEM) */
    //是否到達EOF(用于游標(biāo))
    bool        eof_reached;    /* reached EOF (needed for cursors) */
    /* markpos_xxx holds marked position for mark and restore */
    //markpos_xxx保持已標(biāo)記的位置,用于標(biāo)記和存儲
    //tape block編號(只用于SORTEDONTAPE)
    long        markpos_block;  /* tape block# (only used if SORTEDONTAPE) */
    //存儲的"current",或者tape塊中的偏移
    int         markpos_offset; /* saved "current", or offset in tape block */
    //存儲的eof_reached
    bool        markpos_eof;    /* saved "eof_reached" */
    /*
     * These variables are used during parallel sorting.
     * 這些變量用于并行排序.
     *
     * worker is our worker identifier.  Follows the general convention that
     * -1 value relates to a leader tuplesort, and values >= 0 worker
     * tuplesorts. (-1 can also be a serial tuplesort.)
     * worker是worker標(biāo)識符ID.
     * 遵循一般約定，-1值與leader tuplesort相關(guān)，并且值>= 0表示worker tuplesorts。
     * (在串行tuplesort時,-1也可以表示這種情況)
     *
     * shared is mutable shared memory state, which is used to coordinate
     * parallel sorts.
     * shared是可變的共享內(nèi)存狀態(tài),用于協(xié)調(diào)并行排序.
     *
     * nParticipants is the number of worker Tuplesortstates known by the
     * leader to have actually been launched, which implies that they must
     * finish a run leader can merge.  Typically includes a worker state held
     * by the leader process itself.  Set in the leader Tuplesortstate only.
     * nParticipants是已知的worker Tuplesortstates的數(shù)目,這些worker由leader感知,是實際啟動的worker數(shù),
     *   這也意味著在leader可以歸并前這些worker必須完成.
     * 典型的,leader進行自身包含至少一個worker狀態(tài).
     * 只在leader的Tuplesortstate中設(shè)置.
     */
    int         worker;
    Sharedsort *shared;
    int         nParticipants;
    /*
     * The sortKeys variable is used by every case other than the hash index
     * case; it is set by tuplesort_begin_xxx.  tupDesc is only used by the
     * MinimalTuple and CLUSTER routines, though.
     * sortKeys變量用于every而不是hash index,通過tuplesort_begin_xxx設(shè)置.
     * tupDesc只由MinimalTuple和CLUSTER例程使用.
     */
    TupleDesc   tupDesc;
    //長度nKeys數(shù)組
    SortSupport sortKeys;       /* array of length nKeys */
    /*
     * This variable is shared by the single-key MinimalTuple case and the
     * Datum case (which both use qsort_ssup()).  Otherwise it's NULL.
     * 該變量在單鍵MinimalTuple和Datum情況下(使用qsort_ssup()函數(shù))共享使用,否則的話值為NULL.
     */
    SortSupport onlyKey;
    /*
     * Additional state for managing "abbreviated key" sortsupport routines
     * (which currently may be used by all cases except the hash index case).
     * Tracks the intervals at which the optimization's effectiveness is
     * tested.
     * 管理"縮寫鍵"sortsupport過程的額外狀態(tài).
     * (除了hash index外會被其他情況使用)
     * 跟蹤在優(yōu)化器有效性測試時時間間隔. 
     */
    int64       abbrevNext;     /* Tuple # at which to next check
                                 * applicability */
    /*
     * These variables are specific to the CLUSTER case; they are set by
     * tuplesort_begin_cluster.
     * 這些變量僅在CLUSTER時生效,通過tuplesort_begin_cluster設(shè)置.
     */
    //將用于依賴的索引信息
    IndexInfo  *indexInfo;      /* info about index being used for reference */
    //解析索引表達式的運行期狀態(tài)
    EState     *estate;         /* for evaluating index expressions */
    /*
     * These variables are specific to the IndexTuple case; they are set by
     * tuplesort_begin_index_xxx and used only by the IndexTuple routines.
     * 這些變量僅用于IndexTuple.
     * 通過tuplesort_begin_index_xxx設(shè)置,僅用于IndexTuple例程.
     */
    //數(shù)據(jù)表
    Relation    heapRel;        /* table the index is being built on */
    //正在創(chuàng)建的index
    Relation    indexRel;       /* index being built */
    /* These are specific to the index_btree subcase: */
    //這些僅在index_btree下使用
    //如發(fā)現(xiàn)重復(fù)元組,則提示
    bool        enforceUnique;  /* complain if we find duplicate tuples */
    /* These are specific to the index_hash subcase: */
    //index_hash情況
    uint32      high_mask;      /* masks for sortable part of hash code */
    uint32      low_mask;
    uint32      max_buckets;
    /*
     * These variables are specific to the Datum case; they are set by
     * tuplesort_begin_datum and used only by the DatumTuple routines.
     * 這些變量用于Datum,通過tuplesort_begin_datum設(shè)置,僅用于DatumTuple例程.
     */
    Oid         datumType;
    /* we need typelen in order to know how to copy the Datums. */
    //需要typelen用于知道如何拷貝Datums.
    int         datumTypeLen;
    /*
     * Resource snapshot for time of sort start.
     * 在排序開始時的資源快照
     */
#ifdef TRACE_SORT
    PGRUsage    ru_start;
#endif
};
到此，相信大家對“PostgreSQL中的Tuplesortstate數(shù)據(jù)結(jié)構(gòu)是怎樣的”有了更深的了解，不妨來實際操作一番吧！這里是億速云網(wǎng)站，更多相關(guān)內(nèi)容可以進入相關(guān)頻道進行查詢，關(guān)注我們，繼續(xù)學(xué)習(xí)！
向AI問一下細節(jié)
PostgreSQL中的Tuplesortstate數(shù)據(jù)結(jié)構(gòu)是怎樣的

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽