溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務(wù)條款》

PostgreSQL 源碼解讀(98)- 分區(qū)表#4(數(shù)據(jù)查詢路由#1-“擴展”分區(qū)表)

發(fā)布時間:2020-08-17 13:10:46 來源:ITPUB博客 閱讀:258 作者:husthxd 欄目:關(guān)系型數(shù)據(jù)庫

在查詢分區(qū)表的時候PG如何確定查詢的是哪個分區(qū)?如何確定?相關(guān)的機制是什么?接下來幾個章節(jié)將一一介紹,本節(jié)是第一部分。

零、實現(xiàn)機制

我們先看下面的例子,兩個普通表t_normal_1和t_normal_2,執(zhí)行UNION ALL操作:

drop table if exists t_normal_1;
drop table if exists t_normal_2;
create table t_normal_1 (c1 int not null,c2  varchar(40),c3 varchar(40));
create table t_normal_2 (c1 int not null,c2  varchar(40),c3 varchar(40));

insert into t_normal_1(c1,c2,c3) VALUES(0,'HASH0','HAHS0');
insert into t_normal_2(c1,c2,c3) VALUES(0,'HASH0','HAHS0');

testdb=# explain verbose select * from t_normal_1 where c1 = 0
testdb-# union all
testdb-# select * from t_normal_2 where c1 <> 0;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Append  (cost=0.00..34.00 rows=350 width=200)
   ->  Seq Scan on public.t_normal_1  (cost=0.00..14.38 rows=2 width=200)
         Output: t_normal_1.c1, t_normal_1.c2, t_normal_1.c3
         Filter: (t_normal_1.c1 = 0)
   ->  Seq Scan on public.t_normal_2  (cost=0.00..14.38 rows=348 width=200)
         Output: t_normal_2.c1, t_normal_2.c2, t_normal_2.c3
         Filter: (t_normal_2.c1 <> 0)
(7 rows)

兩張普通表的UNION ALL,PG使用APPEND操作符把t_normal_1順序掃描的結(jié)果集和t_normal_2順序掃描的結(jié)果集"APPEND"在一起作為最終的結(jié)果集輸出.

分區(qū)表的查詢也是類似的機制,把各個分區(qū)的結(jié)果集APPEND在一起,然后作為最終的結(jié)果集輸出,如下例所示:

testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Append  (cost=0.00..30.53 rows=6 width=200)
   ->  Seq Scan on public.t_hash_partition_1  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3
         Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2))
   ->  Seq Scan on public.t_hash_partition_3  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3
         Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))
(7 rows)

查詢分區(qū)表t_hash_partition,條件為c1 = 1 OR c1 = 2,從執(zhí)行計劃可見是把t_hash_partition_1順序掃描的結(jié)果集和t_hash_partition_3順序掃描的結(jié)果集"APPEND"在一起作為最終的結(jié)果集輸出.

這里面有幾個問題需要解決:
1.識別分區(qū)表并找到所有的分區(qū)子表;
2.根據(jù)約束條件識別需要查詢的分區(qū),這是出于性能的考慮;
3.對結(jié)果集執(zhí)行APPEND,作為最終結(jié)果輸出.
本節(jié)介紹了PG如何識別分區(qū)表并找到所有的分區(qū)子表,實現(xiàn)的函數(shù)是expand_inherited_tables.

一、數(shù)據(jù)結(jié)構(gòu)

AppendRelInfo
Append-relation信息.
當我們將可繼承表(分區(qū)表)或UNION-ALL子查詢展開為“追加關(guān)系”(本質(zhì)上是子RTE的鏈表)時,為每個子RTE構(gòu)建一個AppendRelInfo。
AppendRelInfos鏈表指示在展開父節(jié)點時必須包含哪些子rte,每個節(jié)點具有將引用父節(jié)點的Vars轉(zhuǎn)換為引用該子節(jié)點的Vars所需的所有信息。

/*
 * Append-relation info.
 * Append-relation信息.
 * 
 * When we expand an inheritable table or a UNION-ALL subselect into an
 * "append relation" (essentially, a list of child RTEs), we build an
 * AppendRelInfo for each child RTE.  The list of AppendRelInfos indicates
 * which child RTEs must be included when expanding the parent, and each node
 * carries information needed to translate Vars referencing the parent into
 * Vars referencing that child.
 * 當我們將可繼承表(分區(qū)表)或UNION-ALL子查詢展開為“追加關(guān)系”(本質(zhì)上是子RTE的鏈表)時,
 *   為每個子RTE構(gòu)建一個AppendRelInfo。
 * AppendRelInfos鏈表指示在展開父節(jié)點時必須包含哪些子rte,
 *   每個節(jié)點具有將引用父節(jié)點的Vars轉(zhuǎn)換為引用該子節(jié)點的Vars所需的所有信息。
 * 
 * These structs are kept in the PlannerInfo node's append_rel_list.
 * Note that we just throw all the structs into one list, and scan the
 * whole list when desiring to expand any one parent.  We could have used
 * a more complex data structure (eg, one list per parent), but this would
 * be harder to update during operations such as pulling up subqueries,
 * and not really any easier to scan.  Considering that typical queries
 * will not have many different append parents, it doesn't seem worthwhile
 * to complicate things.
 * 這些結(jié)構(gòu)體保存在PlannerInfo節(jié)點的append_rel_list中。
 * 注意,只是將所有的結(jié)構(gòu)體放入一個鏈表中,并在希望展開任何父類時掃描整個鏈表。
 * 本可以使用更復雜的數(shù)據(jù)結(jié)構(gòu)(例如,每個父節(jié)點一個列表),
 *   但是在提取子查詢之類的操作中更新它會更困難,
 *   而且實際上也不會更容易掃描。
 * 考慮到典型的查詢不會有很多不同的附加項,因此似乎不值得將事情復雜化。
 * 
 * Note: after completion of the planner prep phase, any given RTE is an
 * append parent having entries in append_rel_list if and only if its
 * "inh" flag is set.  We clear "inh" for plain tables that turn out not
 * to have inheritance children, and (in an abuse of the original meaning
 * of the flag) we set "inh" for subquery RTEs that turn out to be
 * flattenable UNION ALL queries.  This lets us avoid useless searches
 * of append_rel_list.
 * 注意:計劃準備階段完成后,
 *   當且僅當它的“inh”標志已設(shè)置時,給定的RTE是一個append parent在append_rel_list中的一個條目。
 * 我們?yōu)闆]有child的平面表清除“inh”標記,
 *   同時(有濫用標記的嫌疑)為UNION ALL查詢中的子查詢RTEs設(shè)置“inh”標記。
 * 這樣可以避免對append_rel_list進行無用的搜索。
 * 
 * Note: the data structure assumes that append-rel members are single
 * baserels.  This is OK for inheritance, but it prevents us from pulling
 * up a UNION ALL member subquery if it contains a join.  While that could
 * be fixed with a more complex data structure, at present there's not much
 * point because no improvement in the plan could result.
 * 注意:數(shù)據(jù)結(jié)構(gòu)假定附加的rel成員是獨立的baserels。
 * 這對于繼承來說是可以的,但是如果UNION ALL member子查詢包含一個join,
 *   那么它將阻止我們提取UNION ALL member子查詢。
 * 雖然可以用更復雜的數(shù)據(jù)結(jié)構(gòu)解決這個問題,但目前沒有太大意義,因為該計劃可能不會有任何改進。
 */

typedef struct AppendRelInfo
{
    NodeTag     type;

    /*
     * These fields uniquely identify this append relationship.  There can be
     * (in fact, always should be) multiple AppendRelInfos for the same
     * parent_relid, but never more than one per child_relid, since a given
     * RTE cannot be a child of more than one append parent.
     * 這些字段惟一地標識這個append relationship。
     * 對于同一個parent_relid可以有(實際上應該總是)多個AppendRelInfos,
     *   但是每個child_relid不能有多個AppendRelInfos,
     *   因為給定的RTE不能是多個append parent的子節(jié)點。
     */
    Index       parent_relid;   /* parent rel的RT索引;RT index of append parent rel */
    Index       child_relid;    /* child rel的RT索引;RT index of append child rel */

    /*
     * For an inheritance appendrel, the parent and child are both regular
     * relations, and we store their rowtype OIDs here for use in translating
     * whole-row Vars.  For a UNION-ALL appendrel, the parent and child are
     * both subqueries with no named rowtype, and we store InvalidOid here.
     * 對于繼承appendrel,父類和子類都是普通關(guān)系,
     *   我們將它們的rowtype OIDs存儲在這里,用于轉(zhuǎn)換whole-row Vars。
     * 對于UNION-ALL appendrel,父查詢和子查詢都是沒有指定行類型的子查詢,
     * 我們在這里存儲InvalidOid。
     */
    Oid         parent_reltype; /* OID of parent's composite type */
    Oid         child_reltype;  /* OID of child's composite type */

    /*
     * The N'th element of this list is a Var or expression representing the
     * child column corresponding to the N'th column of the parent. This is
     * used to translate Vars referencing the parent rel into references to
     * the child.  A list element is NULL if it corresponds to a dropped
     * column of the parent (this is only possible for inheritance cases, not
     * UNION ALL).  The list elements are always simple Vars for inheritance
     * cases, but can be arbitrary expressions in UNION ALL cases.
     * 這個列表的第N個元素是一個Var或表達式,表示與父元素的第N列對應的子列。
     * 這用于將引用parent rel的Vars轉(zhuǎn)換為對子rel的引用。
     * 如果鏈表元素與父元素的已刪除列相對應,則該元素為NULL
     *   (這只適用于繼承情況,而不是UNION ALL)。
     * 對于繼承情況,鏈表元素總是簡單的變量,但是可以是UNION ALL情況下的任意表達式。
     *
     * Notice we only store entries for user columns (attno > 0).  Whole-row
     * Vars are special-cased, and system columns (attno < 0) need no special
     * translation since their attnos are the same for all tables.
     * 注意,我們只存儲用戶列的條目(attno > 0)。
     * Whole-row Vars是大小寫敏感的,系統(tǒng)列(attno < 0)不需要特別的轉(zhuǎn)換,
     *   因為它們的attno對所有表都是相同的。
     *
     * Caution: the Vars have varlevelsup = 0.  Be careful to adjust as needed
     * when copying into a subquery.
     * 注意:Vars的varlevelsup = 0。
     * 在將數(shù)據(jù)復制到子查詢時,要注意根據(jù)需要進行調(diào)整。
     */
    //child's Vars中的表達式
    List       *translated_vars;    /* Expressions in the child's Vars */

    /*
     * We store the parent table's OID here for inheritance, or InvalidOid for
     * UNION ALL.  This is only needed to help in generating error messages if
     * an attempt is made to reference a dropped parent column.
     * 我們將父表的OID存儲在這里用于繼承,
     *   如為UNION ALL,則這里存儲的是InvalidOid。
     * 只有在試圖引用已刪除的父列時,才需要這樣做來幫助生成錯誤消息。
     */
    Oid         parent_reloid;  /* OID of parent relation */
} AppendRelInfo;

PlannerInfo
該數(shù)據(jù)結(jié)構(gòu)用于存儲查詢語句在規(guī)劃/優(yōu)化過程中的相關(guān)信息

/*----------
 * PlannerInfo
 *      Per-query information for planning/optimization
 *      用于規(guī)劃/優(yōu)化的每個查詢信息
 * 
 * This struct is conventionally called "root" in all the planner routines.
 * It holds links to all of the planner's working state, in addition to the
 * original Query.  Note that at present the planner extensively modifies
 * the passed-in Query data structure; someday that should stop.
 * 在所有計劃程序例程中,這個結(jié)構(gòu)通常稱為“root”。
 * 除了原始查詢之外,它還保存到所有計劃器工作狀態(tài)的鏈接。
 * 注意,目前計劃器會毫無節(jié)制的修改傳入的查詢數(shù)據(jù)結(jié)構(gòu),相信總有一天這種情況會停止的。
 *----------
 */
struct AppendRelInfo;

typedef struct PlannerInfo
{
    NodeTag     type;//Node標識
    //查詢樹
    Query      *parse;          /* the Query being planned */
    //當前的planner全局信息
    PlannerGlobal *glob;        /* global info for current planner run */
    //查詢層次,1標識最高層
    Index       query_level;    /* 1 at the outermost Query */
    // 如為子計劃,則這里存儲父計劃器指針,NULL標識最高層
    struct PlannerInfo *parent_root;    /* NULL at outermost Query */

    /*
     * plan_params contains the expressions that this query level needs to
     * make available to a lower query level that is currently being planned.
     * outer_params contains the paramIds of PARAM_EXEC Params that outer
     * query levels will make available to this query level.
     * plan_params包含該查詢級別需要提供給當前計劃的較低查詢級別的表達式。
     * outer_params包含PARAM_EXEC Params的參數(shù),外部查詢級別將使該查詢級別可用這些參數(shù)。
     */
    List       *plan_params;    /* list of PlannerParamItems, see below */
    Bitmapset  *outer_params;

    /*
     * simple_rel_array holds pointers to "base rels" and "other rels" (see
     * comments for RelOptInfo for more info).  It is indexed by rangetable
     * index (so entry 0 is always wasted).  Entries can be NULL when an RTE
     * does not correspond to a base relation, such as a join RTE or an
     * unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
     * simple_rel_array保存指向“base rels”和“other rels”的指針
     * (有關(guān)RelOptInfo的更多信息,請參見注釋)。
     * 它由可范圍表索引建立索引(因此條目0總是被浪費)。
     * 當RTE與基本關(guān)系(如JOIN RTE或未被引用的視圖RTE時)不相對應
     *   或者如果RelOptInfo還沒有生成,條目可以為NULL。
     */
    //RelOptInfo數(shù)組,存儲"base rels",比如基表/子查詢等.
    //該數(shù)組與RTE的順序一一對應,而且是從1開始,因此[0]無用 */
    struct RelOptInfo **simple_rel_array;   /* All 1-rel RelOptInfos */
    int         simple_rel_array_size;  /* 數(shù)組大小,allocated size of array */

    /*
     * simple_rte_array is the same length as simple_rel_array and holds
     * pointers to the associated rangetable entries.  This lets us avoid
     * rt_fetch(), which can be a bit slow once large inheritance sets have
     * been expanded.
     * simple_rte_array的長度與simple_rel_array相同,
     *   并保存指向相應范圍表條目的指針。
     * 這使我們可以避免執(zhí)行rt_fetch(),因為一旦擴展了大型繼承集,rt_fetch()可能會有點慢。
     */
    //RTE數(shù)組
    RangeTblEntry **simple_rte_array;   /* rangetable as an array */

    /*
     * append_rel_array is the same length as the above arrays, and holds
     * pointers to the corresponding AppendRelInfo entry indexed by
     * child_relid, or NULL if none.  The array itself is not allocated if
     * append_rel_list is empty.
     * append_rel_array與上述數(shù)組的長度相同,
     *   并保存指向?qū)腁ppendRelInfo條目的指針,該條目由child_relid索引,
     *   如果沒有索引則為NULL。
     * 如果append_rel_list為空,則不分配數(shù)組本身。
     */
    //處理集合操作如UNION ALL時使用和分區(qū)表時使用
    struct AppendRelInfo **append_rel_array;

    /*
     * all_baserels is a Relids set of all base relids (but not "other"
     * relids) in the query; that is, the Relids identifier of the final join
     * we need to form.  This is computed in make_one_rel, just before we
     * start making Paths.
     * all_baserels是查詢中所有base relids(但不是“other” relids)的一個Relids集合;
     *   也就是說,這是需要形成的最終連接的Relids標識符。
     * 這是在開始創(chuàng)建路徑之前在make_one_rel中計算的。
     */
    Relids      all_baserels;//"base rels"

    /*
     * nullable_baserels is a Relids set of base relids that are nullable by
     * some outer join in the jointree; these are rels that are potentially
     * nullable below the WHERE clause, SELECT targetlist, etc.  This is
     * computed in deconstruct_jointree.
     * nullable_baserels是由jointree中的某些外連接中值可為空的base Relids集合;
     *   這些是在WHERE子句、SELECT targetlist等下面可能為空的樹。
     * 這是在deconstruct_jointree中處理獲得的。
     */
    //Nullable-side端的"base rels"
    Relids      nullable_baserels;

    /*
     * join_rel_list is a list of all join-relation RelOptInfos we have
     * considered in this planning run.  For small problems we just scan the
     * list to do lookups, but when there are many join relations we build a
     * hash table for faster lookups.  The hash table is present and valid
     * when join_rel_hash is not NULL.  Note that we still maintain the list
     * even when using the hash table for lookups; this simplifies life for
     * GEQO.
     * join_rel_list是在計劃執(zhí)行中考慮的所有連接關(guān)系RelOptInfos的鏈表。
     * 對于小問題,只需要掃描鏈表執(zhí)行查找,但是當存在許多連接關(guān)系時,
     *    需要構(gòu)建一個散列表來進行更快的查找。
     * 當join_rel_hash不為空時,哈希表是有效可用于查詢的。
     * 注意,即使在使用哈希表進行查找時,仍然維護該鏈表;這簡化了GEQO(遺傳算法)的生命周期。
     */
    //參與連接的Relation的RelOptInfo鏈表
    List       *join_rel_list;  /* list of join-relation RelOptInfos */
    //可加快鏈表訪問的hash表
    struct HTAB *join_rel_hash; /* optional hashtable for join relations */

    /*
     * When doing a dynamic-programming-style join search, join_rel_level[k]
     * is a list of all join-relation RelOptInfos of level k, and
     * join_cur_level is the current level.  New join-relation RelOptInfos are
     * automatically added to the join_rel_level[join_cur_level] list.
     * join_rel_level is NULL if not in use.
     * 在執(zhí)行動態(tài)規(guī)劃算法的連接搜索時,join_rel_level[k]是k級的所有連接關(guān)系RelOptInfos的列表,
     * join_cur_level是當前級別。
     * 新的連接關(guān)系RelOptInfos會自動添加到j(luò)oin_rel_level[join_cur_level]鏈表中。
     * 如果不使用join_rel_level,則為NULL。
     */
    //RelOptInfo指針鏈表數(shù)組,k層的join存儲在[k]中
    List      **join_rel_level; /* lists of join-relation RelOptInfos */
    //當前的join層次
    int         join_cur_level; /* index of list being extended */
    //查詢的初始化計劃鏈表
    List       *init_plans;     /* init SubPlans for query */
    //CTE子計劃ID鏈表
    List       *cte_plan_ids;   /* per-CTE-item list of subplan IDs */
    //MULTIEXPR子查詢輸出的參數(shù)鏈表的鏈表
    List       *multiexpr_params;   /* List of Lists of Params for MULTIEXPR
                                     * subquery outputs */
    //活動的等價類鏈表
    List       *eq_classes;     /* list of active EquivalenceClasses */
    //規(guī)范化的PathKey鏈表
    List       *canon_pathkeys; /* list of "canonical" PathKeys */
    //外連接約束條件鏈表(左)
    List       *left_join_clauses;  /* list of RestrictInfos for mergejoinable
                                     * outer join clauses w/nonnullable var on
                                     * left */
    //外連接約束條件鏈表(右)
    List       *right_join_clauses; /* list of RestrictInfos for mergejoinable
                                     * outer join clauses w/nonnullable var on
                                     * right */
    //全連接約束條件鏈表
    List       *full_join_clauses;  /* list of RestrictInfos for mergejoinable
                                     * full join clauses */
    //特殊連接信息鏈表
    List       *join_info_list; /* list of SpecialJoinInfos */
    //AppendRelInfo鏈表
    List       *append_rel_list;    /* list of AppendRelInfos */
    //PlanRowMarks鏈表
    List       *rowMarks;       /* list of PlanRowMarks */
    //PHI鏈表
    List       *placeholder_list;   /* list of PlaceHolderInfos */
    // 外鍵信息鏈表
    List       *fkey_list;      /* list of ForeignKeyOptInfos */
    //query_planner()要求的PathKeys鏈表
    List       *query_pathkeys; /* desired pathkeys for query_planner() */
    //分組子句路徑鍵
    List       *group_pathkeys; /* groupClause pathkeys, if any */
    //窗口函數(shù)路徑鍵
    List       *window_pathkeys;    /* pathkeys of bottom window, if any */
    //distinctClause路徑鍵
    List       *distinct_pathkeys;  /* distinctClause pathkeys, if any */
    //排序路徑鍵
    List       *sort_pathkeys;  /* sortClause pathkeys, if any */
    //已規(guī)范化的分區(qū)Schema
    List       *part_schemes;   /* Canonicalised partition schemes used in the
                                 * query. */
    //嘗試連接的RelOptInfo鏈表
    List       *initial_rels;   /* RelOptInfos we are now trying to join */

    /* Use fetch_upper_rel() to get any particular upper rel */
    //上層的RelOptInfo鏈表
    List       *upper_rels[UPPERREL_FINAL + 1]; /*  upper-rel RelOptInfos */

    /* Result tlists chosen by grouping_planner for upper-stage processing */
    //grouping_planner為上層處理選擇的結(jié)果tlists
    struct PathTarget *upper_targets[UPPERREL_FINAL + 1];//

    /*
     * grouping_planner passes back its final processed targetlist here, for
     * use in relabeling the topmost tlist of the finished Plan.
     * grouping_planner在這里傳回它最終處理過的targetlist,用于重新標記已完成計劃的最頂層tlist。
     */
    ////最后需處理的投影列
    List       *processed_tlist;

    /* Fields filled during create_plan() for use in setrefs.c */
    //setrefs.c中在create_plan()函數(shù)調(diào)用期間填充的字段
    //分組函數(shù)屬性映射
    AttrNumber *grouping_map;   /* for GroupingFunc fixup */
    //MinMaxAggInfos鏈表
    List       *minmax_aggs;    /* List of MinMaxAggInfos */
    //內(nèi)存上下文
    MemoryContext planner_cxt;  /* context holding PlannerInfo */
    //關(guān)系的page計數(shù)
    double      total_table_pages;  /* # of pages in all tables of query */
    //query_planner輸入?yún)?shù):元組處理比例
    double      tuple_fraction; /* tuple_fraction passed to query_planner */
    //query_planner輸入?yún)?shù):limit_tuple
    double      limit_tuples;   /* limit_tuples passed to query_planner */
    //表達式的最小安全等級
    Index       qual_security_level;    /* minimum security_level for quals */
    /* Note: qual_security_level is zero if there are no securityQuals */
    //注意:如果沒有securityQuals, 則qual_security_level是NULL(0)

    //如目標relation是分區(qū)表的child/partition/分區(qū)表,則通過此字段標記
    InheritanceKind inhTargetKind;  /* indicates if the target relation is an
                                     * inheritance child or partition or a
                                     * partitioned table */
    //是否存在RTE_JOIN的RTE
    bool        hasJoinRTEs;    /* true if any RTEs are RTE_JOIN kind */
    //是否存在標記為LATERAL的RTE
    bool        hasLateralRTEs; /* true if any RTEs are marked LATERAL */
    //是否存在已在jointree刪除的RTE
    bool        hasDeletedRTEs; /* true if any RTE was deleted from jointree */
    //是否存在Having子句
    bool        hasHavingQual;  /* true if havingQual was non-null */
    //如約束條件中存在pseudoconstant = true,則此字段為T
    bool        hasPseudoConstantQuals; /* true if any RestrictInfo has
                                         * pseudoconstant = true */
    //是否存在遞歸語句
    bool        hasRecursion;   /* true if planning a recursive WITH item */

    /* These fields are used only when hasRecursion is true: */
    //這些字段僅在hasRecursion為T時使用:
    //工作表的PARAM_EXEC ID
    int         wt_param_id;    /* PARAM_EXEC ID for the work table */
    //非遞歸模式的訪問路徑
    struct Path *non_recursive_path;    /* a path for non-recursive term */

    /* These fields are workspace for createplan.c */
    //這些字段用于createplan.c
    //當前節(jié)點之上的外部rels
    Relids      curOuterRels;   /* outer rels above current node */
    //未賦值的NestLoopParams參數(shù)
    List       *curOuterParams; /* not-yet-assigned NestLoopParams */

    /* optional private data for join_search_hook, e.g., GEQO */
    //可選的join_search_hook私有數(shù)據(jù),例如GEQO
    void       *join_search_private;

    /* Does this query modify any partition key columns? */
    //該查詢是否更新分區(qū)鍵列?
    bool        partColsUpdated;
} PlannerInfo;

二、源碼解讀

expand_inherited_tables函數(shù)將表示繼承集合的每個范圍表條目展開為“append relation”。

/*
 * expand_inherited_tables
 *      Expand each rangetable entry that represents an inheritance set
 *      into an "append relation".  At the conclusion of this process,
 *      the "inh" flag is set in all and only those RTEs that are append
 *      relation parents.
 *      將表示繼承集合的每個范圍表條目展開為“append relation”。
 *      在這個過程結(jié)束時,“inh”標志被設(shè)置在所有且只有那些作為append
 *      relation parents的RTEs中。
 */
void
expand_inherited_tables(PlannerInfo *root)
{
    Index       nrtes;
    Index       rti;
    ListCell   *rl;

    /*
     * expand_inherited_rtentry may add RTEs to parse->rtable. The function is
     * expected to recursively handle any RTEs that it creates with inh=true.
     * So just scan as far as the original end of the rtable list.
     * expand_inherited_rtentry可以添加RTEs到parse->rtable中。
     * 這個函數(shù)被期望遞歸地處理它用inh = true創(chuàng)建的所有RTEs。
     * 所以只要掃描到rtable鏈表最開始的末尾即可。
     */
    nrtes = list_length(root->parse->rtable);
    rl = list_head(root->parse->rtable);
    for (rti = 1; rti <= nrtes; rti++)
    {
        RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);

        expand_inherited_rtentry(root, rte, rti);
        rl = lnext(rl);
    }
}

/*
 * expand_inherited_rtentry
 *      Check whether a rangetable entry represents an inheritance set.
 *      If so, add entries for all the child tables to the query's
 *      rangetable, and build AppendRelInfo nodes for all the child tables
 *      and add them to root->append_rel_list.  If not, clear the entry's
 *      "inh" flag to prevent later code from looking for AppendRelInfos.
 *      檢查范圍表條目是否表示繼承集合。
 *      如是,將所有子表的條目添加到查詢的范圍表中,
 *        并為所有子表構(gòu)建AppendRelInfo節(jié)點,并將它們添加到root->append_rel_list。
 *      如沒有,清除條目的“inh”標志,以防止以后的代碼尋找AppendRelInfos。
 *
 * Note that the original RTE is considered to represent the whole
 * inheritance set.  The first of the generated RTEs is an RTE for the same
 * table, but with inh = false, to represent the parent table in its role
 * as a simple member of the inheritance set.
 * 注意,原始的RTEs被認為代表了整個繼承集合。
 * 生成的第一個RTE是同一個表的RTE,但inh = false表示父表作為繼承集的一個簡單成員的角色。
 *
 * A childless table is never considered to be an inheritance set. For
 * regular inheritance, a parent RTE must always have at least two associated
 * AppendRelInfos: one corresponding to the parent table as a simple member of
 * inheritance set and one or more corresponding to the actual children.
 * Since a partitioned table is not scanned, it might have only one associated
 * AppendRelInfo.
 * 無子表的關(guān)系永遠不會被認為是繼承集合。
 * 對于常規(guī)繼承,父RTE必須始終至少有兩個相關(guān)的AppendRelInfos:
 *   一個作為繼承集的簡單成員與父表相對應,
 *   另一個或多個與實際的子表相對應。
 * 因為沒有掃描分區(qū)表,所以它可能只有一個關(guān)聯(lián)的AppendRelInfo。
 */
static void
expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
{
    Oid         parentOID;
    PlanRowMark *oldrc;
    Relation    oldrelation;
    LOCKMODE    lockmode;
    List       *inhOIDs;
    ListCell   *l;

    /* Does RT entry allow inheritance? */
    //是否分區(qū)表?
    if (!rte->inh)
        return;
    /* Ignore any already-expanded UNION ALL nodes */
    //忽略所有已擴展的UNION ALL節(jié)點
    if (rte->rtekind != RTE_RELATION)
    {
        Assert(rte->rtekind == RTE_SUBQUERY);
        return;//返回
    }
    /* Fast path for common case of childless table */
    //對于常規(guī)的無子表的關(guān)系,快速判斷
    parentOID = rte->relid;
    if (!has_subclass(parentOID))
    {
        /* Clear flag before returning */
        //無子表,設(shè)置標記并返回
        rte->inh = false;
        return;
    }

    /*
     * The rewriter should already have obtained an appropriate lock on each
     * relation named in the query.  However, for each child relation we add
     * to the query, we must obtain an appropriate lock, because this will be
     * the first use of those relations in the parse/rewrite/plan pipeline.
     * Child rels should use the same lockmode as their parent.
     * 查詢rewriter程序應該已經(jīng)在查詢中命名的每個關(guān)系上獲得了適當?shù)逆i。
     * 但是,對于添加到查詢中的每個子關(guān)系,必須獲得適當?shù)逆i,
     *   因為這將是解析/重寫/計劃過程中這些關(guān)系的第一次使用。
     * 子樹應該使用與父樹相同的鎖模式。
     */
    lockmode = rte->rellockmode;

    /* Scan for all members of inheritance set, acquire needed locks */
    //掃描繼承集的所有成員,獲取所需的鎖
    inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);

    /*
     * Check that there's at least one descendant, else treat as no-child
     * case.  This could happen despite above has_subclass() check, if table
     * once had a child but no longer does.
     * 檢查是否至少有一個后代,否則視為無子女情況。
     * 盡管上面有has_subclass()檢查,但如果table曾經(jīng)有一個子元素,
     *   但現(xiàn)在不再有了,則可能發(fā)生這種情況。
     */
    if (list_length(inhOIDs) < 2)
    {
        /* Clear flag before returning */
        //清除標記,返回
        rte->inh = false;
        return;
    }

    /*
     * If parent relation is selected FOR UPDATE/SHARE, we need to mark its
     * PlanRowMark as isParent = true, and generate a new PlanRowMark for each
     * child.
     * 如果父關(guān)系是 selected FOR UPDATE/SHARE,
     *   則需要將其PlanRowMark標記為isParent = true,
     *   并為每個子關(guān)系生成一個新的PlanRowMark。
     */
    oldrc = get_plan_rowmark(root->rowMarks, rti);
    if (oldrc)
        oldrc->isParent = true;

    /*
     * Must open the parent relation to examine its tupdesc.  We need not lock
     * it; we assume the rewriter already did.
     * 必須打開父關(guān)系以檢查其tupdesc。
     * 不需要鎖定,我們假設(shè)查詢重寫已經(jīng)這么做了。
     */
    oldrelation = heap_open(parentOID, NoLock);

    /* Scan the inheritance set and expand it */
    //掃描繼承集合并擴展之
    if (RelationGetPartitionDesc(oldrelation) != NULL)//
    {
        Assert(rte->relkind == RELKIND_PARTITIONED_TABLE);

        /*
         * If this table has partitions, recursively expand them in the order
         * in which they appear in the PartitionDesc.  While at it, also
         * extract the partition key columns of all the partitioned tables.
         * 如果這個表有分區(qū),則按分區(qū)在PartitionDesc中出現(xiàn)的順序遞歸展開它們。
         * 同時,還提取所有分區(qū)表的分區(qū)鍵列。
         */
        expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
                                   lockmode, &root->append_rel_list);
    }
    else
    {
        //分區(qū)描述符獲取不成功(沒有分區(qū)信息)
        List       *appinfos = NIL;
        RangeTblEntry *childrte;
        Index       childRTindex;

        /*
         * This table has no partitions.  Expand any plain inheritance
         * children in the order the OIDs were returned by
         * find_all_inheritors.
         * 這個表沒有分區(qū)。
         * 按find_all_inheritors返回的OIDs的順序展開所有普通繼承子元素。
         */
        foreach(l, inhOIDs)//遍歷OIDs
        {
            Oid         childOID = lfirst_oid(l);
            Relation    newrelation;

            /* Open rel if needed; we already have required locks */
            //如有需要,打開rel(已獲得鎖)
            if (childOID != parentOID)
                newrelation = heap_open(childOID, NoLock);
            else
                newrelation = oldrelation;

            /*
             * It is possible that the parent table has children that are temp
             * tables of other backends.  We cannot safely access such tables
             * (because of buffering issues), and the best thing to do seems
             * to be to silently ignore them.
             * 父表的子表可能是其他后臺的臨時表。
             * 我們不能安全地訪問這些表(因為存在緩沖問題),最好的辦法似乎是悄悄地忽略它們。
             */
            if (childOID != parentOID && RELATION_IS_OTHER_TEMP(newrelation))
            {
                heap_close(newrelation, lockmode);//忽略它們
                continue;
            }

            expand_single_inheritance_child(root, rte, rti, oldrelation, oldrc,
                                            newrelation,
                                            &appinfos, &childrte,
                                            &childRTindex);//展開

            /* Close child relations, but keep locks */
            //關(guān)閉子表,但仍持有鎖
            if (childOID != parentOID)
                heap_close(newrelation, NoLock);
        }

        /*
         * If all the children were temp tables, pretend it's a
         * non-inheritance situation; we don't need Append node in that case.
         * The duplicate RTE we added for the parent table is harmless, so we
         * don't bother to get rid of it; ditto for the useless PlanRowMark
         * node.
         * 如果所有的子表都是臨時表,則假設(shè)這是非繼承情況;
         *   在這種情況下,不需要APPEND NODE。
         * 我們?yōu)楦副硖砑又貜偷腞TE是無關(guān)緊要的,
         *   因此我們不必費心刪除它;無用的PlanRowMark節(jié)點也是如此。
         */
        if (list_length(appinfos) < 2)
            rte->inh = false;//設(shè)置標記
        else
            root->append_rel_list = list_concat(root->append_rel_list,
                                                appinfos);//添加到鏈表中

    }

    heap_close(oldrelation, NoLock);//關(guān)閉relation
}

/*
 * expand_partitioned_rtentry
 *      Recursively expand an RTE for a partitioned table.
 *      遞歸擴展分區(qū)表RTE
 */
static void
expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte,
                           Index parentRTindex, Relation parentrel,
                           PlanRowMark *top_parentrc, LOCKMODE lockmode,
                           List **appinfos)
{
    int         i;
    RangeTblEntry *childrte;
    Index       childRTindex;
    PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);

    check_stack_depth();

    /* A partitioned table should always have a partition descriptor. */
    //分配表通常應具備分區(qū)描述符
    Assert(partdesc);

    Assert(parentrte->inh);

    /*
     * Note down whether any partition key cols are being updated. Though it's
     * the root partitioned table's updatedCols we are interested in, we
     * instead use parentrte to get the updatedCols. This is convenient
     * because parentrte already has the root partrel's updatedCols translated
     * to match the attribute ordering of parentrel.
     * 請注意是否正在更新分區(qū)鍵cols。
     * 雖然感興趣的是根分區(qū)表的updatedCols,但是使用parentrte來獲取updatedCols。
     * 這很方便,因為parentrte已經(jīng)將root partrel的updatedCols轉(zhuǎn)換為匹配parentrel的屬性順序。
     */
    if (!root->partColsUpdated)
        root->partColsUpdated =
            has_partition_attrs(parentrel, parentrte->updatedCols, NULL);

    /* First expand the partitioned table itself. */
    //
    expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel,
                                    top_parentrc, parentrel,
                                    appinfos, &childrte, &childRTindex);

    /*
     * If the partitioned table has no partitions, treat this as the
     * non-inheritance case.
     * 如果分區(qū)表沒有分區(qū),則將其視為非繼承情況。
     */
    if (partdesc->nparts == 0)
    {
        parentrte->inh = false;
        return;
    }

    for (i = 0; i < partdesc->nparts; i++)
    {
        Oid         childOID = partdesc->oids[i];
        Relation    childrel;

        /* Open rel; we already have required locks */
        //打開rel
        childrel = heap_open(childOID, NoLock);

        /*
         * Temporary partitions belonging to other sessions should have been
         * disallowed at definition, but for paranoia's sake, let's double
         * check.
         * 屬于其他會話的臨時分區(qū)在定義時應該是不允許的,但是出于偏執(zhí)狂的考慮,再檢查一下。
         */
        if (RELATION_IS_OTHER_TEMP(childrel))
            elog(ERROR, "temporary relation from another session found as partition");
        //擴展之
        expand_single_inheritance_child(root, parentrte, parentRTindex,
                                        parentrel, top_parentrc, childrel,
                                        appinfos, &childrte, &childRTindex);

        /* If this child is itself partitioned, recurse */
        //子關(guān)系是分區(qū)表,遞歸擴展
        if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
            expand_partitioned_rtentry(root, childrte, childRTindex,
                                       childrel, top_parentrc, lockmode,
                                       appinfos);

        /* Close child relation, but keep locks */
        //關(guān)閉子關(guān)系,但仍持有鎖
        heap_close(childrel, NoLock);
    }
}


 /* expand_single_inheritance_child
 *      Build a RangeTblEntry and an AppendRelInfo, if appropriate, plus
 *      maybe a PlanRowMark.
 *      構(gòu)建一個RangeTblEntry和一個AppendRelInfo,如果合適的話,再加上一個PlanRowMark。
 *
 * We now expand the partition hierarchy level by level, creating a
 * corresponding hierarchy of AppendRelInfos and RelOptInfos, where each
 * partitioned descendant acts as a parent of its immediate partitions.
 * (This is a difference from what older versions of PostgreSQL did and what
 * is still done in the case of table inheritance for unpartitioned tables,
 * where the hierarchy is flattened during RTE expansion.)
 * 現(xiàn)在我們逐層擴展分區(qū)層次結(jié)構(gòu),創(chuàng)建一個對應的AppendRelInfos和RelOptInfos層次結(jié)構(gòu),
 *   其中每個分區(qū)的后代充當其直接分區(qū)的父級。
 * (在未分區(qū)表的表繼承中,
 *    層次結(jié)構(gòu)在RTE擴展期間被扁平化,這與老版本的PostgreSQL有所不同。)
 *
 * PlanRowMarks still carry the top-parent's RTI, and the top-parent's
 * allMarkTypes field still accumulates values from all descendents.
 * PlanRowMarks仍然具有頂級父類的RTI信息,
 *   而頂級父類的allMarkTypes字段仍然從所有子類累積。
 * 
 * "parentrte" and "parentRTindex" are immediate parent's RTE and
 * RTI. "top_parentrc" is top parent's PlanRowMark.
 * “parentrte”和“parentRTindex”是直接父級的RTE和RTI。
 * “top_parentrc”是top父類的PlanRowMark。
 *
 * The child RangeTblEntry and its RTI are returned in "childrte_p" and
 * "childRTindex_p" resp.
 * 子RTE及其RTI在“childrte_p”和“childRTindex_p”resp中返回。
 */
static void
expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte,
                                Index parentRTindex, Relation parentrel,
                                PlanRowMark *top_parentrc, Relation childrel,
                                List **appinfos, RangeTblEntry **childrte_p,
                                Index *childRTindex_p)
{
    Query      *parse = root->parse;
    Oid         parentOID = RelationGetRelid(parentrel);//父關(guān)系
    Oid         childOID = RelationGetRelid(childrel);//子關(guān)系
    RangeTblEntry *childrte;
    Index       childRTindex;
    AppendRelInfo *appinfo;

    /*
     * Build an RTE for the child, and attach to query's rangetable list. We
     * copy most fields of the parent's RTE, but replace relation OID and
     * relkind, and set inh = false.  Also, set requiredPerms to zero since
     * all required permissions checks are done on the original RTE. Likewise,
     * set the child's securityQuals to empty, because we only want to apply
     * the parent's RLS conditions regardless of what RLS properties
     * individual children may have.  (This is an intentional choice to make
     * inherited RLS work like regular permissions checks.) The parent
     * securityQuals will be propagated to children along with other base
     * restriction clauses, so we don't need to do it here.
     * 為子元素構(gòu)建一個RTE,并附加到query的范圍表鏈表中。
     * 我們復制父RTE的大部分字段,但是替換關(guān)系OID和relkind,并設(shè)置inh = false。
     * 另外,將requiredPerms設(shè)置為0,因為所有需要的權(quán)限檢查都是在原始RTE上完成的。
     * 同樣,將子元素securityQuals設(shè)置為空,因為只想應用父元素的RLS條件,
     *   而不管每個子元素可能具有什么RLS屬性。
     *   (這是一種有意的選擇,目的是讓繼承的RLS像常規(guī)權(quán)限檢查一樣工作。)
     * 父安全條件quals將與其他基本限制條款一起傳播到子級,因此不需要在這里這樣做。
     */
    childrte = copyObject(parentrte);
    *childrte_p = childrte;
    childrte->relid = childOID;
    childrte->relkind = childrel->rd_rel->relkind;
    /* A partitioned child will need to be expanded further. */
    //分區(qū)表的子關(guān)系會在"將來"擴展
    if (childOID != parentOID &&
        childrte->relkind == RELKIND_PARTITIONED_TABLE)
        childrte->inh = true;
    else
        childrte->inh = false;
    childrte->requiredPerms = 0;
    childrte->securityQuals = NIL;
    parse->rtable = lappend(parse->rtable, childrte);
    childRTindex = list_length(parse->rtable);
    *childRTindex_p = childRTindex;

    /*
     * We need an AppendRelInfo if paths will be built for the child RTE. If
     * childrte->inh is true, then we'll always need to generate append paths
     * for it.  If childrte->inh is false, we must scan it if it's not a
     * partitioned table; but if it is a partitioned table, then it never has
     * any data of its own and need not be scanned.
     * 如果要為子RTE構(gòu)建路徑,則需要一個AppendRelInfo。
     * 如果children ->inh為真,那么我們總是需要為它生成APPEND訪問路徑。
     * 如果children ->inh為假,則必須掃描它,如果它不是分區(qū)表;
     *   但是如果它是一個分區(qū)表,那么它永遠不會有任何自己的數(shù)據(jù),也不需要掃描。
     */
    if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
    {
        appinfo = makeNode(AppendRelInfo);
        appinfo->parent_relid = parentRTindex;
        appinfo->child_relid = childRTindex;
        appinfo->parent_reltype = parentrel->rd_rel->reltype;
        appinfo->child_reltype = childrel->rd_rel->reltype;
        make_inh_translation_list(parentrel, childrel, childRTindex,
                                  &appinfo->translated_vars);
        appinfo->parent_reloid = parentOID;
        *appinfos = lappend(*appinfos, appinfo);

        /*
         * Translate the column permissions bitmaps to the child's attnums (we
         * have to build the translated_vars list before we can do this). But
         * if this is the parent table, leave copyObject's result alone.
         * 將列權(quán)限位圖轉(zhuǎn)換為子節(jié)點的attnums(在此之前必須構(gòu)建translated_vars列表)。
         * 但是,如果這是父表,則不要理會copyObject的結(jié)果。
         *
         * Note: we need to do this even though the executor won't run any
         * permissions checks on the child RTE.  The insertedCols/updatedCols
         * bitmaps may be examined for trigger-firing purposes.
         * 注意:即使執(zhí)行程序不會在子RTE上運行任何權(quán)限檢查,我們也需要這樣做。
         * 可以檢查插入的tedcols /updatedCols位圖是否具有觸發(fā)目的。
         */
        if (childOID != parentOID)
        {
            childrte->selectedCols = translate_col_privs(parentrte->selectedCols,
                                                         appinfo->translated_vars);
            childrte->insertedCols = translate_col_privs(parentrte->insertedCols,
                                                         appinfo->translated_vars);
            childrte->updatedCols = translate_col_privs(parentrte->updatedCols,
                                                        appinfo->translated_vars);
        }
    }

    /*
     * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE.
     * 如父關(guān)系標記為FOR UPDATE/SHARE,則創(chuàng)建PlanRowMark
     */
    if (top_parentrc)
    {
        PlanRowMark *childrc = makeNode(PlanRowMark);

        childrc->rti = childRTindex;
        childrc->prti = top_parentrc->rti;
        childrc->rowmarkId = top_parentrc->rowmarkId;
        /* Reselect rowmark type, because relkind might not match parent */
        //重新選擇rowmark類型,因為relkind可能與父類不匹配
        childrc->markType = select_rowmark_type(childrte,
                                                top_parentrc->strength);
        childrc->allMarkTypes = (1 << childrc->markType);
        childrc->strength = top_parentrc->strength;
        childrc->waitPolicy = top_parentrc->waitPolicy;

        /*
         * We mark RowMarks for partitioned child tables as parent RowMarks so
         * that the executor ignores them (except their existence means that
         * the child tables be locked using appropriate mode).
         * 我們將分區(qū)的子表的RowMarks標記為父RowMarks,
         *   以便執(zhí)行程序忽略它們(除非它們的存在意味著子表使用適當?shù)哪J奖绘i定)。
         */
        childrc->isParent = (childrte->relkind == RELKIND_PARTITIONED_TABLE);

        /* Include child's rowmark type in top parent's allMarkTypes */
        //在父類的allMarkTypes中包含子類的rowmark類型
        top_parentrc->allMarkTypes |= childrc->allMarkTypes;

        root->rowMarks = lappend(root->rowMarks, childrc);
    }
}

三、跟蹤分析

測試腳本如下

testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Append  (cost=0.00..30.53 rows=6 width=200)
   ->  Seq Scan on public.t_hash_partition_1  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3
         Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2))
   ->  Seq Scan on public.t_hash_partition_3  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3
         Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))
(7 rows)

啟動gdb,設(shè)置斷點

(gdb) b expand_inherited_tables
Breakpoint 1 at 0x7e53ba: file prepunion.c, line 1483.
(gdb) c
Continuing.

Breakpoint 1, expand_inherited_tables (root=0x28fcdc8) at prepunion.c:1483
1483        nrtes = list_length(root->parse->rtable);

獲取RTE的個數(shù)和鏈表元素

(gdb) n
1484        rl = list_head(root->parse->rtable);
(gdb) 
1485        for (rti = 1; rti <= nrtes; rti++)
(gdb) p nrtes
$1 = 1
(gdb) p *rl
$2 = {data = {ptr_value = 0x28d83d0, int_value = 42828752, oid_value = 42828752}, next = 0x0}
(gdb) 

循環(huán)處理RTE

(gdb) n
1487            RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);
(gdb) 
1489            expand_inherited_rtentry(root, rte, rti);
(gdb) p *rte
$3 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112 'p', tablesample = 0x0, subquery = 0x0, 
  security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, 
  tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, 
  coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28d84e8, lateral = false, 
  inh = true, inFromCl = true, requiredPerms = 2, checkAsUser = 0, selectedCols = 0x28d8c40, insertedCols = 0x0, 
  updatedCols = 0x0, securityQuals = 0x0}

進入expand_inherited_rtentry

(gdb) step
expand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:1517
1517        Query      *parse = root->parse;

expand_inherited_rtentry->分區(qū)表標記為T

1526        if (!rte->inh)
(gdb) p rte->inh
$4 = true

expand_inherited_rtentry->執(zhí)行相關(guān)判斷

(gdb) n
1529        if (rte->rtekind != RTE_RELATION)
(gdb) p rte->rtekind
$5 = RTE_RELATION
(gdb) n
1535        parentOID = rte->relid;
(gdb) 
1536        if (!has_subclass(parentOID))
(gdb) p parentOID
$6 = 16986
(gdb) n
1556        oldrc = get_plan_rowmark(root->rowMarks, rti);
(gdb) 
1557        if (rti == parse->resultRelation)
(gdb) p *oldrc
Cannot access memory at address 0x0

expand_inherited_rtentry->掃描繼承集的所有成員,獲取所需的鎖,并構(gòu)建OIDs鏈表

(gdb) n
1559        else if (oldrc && RowMarkRequiresRowShareLock(oldrc->markType))
(gdb) 
1562            lockmode = AccessShareLock;
(gdb) 
1565        inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);
(gdb) 
1572        if (list_length(inhOIDs) < 2)
(gdb) p inhOIDs
$7 = (List *) 0x28fd208
(gdb) p *inhOIDs
$8 = {type = T_OidList, length = 7, head = 0x28fd1e0, tail = 0x28fd778}
(gdb) 

expand_inherited_rtentry->打開relation

(gdb) n
1584        if (oldrc)
(gdb) 
1591        oldrelation = heap_open(parentOID, NoLock);

expand_inherited_rtentry->成功獲取分區(qū)描述符,調(diào)用expand_partitioned_rtentry

(gdb) 
1594        if (RelationGetPartitionDesc(oldrelation) != NULL)
(gdb) 
1596            Assert(rte->relkind == RELKIND_PARTITIONED_TABLE);
(gdb) 
1603            expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
(gdb) 

expand_inherited_rtentry->進入expand_partitioned_rtentry

(gdb) step
expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1684
1684        PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);

expand_partitioned_rtentry->獲取分區(qū)描述符

1684        PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
(gdb) n
1686        check_stack_depth();
(gdb) p *partdesc
$9 = {nparts = 6, oids = 0x298e4f8, boundinfo = 0x298e530}

expand_partitioned_rtentry->執(zhí)行相關(guān)校驗

(gdb) n
1689        Assert(partdesc);
(gdb) 
1691        Assert(parentrte->inh);
(gdb) 
1700        if (!root->partColsUpdated)
(gdb) 
1702                has_partition_attrs(parentrel, parentrte->updatedCols, NULL);
(gdb) 
1701            root->partColsUpdated =
(gdb) 
1705        expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel,

expand_partitioned_rtentry->首先展開分區(qū)表本身,進入expand_single_inheritance_child

(gdb) step
expand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, childrel=0x7f4e66827980, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4)
    at prepunion.c:1778
1778        Query      *parse = root->parse;

expand_single_inheritance_child->執(zhí)行相關(guān)初始化(childrte)

(gdb) n
1779        Oid         parentOID = RelationGetRelid(parentrel);
(gdb) 
1780        Oid         childOID = RelationGetRelid(childrel);
(gdb) 
1797        childrte = copyObject(parentrte);
(gdb) p parentOID
$10 = 16986
(gdb) p childOID
$11 = 16986
(gdb) n
1798        *childrte_p = childrte;
(gdb) 
1799        childrte->relid = childOID;
(gdb) 
1800        childrte->relkind = childrel->rd_rel->relkind;
(gdb) 
1802        if (childOID != parentOID &&
(gdb) 
1806            childrte->inh = false;
(gdb) 
1807        childrte->requiredPerms = 0;
(gdb) 
1808        childrte->securityQuals = NIL;
(gdb) 
1809        parse->rtable = lappend(parse->rtable, childrte);
(gdb) 
1810        childRTindex = list_length(parse->rtable);
(gdb) 
1811        *childRTindex_p = childRTindex;
(gdb) p *childrte -->relid = 16986,仍為分區(qū)表
$12 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112 'p', tablesample = 0x0, subquery = 0x0, 
  security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, 
  tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, 
  coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28fd268, lateral = false, 
  inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fd898, insertedCols = 0x0, 
  updatedCols = 0x0, securityQuals = 0x0}
(gdb) p *childRTindex_p
$13 = 0

expand_single_inheritance_child->完成分區(qū)表本身的擴展,回到expand_partitioned_rtentry

(gdb) p *childRTindex_p
$13 = 0
(gdb) n
1820        if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
(gdb) 
1855        if (top_parentrc)
(gdb) 
1881    }
(gdb) 
expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1713
1713        if (partdesc->nparts == 0)

expand_partitioned_rtentry->開始遍歷分區(qū)描述符中的分區(qū)

1713        if (partdesc->nparts == 0)
(gdb) n
1719        for (i = 0; i < partdesc->nparts; i++)
(gdb) 
1721            Oid         childOID = partdesc->oids[i];
(gdb) 
1725            childrel = heap_open(childOID, NoLock);
(gdb) 
1732            if (RELATION_IS_OTHER_TEMP(childrel))
(gdb) 
1735            expand_single_inheritance_child(root, parentrte, parentRTindex,
(gdb) p childOID
$14 = 16989 
----------------------------------------
testdb=# select relname from pg_class where oid=16989;
      relname       
--------------------
 t_hash_partition_1
(1 row)
----------------------------------------

expand_single_inheritance_child->再次進入expand_single_inheritance_child

(gdb) step
expand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, childrel=0x7f4e668306a0, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4)
    at prepunion.c:1778
1778        Query      *parse = root->parse;

expand_single_inheritance_child->開始構(gòu)建AppendRelInfo

...
1820        if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)
(gdb) 
1822            appinfo = makeNode(AppendRelInfo);
(gdb) p *childrte
$17 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16989, relkind = 114 'r', tablesample = 0x0, subquery = 0x0, 
  security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, 
  tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, 
  coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28fd9d0, lateral = false, 
  inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fdbc8, insertedCols = 0x0, 
  updatedCols = 0x0, securityQuals = 0x0}
(gdb) p *childrte->relkind
Cannot access memory at address 0x72
(gdb) p childrte->relkind
$18 = 114 'r'
(gdb) p childrte->inh
$19 = false

expand_single_inheritance_child->構(gòu)建完畢,查看AppendRelInfo結(jié)構(gòu)體

(gdb) n
1823            appinfo->parent_relid = parentRTindex;
(gdb) 
1824            appinfo->child_relid = childRTindex;
(gdb) 
1825            appinfo->parent_reltype = parentrel->rd_rel->reltype;
(gdb) 
1826            appinfo->child_reltype = childrel->rd_rel->reltype;
(gdb) 
1827            make_inh_translation_list(parentrel, childrel, childRTindex,
(gdb) 
1829            appinfo->parent_reloid = parentOID;
(gdb) 
1830            *appinfos = lappend(*appinfos, appinfo);
(gdb) 
1841            if (childOID != parentOID)
(gdb) 
1843                childrte->selectedCols = translate_col_privs(parentrte->selectedCols,
(gdb) 
1845                childrte->insertedCols = translate_col_privs(parentrte->insertedCols,
(gdb) 
1847                childrte->updatedCols = translate_col_privs(parentrte->updatedCols,
(gdb) 
1855        if (top_parentrc)
(gdb) p *appinfo
$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, 
  translated_vars = 0x28fdc90, parent_reloid = 16986}

expand_single_inheritance_child->完成調(diào)用,返回

(gdb) 
1855        if (top_parentrc)
(gdb) p *appinfo
$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, 
  translated_vars = 0x28fdc90, parent_reloid = 16986}
(gdb) n
1881    }
(gdb) 
expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, 
    top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1740
1740            if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)

expand_inherited_rtentry->完成expand_partitioned_rtentry過程調(diào)用,回到expand_inherited_rtentry

(gdb) finish
Run till exit from #0  expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, 
    parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:1740
0x00000000007e55e3 in expand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:1603
1603            expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,
(gdb) 

expand_inherited_rtentry->完成expand_inherited_rtentry的調(diào)用,回到expand_inherited_tables

(gdb) n
1665        heap_close(oldrelation, NoLock);
(gdb) 
1666    }
(gdb) 
expand_inherited_tables (root=0x28fcdc8) at prepunion.c:1490
1490            rl = lnext(rl);
(gdb) 

expand_inherited_tables->完成expand_inherited_tables調(diào)用,回到subquery_planner

(gdb) n
1485        for (rti = 1; rti <= nrtes; rti++)
(gdb) 
1492    }
(gdb) 
subquery_planner (glob=0x28fcd30, parse=0x28d82b8, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:719
719     root->hasHavingQual = (parse->havingQual != NULL);
(gdb) 

DONE!

四、參考資料

Parallel Append implementation
Partition Elimination in PostgreSQL 11

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI