您好,登錄后才能下訂單哦!
本篇內(nèi)容介紹了“PostgreSQL中create_plan函數(shù)連接計(jì)劃的實(shí)現(xiàn)過(guò)程是什么”的有關(guān)知識(shí),在實(shí)際案例的操作過(guò)程中,不少人都會(huì)遇到這樣的困境,接下來(lái)就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!
Plan
所有計(jì)劃節(jié)點(diǎn)通過(guò)將Plan結(jié)構(gòu)作為第一個(gè)字段從Plan結(jié)構(gòu)“派生”。這確保了在將節(jié)點(diǎn)轉(zhuǎn)換為計(jì)劃節(jié)點(diǎn)時(shí)能正常工作。(在執(zhí)行器中以通用方式傳遞時(shí),節(jié)點(diǎn)指針經(jīng)常被轉(zhuǎn)換為Plan *)
/* ---------------- * Plan node * * All plan nodes "derive" from the Plan structure by having the * Plan structure as the first field. This ensures that everything works * when nodes are cast to Plan's. (node pointers are frequently cast to Plan* * when passed around generically in the executor) * 所有計(jì)劃節(jié)點(diǎn)通過(guò)將Plan結(jié)構(gòu)作為第一個(gè)字段從Plan結(jié)構(gòu)“派生”。 * 這確保了在將節(jié)點(diǎn)轉(zhuǎn)換為計(jì)劃節(jié)點(diǎn)時(shí),一切都能正常工作。 * (在執(zhí)行器中以通用方式傳遞時(shí),節(jié)點(diǎn)指針經(jīng)常被轉(zhuǎn)換為Plan *) * * We never actually instantiate any Plan nodes; this is just the common * abstract superclass for all Plan-type nodes. * 從未實(shí)例化任何Plan節(jié)點(diǎn);這只是所有Plan-type節(jié)點(diǎn)的通用抽象超類。 * ---------------- */ typedef struct Plan { NodeTag type;//節(jié)點(diǎn)類型 /* * 成本估算信息;estimated execution costs for plan (see costsize.c for more info) */ Cost startup_cost; /* 啟動(dòng)成本;cost expended before fetching any tuples */ Cost total_cost; /* 總成本;total cost (assuming all tuples fetched) */ /* * 優(yōu)化器估算信息;planner's estimate of result size of this plan step */ double plan_rows; /* 行數(shù);number of rows plan is expected to emit */ int plan_width; /* 平均行大小(Byte為單位);average row width in bytes */ /* * 并行執(zhí)行相關(guān)的信息;information needed for parallel query */ bool parallel_aware; /* 是否參與并行執(zhí)行邏輯?engage parallel-aware logic? */ bool parallel_safe; /* 是否并行安全;OK to use as part of parallel plan? */ /* * Plan類型節(jié)點(diǎn)通用的信息.Common structural data for all Plan types. */ int plan_node_id; /* unique across entire final plan tree */ List *targetlist; /* target list to be computed at this node */ List *qual; /* implicitly-ANDed qual conditions */ struct Plan *lefttree; /* input plan tree(s) */ struct Plan *righttree; List *initPlan; /* Init Plan nodes (un-correlated expr * subselects) */ /* * Information for management of parameter-change-driven rescanning * parameter-change-driven重掃描的管理信息. * * extParam includes the paramIDs of all external PARAM_EXEC params * affecting this plan node or its children. setParam params from the * node's initPlans are not included, but their extParams are. * * allParam includes all the extParam paramIDs, plus the IDs of local * params that affect the node (i.e., the setParams of its initplans). * These are _all_ the PARAM_EXEC params that affect this node. */ Bitmapset *extParam; Bitmapset *allParam; } Plan;
create_join_plan函數(shù)創(chuàng)建Join Plan節(jié)點(diǎn).Join可以分為Merge Join/Hash Join/NestLoop Join三種,相應(yīng)的實(shí)現(xiàn)函數(shù)是create_nestloop_plan/create_mergejoin_plan/create_hashjoin_plan.
//------------------------------------------------------------------ create_join_plan /* * create_join_plan * Create a join plan for 'best_path' and (recursively) plans for its * inner and outer paths. * 創(chuàng)建連接計(jì)劃Plan節(jié)點(diǎn). */ static Plan * create_join_plan(PlannerInfo *root, JoinPath *best_path) { Plan *plan; List *gating_clauses; switch (best_path->path.pathtype) { case T_MergeJoin://Merge Join plan = (Plan *) create_mergejoin_plan(root, (MergePath *) best_path); break; case T_HashJoin://Hash Join plan = (Plan *) create_hashjoin_plan(root, (HashPath *) best_path); break; case T_NestLoop://NestLoop Join plan = (Plan *) create_nestloop_plan(root, (NestPath *) best_path); break; default://目前僅支持上述三種 elog(ERROR, "unrecognized node type: %d", (int) best_path->path.pathtype); plan = NULL; /* keep compiler quiet */ break; } /* * If there are any pseudoconstant clauses attached to this node, insert a * gating Result node that evaluates the pseudoconstants as one-time * quals. * 如果這個(gè)節(jié)點(diǎn)上附加了偽常量子句,插入一個(gè)gating Result節(jié)點(diǎn),該節(jié)點(diǎn)將偽常量計(jì)算為一次性條件quals。 */ gating_clauses = get_gating_quals(root, best_path->joinrestrictinfo); if (gating_clauses) plan = create_gating_plan(root, (Path *) best_path, plan, gating_clauses); #ifdef NOT_USED /* * * Expensive function pullups may have pulled local predicates * into * this path node. Put them in the qpqual of the plan node. * JMH, * 6/15/92 * pullups函數(shù)可能已經(jīng)把本地謂詞上拉到該訪問(wèn)路徑節(jié)點(diǎn)中,把這些信息放在Plan節(jié)點(diǎn)的qpqual中 */ if (get_loc_restrictinfo(best_path) != NIL) set_qpqual((Plan) plan, list_concat(get_qpqual((Plan) plan), get_actual_clauses(get_loc_restrictinfo(best_path)))); #endif return plan; } //------------------------------------------ create_nestloop_plan static NestLoop * create_nestloop_plan(PlannerInfo *root, NestPath *best_path) { NestLoop *join_plan; Plan *outer_plan; Plan *inner_plan; List *tlist = build_path_tlist(root, &best_path->path); List *joinrestrictclauses = best_path->joinrestrictinfo; List *joinclauses; List *otherclauses; Relids outerrelids; List *nestParams; Relids saveOuterRels = root->curOuterRels; ListCell *cell; ListCell *prev; ListCell *next; /* NestLoop can project, so no need to be picky about child tlists */ //NestLoop可以執(zhí)行投影操作,所以不需要關(guān)心子計(jì)劃的tlists //遞歸調(diào)用生成外表計(jì)劃 outer_plan = create_plan_recurse(root, best_path->outerjoinpath, 0); /* For a nestloop, include outer relids in curOuterRels for inner side */ //對(duì)于nestloop,對(duì)應(yīng)內(nèi)側(cè)的curOuterRels中需要包含外表的relids root->curOuterRels = bms_union(root->curOuterRels, best_path->outerjoinpath->parent->relids); //遞歸調(diào)用生成內(nèi)表計(jì)劃 inner_plan = create_plan_recurse(root, best_path->innerjoinpath, 0); /* Restore curOuterRels */ //恢復(fù)curOuterRels bms_free(root->curOuterRels); root->curOuterRels = saveOuterRels; /* Sort join qual clauses into best execution order */ //排序連接條件 joinrestrictclauses = order_qual_clauses(root, joinrestrictclauses); /* Get the join qual clauses (in plain expression form) */ /* Any pseudoconstant clauses are ignored here */ //獲取連接條件子句,在這里,會(huì)忽略偽常量 if (IS_OUTER_JOIN(best_path->jointype)) { extract_actual_join_clauses(joinrestrictclauses, best_path->path.parent->relids, &joinclauses, &otherclauses);//外連接 } else { /* We can treat all clauses alike for an inner join */ //內(nèi)連接 joinclauses = extract_actual_clauses(joinrestrictclauses, false); otherclauses = NIL; } /* Replace any outer-relation variables with nestloop params */ //使用nestloop參數(shù)替代外表變量 if (best_path->path.param_info) { joinclauses = (List *) replace_nestloop_params(root, (Node *) joinclauses); otherclauses = (List *) replace_nestloop_params(root, (Node *) otherclauses); } /* * Identify any nestloop parameters that should be supplied by this join * node, and move them from root->curOuterParams to the nestParams list. * 確定這個(gè)連接節(jié)點(diǎn)應(yīng)該提供的所有nestloop連接參數(shù), * 并將它們從root->curOuterParams移動(dòng)到nestParams鏈表中。 */ outerrelids = best_path->outerjoinpath->parent->relids; nestParams = NIL; prev = NULL; for (cell = list_head(root->curOuterParams); cell; cell = next)//遍歷curOuterParams { NestLoopParam *nlp = (NestLoopParam *) lfirst(cell);//獲取參數(shù) next = lnext(cell); if (IsA(nlp->paramval, Var) && bms_is_member(nlp->paramval->varno, outerrelids))//Var變量,而且是外層的relids { root->curOuterParams = list_delete_cell(root->curOuterParams, cell, prev); nestParams = lappend(nestParams, nlp); } else if (IsA(nlp->paramval, PlaceHolderVar) &&//PHV bms_overlap(((PlaceHolderVar *) nlp->paramval)->phrels, outerrelids) && bms_is_subset(find_placeholder_info(root, (PlaceHolderVar *) nlp->paramval, false)->ph_eval_at, outerrelids)) { root->curOuterParams = list_delete_cell(root->curOuterParams, cell, prev); nestParams = lappend(nestParams, nlp); } else prev = cell;//直接賦值 } join_plan = make_nestloop(tlist, joinclauses, otherclauses, nestParams, outer_plan, inner_plan, best_path->jointype, best_path->inner_unique);//構(gòu)造nestloop訪問(wèn)節(jié)點(diǎn) copy_generic_path_info(&join_plan->join.plan, &best_path->path); return join_plan; } //------------------------------------------ create_mergejoin_plan static MergeJoin * create_mergejoin_plan(PlannerInfo *root, MergePath *best_path) { MergeJoin *join_plan; Plan *outer_plan; Plan *inner_plan; List *tlist = build_path_tlist(root, &best_path->jpath.path); List *joinclauses; List *otherclauses; List *mergeclauses; List *outerpathkeys; List *innerpathkeys; int nClauses; Oid *mergefamilies; Oid *mergecollations; int *mergestrategies; bool *mergenullsfirst; PathKey *opathkey; EquivalenceClass *opeclass; int i; ListCell *lc; ListCell *lop; ListCell *lip; Path *outer_path = best_path->jpath.outerjoinpath; Path *inner_path = best_path->jpath.innerjoinpath; /* * MergeJoin can project, so we don't have to demand exact tlists from the * inputs. However, if we're intending to sort an input's result, it's * best to request a small tlist so we aren't sorting more data than * necessary. * MergeJoin可以進(jìn)行投影運(yùn)算,因此不必從輸入中要求精確的tlist。 * 然而,如果打算對(duì)輸入的結(jié)果進(jìn)行排序,最好是請(qǐng)求一個(gè)小的tlist,這樣就不會(huì)對(duì)多余的數(shù)據(jù)進(jìn)行排序。 */ //對(duì)外表生成計(jì)劃Plan outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, (best_path->outersortkeys != NIL) ? CP_SMALL_TLIST : 0); //對(duì)內(nèi)部生成計(jì)劃Plan inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath, (best_path->innersortkeys != NIL) ? CP_SMALL_TLIST : 0); /* Sort join qual clauses into best execution order */ /* NB: do NOT reorder the mergeclauses */ //排序連接條件 joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo); /* Get the join qual clauses (in plain expression form) */ /* Any pseudoconstant clauses are ignored here */ //獲取連接約束條件子句(以扁平化的形式) if (IS_OUTER_JOIN(best_path->jpath.jointype)) { extract_actual_join_clauses(joinclauses, best_path->jpath.path.parent->relids, &joinclauses, &otherclauses); } else { /* We can treat all clauses alike for an inner join */ //以內(nèi)連接的方式處理所有條件子句 joinclauses = extract_actual_clauses(joinclauses, false); otherclauses = NIL; } /* * Remove the mergeclauses from the list of join qual clauses, leaving the * list of quals that must be checked as qpquals. * 從join qual子句鏈表中刪除mergeclauses,將必須檢查為qpquals的quals鏈表保留下來(lái)。 */ mergeclauses = get_actual_clauses(best_path->path_mergeclauses); joinclauses = list_difference(joinclauses, mergeclauses); /* * Replace any outer-relation variables with nestloop params. There * should not be any in the mergeclauses. * 使用nestloop參數(shù)替代外表變量.這些變量不應(yīng)在mergeclauses中出現(xiàn). */ if (best_path->jpath.path.param_info) { joinclauses = (List *) replace_nestloop_params(root, (Node *) joinclauses);//連接條件 otherclauses = (List *) replace_nestloop_params(root, (Node *) otherclauses);//其他條件 } /* * Rearrange mergeclauses, if needed, so that the outer variable is always * on the left; mark the mergeclause restrictinfos with correct * outer_is_left status. * 如果需要,重新安排mergeclauses,使外部變量總是在左邊; * 用正確的outer_is_left狀態(tài)標(biāo)記mergeclause restrictinfos。 */ mergeclauses = get_switched_clauses(best_path->path_mergeclauses, best_path->jpath.outerjoinpath->parent->relids); /* * Create explicit sort nodes for the outer and inner paths if necessary. * 如需要?jiǎng)?chuàng)建顯式的Sort節(jié)點(diǎn) */ if (best_path->outersortkeys) { Relids outer_relids = outer_path->parent->relids; Sort *sort = make_sort_from_pathkeys(outer_plan, best_path->outersortkeys, outer_relids); label_sort_with_costsize(root, sort, -1.0); outer_plan = (Plan *) sort; outerpathkeys = best_path->outersortkeys; } else outerpathkeys = best_path->jpath.outerjoinpath->pathkeys; if (best_path->innersortkeys) { Relids inner_relids = inner_path->parent->relids; Sort *sort = make_sort_from_pathkeys(inner_plan, best_path->innersortkeys, inner_relids); label_sort_with_costsize(root, sort, -1.0); inner_plan = (Plan *) sort; innerpathkeys = best_path->innersortkeys; } else innerpathkeys = best_path->jpath.innerjoinpath->pathkeys; /* * If specified, add a materialize node to shield the inner plan from the * need to handle mark/restore. * 如指定物化,則添加物化節(jié)點(diǎn) */ if (best_path->materialize_inner) { Plan *matplan = (Plan *) make_material(inner_plan); /* * We assume the materialize will not spill to disk, and therefore * charge just cpu_operator_cost per tuple. (Keep this estimate in * sync with final_cost_mergejoin.) * 假設(shè)materialize不會(huì)溢出到磁盤,因此每個(gè)元組的成本為cpu_operator_cost。 * (讓這個(gè)估計(jì)與final_cost_mergejoin保持同步。) */ copy_plan_costsize(matplan, inner_plan); matplan->total_cost += cpu_operator_cost * matplan->plan_rows; inner_plan = matplan; } /* * Compute the opfamily/collation/strategy/nullsfirst arrays needed by the * executor. The information is in the pathkeys for the two inputs, but * we need to be careful about the possibility of mergeclauses sharing a * pathkey, as well as the possibility that the inner pathkeys are not in * an order matching the mergeclauses. * 計(jì)算執(zhí)行器需要的opfamily/collation/strategy/nullsfirst數(shù)組。 * 信息在這兩個(gè)輸入的pathkeys中,但是需要注意mergeclauses共享一個(gè)pathkey的可能性, * 以及內(nèi)表路徑鍵不符合mergeclauses順序的可能性。 */ nClauses = list_length(mergeclauses); Assert(nClauses == list_length(best_path->path_mergeclauses)); mergefamilies = (Oid *) palloc(nClauses * sizeof(Oid));//申請(qǐng)內(nèi)存 mergecollations = (Oid *) palloc(nClauses * sizeof(Oid)); mergestrategies = (int *) palloc(nClauses * sizeof(int)); mergenullsfirst = (bool *) palloc(nClauses * sizeof(bool)); opathkey = NULL; opeclass = NULL; lop = list_head(outerpathkeys); lip = list_head(innerpathkeys); i = 0; foreach(lc, best_path->path_mergeclauses)//遍歷條件 { RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc); EquivalenceClass *oeclass; EquivalenceClass *ieclass; PathKey *ipathkey = NULL; EquivalenceClass *ipeclass = NULL; bool first_inner_match = false; /* fetch outer/inner eclass from mergeclause */ //從mergeclause中獲取outer/inner等價(jià)類 if (rinfo->outer_is_left) { oeclass = rinfo->left_ec; ieclass = rinfo->right_ec; } else { oeclass = rinfo->right_ec; ieclass = rinfo->left_ec; } Assert(oeclass != NULL); Assert(ieclass != NULL); /* * We must identify the pathkey elements associated with this clause * by matching the eclasses (which should give a unique match, since * the pathkey lists should be canonical). In typical cases the merge * clauses are one-to-one with the pathkeys, but when dealing with * partially redundant query conditions, things are more complicated. * 必須通過(guò)匹配等價(jià)類eclasses來(lái)標(biāo)識(shí)與此子句關(guān)聯(lián)的pathkey元素 * (它應(yīng)該提供唯一的匹配,因?yàn)閜athkey鏈表應(yīng)該是規(guī)范的)。 * 在典型的情況下,merge子句與pathkey是一對(duì)一的,但是在處理部分冗余查詢條件時(shí),事情就有些復(fù)雜了。 * * lop and lip reference the first as-yet-unmatched pathkey elements. * If they're NULL then all pathkey elements have been matched. * lop和lip引用第一個(gè)尚未匹配的pathkey元素。如果它們?yōu)榭眨敲此械膒athkey元素都已匹配。 * * The ordering of the outer pathkeys should match the mergeclauses, * by construction (see find_mergeclauses_for_outer_pathkeys()). There * could be more than one mergeclause for the same outer pathkey, but * no pathkey may be entirely skipped over. * 通過(guò)處理,外表pathkey順序應(yīng)該與mergeclauses匹配(參見find_mergeclauses_for_outer_pathkeys()函數(shù))。 * 同一個(gè)外表pathkey可以有多個(gè)mergeclause,但是不能完全跳過(guò)所有pathkey。 */ if (oeclass != opeclass) /* multiple matches are not interesting */ { /* doesn't match the current opathkey, so must match the next */ //與當(dāng)前的opathkey不匹配,那么必須與接下來(lái)的匹配 if (lop == NULL) elog(ERROR, "outer pathkeys do not match mergeclauses"); opathkey = (PathKey *) lfirst(lop); opeclass = opathkey->pk_eclass; lop = lnext(lop); if (oeclass != opeclass) elog(ERROR, "outer pathkeys do not match mergeclauses"); } /* * The inner pathkeys likewise should not have skipped-over keys, but * it's possible for a mergeclause to reference some earlier inner * pathkey if we had redundant pathkeys. For example we might have * mergeclauses like "o.a = i.x AND o.b = i.y AND o.c = i.x". The * implied inner ordering is then "ORDER BY x, y, x", but the pathkey * mechanism drops the second sort by x as redundant, and this code * must cope. * 同樣,內(nèi)表pathkey也不應(yīng)該有skipped-over keys,但是如果我們有冗余的路徑鍵, * mergeclause可以引用一些早期的內(nèi)部路徑鍵。 * 例如,我們可能存在下面的mergeclauses,比如"o.a = i.x AND o.b = i.y AND o.c = i.x"。 * 隱含的內(nèi)部排序是“x, y, x的排序”,但是pathkey機(jī)制將按x排序視為多余并刪除,在這里必須處理這種情況。 * * It's also possible for the implied inner-rel ordering to be like * "ORDER BY x, y, x DESC". We still drop the second instance of x as * redundant; but this means that the sort ordering of a redundant * inner pathkey should not be considered significant. So we must * detect whether this is the first clause matching an inner pathkey. * 對(duì)于隱含的內(nèi)表排序,也有可能是“ORDER BY x, y, x DESC”。 * 仍然將x的第二個(gè)實(shí)例視為冗余并刪除; * 但是這意味著冗余的內(nèi)表pathkey的排序順序不應(yīng)該被認(rèn)為是重要的。 * 因此,我們必須檢測(cè)這是否是與內(nèi)表pathkey匹配的第一個(gè)子句。 * */ if (lip) { ipathkey = (PathKey *) lfirst(lip); ipeclass = ipathkey->pk_eclass; if (ieclass == ipeclass) { /* successful first match to this inner pathkey */ //成功匹配 lip = lnext(lip); first_inner_match = true; } } if (!first_inner_match) { /* redundant clause ... must match something before lip */ //多余的條件子句,必須在lip前匹配某些pathkey ListCell *l2; foreach(l2, innerpathkeys) { if (l2 == lip) break; ipathkey = (PathKey *) lfirst(l2); ipeclass = ipathkey->pk_eclass; if (ieclass == ipeclass) break; } if (ieclass != ipeclass) elog(ERROR, "inner pathkeys do not match mergeclauses"); } /* * The pathkeys should always match each other as to opfamily and * collation (which affect equality), but if we're considering a * redundant inner pathkey, its sort ordering might not match. In * such cases we may ignore the inner pathkey's sort ordering and use * the outer's. (In effect, we're lying to the executor about the * sort direction of this inner column, but it does not matter since * the run-time row comparisons would only reach this column when * there's equality for the earlier column containing the same eclass. * There could be only one value in this column for the range of inner * rows having a given value in the earlier column, so it does not * matter which way we imagine this column to be ordered.) But a * non-redundant inner pathkey had better match outer's ordering too. * 對(duì)于opfamily和collation(這會(huì)影響等式),pathkey應(yīng)該總是匹配的, * 但是如果我們考慮一個(gè)冗余的內(nèi)表pathkey,它的排序順序可能不匹配。 * 在這種情況下,我們可以忽略內(nèi)表pathkey的排序順序,而使用外表訪問(wèn)路徑。 * (實(shí)際上,是在內(nèi)表列的排序方向上欺騙執(zhí)行器,但這無(wú)關(guān)緊要, * 因?yàn)檫\(yùn)行時(shí)行比較只在包含相同eclass的前一列相等時(shí)才會(huì)到達(dá)這一列。 * 對(duì)于在前一列中具有給定值的內(nèi)部行范圍,在此列中只能有一個(gè)值, * 因此我們認(rèn)為該列的順序如何并不重要。) * 而一個(gè)非冗余的內(nèi)部路徑也最好與外部的排序匹配。 */ if (opathkey->pk_opfamily != ipathkey->pk_opfamily || opathkey->pk_eclass->ec_collation != ipathkey->pk_eclass->ec_collation) elog(ERROR, "left and right pathkeys do not match in mergejoin"); if (first_inner_match && (opathkey->pk_strategy != ipathkey->pk_strategy || opathkey->pk_nulls_first != ipathkey->pk_nulls_first)) elog(ERROR, "left and right pathkeys do not match in mergejoin"); /* OK, save info for executor */ mergefamilies[i] = opathkey->pk_opfamily; mergecollations[i] = opathkey->pk_eclass->ec_collation; mergestrategies[i] = opathkey->pk_strategy; mergenullsfirst[i] = opathkey->pk_nulls_first; i++; } /* * Note: it is not an error if we have additional pathkey elements (i.e., * lop or lip isn't NULL here). The input paths might be better-sorted * than we need for the current mergejoin. * 注意:如果有額外的pathkey元素(例如, lop或lip在這里不是空的)。 * 輸入路徑可能比當(dāng)前合并連接所需的排序更好。 */ /* * Now we can build the mergejoin node. * 創(chuàng)建mergejoin節(jié)點(diǎn) */ join_plan = make_mergejoin(tlist, joinclauses, otherclauses, mergeclauses, mergefamilies, mergecollations, mergestrategies, mergenullsfirst, outer_plan, inner_plan, best_path->jpath.jointype, best_path->jpath.inner_unique, best_path->skip_mark_restore); /* Costs of sort and material steps are included in path cost already */ //排序和物化步驟一包含在訪問(wèn)路徑的成本中 copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path); return join_plan; } //------------------------------------------ create_hashjoin_plan static HashJoin * create_hashjoin_plan(PlannerInfo *root, HashPath *best_path) { HashJoin *join_plan; Hash *hash_plan; Plan *outer_plan; Plan *inner_plan; List *tlist = build_path_tlist(root, &best_path->jpath.path); List *joinclauses; List *otherclauses; List *hashclauses; Oid skewTable = InvalidOid; AttrNumber skewColumn = InvalidAttrNumber; bool skewInherit = false; /* * HashJoin can project, so we don't have to demand exact tlists from the * inputs. However, it's best to request a small tlist from the inner * side, so that we aren't storing more data than necessary. Likewise, if * we anticipate batching, request a small tlist from the outer side so * that we don't put extra data in the outer batch files. * HashJoin可以進(jìn)行投影運(yùn)算,因此我們不必從輸入中要求精確的tlist。 * 但是,最好從內(nèi)部請(qǐng)求一個(gè)小tlist,這樣就不需要存儲(chǔ)過(guò)多的數(shù)據(jù)。 * 同樣,如果我們進(jìn)行預(yù)批處理,從外部請(qǐng)求一個(gè)小tlist,這樣就不會(huì)在外部批處理文件中添加額外的數(shù)據(jù)。 */ outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, (best_path->num_batches > 1) ? CP_SMALL_TLIST : 0); inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath, CP_SMALL_TLIST); /* Sort join qual clauses into best execution order */ joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo); /* There's no point in sorting the hash clauses ... */ /* Get the join qual clauses (in plain expression form) */ /* Any pseudoconstant clauses are ignored here */ if (IS_OUTER_JOIN(best_path->jpath.jointype)) { extract_actual_join_clauses(joinclauses, best_path->jpath.path.parent->relids, &joinclauses, &otherclauses); } else { /* We can treat all clauses alike for an inner join */ joinclauses = extract_actual_clauses(joinclauses, false); otherclauses = NIL; } /* * Remove the hashclauses from the list of join qual clauses, leaving the * list of quals that must be checked as qpquals. * 從join qual子句列表中刪除hashclause,將必須檢查為qpquals的quals列表保留下來(lái)。 */ hashclauses = get_actual_clauses(best_path->path_hashclauses); joinclauses = list_difference(joinclauses, hashclauses); /* * Replace any outer-relation variables with nestloop params. There * should not be any in the hashclauses. * 用nestloop參數(shù)替換任何外部關(guān)系變量。而且不應(yīng)在hashclauses中出現(xiàn)。 */ if (best_path->jpath.path.param_info) { joinclauses = (List *) replace_nestloop_params(root, (Node *) joinclauses); otherclauses = (List *) replace_nestloop_params(root, (Node *) otherclauses); } /* * Rearrange hashclauses, if needed, so that the outer variable is always * on the left. * 重新安排hashclausees,以便外表的Var出現(xiàn)在左側(cè) */ hashclauses = get_switched_clauses(best_path->path_hashclauses, best_path->jpath.outerjoinpath->parent->relids); /* * If there is a single join clause and we can identify the outer variable * as a simple column reference, supply its identity for possible use in * skew optimization. (Note: in principle we could do skew optimization * with multiple join clauses, but we'd have to be able to determine the * most common combinations of outer values, which we don't currently have * enough stats for.) * 如果有一個(gè)連接條件子句,并且可以將外表變量標(biāo)識(shí)為一個(gè)簡(jiǎn)單的列引用, * 那么可以通過(guò)提供它的標(biāo)識(shí)以表在列數(shù)據(jù)傾斜優(yōu)化中使用。 * (注意:原則上可以使用多個(gè)連接子句進(jìn)行傾斜優(yōu)化, * 但我們必須能夠確定最常見的外部值組合,目前我們還沒(méi)有足夠的統(tǒng)計(jì)數(shù)據(jù)。) */ if (list_length(hashclauses) == 1) { OpExpr *clause = (OpExpr *) linitial(hashclauses); Node *node; Assert(is_opclause(clause)); node = (Node *) linitial(clause->args); if (IsA(node, RelabelType)) node = (Node *) ((RelabelType *) node)->arg; if (IsA(node, Var)) { Var *var = (Var *) node; RangeTblEntry *rte; rte = root->simple_rte_array[var->varno]; if (rte->rtekind == RTE_RELATION) { skewTable = rte->relid; skewColumn = var->varattno; skewInherit = rte->inh; } } } /* * Build the hash node and hash join node. * 創(chuàng)建hash節(jié)點(diǎn)和hash join節(jié)點(diǎn) */ hash_plan = make_hash(inner_plan, skewTable, skewColumn, skewInherit);//為內(nèi)表創(chuàng)建hash表 /* * Set Hash node's startup & total costs equal to total cost of input * plan; this only affects EXPLAIN display not decisions. * 設(shè)置哈希節(jié)點(diǎn)的啟動(dòng)和總成本等于輸入的計(jì)劃總成本; * 這只影響解釋顯示而不是決策。 */ copy_plan_costsize(&hash_plan->plan, inner_plan); hash_plan->plan.startup_cost = hash_plan->plan.total_cost; /* * If parallel-aware, the executor will also need an estimate of the total * number of rows expected from all participants so that it can size the * shared hash table. * 如果需要并行,執(zhí)行器還需要估計(jì)所有參與者預(yù)期的行數(shù),以便對(duì)共享哈希表進(jìn)行大小計(jì)算。 */ if (best_path->jpath.path.parallel_aware) { hash_plan->plan.parallel_aware = true; hash_plan->rows_total = best_path->inner_rows_total; } join_plan = make_hashjoin(tlist, joinclauses, otherclauses, hashclauses, outer_plan, (Plan *) hash_plan, best_path->jpath.jointype, best_path->jpath.inner_unique);//創(chuàng)建hash join節(jié)點(diǎn) copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path); return join_plan; }
測(cè)試腳本如下
testdb=# explain select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je testdb-# from t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je testdb(# from t_grxx gr inner join t_jfxx jf testdb(# on gr.dwbh = dw.dwbh testdb(# and gr.grbh = jf.grbh) grjf testdb-# where dw.dwbh in ('1001','1002') testdb-# order by dw.dwbh; QUERY PLAN -------------------------------------------------------------------------------------------------- Sort (cost=2010.12..2010.17 rows=20 width=47) Sort Key: dw.dwbh -> Nested Loop (cost=14.24..2009.69 rows=20 width=47) -> Hash Join (cost=13.95..2002.56 rows=20 width=32) Hash Cond: ((gr.dwbh)::text = (dw.dwbh)::text) -> Seq Scan on t_grxx gr (cost=0.00..1726.00 rows=100000 width=16) -> Hash (cost=13.92..13.92 rows=2 width=20) -> Index Scan using t_dwxx_pkey on t_dwxx dw (cost=0.29..13.92 rows=2 width=20) Index Cond: ((dwbh)::text = ANY ('{1001,1002}'::text[])) -> Index Scan using idx_t_jfxx_grbh on t_jfxx jf (cost=0.29..0.35 rows=1 width=20) Index Cond: ((grbh)::text = (gr.grbh)::text)
啟動(dòng)gdb,設(shè)置斷點(diǎn),進(jìn)入create_join_plan函數(shù)
(gdb) b create_join_plan Breakpoint 1 at 0x7b8426: file createplan.c, line 973. (gdb) c Continuing. Breakpoint 1, create_join_plan (root=0x2ef8a00, best_path=0x2f5ad40) at createplan.c:973 973 switch (best_path->path.pathtype)
查看輸入?yún)?shù),pathtype為T_NestLoop
(gdb) p *best_path $3 = {path = {type = T_NestPath, pathtype = T_NestLoop, parent = 0x2f5a570, pathtarget = 0x2f5a788, param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 20, startup_cost = 14.241722117799656, total_cost = 2009.6908721177995, pathkeys = 0x0}, jointype = JOIN_INNER, inner_unique = false, outerjoinpath = 0x2f58bb0, innerjoinpath = 0x2f56080, joinrestrictinfo = 0x0}
進(jìn)入create_nestloop_plan
973 switch (best_path->path.pathtype) (gdb) n 984 plan = (Plan *) create_nestloop_plan(root, (gdb) step create_nestloop_plan (root=0x2f49180, best_path=0x2f5ad40) at createplan.c:3678 3678 List *tlist = build_path_tlist(root, &best_path->path);
nestloop join->創(chuàng)建tlist,獲取連接條件等
3678 List *tlist = build_path_tlist(root, &best_path->path); (gdb) n 3679 List *joinrestrictclauses = best_path->joinrestrictinfo; (gdb) 3684 Relids saveOuterRels = root->curOuterRels; (gdb) p root->curOuterRels $1 = (Relids) 0x0
nestloop join->調(diào)用create_plan_recurse創(chuàng)建outer_plan
(gdb) n 3690 outer_plan = create_plan_recurse(root, best_path->outerjoinpath, 0); (gdb) Breakpoint 1, create_join_plan (root=0x2f49180, best_path=0x2f58bb0) at createplan.c:973 973 switch (best_path->path.pathtype)
nestloop join->外表對(duì)應(yīng)的outer_plan為T_HashJoin
(gdb) p *best_path $2 = {path = {type = T_HashPath, pathtype = T_HashJoin, parent = 0x2f572d0, pathtarget = 0x2f57508, param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 20, startup_cost = 13.949222117799655, total_cost = 2002.5604721177997, pathkeys = 0x0}, jointype = JOIN_INNER, inner_unique = true, outerjoinpath = 0x2f512f0, innerjoinpath = 0x2f51e98, joinrestrictinfo = 0x2f577a8} (gdb)
nestloop join->進(jìn)入create_hashjoin_plan
(gdb) n 980 plan = (Plan *) create_hashjoin_plan(root, (gdb) step create_hashjoin_plan (root=0x2f49180, best_path=0x2f58bb0) at createplan.c:4093 4093 List *tlist = build_path_tlist(root, &best_path->jpath.path);
hash join->創(chuàng)建outer plan
(gdb) 4108 outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, (gdb) p *best_path->jpath.outerjoinpath $4 = {type = T_Path, pathtype = T_SeqScan, parent = 0x2f06090, pathtarget = 0x2f062c8, param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 100000, startup_cost = 0, total_cost = 1726, pathkeys = 0x0}
hash join->創(chuàng)建inner plan
(gdb) p *best_path->jpath.innerjoinpath $5 = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x2f04b60, pathtarget = 0x2f04d98, param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 2, startup_cost = 0.28500000000000003, total_cost = 13.924222117799655, pathkeys = 0x2f51e20}
hash join->獲取連接條件
(gdb) n 4115 joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo); (gdb) 4120 if (IS_OUTER_JOIN(best_path->jpath.jointype)) (gdb) p *joinclauses $6 = {type = T_List, length = 1, head = 0x2f57780, tail = 0x2f57780}
hash join->處理連接條件&hash條件
(gdb) n 4137 hashclauses = get_actual_clauses(best_path->path_hashclauses); (gdb) 4138 joinclauses = list_difference(joinclauses, hashclauses); (gdb) 4144 if (best_path->jpath.path.param_info) (gdb) p *hashclauses $8 = {type = T_List, length = 1, head = 0x2f5d690, tail = 0x2f5d690} (gdb) p *joinclauses Cannot access memory at address 0x0
hash join->變換位置,把外表的Var放在左側(cè)
(gdb) n 4156 hashclauses = get_switched_clauses(best_path->path_hashclauses, (gdb)
hash join->Hash連接條件只有一個(gè),進(jìn)行數(shù)據(jù)傾斜優(yōu)化
(gdb) 4167 if (list_length(hashclauses) == 1) (gdb) n 4169 OpExpr *clause = (OpExpr *) linitial(hashclauses); (gdb) n 4172 Assert(is_opclause(clause)); (gdb) 4173 node = (Node *) linitial(clause->args); (gdb) 4174 if (IsA(node, RelabelType)) (gdb) 4175 node = (Node *) ((RelabelType *) node)->arg; (gdb) 4176 if (IsA(node, Var)) (gdb) 4178 Var *var = (Var *) node; (gdb) 4181 rte = root->simple_rte_array[var->varno]; (gdb) p *node $9 = {type = T_Var} (gdb) p *(Var *)node $10 = {xpr = {type = T_Var}, varno = 3, varattno = 1, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, varnoold = 3, varoattno = 1, location = 208} (gdb) n 4182 if (rte->rtekind == RTE_RELATION) (gdb) 4184 skewTable = rte->relid; (gdb) 4185 skewColumn = var->varattno; (gdb) 4186 skewInherit = rte->inh; (gdb)
hash join->開始創(chuàng)建創(chuàng)建hash節(jié)點(diǎn)和hash join節(jié)點(diǎn)
創(chuàng)建hash節(jié)點(diǎn)(構(gòu)建Hash表)
4194 hash_plan = make_hash(inner_plan, (gdb) n 4203 copy_plan_costsize(&hash_plan->plan, inner_plan); (gdb) 4204 hash_plan->plan.startup_cost = hash_plan->plan.total_cost; (gdb) p *hash_plan $11 = {plan = {type = T_Hash, startup_cost = 0.28500000000000003, total_cost = 13.924222117799655, plan_rows = 2, plan_width = 20, parallel_aware = false, parallel_safe = true, plan_node_id = 0, targetlist = 0x2f5d250, qual = 0x0, lefttree = 0x2f58428, righttree = 0x0, initPlan = 0x0, extParam = 0x0, allParam = 0x0}, skewTable = 16742, skewColumn = 1, skewInherit = false, rows_total = 0}
hash join->創(chuàng)建hash join節(jié)點(diǎn)
(gdb) n 4217 join_plan = make_hashjoin(tlist, (gdb) 4226 copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path); (gdb) 4228 return join_plan; (gdb) p *join_plan $13 = {join = {plan = {type = T_HashJoin, startup_cost = 13.949222117799655, total_cost = 2002.5604721177997, plan_rows = 20, plan_width = 32, parallel_aware = false, parallel_safe = true, plan_node_id = 0, targetlist = 0x2f5cb28, qual = 0x0, lefttree = 0x2f5ae98, righttree = 0x2f5d830, initPlan = 0x0, extParam = 0x0, allParam = 0x0}, jointype = JOIN_INNER, inner_unique = true, joinqual = 0x0}, hashclauses = 0x2f5d7f8}
hash join->回到create_nestloop_plan
(gdb) n create_nestloop_plan (root=0x2f49180, best_path=0x2f5ad40) at createplan.c:3694 3694 best_path->outerjoinpath->parent->relids); (gdb) n 3693 root->curOuterRels = bms_union(root->curOuterRels,
nestloop join->創(chuàng)建內(nèi)表Plan
(gdb) n 3696 inner_plan = create_plan_recurse(root, best_path->innerjoinpath, 0); (gdb) p *best_path->innerjoinpath $16 = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x2f06858, pathtarget = 0x2f06a70, param_info = 0x2f56910, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 1, startup_cost = 0.29249999999999998, total_cost = 0.34651999999999999, pathkeys = 0x2f56608}
nestloop join->獲取連接條件子句
(gdb) n 3699 bms_free(root->curOuterRels); (gdb) 3700 root->curOuterRels = saveOuterRels; (gdb) 3703 joinrestrictclauses = order_qual_clauses(root, joinrestrictclauses); (gdb) 3707 if (IS_OUTER_JOIN(best_path->jointype)) (gdb) p *joinrestrictclauses Cannot access memory at address 0x0
nestloop join->獲取連接條件&參數(shù)化處理(相關(guān)值為NULL)
(gdb) n 3716 joinclauses = extract_actual_clauses(joinrestrictclauses, false); (gdb) 3717 otherclauses = NIL; (gdb) 3721 if (best_path->path.param_info) (gdb) p *joinclauses Cannot access memory at address 0x0 (gdb) p *best_path->path.param_info Cannot access memory at address 0x0
nestloop join->獲取外表的relids(外表為1和3號(hào)RTE的連接)
(gdb) n 3733 outerrelids = best_path->outerjoinpath->parent->relids; (gdb) 3734 nestParams = NIL; (gdb) p *outerrelids $17 = {nwords = 1, words = 0x2f574ec} (gdb) p *outerrelids->words $18 = 10
nestloop join->遍歷當(dāng)前的外表參數(shù)鏈表
(gdb) n 3735 prev = NULL; (gdb) 3736 for (cell = list_head(root->curOuterParams); cell; cell = next) (gdb) p *root->curOuterParams $19 = {type = T_List, length = 1, head = 0x2f5df98, tail = 0x2f5df98}
nestloop join->查看該參數(shù)信息,3號(hào)RTE編號(hào)為2的字段(即grbh)
(gdb) n 3738 NestLoopParam *nlp = (NestLoopParam *) lfirst(cell); (gdb) 3740 next = lnext(cell); (gdb) p *(NestLoopParam *)nlp $21 = {type = T_NestLoopParam, paramno = 0, paramval = 0x2f54e50} (gdb) p *nlp->paramval $22 = {xpr = {type = T_Var}, varno = 3, varattno = 2, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, varnoold = 3, varoattno = 2, location = 273}
nestloop join->把條件從root->curOuterParams移動(dòng)到nestParams鏈表中
(gdb) n 3741 if (IsA(nlp->paramval, Var) && (gdb) n 3742 bms_is_member(nlp->paramval->varno, outerrelids)) (gdb) 3741 if (IsA(nlp->paramval, Var) && (gdb) 3744 root->curOuterParams = list_delete_cell(root->curOuterParams, (gdb) 3746 nestParams = lappend(nestParams, nlp); (gdb) 3736 for (cell = list_head(root->curOuterParams); cell; cell = next) (gdb) p *nestParams $23 = {type = T_List, length = 1, head = 0x2f5df98, tail = 0x2f5df98} (gdb) p *(Node *)nestParams->head->data.ptr_value $24 = {type = T_NestLoopParam} (gdb) p *(NestLoopParam *)nestParams->head->data.ptr_value $25 = {type = T_NestLoopParam, paramno = 0, paramval = 0x2f54e50} (gdb) set $nlp=(NestLoopParam *)nestParams->head->data.ptr_value (gdb) p $nlp->paramval $26 = (Var *) 0x2f54e50 (gdb) p *$nlp->paramval $27 = {xpr = {type = T_Var}, varno = 3, varattno = 2, vartype = 1043, vartypmod = 14, varcollid = 100, varlevelsup = 0, varnoold = 3, varoattno = 2, location = 273} (gdb)
nestloop join->創(chuàng)建nestloop join節(jié)點(diǎn)
(gdb) n 3771 best_path->inner_unique); (gdb) 3764 join_plan = make_nestloop(tlist, (gdb) 3773 copy_generic_path_info(&join_plan->join.plan, &best_path->path); (gdb) 3775 return join_plan; (gdb) p *join_plan $28 = {join = {plan = {type = T_NestLoop, startup_cost = 14.241722117799656, total_cost = 2009.6908721177995, plan_rows = 20, plan_width = 47, parallel_aware = false, parallel_safe = true, plan_node_id = 0, targetlist = 0x2f5c770, qual = 0x0, lefttree = 0x2f5d8c8, righttree = 0x2f59ed0, initPlan = 0x0, extParam = 0x0, allParam = 0x0}, jointype = JOIN_INNER, inner_unique = false, joinqual = 0x0}, nestParams = 0x2f5dfc0} (gdb)
“PostgreSQL中create_plan函數(shù)連接計(jì)劃的實(shí)現(xiàn)過(guò)程是什么”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注億速云網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實(shí)用文章!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。