make_join_rel->populate_joinrel_with_paths->add_paths_to_j..."/>
您好,登錄后才能下訂單哦!
本節(jié)大體介紹了動(dòng)態(tài)規(guī)劃算法實(shí)現(xiàn)(standard_join_search)中的join_search_one_level->make_join_rel->populate_joinrel_with_paths->add_paths_to_joinrel->match_unsorted_outer中的initial_cost_nestloop和final_cost_nestloop函數(shù),這些函數(shù)用于計(jì)算nestloop join的Cost。
Cost相關(guān)
注意:實(shí)際使用的參數(shù)值通過(guò)系統(tǒng)配置文件定義,而不是這里的常量定義!
typedef double Cost; /* execution cost (in page-access units) */
/* defaults for costsize.c's Cost parameters */
/* NB: cost-estimation code should use the variables, not these constants! */
/* 注意:實(shí)際值通過(guò)系統(tǒng)配置文件定義,而不是這里的常量定義! */
/* If you change these, update backend/utils/misc/postgresql.sample.conf */
#define DEFAULT_SEQ_PAGE_COST 1.0 //順序掃描page的成本
#define DEFAULT_RANDOM_PAGE_COST 4.0 //隨機(jī)掃描page的成本
#define DEFAULT_CPU_TUPLE_COST 0.01 //處理一個(gè)元組的CPU成本
#define DEFAULT_CPU_INDEX_TUPLE_COST 0.005 //處理一個(gè)索引元組的CPU成本
#define DEFAULT_CPU_OPERATOR_COST 0.0025 //執(zhí)行一次操作或函數(shù)的CPU成本
#define DEFAULT_PARALLEL_TUPLE_COST 0.1 //并行執(zhí)行,從一個(gè)worker傳輸一個(gè)元組到另一個(gè)worker的成本
#define DEFAULT_PARALLEL_SETUP_COST 1000.0 //構(gòu)建并行執(zhí)行環(huán)境的成本
#define DEFAULT_EFFECTIVE_CACHE_SIZE 524288 /*先前已有介紹, measured in pages */
double seq_page_cost = DEFAULT_SEQ_PAGE_COST;
double random_page_cost = DEFAULT_RANDOM_PAGE_COST;
double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
double parallel_tuple_cost = DEFAULT_PARALLEL_TUPLE_COST;
double parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;
int effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
Cost disable_cost = 1.0e10;//1后面10個(gè)0,通過(guò)設(shè)置一個(gè)巨大的成本,讓優(yōu)化器自動(dòng)放棄此路徑
int max_parallel_workers_per_gather = 2;//每次gather使用的worker數(shù)
nested loop join的算法實(shí)現(xiàn)偽代碼如下:
FOR row#1 IN (select * from dataset#1) LOOP
FOR row#2 IN (select * from dataset#2 where row#1 is matched) LOOP
output values from row#1 and row#2
END LOOP
END LOOP
initial_cost_nestloop
該函數(shù)預(yù)估nestloop join訪問(wèn)路徑的成本
//---------------------------------------------------------------------- initial_cost_nestloop
/*
* initial_cost_nestloop
* Preliminary estimate of the cost of a nestloop join path.
* 預(yù)估nestloop join訪問(wèn)路徑的成本
*
* This must quickly produce lower-bound estimates of the path's startup and
* total costs. If we are unable to eliminate the proposed path from
* consideration using the lower bounds, final_cost_nestloop will be called
* to obtain the final estimates.
*
* The exact division of labor between this function and final_cost_nestloop
* is private to them, and represents a tradeoff between speed of the initial
* estimate and getting a tight lower bound. We choose to not examine the
* join quals here, since that's by far the most expensive part of the
* calculations. The end result is that CPU-cost considerations must be
* left for the second phase; and for SEMI/ANTI joins, we must also postpone
* incorporation of the inner path's run cost.
*
* 'workspace' is to be filled with startup_cost, total_cost, and perhaps
* other data to be used by final_cost_nestloop
* 'jointype' is the type of join to be performed
* 'outer_path' is the outer input to the join
* 'inner_path' is the inner input to the join
* 'extra' contains miscellaneous information about the join
*/
void
initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
JoinType jointype,
Path *outer_path, Path *inner_path,
JoinPathExtraData *extra)
{
Cost startup_cost = 0;
Cost run_cost = 0;
double outer_path_rows = outer_path->rows;
Cost inner_rescan_start_cost;
Cost inner_rescan_total_cost;
Cost inner_run_cost;
Cost inner_rescan_run_cost;
/* 估算重新掃描內(nèi)表的成本.estimate costs to rescan the inner relation */
cost_rescan(root, inner_path,
&inner_rescan_start_cost,
&inner_rescan_total_cost);
/* cost of source data */
/*
* NOTE: clearly, we must pay both outer and inner paths' startup_cost
* before we can start returning tuples, so the join's startup cost is
* their sum. We'll also pay the inner path's rescan startup cost
* multiple times.
* 注意:顯然,在開(kāi)始返回元組之前,必須耗費(fèi)外表訪問(wèn)路徑和內(nèi)表訪問(wèn)路徑的啟動(dòng)成本startup_cost,
* 因此連接的啟動(dòng)成本是它們的總和。我們還需要多次計(jì)算內(nèi)表訪問(wèn)路徑的重新掃描啟動(dòng)成本。
*/
startup_cost += outer_path->startup_cost + inner_path->startup_cost;
run_cost += outer_path->total_cost - outer_path->startup_cost;
if (outer_path_rows > 1)
run_cost += (outer_path_rows - 1) * inner_rescan_start_cost;
inner_run_cost = inner_path->total_cost - inner_path->startup_cost;
inner_rescan_run_cost = inner_rescan_total_cost - inner_rescan_start_cost;
if (jointype == JOIN_SEMI || jointype == JOIN_ANTI ||
extra->inner_unique)
{
/*
* With a SEMI or ANTI join, or if the innerrel is known unique, the
* executor will stop after the first match.
* 對(duì)于半連接或反連接,或者如果內(nèi)表已知是唯一的,執(zhí)行器將在第一次匹配后停止。
*
* Getting decent estimates requires inspection of the join quals,
* which we choose to postpone to final_cost_nestloop.
* 獲得適當(dāng)?shù)墓浪阈枰獧z查join quals,我們選擇將其推遲到final_cost_nestloop中實(shí)現(xiàn)。
*/
/* Save private data for final_cost_nestloop */
workspace->inner_run_cost = inner_run_cost;
workspace->inner_rescan_run_cost = inner_rescan_run_cost;
}
else
{
/* Normal case; we'll scan whole input rel for each outer row */
//常規(guī)的情況,對(duì)于每一個(gè)外表行,將掃描整個(gè)內(nèi)表
run_cost += inner_run_cost;
if (outer_path_rows > 1)
run_cost += (outer_path_rows - 1) * inner_rescan_run_cost;
}
/* CPU costs left for later */
/* Public result fields */
//結(jié)果賦值
workspace->startup_cost = startup_cost;
workspace->total_cost = startup_cost + run_cost;
/* Save private data for final_cost_nestloop */
workspace->run_cost = run_cost;
}
//-------------------------------------- cost_rescan
/*
* cost_rescan
* Given a finished Path, estimate the costs of rescanning it after
* having done so the first time. For some Path types a rescan is
* cheaper than an original scan (if no parameters change), and this
* function embodies knowledge about that. The default is to return
* the same costs stored in the Path. (Note that the cost estimates
* actually stored in Paths are always for first scans.)
* 給定一個(gè)已完成的訪問(wèn)路徑,估算在第一次完成之后重新掃描它的成本。
* 對(duì)于某些訪問(wèn)路徑,重新掃描比原始掃描成本更低(如果沒(méi)有參數(shù)更改),
* 該函數(shù)體現(xiàn)了這方面的信息。
* 默認(rèn)值是返回存儲(chǔ)在路徑中的相同成本。
* (請(qǐng)注意,實(shí)際存儲(chǔ)在路徑中的成本估算總是用于首次掃描。)
*
* This function is not currently intended to model effects such as rescans
* being cheaper due to disk block caching; what we are concerned with is
* plan types wherein the executor caches results explicitly, or doesn't
* redo startup calculations, etc.
* 該函數(shù)目前并不打算對(duì)諸如由于磁盤(pán)塊緩存而使后續(xù)的掃描成本更低等效果進(jìn)行建模;
* 我們關(guān)心的是計(jì)劃類(lèi)型,其中執(zhí)行器顯式緩存結(jié)果,或不重做啟動(dòng)計(jì)算,等等。
*/
static void
cost_rescan(PlannerInfo *root, Path *path,
Cost *rescan_startup_cost, /* output parameters */
Cost *rescan_total_cost)
{
switch (path->pathtype)//路徑類(lèi)型
{
case T_FunctionScan:
/*
* Currently, nodeFunctionscan.c always executes the function to
* completion before returning any rows, and caches the results in
* a tuplestore. So the function eval cost is all startup cost
* and isn't paid over again on rescans. However, all run costs
* will be paid over again.
* 目前nodeFunctionscan.c在在返回?cái)?shù)據(jù)行之前完成執(zhí)行,同時(shí)會(huì)在tuplestore中緩存結(jié)果.
* 因此,函數(shù)估算成本是啟動(dòng)成本,同時(shí)不需要考慮重新掃描.
* 但是所有的運(yùn)行成本在重新掃描時(shí)是需要的.
*/
*rescan_startup_cost = 0;
*rescan_total_cost = path->total_cost - path->startup_cost;
break;
case T_HashJoin://Hash Join
/*
* If it's a single-batch join, we don't need to rebuild the hash
* table during a rescan.
* 如果是單批連接,則不需要在重新掃描期間重新構(gòu)建哈希表。
*/
if (((HashPath *) path)->num_batches == 1)
{
/* Startup cost is exactly the cost of hash table building */
*rescan_startup_cost = 0;//啟動(dòng)成本為0(構(gòu)建hash表)
*rescan_total_cost = path->total_cost - path->startup_cost;
}
else
{
/* Otherwise, no special treatment */
*rescan_startup_cost = path->startup_cost;
*rescan_total_cost = path->total_cost;
}
break;
case T_CteScan:
case T_WorkTableScan:
{
/*
* These plan types materialize their final result in a
* tuplestore or tuplesort object. So the rescan cost is only
* cpu_tuple_cost per tuple, unless the result is large enough
* to spill to disk.
* 這些計(jì)劃類(lèi)型在tuplestore或tuplesort對(duì)象中物化它們的最終結(jié)果。
* 因此,重新掃描的成本只是每個(gè)元組的cpu_tuple_cost,除非結(jié)果大到足以溢出到磁盤(pán)。
*/
Cost run_cost = cpu_tuple_cost * path->rows;
double nbytes = relation_byte_size(path->rows,
path->pathtarget->width);
long work_mem_bytes = work_mem * 1024L;
if (nbytes > work_mem_bytes)
{
/* It will spill, so account for re-read cost */
//如果溢出到磁盤(pán),那么需考慮重新讀取的成本
double npages = ceil(nbytes / BLCKSZ);
run_cost += seq_page_cost * npages;
}
*rescan_startup_cost = 0;
*rescan_total_cost = run_cost;
}
break;
case T_Material:
case T_Sort:
{
/*
* These plan types not only materialize their results, but do
* not implement qual filtering or projection. So they are
* even cheaper to rescan than the ones above. We charge only
* cpu_operator_cost per tuple. (Note: keep that in sync with
* the run_cost charge in cost_sort, and also see comments in
* cost_material before you change it.)
* 這些計(jì)劃類(lèi)型不僅物化了它們的結(jié)果,而且沒(méi)有物化qual條件子句過(guò)濾或投影。
* 所以重新掃描比上面的路徑成本更低。每個(gè)元組耗費(fèi)的成本只有cpu_operator_cost。
* (注意:請(qǐng)與cost_sort中的run_cost成本保持一致,并在更改cost_material之前查看注釋)。
*/
Cost run_cost = cpu_operator_cost * path->rows;
double nbytes = relation_byte_size(path->rows,
path->pathtarget->width);
long work_mem_bytes = work_mem * 1024L;
if (nbytes > work_mem_bytes)
{
/* It will spill, so account for re-read cost */
//如果溢出到磁盤(pán),那么需考慮重新讀取的成本
double npages = ceil(nbytes / BLCKSZ);
run_cost += seq_page_cost * npages;
}
*rescan_startup_cost = 0;
*rescan_total_cost = run_cost;
}
break;
default:
*rescan_startup_cost = path->startup_cost;
*rescan_total_cost = path->total_cost;
break;
}
}
final_cost_nestloop
該函數(shù)實(shí)現(xiàn)nestloop join訪問(wèn)路徑的成本和結(jié)果大小的最終估算。
//---------------------------------------------------------------------- final_cost_nestloop
/*
* final_cost_nestloop
* Final estimate of the cost and result size of a nestloop join path.
* nestloop join訪問(wèn)路徑的成本和結(jié)果大小的最終估算。
*
* 'path' is already filled in except for the rows and cost fields
* 'workspace' is the result from initial_cost_nestloop
* 'extra' contains miscellaneous information about the join
*/
void
final_cost_nestloop(PlannerInfo *root, NestPath *path,//NL訪問(wèn)路徑
JoinCostWorkspace *workspace,//initial_cost_nestloop返回的結(jié)果
JoinPathExtraData *extra)//額外的信息
{
Path *outer_path = path->outerjoinpath;//外表訪問(wèn)路徑
Path *inner_path = path->innerjoinpath;//內(nèi)部訪問(wèn)路徑
double outer_path_rows = outer_path->rows;//外表訪問(wèn)路徑行數(shù)
double inner_path_rows = inner_path->rows;//內(nèi)部訪問(wèn)路徑行數(shù)
Cost startup_cost = workspace->startup_cost;//啟動(dòng)成本
Cost run_cost = workspace->run_cost;//運(yùn)行成本
Cost cpu_per_tuple;//處理每個(gè)tuple的CPU成本
QualCost restrict_qual_cost;//表達(dá)式處理成本
double ntuples;//元組數(shù)目
/* Protect some assumptions below that rowcounts aren't zero or NaN */
//確保參數(shù)正確
if (outer_path_rows <= 0 || isnan(outer_path_rows))
outer_path_rows = 1;
if (inner_path_rows <= 0 || isnan(inner_path_rows))
inner_path_rows = 1;
/* Mark the path with the correct row estimate */
//修正行數(shù)估算
if (path->path.param_info)
path->path.rows = path->path.param_info->ppi_rows;
else
path->path.rows = path->path.parent->rows;
/* For partial paths, scale row estimate. */
//調(diào)整并行執(zhí)行的行數(shù)估算
if (path->path.parallel_workers > 0)
{
double parallel_divisor = get_parallel_divisor(&path->path);
path->path.rows =
clamp_row_est(path->path.rows / parallel_divisor);
}
/*
* We could include disable_cost in the preliminary estimate, but that
* would amount to optimizing for the case where the join method is
* disabled, which doesn't seem like the way to bet.
* 我們可以將disable_cost包含在初步估算中,但是這相當(dāng)于為禁用join方法的情況進(jìn)行了優(yōu)化。
*/
if (!enable_nestloop)
startup_cost += disable_cost;
/* cost of inner-relation source data (we already dealt with outer rel) */
// 內(nèi)部源數(shù)據(jù)的成本
if (path->jointype == JOIN_SEMI || path->jointype == JOIN_ANTI ||
extra->inner_unique)//半連接/反連接或者內(nèi)部返回唯一值
{
/*
* With a SEMI or ANTI join, or if the innerrel is known unique, the
* executor will stop after the first match.
* 執(zhí)行器在第一次匹配后立即停止.
*/
Cost inner_run_cost = workspace->inner_run_cost;
Cost inner_rescan_run_cost = workspace->inner_rescan_run_cost;
double outer_matched_rows;
double outer_unmatched_rows;
Selectivity inner_scan_frac;
/*
* For an outer-rel row that has at least one match, we can expect the
* inner scan to stop after a fraction 1/(match_count+1) of the inner
* rows, if the matches are evenly distributed. Since they probably
* aren't quite evenly distributed, we apply a fuzz factor of 2.0 to
* that fraction. (If we used a larger fuzz factor, we'd have to
* clamp inner_scan_frac to at most 1.0; but since match_count is at
* least 1, no such clamp is needed now.)
* 對(duì)于至少有一個(gè)匹配的外表行,如果匹配是均勻分布的,
* 那么可以任務(wù)內(nèi)表掃描在內(nèi)部行的:1/(match_count+1)之后停止。
* 因?yàn)樗鼈兛赡懿皇蔷鶆蚍植嫉?,所以我們?duì)這個(gè)分?jǐn)?shù)乘上2.0的系數(shù)。
* (如果我們使用更大的因子,我們將不得不將inner_scan_frac限制為1.0;
* 但是因?yàn)閙atch_count至少為1,所以現(xiàn)在不需要這樣的處理了。)
*/
outer_matched_rows = rint(outer_path_rows * extra->semifactors.outer_match_frac);
outer_unmatched_rows = outer_path_rows - outer_matched_rows;
inner_scan_frac = 2.0 / (extra->semifactors.match_count + 1.0);
/*
* Compute number of tuples processed (not number emitted!). First,
* account for successfully-matched outer rows.
* 計(jì)算處理的元組的數(shù)量(而不是發(fā)出的數(shù)量!)
* 首先,計(jì)算成功匹配的外表行。
*/
ntuples = outer_matched_rows * inner_path_rows * inner_scan_frac;
/*
* Now we need to estimate the actual costs of scanning the inner
* relation, which may be quite a bit less than N times inner_run_cost
* due to early scan stops. We consider two cases. If the inner path
* is an indexscan using all the joinquals as indexquals, then an
* unmatched outer row results in an indexscan returning no rows,
* which is probably quite cheap. Otherwise, the executor will have
* to scan the whole inner rel for an unmatched row; not so cheap.
* 現(xiàn)在需要估算掃描內(nèi)部關(guān)系的實(shí)際成本,由于先前的掃描停止,這可能會(huì)比inner_run_cost小很多。
* 需要考慮兩種情況。如果內(nèi)表訪問(wèn)路徑是使用所有的joinquals連接條件作為indexquals的索引掃描indexscan,
* 那么一個(gè)不匹配的外部行將導(dǎo)致indexscan不返回任何行,這個(gè)成本可能相當(dāng)?shù)汀? * 否則,執(zhí)行器將必須掃描整個(gè)內(nèi)表以尋找一個(gè)不匹配的行;這個(gè)成本就非常的高了。
*/
if (has_indexed_join_quals(path))//連接條件上存在索引
{
/*
* Successfully-matched outer rows will only require scanning
* inner_scan_frac of the inner relation. In this case, we don't
* need to charge the full inner_run_cost even when that's more
* than inner_rescan_run_cost, because we can assume that none of
* the inner scans ever scan the whole inner relation. So it's
* okay to assume that all the inner scan executions can be
* fractions of the full cost, even if materialization is reducing
* the rescan cost. At this writing, it's impossible to get here
* for a materialized inner scan, so inner_run_cost and
* inner_rescan_run_cost will be the same anyway; but just in
* case, use inner_run_cost for the first matched tuple and
* inner_rescan_run_cost for additional ones.
* 成功匹配的外部行只需要掃描內(nèi)部關(guān)系的一部分(比例是inner_scan_frac)。
* 在這種情況下,不需要增加整個(gè)內(nèi)表運(yùn)行成本inner_run_cost,即使這比inner_rescan_run_cost還要高,
* 因?yàn)榭梢约僭O(shè)任何內(nèi)表掃描都不會(huì)掃描整個(gè)內(nèi)表。
* 因此,可以假設(shè)所有內(nèi)部掃描執(zhí)行都可以是全部成本的一部分,即使物化降低了重新掃描成本。
* 在寫(xiě)這篇文章時(shí),不可能在這里進(jìn)行物化的內(nèi)部掃描,因此inner_run_cost和inner_rescan_run_cost是相同的;
* 但是為了以防萬(wàn)一,對(duì)第一個(gè)匹配的元組使用inner_run_cost,對(duì)其他元組使用inner_rescan_run_cost。
*/
run_cost += inner_run_cost * inner_scan_frac;
if (outer_matched_rows > 1)
run_cost += (outer_matched_rows - 1) * inner_rescan_run_cost * inner_scan_frac;
/*
* Add the cost of inner-scan executions for unmatched outer rows.
* We estimate this as the same cost as returning the first tuple
* of a nonempty scan. We consider that these are all rescans,
* since we used inner_run_cost once already.
* 為不匹配的外部行添加內(nèi)部掃描執(zhí)行的成本。
* 這與返回非空掃描的第一個(gè)元組的成本相同。
* 我們認(rèn)為這些都是rescans,因?yàn)橐呀?jīng)使用了inner_run_cost一次。
*/
run_cost += outer_unmatched_rows *
inner_rescan_run_cost / inner_path_rows;
/*
* We won't be evaluating any quals at all for unmatched rows, so
* don't add them to ntuples.
* 對(duì)于所有不匹配的行,不會(huì)解析約束條件,因此不需要增加這些成本.
*/
}
else
{
/*
* Here, a complicating factor is that rescans may be cheaper than
* first scans. If we never scan all the way to the end of the
* inner rel, it might be (depending on the plan type) that we'd
* never pay the whole inner first-scan run cost. However it is
* difficult to estimate whether that will happen (and it could
* not happen if there are any unmatched outer rows!), so be
* conservative and always charge the whole first-scan cost once.
* We consider this charge to correspond to the first unmatched
* outer row, unless there isn't one in our estimate, in which
* case blame it on the first matched row.
* 在這里,一個(gè)復(fù)雜的因素是重新掃描可能比第一次掃描成本更低。
* 如果我們從來(lái)沒(méi)有掃描到內(nèi)表的末尾,那么(取決于計(jì)劃類(lèi)型)可能永遠(yuǎn)不會(huì)耗費(fèi)整個(gè)內(nèi)表first-scan運(yùn)行成本。
* 然而,很難估計(jì)這種情況是否會(huì)發(fā)生(如果存在無(wú)法匹配的外部行,這是不可能發(fā)生的!)
*/
/* First, count all unmatched join tuples as being processed */
//首先,統(tǒng)計(jì)所有處理過(guò)程中不匹配的行數(shù)
ntuples += outer_unmatched_rows * inner_path_rows;
/* Now add the forced full scan, and decrement appropriate count */
//現(xiàn)在強(qiáng)制添加全表掃描,并減少相應(yīng)的計(jì)數(shù)
run_cost += inner_run_cost;
if (outer_unmatched_rows >= 1)
outer_unmatched_rows -= 1;
else
outer_matched_rows -= 1;
/* Add inner run cost for additional outer tuples having matches */
//對(duì)于已經(jīng)匹配的外表行,增加內(nèi)表運(yùn)行成本
if (outer_matched_rows > 0)
run_cost += outer_matched_rows * inner_rescan_run_cost * inner_scan_frac;
/* Add inner run cost for additional unmatched outer tuples */
//對(duì)于未匹配的外表行,增加內(nèi)表運(yùn)行成本
if (outer_unmatched_rows > 0)
run_cost += outer_unmatched_rows * inner_rescan_run_cost;
}
}
else//普通連接
{
/* Normal-case source costs were included in preliminary estimate */
//正常情況下,源成本已在預(yù)估計(jì)算過(guò)程中統(tǒng)計(jì)
/* Compute number of tuples processed (not number emitted!) */
//計(jì)算處理的元組數(shù)量
ntuples = outer_path_rows * inner_path_rows;
}
/* CPU成本.CPU costs */
cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, root);
startup_cost += restrict_qual_cost.startup;
cpu_per_tuple = cpu_tuple_cost + restrict_qual_cost.per_tuple;
run_cost += cpu_per_tuple * ntuples;
/* tlist eval costs are paid per output row, not per tuple scanned */
startup_cost += path->path.pathtarget->cost.startup;
run_cost += path->path.pathtarget->cost.per_tuple * path->path.rows;
path->path.startup_cost = startup_cost;
path->path.total_cost = startup_cost + run_cost;
}
測(cè)試腳本如下
select a.*,b.grbh,b.je
from t_dwxx a,
lateral (select t1.dwbh,t1.grbh,t2.je
from t_grxx t1
inner join t_jfxx t2 on t1.dwbh = a.dwbh and t1.grbh = t2.grbh) b
where a.dwbh = '1001'
order by b.dwbh;
啟動(dòng)gdb,設(shè)置斷點(diǎn)
(gdb) b try_nestloop_path
Breakpoint 1 at 0x7ae950: file joinpath.c, line 373.
(gdb) c
Continuing.
Breakpoint 1, try_nestloop_path (root=0x2fb3b30, joinrel=0x2fc6d28, outer_path=0x2fc2540, inner_path=0x2fc1280,
pathkeys=0x0, jointype=JOIN_INNER, extra=0x7ffec5f496e0) at joinpath.c:373
373 RelOptInfo *innerrel = inner_path->parent;
進(jìn)入函數(shù)initial_cost_nestloop
(gdb)
422 initial_cost_nestloop(root, &workspace, jointype,
(gdb) step
initial_cost_nestloop (root=0x2fb3b30, workspace=0x7ffec5f49540, jointype=JOIN_INNER, outer_path=0x2fc2540,
inner_path=0x2fc1280, extra=0x7ffec5f496e0) at costsize.c:2323
2323 Cost startup_cost = 0;
進(jìn)入initial_cost_nestloop->cost_rescan函數(shù)
(gdb)
2332 cost_rescan(root, inner_path,
(gdb) step
cost_rescan (root=0x2fb3b30, path=0x2fc1280, rescan_startup_cost=0x7ffec5f494a0, rescan_total_cost=0x7ffec5f49498)
at costsize.c:3613
3613 switch (path->pathtype)
路徑類(lèi)型為T(mén)_SeqScan(在執(zhí)行該SQL語(yǔ)句前,刪除了t_grxx.dwbh上的索引)
(gdb) p path->pathtype
$1 = T_SeqScan
進(jìn)入相應(yīng)的處理邏輯,直接復(fù)制,啟動(dòng)成本&總成本與T_SeqScan一樣
(gdb) n
3699 *rescan_startup_cost = path->startup_cost;
(gdb) n
3700 *rescan_total_cost = path->total_cost;
(gdb)
3701 break;
(gdb)
回到initial_cost_nestloop,執(zhí)行完成,最終結(jié)果
外表存在約束條件dwbh='1001',只有一行,內(nèi)表在dwbh上沒(méi)有索引,使用了順序全表掃描
...
(gdb)
2381 workspace->run_cost = run_cost;
(gdb)
2382 }
(gdb) p *workspace
$4 = {startup_cost = 0.28500000000000003, total_cost = 1984.3025, run_cost = 1984.0174999999999,
inner_run_cost = 2.4712728827210812e-316, inner_rescan_run_cost = 6.9530954948263344e-310,
outer_rows = 3.9937697668447996e-317, inner_rows = 2.4712728827210812e-316, outer_skip_rows = 6.9530954948287059e-310,
inner_skip_rows = 6.9443062041807458e-310, numbuckets = 50092024, numbatches = 0,
inner_rows_total = 2.4751428001118265e-316}
回到try_nestloop_path
(gdb) n
try_nestloop_path (root=0x2fb3b30, joinrel=0x2fc6d28, outer_path=0x2fc2540, inner_path=0x2fc1280, pathkeys=0x0,
jointype=JOIN_INNER, extra=0x7ffec5f496e0) at joinpath.c:425
425 if (add_path_precheck(joinrel,
設(shè)置斷點(diǎn),進(jìn)入final_cost_nestloop
(gdb) b final_cost_nestloop
Breakpoint 2 at 0x79f3ff: file costsize.c, line 2397.
(gdb) c
Continuing.
Breakpoint 2, final_cost_nestloop (root=0x2fb3b30, path=0x2fc2ef0, workspace=0x7ffec5f49540, extra=0x7ffec5f496e0)
at costsize.c:2397
2397 Path *outer_path = path->outerjoinpath;
外表訪問(wèn)路徑是索引掃描(t_dwxx),內(nèi)表訪問(wèn)路徑是全表順序掃描
(gdb) p *outer_path
$6 = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x2fb3570, pathtarget = 0x2fb8ee0, param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 1, startup_cost = 0.28500000000000003,
total_cost = 8.3025000000000002, pathkeys = 0x0}
內(nèi)表行數(shù)為10,PG通過(guò)統(tǒng)計(jì)信息準(zhǔn)確計(jì)算了該值
(gdb) n
2400 double inner_path_rows = inner_path->rows;
(gdb)
2401 Cost startup_cost = workspace->startup_cost;
(gdb) p inner_path_rows
$8 = 10
計(jì)算成本
(gdb)
2556 cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, root);
(gdb)
2557 startup_cost += restrict_qual_cost.startup;
(gdb) p restrict_qual_cost
$10 = {startup = 0, per_tuple = 0}
最終結(jié)果,T_NestPath,總成本total_cost = 1984.4024999999999,啟動(dòng)成本startup_cost = 0.28500000000000003
2567 }
(gdb) p *path
$11 = {path = {type = T_NestPath, pathtype = T_NestLoop, parent = 0x2fc6d28, pathtarget = 0x2fc6f60, param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 10, startup_cost = 0.28500000000000003,
total_cost = 1984.4024999999999, pathkeys = 0x0}, jointype = JOIN_INNER, inner_unique = false,
outerjoinpath = 0x2fc2540, innerjoinpath = 0x2fc1280, joinrestrictinfo = 0x0}
完成調(diào)用
(gdb) n
create_nestloop_path (root=0x2fb3b30, joinrel=0x2fc6d28, jointype=JOIN_INNER, workspace=0x7ffec5f49540,
extra=0x7ffec5f496e0, outer_path=0x2fc2540, inner_path=0x2fc1280, restrict_clauses=0x0, pathkeys=0x0,
required_outer=0x0) at pathnode.c:2229
2229 return pathnode;
DONE!
allpaths.c
cost.h
costsize.c
PG Document:Query Planning
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。