溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶(hù)服務(wù)條款》

PostgreSQL 源碼解讀(214)- 后臺(tái)進(jìn)程#13(checkpointer-IsCheckpointOnSchedule)

發(fā)布時(shí)間:2020-08-08 01:57:40 來(lái)源:ITPUB博客 閱讀:340 作者:husthxd 欄目:關(guān)系型數(shù)據(jù)庫(kù)

本節(jié)介紹了checkpoint中用于控制checkpoint刷盤(pán)頻率的函數(shù):IsCheckpointOnSchedule.

一、數(shù)據(jù)結(jié)構(gòu)

宏定義
checkpoints request flag bits
checkpoints request flag bits,檢查點(diǎn)請(qǐng)求標(biāo)記位定義.


/*
 * OR-able request flag bits for checkpoints.  The "cause" bits are used only
 * for logging purposes.  Note: the flags must be defined so that it's
 * sensible to OR together request flags arising from different requestors.
 */
/* These directly affect the behavior of CreateCheckPoint and subsidiaries */
#define CHECKPOINT_IS_SHUTDOWN  0x0001  /* Checkpoint is for shutdown */
#define CHECKPOINT_END_OF_RECOVERY  0x0002  /* Like shutdown checkpoint, but
                       * issued at end of WAL recovery */
#define CHECKPOINT_IMMEDIATE  0x0004  /* Do it without delays */
#define CHECKPOINT_FORCE    0x0008  /* Force even if no activity */
#define CHECKPOINT_FLUSH_ALL  0x0010  /* Flush all pages, including those
                     * belonging to unlogged tables */
/* These are important to RequestCheckpoint */
#define CHECKPOINT_WAIT     0x0020  /* Wait for completion */
#define CHECKPOINT_REQUESTED  0x0040  /* Checkpoint request has been made */
/* These indicate the cause of a checkpoint request */
#define CHECKPOINT_CAUSE_XLOG 0x0080  /* XLOG consumption */
#define CHECKPOINT_CAUSE_TIME 0x0100  /* Elapsed time */

WRITES_PER_ABSORB


/* interval for calling AbsorbSyncRequests in CheckpointWriteDelay */
//調(diào)用AbsorbSyncRequests的間隔,默認(rèn)值為1000
#define WRITES_PER_ABSORB   1000

二、源碼解讀

IsCheckpointOnSchedule
該函數(shù)判斷是否在完成checkpoint的調(diào)度中,如返回T則可以休息,否則返回F則需要干活.


/*
 * Calculate CheckPointSegments based on max_wal_size_mb and
 * checkpoint_completion_target.
 * 計(jì)算CheckPointSegments
 */
static void
CalculateCheckpointSegments(void)
{
  double    target;
  /*-------
   * Calculate the distance at which to trigger a checkpoint, to avoid
   * exceeding max_wal_size_mb. This is based on two assumptions:
   *
   * a) we keep WAL for only one checkpoint cycle (prior to PG11 we kept
   *    WAL for two checkpoint cycles to allow us to recover from the
   *    secondary checkpoint if the first checkpoint failed, though we
   *    only did this on the master anyway, not on standby. Keeping just
   *    one checkpoint simplifies processing and reduces disk space in
   *    many smaller databases.)
   * b) during checkpoint, we consume checkpoint_completion_target *
   *    number of segments consumed between checkpoints.
   *-------
   */
  //#define ConvertToXSegs(x,segsize) (x / ((segsize) / (1024 * 1024)))
  target = (double) ConvertToXSegs(max_wal_size_mb, wal_segment_size) /
    (1.0 + CheckPointCompletionTarget);
  /* round down */
  CheckPointSegments = (int) target;
  if (CheckPointSegments < 1)
    CheckPointSegments = 1;
}
/*
 * IsCheckpointOnSchedule -- are we on schedule to finish this checkpoint
 *     (or restartpoint) in time?
 * IsCheckpointOnSchedule -- 是否在完成checkpoint的調(diào)度中
 *
 * Compares the current progress against the time/segments elapsed since last
 * checkpoint, and returns true if the progress we've made this far is greater
 * than the elapsed time/segments.
 * 當(dāng)前的進(jìn)度與消逝的time/xlog segments進(jìn)行比較,如果進(jìn)度要早,那么返回T(進(jìn)入休息狀態(tài))
 */
static bool
IsCheckpointOnSchedule(double progress)
{
  XLogRecPtr  recptr;
  struct timeval now;
  double    elapsed_xlogs,
        elapsed_time;
  Assert(ckpt_active);
  /* Scale progress according to checkpoint_completion_target. */
  //實(shí)際進(jìn)度調(diào)整為progress*checkpoint_completion_target
  progress *= CheckPointCompletionTarget;
  /*
   * Check against the cached value first. Only do the more expensive
   * calculations once we reach the target previously calculated. Since
   * neither time or WAL insert pointer moves backwards, a freshly
   * calculated value can only be greater than or equal to the cached value.
   * 如果進(jìn)度小于緩存值,返回F,需加快進(jìn)度了!
   */
  if (progress < ckpt_cached_elapsed)
    return false;
  /*
   * Check progress against WAL segments written and CheckPointSegments.
   * 進(jìn)度 vs WAL
   *
   * We compare the current WAL insert location against the location
   * computed before calling CreateCheckPoint. The code in XLogInsert that
   * actually triggers a checkpoint when CheckPointSegments is exceeded
   * compares against RedoRecptr, so this is not completely accurate.
   * However, it's good enough for our purposes, we're only calculating an
   * estimate anyway.
   *
   * During recovery, we compare last replayed WAL record's location with
   * the location computed before calling CreateRestartPoint. That maintains
   * the same pacing as we have during checkpoints in normal operation, but
   * we might exceed max_wal_size by a fair amount. That's because there can
   * be a large gap between a checkpoint's redo-pointer and the checkpoint
   * record itself, and we only start the restartpoint after we've seen the
   * checkpoint record. (The gap is typically up to CheckPointSegments *
   * checkpoint_completion_target where checkpoint_completion_target is the
   * value that was in effect when the WAL was generated).
   */
  if (RecoveryInProgress())
    recptr = GetXLogReplayRecPtr(NULL);
  else
    recptr = GetInsertRecPtr();
  elapsed_xlogs = (((double) (recptr - ckpt_start_recptr)) /
           wal_segment_size) / CheckPointSegments;
  if (progress < elapsed_xlogs)
  {
    //進(jìn)度小于產(chǎn)生xlogs的速度,需干活
    ckpt_cached_elapsed = elapsed_xlogs;
    return false;
  }
  /*
   * Check progress against time elapsed and checkpoint_timeout.
   * 比較時(shí)間
   */
  gettimeofday(&now, NULL);
  elapsed_time = ((double) ((pg_time_t) now.tv_sec - ckpt_start_time) +
          now.tv_usec / 1000000.0) / CheckPointTimeout;
  if (progress < elapsed_time)
  {
    //進(jìn)度慢于消逝的時(shí)間,需干活
    ckpt_cached_elapsed = elapsed_time;
    return false;
  }
  /* It looks like we're on schedule. */
  //處于調(diào)度中,可以休息
  return true;
}

三、跟蹤分析

N/A

四、參考資料

PG Source Code
PgSQL · 特性分析 · 談?wù)刢heckpoint的調(diào)度

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀(guān)點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

gr he
AI