Go語言的GC流程解析

發(fā)布時(shí)間：2021-09-16 17:56:13 來源：億速云閱讀：166 作者：chen 欄目：web開發(fā)

這篇文章主要介紹“Go語言的GC流程解析”，在日常操作中，相信很多人在Go語言的GC流程解析問題上存在疑惑，小編查閱了各式資料，整理出簡單好用的操作方法，希望對大家解答”Go語言的GC流程解析”的疑惑有所幫助！接下來，請跟著小編一起來學(xué)習(xí)吧！

三色標(biāo)記原理

我們首先看一張圖，大概就會對三色標(biāo)記法有一個(gè)大致的了解：

Go語言的GC流程解析

原理：

鴻蒙官方戰(zhàn)略合作共建——HarmonyOS技術(shù)社區(qū)
首先把所有的對象都放到白色的集合中
從根節(jié)點(diǎn)開始遍歷對象，遍歷到的白色對象從白色集合中放到灰色集合中
遍歷灰色集合中的對象，把灰色對象引用的白色集合的對象放入到灰色集合中，同時(shí)把遍歷過的灰色集合中的對象放到黑色的集合中
循環(huán)步驟3，知道灰色集合中沒有對象
步驟4結(jié)束后，白色集合中的對象就是不可達(dá)對象，也就是垃圾，進(jìn)行回收

寫屏障

Go在進(jìn)行三色標(biāo)記的時(shí)候并沒有STW，也就是說，此時(shí)的對象還是可以進(jìn)行修改

那么我們考慮一下，下面的情況

Go語言的GC流程解析

我們在進(jìn)行三色標(biāo)記中掃描灰色集合中，掃描到了對象A，并標(biāo)記了對象A的所有引用，這時(shí)候，開始掃描對象D的引用，而此時(shí)，另一個(gè)goroutine修改了D->E的引用，變成了如下圖所示

Go語言的GC流程解析

這樣會不會導(dǎo)致E對象就掃描不到了，而被誤認(rèn)為為白色對象，也就是垃圾

寫屏障就是為了解決這樣的問題，引入寫屏障后，在上述步驟后，E會被認(rèn)為是存活的，即使后面E被A對象拋棄，E會被在下一輪的GC中進(jìn)行回收，這一輪GC中是不會對對象E進(jìn)行回收的

Go1.9中開始啟用了混合寫屏障，偽代碼如下

writePointer(slot, ptr):      shade(*slot)      if any stack is grey:          shade(ptr)      *slot = ptr

混合寫屏障會同時(shí)標(biāo)記指針寫入目標(biāo)的"原指針"和“新指針".

標(biāo)記原指針的原因是, 其他運(yùn)行中的線程有可能會同時(shí)把這個(gè)指針的值復(fù)制到寄存器或者棧上的本地變量

因?yàn)閺?fù)制指針到寄存器或者棧上的本地變量不會經(jīng)過寫屏障, 所以有可能會導(dǎo)致指針不被標(biāo)記, 試想下面的情況：

[go] b = obj  [go] oldx = nil  [gc] scan oldx...  [go] oldx = b.x // 復(fù)制b.x到本地變量, 不進(jìn)過寫屏障  [go] b.x = ptr // 寫屏障應(yīng)該標(biāo)記b.x的原值  [gc] scan b...  如果寫屏障不標(biāo)記原值, 那么oldx就不會被掃描到.

標(biāo)記新指針的原因是, 其他運(yùn)行中的線程有可能會轉(zhuǎn)移指針的位置, 試想下面的情況:

[go] a = ptr  [go] b = obj  [gc] scan b...  [go] b.x = a // 寫屏障應(yīng)該標(biāo)記b.x的新值  [go] a = nil  [gc] scan a...  如果寫屏障不標(biāo)記新值, 那么ptr就不會被掃描到.

混合寫屏障可以讓GC在并行標(biāo)記結(jié)束后不需要重新掃描各個(gè)G的堆棧, 可以減少M(fèi)ark Termination中的STW時(shí)間

除了寫屏障外, 在GC的過程中所有新分配的對象都會立刻變?yōu)楹谏? 在上面的mallocgc函數(shù)中可以看到

回收流程

GO的GC是并行GC, 也就是GC的大部分處理和普通的go代碼是同時(shí)運(yùn)行的, 這讓GO的GC流程比較復(fù)雜.

首先GC有四個(gè)階段, 它們分別是:

Sweep Termination: 對未清掃的span進(jìn)行清掃, 只有上一輪的GC的清掃工作完成才可以開始新一輪的GC
Mark: 掃描所有根對象, 和根對象可以到達(dá)的所有對象, 標(biāo)記它們不被回收
Mark Termination: 完成標(biāo)記工作, 重新掃描部分根對象(要求STW)
Sweep: 按標(biāo)記結(jié)果清掃span

下圖是比較完整的GC流程, 并按顏色對這四個(gè)階段進(jìn)行了分類:

Go語言的GC流程解析

在GC過程中會有兩種后臺任務(wù)(G), 一種是標(biāo)記用的后臺任務(wù), 一種是清掃用的后臺任務(wù).

標(biāo)記用的后臺任務(wù)會在需要時(shí)啟動, 可以同時(shí)工作的后臺任務(wù)數(shù)量大約是P的數(shù)量的25%, 也就是go所講的讓25%的cpu用在GC上的根據(jù).

清掃用的后臺任務(wù)在程序啟動時(shí)會啟動一個(gè), 進(jìn)入清掃階段時(shí)喚醒.

目前整個(gè)GC流程會進(jìn)行兩次STW(Stop The World), 第一次是Mark階段的開始, 第二次是Mark Termination階段.

第一次STW會準(zhǔn)備根對象的掃描, 啟動寫屏障(Write Barrier)和輔助GC(mutator assist).

第二次STW會重新掃描部分根對象, 禁用寫屏障(Write Barrier)和輔助GC(mutator assist).

需要注意的是, 不是所有根對象的掃描都需要STW, 例如掃描棧上的對象只需要停止擁有該棧的G.

寫屏障的實(shí)現(xiàn)使用了Hybrid Write Barrier, 大幅減少了第二次STW的時(shí)間.

源碼分析

gcStart

func gcStart(mode gcMode, trigger gcTrigger) {      // Since this is called from malloc and malloc is called in      // the guts of a number of libraries that might be holding      // locks, don't attempt to start GC in non-preemptible or      // potentially unstable situations.      // 判斷當(dāng)前g是否可以搶占，不可搶占時(shí)不觸發(fā)GC      mp := acquirem()      if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" {          releasem(mp)          return      }      releasem(mp)      mp = nil      // Pick up the remaining unswept/not being swept spans concurrently      //      // This shouldn't happen if we're being invoked in background      // mode since proportional sweep should have just finished      // sweeping everything, but rounding errors, etc, may leave a      // few spans unswept. In forced mode, this is necessary since      // GC can be forced at any point in the sweeping cycle.      //      // We check the transition condition continuously here in case      // this G gets delayed in to the next GC cycle.      // 清掃 殘留的未清掃的垃圾      for trigger.test() && gosweepone() != ^uintptr(0) {          sweep.nbgsweep++      }      // Perform GC initialization and the sweep termination      // transition.      semacquire(&work.startSema)      // Re-check transition condition under transition lock.      // 判斷gcTrriger的條件是否成立      if !trigger.test() {          semrelease(&work.startSema)          return      }      // For stats, check if this GC was forced by the user      // 判斷并記錄GC是否被強(qiáng)制執(zhí)行的，runtime.GC()可以被用戶調(diào)用并強(qiáng)制執(zhí)行      work.userForced = trigger.kind == gcTriggerAlways || trigger.kind == gcTriggerCycle      // In gcstoptheworld debug mode, upgrade the mode accordingly.      // We do this after re-checking the transition condition so      // that multiple goroutines that detect the heap trigger don't      // start multiple STW GCs.      // 設(shè)置gc的mode      if mode == gcBackgroundMode {          if debug.gcstoptheworld == 1 {              mode = gcForceMode          } else if debug.gcstoptheworld == 2 {              mode = gcForceBlockMode          }      }      // Ok, we're doing it! Stop everybody else      semacquire(&worldsema)     if trace.enabled {          traceGCStart()      }      // 啟動后臺標(biāo)記任務(wù)      if mode == gcBackgroundMode {          gcBgMarkStartWorkers()      }      // 重置gc 標(biāo)記相關(guān)的狀態(tài)      gcResetMarkState()      work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs      if work.stwprocs > ncpu {          // This is used to compute CPU time of the STW phases,          // so it can't be more than ncpu, even if GOMAXPROCS is.          work.stwprocs = ncpu      }      work.heap0 = atomic.Load64(&memstats.heap_live)      work.pauseNS = 0      work.mode = mode      now := nanotime()      work.tSweepTerm = now      work.pauseStart = now      if trace.enabled {          traceGCSTWStart(1)      }      // STW,停止世界      systemstack(stopTheWorldWithSema)      // Finish sweep before we start concurrent scan.      // 先清掃上一輪的垃圾，確保上輪GC完成      systemstack(func() {          finishsweep_m()      })      // clearpools before we start the GC. If we wait they memory will not be      // reclaimed until the next GC cycle.      // 清理 sync.pool sched.sudogcache、sched.deferpool，這里不展開，sync.pool已經(jīng)說了，剩余的后面的文章會涉及      clearpools()      // 增加GC技術(shù)      work.cycles++      if mode == gcBackgroundMode { // Do as much work concurrently as possible          gcController.startCycle()          work.heapGoal = memstats.next_gc          // Enter concurrent mark phase and enable          // write barriers.         //          // Because the world is stopped, all Ps will          // observe that write barriers are enabled by          // the time we start the world and begin          // scanning.          //          // Write barriers must be enabled before assists are          // enabled because they must be enabled before          // any non-leaf heap objects are marked. Since          // allocations are blocked until assists can          // happen, we want enable assists as early as          // possible.          // 設(shè)置GC的狀態(tài)為 gcMark          setGCPhase(_GCmark)          // 更新 bgmark 的狀態(tài)          gcBgMarkPrepare() // Must happen before assist enable.          // 計(jì)算并排隊(duì)root 掃描任務(wù)，并初始化相關(guān)掃描任務(wù)狀態(tài)          gcMarkRootPrepare()          // Mark all active tinyalloc blocks. Since we're          // allocating from these, they need to be black like          // other allocations. The alternative is to blacken          // the tiny block on every allocation from it, which          // would slow down the tiny allocator.          // 標(biāo)記 tiny 對象          gcMarkTinyAllocs()          // At this point all Ps have enabled the write          // barrier, thus maintaining the no white to          // black invariant. Enable mutator assists to          // put back-pressure on fast allocating          // mutators.          // 設(shè)置 gcBlackenEnabled 為 1，啟用寫屏障          atomic.Store(&gcBlackenEnabled, 1)          // Assists and workers can start the moment we start          // the world.          gcController.markStartTime = now          // Concurrent mark.          systemstack(func() {              now = startTheWorldWithSema(trace.enabled)          })          work.pauseNS += now - work.pauseStart          work.tMark = now      } else {          // 非并行模式          // 記錄完成標(biāo)記階段的開始時(shí)間          if trace.enabled {              // Switch to mark termination STW.              traceGCSTWDone()              traceGCSTWStart(0)          }          t := nanotime()          work.tMark, work.tMarkTerm = t, t          workwork.heapGoal = work.heap0          // Perform mark termination. This will restart the world.          // stw,進(jìn)行標(biāo)記，清掃并start the world          gcMarkTermination(memstats.triggerRatio)      }      semrelease(&work.startSema)  }

gcBgMarkStartWorkers

這個(gè)函數(shù)準(zhǔn)備一些執(zhí)行bg mark工作的goroutine，但是這些goroutine并不是立即工作的，而是到等到GC的狀態(tài)被標(biāo)記為gcMark 才開始工作，見上個(gè)函數(shù)的119行

func gcBgMarkStartWorkers() {      // Background marking is performed by per-P G's. Ensure that      // each P has a background GC G.      for _, p := range allp {          if p.gcBgMarkWorker == 0 {              go gcBgMarkWorker(p)              // 等待gcBgMarkWorker goroutine 的 bgMarkReady信號再繼續(xù)              notetsleepg(&work.bgMarkReady, -1)              noteclear(&work.bgMarkReady)          }      }  }

gcBgMarkWorker

后臺標(biāo)記任務(wù)的函數(shù)

func gcBgMarkWorker(_p_ *p) {      gp := getg()      // 用于休眠結(jié)束后重新獲取p和m      type parkInfo struct {          m      muintptr // Release this m on park.          attach puintptr // If non-nil, attach to this p on park.      }      // We pass park to a gopark unlock function, so it can't be on      // the stack (see gopark). Prevent deadlock from recursively      // starting GC by disabling preemption.      gp.m.preemptoff = "GC worker init"      park := new(parkInfo)      gp.m.preemptoff = ""      // 設(shè)置park的m和p的信息，留著后面?zhèn)鹘ogopark，在被gcController.findRunnable喚醒的時(shí)候，便于找回      park.m.set(acquirem())      park.attach.set(_p_)      // Inform gcBgMarkStartWorkers that this worker is ready.      // After this point, the background mark worker is scheduled      // cooperatively by gcController.findRunnable. Hence, it must      // never be preempted, as this would put it into _Grunnable      // and put it on a run queue. Instead, when the preempt flag      // is set, this puts itself into _Gwaiting to be woken up by      // gcController.findRunnable at the appropriate time.      // 讓gcBgMarkStartWorkers notetsleepg停止等待并繼續(xù)及退出      notewakeup(&work.bgMarkReady)      for {          // Go to sleep until woken by gcController.findRunnable.          // We can't releasem yet since even the call to gopark          // may be preempted.          // 讓g進(jìn)入休眠          gopark(func(g *g, parkp unsafe.Pointer) bool {              park := (*parkInfo)(parkp)              // The worker G is no longer running, so it's              // now safe to allow preemption.              // 釋放當(dāng)前搶占的m              releasem(park.m.ptr())              // If the worker isn't attached to its P,              // attach now. During initialization and after              // a phase change, the worker may have been              // running on a different P. As soon as we              // attach, the owner P may schedule the              // worker, so this must be done after the G is              // stopped.              // 設(shè)置關(guān)聯(lián)p，上面已經(jīng)設(shè)置過了              if park.attach != 0 {                  p := park.attach.ptr()                  park.attach.set(nil)                  // cas the worker because we may be                  // racing with a new worker starting                  // on this P.                  if !p.gcBgMarkWorker.cas(0, guintptr(unsafe.Pointer(g))) {                      // The P got a new worker.                      // Exit this worker.                      return false                  }              }              return true          }, unsafe.Pointer(park), waitReasonGCWorkerIdle, traceEvGoBlock, 0)          // Loop until the P dies and disassociates this          // worker (the P may later be reused, in which case          // it will get a new worker) or we failed to associate.          // 檢查P的gcBgMarkWorker是否和當(dāng)前的G一致, 不一致時(shí)結(jié)束當(dāng)前的任務(wù)          if _p_.gcBgMarkWorker.ptr() != gp {              break          }          // Disable preemption so we can use the gcw. If the          // scheduler wants to preempt us, we'll stop draining,          // dispose the gcw, and then preempt.          // gopark第一個(gè)函數(shù)中釋放了m，這里再搶占回來          park.m.set(acquirem())          if gcBlackenEnabled == 0 {              throw("gcBgMarkWorker: blackening not enabled")          }          startTime := nanotime()          // 設(shè)置gcmark的開始時(shí)間          _p_.gcMarkWorkerStartTime = startTime          decnwait := atomic.Xadd(&work.nwait, -1)          if decnwait == work.nproc {              println("runtime: workwork.nwait=", decnwait, "work.nproc=", work.nproc)              throw("work.nwait was > work.nproc")          }          // 切換到g0工作          systemstack(func() {              // Mark our goroutine preemptible so its stack              // can be scanned. This lets two mark workers              // scan each other (otherwise, they would              // deadlock). We must not modify anything on              // the G stack. However, stack shrinking is              // disabled for mark workers, so it is safe to              // read from the G stack.              // 設(shè)置G的狀態(tài)為waiting，以便于另一個(gè)g掃描它的棧(兩個(gè)g可以互相掃描對方的棧)              casgstatus(gp, _Grunning, _Gwaiting)              switch _p_.gcMarkWorkerMode {              default:                  throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")              case gcMarkWorkerDedicatedMode:                  // 專心執(zhí)行標(biāo)記工作的模式                  gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)                  if gp.preempt {                      // 被搶占了，把所有本地運(yùn)行隊(duì)列中的G放到全局運(yùn)行隊(duì)列中                      // We were preempted. This is                      // a useful signal to kick                      // everything out of the run                      // queue so it can run                      // somewhere else.                      lock(&sched.lock)                      for {                          gp, _ := runqget(_p_)                          if gp == nil {                              break                          }                          globrunqput(gp)                      }                      unlock(&sched.lock)                  }                  // Go back to draining, this time                  // without preemption.                  // 繼續(xù)執(zhí)行標(biāo)記工作                  gcDrain(&_p_.gcw, gcDrainNoBlock|gcDrainFlushBgCredit)              case gcMarkWorkerFractionalMode:                  // 執(zhí)行標(biāo)記工作，知道被搶占                  gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)              case gcMarkWorkerIdleMode:                  // 空閑的時(shí)候執(zhí)行標(biāo)記工作                  gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)              }              // 把G的waiting狀態(tài)轉(zhuǎn)換到runing狀態(tài)              casgstatus(gp, _Gwaiting, _Grunning)          })          // If we are nearing the end of mark, dispose          // of the cache promptly. We must do this          // before signaling that we're no longer          // working so that other workers can't observe          // no workers and no work while we have this          // cached, and before we compute done.          // 及時(shí)處理本地緩存，上交到全局的隊(duì)列中          if gcBlackenPromptly {              _p_.gcw.dispose()          }          // Account for time.          // 累加耗時(shí)          duration := nanotime() - startTime          switch _p_.gcMarkWorkerMode {          case gcMarkWorkerDedicatedMode:              atomic.Xaddint64(&gcController.dedicatedMarkTime, duration)              atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 1)          case gcMarkWorkerFractionalMode:              atomic.Xaddint64(&gcController.fractionalMarkTime, duration)              atomic.Xaddint64(&_p_.gcFractionalMarkTime, duration)          case gcMarkWorkerIdleMode:              atomic.Xaddint64(&gcController.idleMarkTime, duration)          }          // Was this the last worker and did we run out          // of work?          incnwait := atomic.Xadd(&work.nwait, +1)          if incnwait > work.nproc {              println("runtime: p.gcMarkWorkerMode=", _p_.gcMarkWorkerMode,                  "workwork.nwait=", incnwait, "work.nproc=", work.nproc)              throw("work.nwait > work.nproc")          }          // If this worker reached a background mark completion          // point, signal the main GC goroutine.          if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {              // Make this G preemptible and disassociate it              // as the worker for this P so              // findRunnableGCWorker doesn't try to              // schedule it.              // 取消p m的關(guān)聯(lián)              _p_.gcBgMarkWorker.set(nil)              releasem(park.m.ptr())              gcMarkDone()              // Disable preemption and prepare to reattach              // to the P.              //              // We may be running on a different P at this              // point, so we can't reattach until this G is              // parked.              park.m.set(acquirem())              park.attach.set(_p_)          }      }  }

gcDrain

三色標(biāo)記的主要實(shí)現(xiàn)

gcDrain掃描所有的roots和對象，并表黑灰色對象，知道所有的roots和對象都被標(biāo)記

func gcDrain(gcw *gcWork, flags gcDrainFlags) {      if !writeBarrier.needed {          throw("gcDrain phase incorrect")      }      gp := getg().m.curg      // 看到搶占標(biāo)識是否要返回      preemptible := flags&gcDrainUntilPreempt != 0      // 沒有任務(wù)時(shí)是否要等待任務(wù)      blocking := flags&(gcDrainUntilPreempt|gcDrainIdle|gcDrainFractional|gcDrainNoBlock) == 0      // 是否計(jì)算后臺的掃描量來減少輔助GC和喚醒等待中的G      flushBgCredit := flags&gcDrainFlushBgCredit != 0      // 是否在空閑的時(shí)候執(zhí)行標(biāo)記任務(wù)      idle := flags&gcDrainIdle != 0      // 記錄初始的已經(jīng)執(zhí)行過的掃描任務(wù)      initScanWork := gcw.scanWork      // checkWork is the scan work before performing the next      // self-preempt check.      // 設(shè)置對應(yīng)模式的工作檢查函數(shù)      checkWork := int64(1<<63 - 1)      var check func() bool      if flags&(gcDrainIdle|gcDrainFractional) != 0 {          checkWork = initScanWork + drainCheckThreshold          if idle {              check = pollWork          } else if flags&gcDrainFractional != 0 {              check = pollFractionalWorkerExit          }      }      // Drain root marking jobs.      // 如果root對象沒有掃描完，則掃描      if work.markrootNext < work.markrootJobs {          for !(preemptible && gp.preempt) {              job := atomic.Xadd(&work.markrootNext, +1) - 1              if job >= work.markrootJobs {                  break              }              // 執(zhí)行root掃描任務(wù)              markroot(gcw, job)              if check != nil && check() {                  goto done              }          }      }      // Drain heap marking jobs.      // 循環(huán)直到被搶占      for !(preemptible && gp.preempt) {          // Try to keep work available on the global queue. We used to          // check if there were waiting workers, but it's better to          // just keep work available than to make workers wait. In the          // worst case, we'll do O(log(_WorkbufSize)) unnecessary          // balances.          if work.full == 0 {              // 平衡工作，如果全局的標(biāo)記隊(duì)列為空，則分一部分工作到全局隊(duì)列中              gcw.balance()          }          var b uintptr          if blocking {              b = gcw.get()          } else {              b = gcw.tryGetFast()              if b == 0 {                  b = gcw.tryGet()              }          }          // 獲取任務(wù)失敗，跳出循環(huán)          if b == 0 {              // work barrier reached or tryGet failed.              break          }          // 掃描獲取的到對象          scanobject(b, gcw)          // Flush background scan work credit to the global          // account if we've accumulated enough locally so          // mutator assists can draw on it.          // 如果當(dāng)前掃描的數(shù)量超過了 gcCreditSlack，就把掃描的對象數(shù)量加到全局的數(shù)量，批量更新          if gcw.scanWork >= gcCreditSlack {              atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)              if flushBgCredit {                  gcFlushBgCredit(gcw.scanWork - initScanWork)                  initScanWork = 0              }              checkWork -= gcw.scanWork              gcw.scanWork = 0              // 如果掃描的對象數(shù)量已經(jīng)達(dá)到了 執(zhí)行下次搶占的目標(biāo)數(shù)量 checkWork， 則調(diào)用對應(yīng)模式的函數(shù)              // idle模式為 pollWork， Fractional模式為 pollFractionalWorkerExit ，在第20行              if checkWork <= 0 {                  checkWork += drainCheckThreshold                  if check != nil && check() {                      break                  }              }          }      }      // In blocking mode, write barriers are not allowed after this      // point because we must preserve the condition that the work      // buffers are empty.  done:      // Flush remaining scan work credit.      if gcw.scanWork > 0 {          // 把掃描的對象數(shù)量添加到全局          atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)          if flushBgCredit {              gcFlushBgCredit(gcw.scanWork - initScanWork)          }          gcw.scanWork = 0      }  }

markroot

這個(gè)被用于根對象掃描

func markroot(gcw *gcWork, i uint32) {      // TODO(austin): This is a bit ridiculous. Compute and store      // the bases in gcMarkRootPrepare instead of the counts.      baseFlushCache := uint32(fixedRootCount)      baseData := baseFlushCache + uint32(work.nFlushCacheRoots)      baseBSS := baseData + uint32(work.nDataRoots)      baseSpans := baseBSS + uint32(work.nBSSRoots)      baseStacks := baseSpans + uint32(work.nSpanRoots)      end := baseStacks + uint32(work.nStackRoots)      // Note: if you add a case here, please also update heapdump.go:dumproots.      switch {      // 釋放mcache中的span      case baseFlushCache <= i && i < baseData:          flushmcache(int(i - baseFlushCache))      // 掃描可讀寫的全局變量      case baseData <= i && i < baseBSS:          for _, datap := range activeModules() {              markrootBlock(datap.data, datap.edata-datap.data, datap.gcdatamask.bytedata, gcw, int(i-baseData))          }      // 掃描只讀的全局隊(duì)列      case baseBSS <= i && i < baseSpans:          for _, datap := range activeModules() {              markrootBlock(datap.bss, datap.ebss-datap.bss, datap.gcbssmask.bytedata, gcw, int(i-baseBSS))          }      // 掃描Finalizer隊(duì)列      case i == fixedRootFinalizers:          // Only do this once per GC cycle since we don't call          // queuefinalizer during marking.          if work.markrootDone {              break          }          for fb := allfin; fb != nil; fbfb = fb.alllink {              cnt := uintptr(atomic.Load(&fb.cnt))              scanblock(uintptr(unsafe.Pointer(&fb.fin[0])), cnt*unsafe.Sizeof(fb.fin[0]), &finptrmask[0], gcw)          }      // 釋放已經(jīng)終止的stack      case i == fixedRootFreeGStacks:          // Only do this once per GC cycle; preferably          // concurrently.          if !work.markrootDone {              // Switch to the system stack so we can call              // stackfree.              systemstack(markrootFreeGStacks)          }      // 掃描MSpan.specials      case baseSpans <= i && i < baseStacks:          // mark MSpan.specials          markrootSpans(gcw, int(i-baseSpans))      default:          // the rest is scanning goroutine stacks          // 獲取需要掃描的g          var gp *g          if baseStacks <= i && i < end {              gp = allgs[i-baseStacks]          } else {              throw("markroot: bad index")          }          // remember when we've first observed the G blocked          // needed only to output in traceback          status := readgstatus(gp) // We are not in a scan state          if (status == _Gwaiting || status == _Gsyscall) && gp.waitsince == 0 {              gp.waitsince = work.tstart          }          // scang must be done on the system stack in case          // we're trying to scan our own stack.          // 轉(zhuǎn)交給g0進(jìn)行掃描          systemstack(func() {              // If this is a self-scan, put the user G in              // _Gwaiting to prevent self-deadlock. It may              // already be in _Gwaiting if this is a mark              // worker or we're in mark termination.              userG := getg().m.curg              selfScan := gp == userG && readgstatus(userG) == _Grunning              // 如果是掃描自己的，則轉(zhuǎn)換自己的g的狀態(tài)              if selfScan {                  casgstatus(userG, _Grunning, _Gwaiting)                  userG.waitreason = waitReasonGarbageCollectionScan              }              // TODO: scang blocks until gp's stack has              // been scanned, which may take a while for              // running goroutines. Consider doing this in              // two phases where the first is non-blocking:              // we scan the stacks we can and ask running              // goroutines to scan themselves; and the              // second blocks.              // 掃描g的棧              scang(gp, gcw)              if selfScan {                  casgstatus(userG, _Gwaiting, _Grunning)              }          })      }  }

markRootBlock

根據(jù) ptrmask0，來掃描[b0, b0+n0)區(qū)域

func markrootBlock(b0, n0 uintptr, ptrmask0 *uint8, gcw *gcWork, shard int) {      if rootBlockBytes%(8*sys.PtrSize) != 0 {          // This is necessary to pick byte offsets in ptrmask0.          throw("rootBlockBytes must be a multiple of 8*ptrSize")      }      b := b0 + uintptr(shard)*rootBlockBytes      // 如果需掃描的block區(qū)域，超出b0+n0的區(qū)域，直接返回      if b >= b0+n0 {          return      }      ptrmask := (*uint8)(add(unsafe.Pointer(ptrmask0), uintptr(shard)*(rootBlockBytes/(8*sys.PtrSize))))      n := uintptr(rootBlockBytes)      if b+n > b0+n0 {          n = b0 + n0 - b      }      // Scan this shard.      // 掃描給定block的shard      scanblock(b, n, ptrmask, gcw)  }

scanblock

func scanblock(b0, n0 uintptr, ptrmask *uint8, gcw *gcWork) {      // Use local copies of original parameters, so that a stack trace      // due to one of the throws below shows the original block      // base and extent.      b := b0      n := n0      for i := uintptr(0); i < n; {          // Find bits for the next word.          // 找到bitmap中對應(yīng)的bits          bits := uint32(*addb(ptrmask, i/(sys.PtrSize*8)))          if bits == 0 {              i += sys.PtrSize * 8              continue          }          for j := 0; j < 8 && i < n; j++ {              if bits&1 != 0 {                  // 如果該地址包含指針                  // Same work as in scanobject; see comments there.                  obj := *(*uintptr)(unsafe.Pointer(b + i))                  if obj != 0 {                      // 如果該地址下找到了對應(yīng)的對象，標(biāo)灰                      if obj, span, objIndex := findObject(obj, b, i); obj != 0 {                          greyobject(obj, b, i, span, gcw, objIndex)                      }                  }              }              bits >>= 1              i += sys.PtrSize          }      }  }

greyobject

標(biāo)灰對象其實(shí)就是找到對應(yīng)bitmap，標(biāo)記存活并扔進(jìn)隊(duì)列

func greyobject(obj, base, off uintptr, span *mspan, gcw *gcWork, objIndex uintptr) {      // obj should be start of allocation, and so must be at least pointer-aligned.      if obj&(sys.PtrSize-1) != 0 {          throw("greyobject: obj not pointer-aligned")      }      mbits := span.markBitsForIndex(objIndex)      if useCheckmark {          // 這里是用來debug，確保所有的對象都被正確標(biāo)識          if !mbits.isMarked() {              // 這個(gè)對象沒有被標(biāo)記              printlock()              print("runtime:greyobject: checkmarks finds unexpected unmarked object obj=", hex(obj), "\n")              print("runtime: found obj at *(", hex(base), "+", hex(off), ")\n")              // Dump the source (base) object              gcDumpObject("base", base, off)              // Dump the object              gcDumpObject("obj", obj, ^uintptr(0))              getg().m.traceback = 2              throw("checkmark found unmarked object")          }          hbits := heapBitsForAddr(obj)          if hbits.isCheckmarked(span.elemsize) {              return          }          hbits.setCheckmarked(span.elemsize)          if !hbits.isCheckmarked(span.elemsize) {              throw("setCheckmarked and isCheckmarked disagree")          }      } else {          if debug.gccheckmark > 0 && span.isFree(objIndex) {              print("runtime: marking free object ", hex(obj), " found at *(", hex(base), "+", hex(off), ")\n")              gcDumpObject("base", base, off)              gcDumpObject("obj", obj, ^uintptr(0))              getg().m.traceback = 2              throw("marking free object")          }          // If marked we have nothing to do.          // 對象被正確標(biāo)記了，無需做其他的操作          if mbits.isMarked() {              return          }          // mbits.setMarked() // Avoid extra call overhead with manual inlining.          // 標(biāo)記對象          atomic.Or8(mbits.bytep, mbits.mask)          // If this is a noscan object, fast-track it to black          // instead of greying it.          // 如果對象不是指針，則只需要標(biāo)記，不需要放進(jìn)隊(duì)列，相當(dāng)于直接標(biāo)黑          if span.spanclass.noscan() {              gcw.bytesMarked += uint64(span.elemsize)              return          }      }      // Queue the obj for scanning. The PREFETCH(obj) logic has been removed but      // seems like a nice optimization that can be added back in.      // There needs to be time between the PREFETCH and the use.      // Previously we put the obj in an 8 element buffer that is drained at a rate      // to give the PREFETCH time to do its work.      // Use of PREFETCHNTA might be more appropriate than PREFETCH      // 判斷對象是否被放進(jìn)隊(duì)列，沒有則放入，標(biāo)灰步驟完成      if !gcw.putFast(obj) {          gcw.put(obj)      }  }

gcWork.putFast

work有wbuf1 wbuf2兩個(gè)隊(duì)列用于保存灰色對象，首先會往wbuf1隊(duì)列里加入灰色對象，wbuf1滿了后，交換wbuf1和wbuf2，這事wbuf2便晉升為wbuf1，繼續(xù)存放灰色對象，兩個(gè)隊(duì)列都滿了，則想全局進(jìn)行申請

putFast這里進(jìn)嘗試將對象放進(jìn)wbuf1隊(duì)列中

func (w *gcWork) putFast(obj uintptr) bool {      wbuf := w.wbuf1      if wbuf == nil {          // 沒有申請緩存隊(duì)列，返回false          return false      } else if wbuf.nobj == len(wbuf.obj) {          // wbuf1隊(duì)列滿了，返回false          return false      }      // 向未滿wbuf1隊(duì)列中加入對象      wbuf.obj[wbuf.nobj] = obj      wbuf.nobj++      return true  }

gcWork.put

put不僅嘗試將對象放入wbuf1，還會再wbuf1滿的時(shí)候，嘗試更換wbuf1 wbuf2的角色，都滿的話，則想全局進(jìn)行申請，并將滿的隊(duì)列上交到全局隊(duì)列

func (w *gcWork) put(obj uintptr) {      flushed := false      wbuf := w.wbuf1      if wbuf == nil {          // 如果wbuf1不存在，則初始化wbuf1 wbuf2兩個(gè)隊(duì)列          w.init()          wwbuf = w.wbuf1          // wbuf is empty at this point.      } else if wbuf.nobj == len(wbuf.obj) {          // wbuf1滿了，更換wbuf1 wbuf2的角色          w.wbuf1, ww.wbuf2 = w.wbuf2, w.wbuf1          wwbuf = w.wbuf1          if wbuf.nobj == len(wbuf.obj) {              // 更換角色后，wbuf1也滿了，說明兩個(gè)隊(duì)列都滿了              // 把 wbuf1上交全局并獲取一個(gè)空的隊(duì)列              putfull(wbuf)              wbuf = getempty()              w.wbuf1 = wbuf              // 設(shè)置隊(duì)列上交的標(biāo)志位              flushed = true          }      }      wbuf.obj[wbuf.nobj] = obj      wbuf.nobj++      // If we put a buffer on full, let the GC controller know so      // it can encourage more workers to run. We delay this until      // the end of put so that w is in a consistent state, since      // enlistWorker may itself manipulate w.      // 此時(shí)全局已經(jīng)有標(biāo)記滿的隊(duì)列，GC controller選擇調(diào)度更多work進(jìn)行工作      if flushed && gcphase == _GCmark {          gcController.enlistWorker()      }  }

到這里，接下來，我們繼續(xù)分析gcDrain里面的函數(shù)，追蹤一下，我們標(biāo)灰的對象是如何被標(biāo)黑的

gcw.balance()

繼續(xù)分析 gcDrain的58行，balance work是什么

func (w *gcWork) balance() {      if w.wbuf1 == nil {          // 這里wbuf1 wbuf2隊(duì)列還沒有初始化          return      }      // 如果wbuf2不為空，則上交到全局，并獲取一個(gè)空島隊(duì)列給wbuf2      if wbuf := w.wbuf2; wbuf.nobj != 0 {          putfull(wbuf)          w.wbuf2 = getempty()      } else if wbuf := w.wbuf1; wbuf.nobj > 4 {          // 把未滿的wbuf1分成兩半，并把其中一半上交的全局隊(duì)列          w.wbuf1 = handoff(wbuf)      } else {          return      }      // We flushed a buffer to the full list, so wake a worker.      // 這里，全局隊(duì)列有滿的隊(duì)列了，其他work可以工作了      if gcphase == _GCmark {          gcController.enlistWorker()      }  }

gcw.get()

繼續(xù)分析 gcDrain的63行，這里就是首先從本地的隊(duì)列獲取一個(gè)對象，如果本地隊(duì)列的wbuf1沒有，嘗試從wbuf2獲取，如果兩個(gè)都沒有，則嘗試從全局隊(duì)列獲取一個(gè)滿的隊(duì)列，并獲取一個(gè)對象

func (w *gcWork) get() uintptr {      wbuf := w.wbuf1      if wbuf == nil {          w.init()          wwbuf = w.wbuf1          // wbuf is empty at this point.      }      if wbuf.nobj == 0 {          // wbuf1空了，更換wbuf1 wbuf2的角色          w.wbuf1, ww.wbuf2 = w.wbuf2, w.wbuf1          wwbuf = w.wbuf1          // 原wbuf2也是空的，嘗試從全局隊(duì)列獲取一個(gè)滿的隊(duì)列          if wbuf.nobj == 0 {              owbuf := wbuf              wbuf = getfull()              // 獲取不到，則返回              if wbuf == nil {                  return 0              }              // 把空的隊(duì)列上傳到全局空隊(duì)列，并把獲取的滿的隊(duì)列，作為自身的wbuf1              putempty(owbuf)              w.wbuf1 = wbuf          }      }      // TODO: This might be a good place to add prefetch code     wbuf.nobj--      return wbuf.obj[wbuf.nobj]  }

gcw.tryGet() gcw.tryGetFast() 邏輯差不多，相對比較簡單，就不繼續(xù)分析了

scanobject

我們繼續(xù)分析到 gcDrain 的L76，這里已經(jīng)獲取到了b，開始消費(fèi)隊(duì)列

func scanobject(b uintptr, gcw *gcWork) {      // Find the bits for b and the size of the object at b.      //      // b is either the beginning of an object, in which case this      // is the size of the object to scan, or it points to an      // oblet, in which case we compute the size to scan below.      // 獲取b對應(yīng)的bits      hbits := heapBitsForAddr(b)      // 獲取b所在的span      s := spanOfUnchecked(b)      n := s.elemsize      if n == 0 {          throw("scanobject n == 0")      }      // 對象過大，則切割后再掃描，maxObletBytes為128k      if n > maxObletBytes {          // Large object. Break into oblets for better          // parallelism and lower latency.          if b == s.base() {              // It's possible this is a noscan object (not              // from greyobject, but from other code              // paths), in which case we must *not* enqueue              // oblets since their bitmaps will be              // uninitialized.              // 如果不是指針，直接標(biāo)記返回，相當(dāng)于標(biāo)黑了              if s.spanclass.noscan() {                  // Bypass the whole scan.                  gcw.bytesMarked += uint64(n)                  return              }              // Enqueue the other oblets to scan later.              // Some oblets may be in b's scalar tail, but              // these will be marked as "no more pointers",              // so we'll drop out immediately when we go to              // scan those.              // 按maxObletBytes切割后放入到 隊(duì)列              for oblet := b + maxObletBytes; oblet < s.base()+s.elemsize; oblet += maxObletBytes {                  if !gcw.putFast(oblet) {                      gcw.put(oblet)                  }              }          }          // Compute the size of the oblet. Since this object          // must be a large object, s.base() is the beginning          // of the object.          n = s.base() + s.elemsize - b          if n > maxObletBytes {              n = maxObletBytes          }      }      var i uintptr      for i = 0; i < n; i += sys.PtrSize {          // Find bits for this word.          // 獲取到對應(yīng)的bits          if i != 0 {              // Avoid needless hbits.next() on last iteration.              hbitshbits = hbits.next()          }          // Load bits once. See CL 22712 and issue 16973 for discussion.          bits := hbits.bits()          // During checkmarking, 1-word objects store the checkmark          // in the type bit for the one word. The only one-word objects          // are pointers, or else they'd be merged with other non-pointer          // data into larger allocations.          if i != 1*sys.PtrSize && bits&bitScan == 0 {              break // no more pointers in this object          }          // 不是指針，繼續(xù)          if bits&bitPointer == 0 {              continue // not a pointer          }          // Work here is duplicated in scanblock and above.          // If you make changes here, make changes there too.          obj := *(*uintptr)(unsafe.Pointer(b + i))          // At this point we have extracted the next potential pointer.          // Quickly filter out nil and pointers back to the current object.          if obj != 0 && obj-b >= n {              // Test if obj points into the Go heap and, if so,              // mark the object.              //              // Note that it's possible for findObject to              // fail if obj points to a just-allocated heap              // object because of a race with growing the              // heap. In this case, we know the object was              // just allocated and hence will be marked by              // allocation itself.              // 找到指針對應(yīng)的對象，并標(biāo)灰              if obj, span, objIndex := findObject(obj, b, i); obj != 0 {                  greyobject(obj, b, i, span, gcw, objIndex)              }          }      }      gcw.bytesMarked += uint64(n)      gcw.scanWork += int64(i)  }

綜上，我們可以發(fā)現(xiàn)，標(biāo)灰就是標(biāo)記并放進(jìn)隊(duì)列，標(biāo)黑就是標(biāo)記，所以當(dāng)灰色對象從隊(duì)列中取出后，我們就可以認(rèn)為這個(gè)對象是黑色對象了

至此，gcDrain的標(biāo)記工作分析完成，我們繼續(xù)回到gcBgMarkWorker分析

gcMarkDone

gcMarkDone會將mark1階段進(jìn)入到mark2階段， mark2階段進(jìn)入到mark termination階段

mark1階段：包括所有root標(biāo)記，全局緩存隊(duì)列和本地緩存隊(duì)列

mark2階段：本地緩存隊(duì)列會被禁用

func gcMarkDone() {  top:      semacquire(&work.markDoneSema)      // Re-check transition condition under transition lock.      if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) {          semrelease(&work.markDoneSema)          return      }      // Disallow starting new workers so that any remaining workers      // in the current mark phase will drain out.      //      // TODO(austin): Should dedicated workers keep an eye on this      // and exit gcDrain promptly?      // 禁止新的標(biāo)記任務(wù)      atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, -0xffffffff)      prevFractionalGoal := gcController.fractionalUtilizationGoal      gcController.fractionalUtilizationGoal = 0      // 如果gcBlackenPromptly表名需要所有本地緩存隊(duì)列立即上交到全局隊(duì)列，并禁用本地緩存隊(duì)列      if !gcBlackenPromptly {          // Transition from mark 1 to mark 2.          //          // The global work list is empty, but there can still be work          // sitting in the per-P work caches.          // Flush and disable work caches.          // Disallow caching workbufs and indicate that we're in mark 2.          // 禁用本地緩存隊(duì)列，進(jìn)入mark2階段          gcBlackenPromptly = true          // Prevent completion of mark 2 until we've flushed          // cached workbufs.          atomic.Xadd(&work.nwait, -1)          // GC is set up for mark 2. Let Gs blocked on the          // transition lock go while we flush caches.          semrelease(&work.markDoneSema)          // 切換到g0執(zhí)行，本地緩存上傳到全局的操作          systemstack(func() {              // Flush all currently cached workbufs and              // ensure all Ps see gcBlackenPromptly. This              // also blocks until any remaining mark 1              // workers have exited their loop so we can              // start new mark 2 workers.              forEachP(func(_p_ *p) {                  wbBufFlush2(_p_)                  _p_.gcw.dispose()              })          })          // Check that roots are marked. We should be able to          // do this before the forEachP, but based on issue          // #16083 there may be a (harmless) race where we can          // enter mark 2 while some workers are still scanning          // stacks. The forEachP ensures these scans are done.          //          // TODO(austin): Figure out the race and fix this          // properly.          // 檢查所有的root是否都被標(biāo)記了          gcMarkRootCheck()          // Now we can start up mark 2 workers.          atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 0xffffffff)          gcController.fractionalUtilizationGoal = prevFractionalGoal          incnwait := atomic.Xadd(&work.nwait, +1)          // 如果沒有更多的任務(wù)，則執(zhí)行第二次調(diào)用，從mark2階段轉(zhuǎn)換到mark termination階段          if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {              // This loop will make progress because              // gcBlackenPromptly is now true, so it won't              // take this same "if" branch.              goto top          }      } else {          // Transition to mark termination.          now := nanotime()          work.tMarkTerm = now          work.pauseStart = now          getg().m.preemptoff = "gcing"          if trace.enabled {              traceGCSTWStart(0)          }          systemstack(stopTheWorldWithSema)          // The gcphase is _GCmark, it will transition to _GCmarktermination          // below. The important thing is that the wb remains active until          // all marking is complete. This includes writes made by the GC.          // Record that one root marking pass has completed.          work.markrootDone = true          // Disable assists and background workers. We must do          // this before waking blocked assists.          atomic.Store(&gcBlackenEnabled, 0)          // Wake all blocked assists. These will run when we          // start the world again.          // 喚醒所有的輔助GC          gcWakeAllAssists()          // Likewise, release the transition lock. Blocked          // workers and assists will run when we start the          // world again.          semrelease(&work.markDoneSema)          // endCycle depends on all gcWork cache stats being          // flushed. This is ensured by mark 2.          // 計(jì)算下一次gc出發(fā)的閾值          nextTriggerRatio := gcController.endCycle()          // Perform mark termination. This will restart the world.          // start the world，并進(jìn)入完成階段          gcMarkTermination(nextTriggerRatio)      }  }

gcMarkTermination

結(jié)束標(biāo)記，并進(jìn)行清掃等工作

func gcMarkTermination(nextTriggerRatio float64) {      // World is stopped.      // Start marktermination which includes enabling the write barrier.      atomic.Store(&gcBlackenEnabled, 0)      gcBlackenPromptly = false      // 設(shè)置GC的階段標(biāo)識      setGCPhase(_GCmarktermination)      work.heap1 = memstats.heap_live      startTime := nanotime()      mp := acquirem()      mp.preemptoff = "gcing"      _g_ := getg()      _g_.m.traceback = 2      gp := _g_.m.curg      // 設(shè)置當(dāng)前g的狀態(tài)為waiting狀態(tài)      casgstatus(gp, _Grunning, _Gwaiting)      gp.waitreason = waitReasonGarbageCollection      // Run gc on the g0 stack. We do this so that the g stack      // we're currently running on will no longer change. Cuts      // the root set down a bit (g0 stacks are not scanned, and      // we don't need to scan gc's internal state).  We also      // need to switch to g0 so we can shrink the stack.      systemstack(func() {          // 通過g0掃描當(dāng)前g的棧          gcMark(startTime)          // Must return immediately.          // The outer function's stack may have moved          // during gcMark (it shrinks stacks, including the          // outer function's stack), so we must not refer          // to any of its variables. Return back to the          // non-system stack to pick up the new addresses          // before continuing.      })      systemstack(func() {          workwork.heap2 = work.bytesMarked          if debug.gccheckmark > 0 {              // Run a full stop-the-world mark using checkmark bits,              // to check that we didn't forget to mark anything during              // the concurrent mark process.              // 如果啟用了gccheckmark，則檢查所有可達(dá)對象是否都有標(biāo)記              gcResetMarkState()              initCheckmarks()              gcMark(startTime)              clearCheckmarks()          }          // marking is complete so we can turn the write barrier off          // 設(shè)置gc的階段標(biāo)識，GCoff時(shí)會關(guān)閉寫屏障          setGCPhase(_GCoff)          // 開始清掃          gcSweep(work.mode)          if debug.gctrace > 1 {              startTime = nanotime()              // The g stacks have been scanned so              // they have gcscanvalid==true and gcworkdone==true.              // Reset these so that all stacks will be rescanned.              gcResetMarkState()              finishsweep_m()              // Still in STW but gcphase is _GCoff, reset to _GCmarktermination              // At this point all objects will be found during the gcMark which              // does a complete STW mark and object scan.              setGCPhase(_GCmarktermination)              gcMark(startTime)              setGCPhase(_GCoff) // marking is done, turn off wb.              gcSweep(work.mode)          }      })      _g_.m.traceback = 0      casgstatus(gp, _Gwaiting, _Grunning)      if trace.enabled {          traceGCDone()      }      // all done      mp.preemptoff = ""      if gcphase != _GCoff {          throw("gc done but gcphase != _GCoff")      }      // Update GC trigger and pacing for the next cycle.      // 更新下次出發(fā)gc的增長比      gcSetTriggerRatio(nextTriggerRatio)      // Update timing memstats      // 更新用時(shí)      now := nanotime()      sec, nsec, _ := time_now()      unixNow := sec*1e9 + int64(nsec)      work.pauseNS += now - work.pauseStart      work.tEnd = now      atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user      atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us      memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS)      memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow)      memstats.pause_total_ns += uint64(work.pauseNS)      // Update work.totaltime.      sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm)      // We report idle marking time below, but omit it from the      // overall utilization here since it's "free".      markCpu := gcController.assistTime + gcController.dedicatedMarkTime + gcController.fractionalMarkTime      markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm)      cycleCpu := sweepTermCpu + markCpu + markTermCpu      work.totaltime += cycleCpu      // Compute overall GC CPU utilization.      totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs)      memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu)      // Reset sweep state.      // 重置清掃的狀態(tài)      sweep.nbgsweep = 0      sweep.npausesweep = 0      // 如果是強(qiáng)制開啟的gc，標(biāo)識增加      if work.userForced {          memstats.numforcedgc++      }      // Bump GC cycle count and wake goroutines waiting on sweep.      // 統(tǒng)計(jì)執(zhí)行GC的次數(shù)然后喚醒等待清掃的G      lock(&work.sweepWaiters.lock)      memstats.numgc++      injectglist(work.sweepWaiters.head.ptr())      work.sweepWaiters.head = 0      unlock(&work.sweepWaiters.lock)      // Finish the current heap profiling cycle and start a new      // heap profiling cycle. We do this before starting the world      // so events don't leak into the wrong cycle.      mProf_NextCycle()      // start the world      systemstack(func() { startTheWorldWithSema(true) })      // Flush the heap profile so we can start a new cycle next GC.      // This is relatively expensive, so we don't do it with the      // world stopped.      mProf_Flush()      // Prepare workbufs for freeing by the sweeper. We do this      // asynchronously because it can take non-trivial time.      prepareFreeWorkbufs()      // Free stack spans. This must be done between GC cycles.      systemstack(freeStackSpans)      // Print gctrace before dropping worldsema. As soon as we drop      // worldsema another cycle could start and smash the stats      // we're trying to print.      if debug.gctrace > 0 {          util := int(memstats.gc_cpu_fraction * 100)          var sbuf [24]byte          printlock()          print("gc ", memstats.numgc,              " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ",              util, "%: ")          prev := work.tSweepTerm          for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} {              if i != 0 {                  print("+")              }              print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev))))              prev = ns          }          print(" ms clock, ")          for i, ns := range []int64{sweepTermCpu, gcController.assistTime, gcController.dedicatedMarkTime + gcController.fractionalMarkTime, gcController.idleMarkTime, markTermCpu} {              if i == 2 || i == 3 {                  // Separate mark time components with /.                  print("/")              } else if i != 0 {                  print("+")              }              print(string(fmtNSAsMS(sbuf[:], uint64(ns))))          }          print(" ms cpu, ",              work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ",              work.heapGoal>>20, " MB goal, ",              work.maxprocs, " P")          if work.userForced {              print(" (forced)")          }          print("\n")          printunlock()      }      semrelease(&worldsema)      // Careful: another GC cycle may start now.      releasem(mp)      mp = nil      // now that gc is done, kick off finalizer thread if needed      // 如果不是并行GC，則讓當(dāng)前M開始調(diào)度      if !concurrentSweep {          // give the queued finalizers, if any, a chance to run          Gosched()      }  }

goSweep

清掃任務(wù)

func gcSweep(mode gcMode) {      if gcphase != _GCoff {          throw("gcSweep being done but phase is not GCoff")      }      lock(&mheap_.lock)      // sweepgen在每次GC之后都會增長2，每次GC之后sweepSpans的角色都會互換      mheap_.sweepgen += 2      mheap_.sweepdone = 0      if mheap_.sweepSpans[mheap_.sweepgen/2%2].index != 0 {          // We should have drained this list during the last          // sweep phase. We certainly need to start this phase          // with an empty swept list.          throw("non-empty swept list")      }      mheap_.pagesSwept = 0      unlock(&mheap_.lock)      // 如果不是并行GC，或者強(qiáng)制GC      if !_ConcurrentSweep || mode == gcForceBlockMode {          // Special case synchronous sweep.          // Record that no proportional sweeping has to happen.          lock(&mheap_.lock)          mheap_.sweepPagesPerByte = 0          unlock(&mheap_.lock)          // Sweep all spans eagerly.          // 清掃所有的span          for sweepone() != ^uintptr(0) {              sweep.npausesweep++          }          // Free workbufs eagerly.          // 釋放所有的 workbufs          prepareFreeWorkbufs()          for freeSomeWbufs(false) {          }          // All "free" events for this mark/sweep cycle have          // now happened, so we can make this profile cycle          // available immediately.          mProf_NextCycle()          mProf_Flush()          return      }      // Background sweep.      lock(&sweep.lock)      // 喚醒后臺清掃任務(wù),也就是 bgsweep 函數(shù)，清掃流程跟上面非并行清掃差不多      if sweep.parked {          sweep.parked = false          ready(sweep.g, 0, true)      }      unlock(&sweep.lock)  }

sweepone

接下來我們就分析一下sweepone 清掃的流程

func sweepone() uintptr {      _g_ := getg()      sweepRatio := mheap_.sweepPagesPerByte // For debugging      // increment locks to ensure that the goroutine is not preempted      // in the middle of sweep thus leaving the span in an inconsistent state for next GC      _g_.m.locks++      // 檢查是否已經(jīng)完成了清掃      if atomic.Load(&mheap_.sweepdone) != 0 {          _g_.m.locks--          return ^uintptr(0)      }      // 增加清掃的worker數(shù)量      atomic.Xadd(&mheap_.sweepers, +1)      npages := ^uintptr(0)      sg := mheap_.sweepgen      for {          // 循環(huán)獲取需要清掃的span          s := mheap_.sweepSpans[1-sg/2%2].pop()          if s == nil {              atomic.Store(&mheap_.sweepdone, 1)              break          }          if s.state != mSpanInUse {              // This can happen if direct sweeping already              // swept this span, but in that case the sweep              // generation should always be up-to-date.              if s.sweepgen != sg {                  print("runtime: bad span s.state=", s.state, " s.sweepgen=", s.sweepgen, " sweepgen=", sg, "\n")                  throw("non in-use span in unswept list")              }              continue          }          // sweepgen == h->sweepgen - 2, 表示這個(gè)span需要清掃          // sweepgen == h->sweepgen - 1, 表示這個(gè)span正在被清掃          // 這是里確定span的狀態(tài)及嘗試轉(zhuǎn)換span的狀態(tài)          if s.sweepgen != sg-2 || !atomic.Cas(&s.sweepgen, sg-2, sg-1) {              continue          }          npages = s.npages          // 單個(gè)span的清掃          if !s.sweep(false) {              // Span is still in-use, so this returned no              // pages to the heap and the span needs to              // move to the swept in-use list.              npages = 0          }          break      }      // Decrement the number of active sweepers and if this is the      // last one print trace information.      // 當(dāng)前worker清掃任務(wù)完成，更新sweepers的數(shù)量      if atomic.Xadd(&mheap_.sweepers, -1) == 0 && atomic.Load(&mheap_.sweepdone) != 0 {          if debug.gcpacertrace > 0 {              print("pacer: sweep done at heap size ", memstats.heap_live>>20, "MB; allocated ", (memstats.heap_live-mheap_.sweepHeapLiveBasis)>>20, "MB during sweep; swept ", mheap_.pagesSwept, " pages at ", sweepRatio, " pages/byte\n")          }      }      _g_.m.locks--      return npages  }

mspan.sweep

func (s *mspan) sweep(preserve bool) bool {      // It's critical that we enter this function with preemption disabled,      // GC must not start while we are in the middle of this function.      _g_ := getg()      if _g_.m.locks == 0 && _g_.m.mallocing == 0 && _g_ != _g_.m.g0 {          throw("MSpan_Sweep: m is not locked")     }      sweepgen := mheap_.sweepgen      // 只有正在清掃中狀態(tài)的span才可以正常執(zhí)行      if s.state != mSpanInUse || s.sweepgen != sweepgen-1 {          print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n")          throw("MSpan_Sweep: bad span state")      }      if trace.enabled {          traceGCSweepSpan(s.npages * _PageSize)      }      // 先更新清掃的page數(shù)      atomic.Xadd64(&mheap_.pagesSwept, int64(s.npages))      spc := s.spanclass      size := s.elemsize      res := false      c := _g_.m.mcache      freeToHeap := false      // The allocBits indicate which unmarked objects don't need to be      // processed since they were free at the end of the last GC cycle      // and were not allocated since then.      // If the allocBits index is >= s.freeindex and the bit      // is not marked then the object remains unallocated      // since the last GC.      // This situation is analogous to being on a freelist.      // Unlink & free special records for any objects we're about to free.      // Two complications here:      // 1. An object can have both finalizer and profile special records.      //    In such case we need to queue finalizer for execution,      //    mark the object as live and preserve the profile special.      // 2. A tiny object can have several finalizers setup for different offsets.      //    If such object is not marked, we need to queue all finalizers at once.      // Both 1 and 2 are possible at the same time.      specialp := &s.specials      special := *specialp      // 判斷在special中的對象是否存活，是否至少有一個(gè)finalizer，釋放沒有finalizer的對象，把有finalizer的對象組成隊(duì)列      for special != nil {          // A finalizer can be set for an inner byte of an object, find object beginning.          objIndex := uintptr(special.offset) / size          p := s.base() + objIndex*size          mbits := s.markBitsForIndex(objIndex)          if !mbits.isMarked() {              // This object is not marked and has at least one special record.              // Pass 1: see if it has at least one finalizer.              hasFin := false              endOffset := p - s.base() + size              for tmp := special; tmp != nil && uintptr(tmp.offset) < endOffset; tmptmp = tmp.next {                  if tmp.kind == _KindSpecialFinalizer {                      // Stop freeing of object if it has a finalizer.                      mbits.setMarkedNonAtomic()                      hasFin = true                      break                  }              }              // Pass 2: queue all finalizers _or_ handle profile record.              for special != nil && uintptr(special.offset) < endOffset {                  // Find the exact byte for which the special was setup                  // (as opposed to object beginning).                  p := s.base() + uintptr(special.offset)                  if special.kind == _KindSpecialFinalizer || !hasFin {                      // Splice out special record.                      y := special                      specialspecial = special.next                      *specialspecialp = special                      freespecial(y, unsafe.Pointer(p), size)                  } else {                      // This is profile record, but the object has finalizers (so kept alive).                      // Keep special record.                      specialp = &special.next                      special = *specialp                  }              }          } else {              // object is still live: keep special record              specialp = &special.next              special = *specialp          }      }      if debug.allocfreetrace != 0 || raceenabled || msanenabled {          // Find all newly freed objects. This doesn't have to          // efficient; allocfreetrace has massive overhead.          mbits := s.markBitsForBase()          abits := s.allocBitsForIndex(0)          for i := uintptr(0); i < s.nelems; i++ {              if !mbits.isMarked() && (abits.index < s.freeindex || abits.isMarked()) {                  x := s.base() + i*s.elemsize                  if debug.allocfreetrace != 0 {                      tracefree(unsafe.Pointer(x), size)                  }                  if raceenabled {                      racefree(unsafe.Pointer(x), size)                  }                  if msanenabled {                      msanfree(unsafe.Pointer(x), size)                  }              }              mbits.advance()              abits.advance()          }      }      // Count the number of free objects in this span.      // 獲取需要釋放的alloc對象的總數(shù)      nalloc := uint16(s.countAlloc())      // 如果sizeclass為0，卻分配的總數(shù)量為0，則釋放到mheap      if spc.sizeclass() == 0 && nalloc == 0 {          s.needzero = 1          freeToHeap = true      }      nfreed := s.allocCount - nalloc      if nalloc > s.allocCount {          print("runtime: nelems=", s.nelems, " nalloc=", nalloc, " previous allocCount=", s.allocCount, " nfreed=", nfreed, "\n")          throw("sweep increased allocation count")      }      s.allocCount = nalloc      // 判斷span是否empty      wasempty := s.nextFreeIndex() == s.nelems      // 重置freeindex      s.freeindex = 0 // reset allocation index to start of span.      if trace.enabled {          getg().m.p.ptr().traceReclaimed += uintptr(nfreed) * s.elemsize      }      // gcmarkBits becomes the allocBits.      // get a fresh cleared gcmarkBits in preparation for next GC      // 重置 allocBits為 gcMarkBits      ss.allocBits = s.gcmarkBits      // 重置 gcMarkBits      s.gcmarkBits = newMarkBits(s.nelems)      // Initialize alloc bits cache.      // 更新allocCache      s.refillAllocCache(0)      // We need to set s.sweepgen = h.sweepgen only when all blocks are swept,      // because of the potential for a concurrent free/SetFinalizer.      // But we need to set it before we make the span available for allocation      // (return it to heap or mcentral), because allocation code assumes that a      // span is already swept if available for allocation.      if freeToHeap || nfreed == 0 {          // The span must be in our exclusive ownership until we update sweepgen,          // check for potential races.          if s.state != mSpanInUse || s.sweepgen != sweepgen-1 {              print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n")              throw("MSpan_Sweep: bad span state after sweep")          }          // Serialization point.          // At this point the mark bits are cleared and allocation ready          // to go so release the span.          atomic.Store(&s.sweepgen, sweepgen)      }      if nfreed > 0 && spc.sizeclass() != 0 {          c.local_nsmallfree[spc.sizeclass()] += uintptr(nfreed)          // 把span釋放到mcentral上          res = mheap_.central[spc].mcentral.freeSpan(s, preserve, wasempty)          // MCentral_FreeSpan updates sweepgen      } else if freeToHeap {          // 這里是大對象的span釋放，與117行呼應(yīng)          // Free large span to heap          // NOTE(rsc,dvyukov): The original implementation of efence          // in CL 22060046 used SysFree instead of SysFault, so that          // the operating system would eventually give the memory          // back to us again, so that an efence program could run          // longer without running out of memory. Unfortunately,          // calling SysFree here without any kind of adjustment of the          // heap data structures means that when the memory does          // come back to us, we have the wrong metadata for it, either in          // the MSpan structures or in the garbage collection bitmap.          // Using SysFault here means that the program will run out of          // memory fairly quickly in efence mode, but at least it won't          // have mysterious crashes due to confused memory reuse.          // It should be possible to switch back to SysFree if we also          // implement and then call some kind of MHeap_DeleteSpan.          if debug.efence > 0 {              s.limit = 0 // prevent mlookup from finding this span              sysFault(unsafe.Pointer(s.base()), size)          } else {              // 把sapn釋放到mheap上              mheap_.freeSpan(s, 1)          }          c.local_nlargefree++          c.local_largefree += size          res = true      }      if !res {          // The span has been swept and is still in-use, so put          // it on the swept in-use list.          // 如果span未釋放到mcentral或mheap，表示span仍然處于in-use狀態(tài)          mheap_.sweepSpans[sweepgen/2%2].push(s)     }      return res  }

到此，關(guān)于“Go語言的GC流程解析”的學(xué)習(xí)就結(jié)束了，希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí)，快去試試吧！若想繼續(xù)學(xué)習(xí)更多相關(guān)知識，請繼續(xù)關(guān)注億速云網(wǎng)站，小編會繼續(xù)努力為大家?guī)砀鄬?shí)用的文章！

向AI問一下細(xì)節(jié)

Go語言的GC流程解析

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽