怎么在 Python中利用遞歸遍歷文件

發(fā)布時間：2021-03-26 17:08:44 來源：億速云閱讀：194 作者：Leah 欄目：開發(fā)技術(shù)

今天就跟大家聊聊有關(guān)怎么在 Python中利用遞歸遍歷文件，可能很多人都不太了解，為了讓大家更加了解，小編給大家總結(jié)了以下內(nèi)容，希望大家根據(jù)這篇文章可以有所收獲。

def getallfiles(dir):
"""遍歷獲取指定文件夾下面所有文件"""
  if os.path.isdir(dir):
    filelist = os.listdir(dir)
    for ret in filelist:
      filename = dir + "\\" + ret
      if os.path.isfile(filename):
        print filename

def getalldirfiles(dir, basedir):
"""遍歷獲取所有子文件夾下面所有文件"""
  if os.path.isdir(dir):
    getallfiles(dir)
    dirlist = os.listdir(dir)
    for dirret in dirlist:
      fullname = dir + "\\" + dirret
      if os.path.isdir(fullname):
        getalldirfiles(fullname, basedir)

我是用了 2 個函數(shù)，并且每個函數(shù)都用了一次 listdir，只是一次用來過濾文件，一次用來過濾文件夾，如果只是從功能實現(xiàn)上看，一點問題沒有，但是這…太不優(yōu)雅了吧。

開始著手優(yōu)化，方案一：

def getallfiles(dir):
"""使用listdir循環(huán)遍歷"""
  if not os.path.isdir(dir):
    print dir
    return
  dirlist = os.listdir(dir)
  for dirret in dirlist:
    fullname = dir + "\\" + dirret
    if os.path.isdir(fullname):
      getallfiles(fullname)
    else:
      print fullname

從上圖可以看到，我把兩個函數(shù)合并成了一個，只調(diào)用了一次 listdir，把文件和文件夾用 if~else~ 進行了分支處理，當(dāng)然，自我調(diào)用的循環(huán)還是存在。

有木有更好的方式呢？網(wǎng)上一搜一大把，原來有一個現(xiàn)成的 os.walk() 函數(shù)可以用來處理文件(夾)的遍歷，這樣優(yōu)化下就更簡單了。

方案二：

def getallfilesofwalk(dir):
"""使用listdir循環(huán)遍歷"""
  if not os.path.isdir(dir):
    print dir
    return
  dirlist = os.walk(dir)
  for root, dirs, files in dirlist:
    for file in files:
      print os.path.join(root, file)

只是從代碼實現(xiàn)上看，方案二是最優(yōu)雅簡潔的了，但是再翻看 os.walk() 實現(xiàn)的源碼就會發(fā)現(xiàn)，其實它內(nèi)部還是調(diào)用的 listdir 完成具體的功能實現(xiàn)，只是它對輸出結(jié)果做了下額外的處理而已。

附上os.walk()的源碼：

from os.path import join, isdir, islink
# We may not have read permission for top, in which case we can't
# get a list of the files the directory contains. os.path.walk
# always suppressed the exception then, rather than blow up for a
# minor reason when (say) a thousand readable directories are still
# left to visit. That logic is copied here.
try:
  # Note that listdir and error are globals in this module due
  # to earlier import-*.
  names = listdir(top)
except error, err:
  if onerror is not None:
    onerror(err)
  return
dirs, nondirs = [], []
for name in names:
  if isdir(join(top, name)):
    dirs.append(name)
  else:
    nondirs.append(name)
if topdown:
  yield top, dirs, nondirs
for name in dirs:
  path = join(top, name)
  if followlinks or not islink(path):
    for x in walk(path, topdown, onerror, followlinks):
      yield x
if not topdown:
  yield top, dirs, nondirs

至于 listdir 和 walk 在輸出時的不同點，主要就是 listdir 默認(rèn)是按照文件和文件夾存放的字母順序進行輸出，而 walk 則是先輸出頂級文件夾，然后是頂級文件，再輸出第二級文件夾，以及第二級文件，以此類推，具體大家可以把上面腳本拷貝后自行驗證。

看完上述內(nèi)容，你們對怎么在 Python中利用遞歸遍歷文件有進一步的了解嗎？如果還想了解更多知識或者相關(guān)內(nèi)容，請關(guān)注億速云行業(yè)資訊頻道，感謝大家的支持。

向AI問一下細節(jié)

怎么在 Python中利用遞歸遍歷文件

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽