linux最快的文本搜索神器ripgrep(grep的最好代替者)

發(fā)布時(shí)間：2020-10-17 13:17:05 來(lái)源：腳本之家閱讀：595 作者：harriszh 欄目：服務(wù)器

前言

說(shuō)到文本搜索工具，大家一定會(huì)知道 grep, 它是 linux 最有用并最常用的工具之一。
但如果要再一個(gè)大的工程項(xiàng)目中搜索某個(gè)關(guān)鍵詞，大家也一定知道它比較耗時(shí)。
所以就有了很多替代工具，之前最出名的是 Ack，Ag
而最近又有了新的替代者 Ripgrep, 這個(gè)工具和 Ack/Ag 一樣都使用了多線程的方法，但 rg 比它們更快

簡(jiǎn)介

ripgrep 是一個(gè)以行為單位的搜索工具，它根據(jù)提供的 pattern 遞歸地在指定的目錄里搜索。它是由 Rust 語(yǔ)言寫成，相較與同類工具，它的特點(diǎn)就是無(wú)與倫比地快。
幾個(gè)特點(diǎn)如下：

自動(dòng)遞歸搜索（grep 需要-R）
自動(dòng)忽略.gitignore 中的文件以及 2 進(jìn)制文件
可以搜索指定文件類型（rg -tpy foo限定 python 文件， rg -Tjs foo排除 js 文件)
支持大部分 grep 的 feature(常用的都有)
支持各種文件編譯（UTF-8， UTF-16， latin-1, GBK, EUC-JP, Shift_JIS 等等）
支持搜索常見(jiàn)壓縮文件(gzip, xz, lzma, bzip2, lz4)
自動(dòng)高亮匹配的結(jié)果
更少的命令名稱 rg (grep 是四個(gè)字符)
不支持多行搜索和花哨的正則

安裝 ripgrep

先安裝 RUST

curl https://sh.rustup.rs -sSf | sh

然后一路 enter 就好了

用 RUST 安裝 rigpre

git clone https://github.com/BurntSushi/ripgrep
cd ripgrep
cargo build --release
cp ./target/release/rg /usr/local/bin/

最后一步根據(jù)你的情況把它放到某個(gè)在 PATH 里的路徑里

使用

搜索結(jié)果展示

linux最快的文本搜索神器ripgrep(grep的最好代替者)

用法總體格式

USAGE:
  rg [OPTIONS] PATTERN [PATH ...]
  rg [OPTIONS] [-e PATTERN ...] [-f PATTERNFILE ...] [PATH ...]
  rg [OPTIONS] --files [PATH ...]
  rg [OPTIONS] --type-list
  command | rg [OPTIONS] PATTERN

輸入?yún)?shù)

ARGS:
  <PATTERN>
      A regular expression used for searching. To match a pattern beginning with a
      dash, use the -e/--regexp flag.

      For example, to search for the literal '-foo', you can use this flag:

        rg -e -foo

      You can also use the special '--' delimiter to indicate that no more flags
      will be provided. Namely, the following is equivalent to the above:

        rg -- -foo

  <PATH>...
      A file or directory to search. Directories are searched recursively. Paths specified on
      the command line override glob and ignore rules.

options	Description	other
-A, --after-context <NUM>	顯示匹配內(nèi)容后的<NUM>行	會(huì)覆蓋--context
-B, --before-context <NUM>	顯示匹配內(nèi)容前的<NUM>行	會(huì)覆蓋--context
-b, --byte-offset	顯示匹配內(nèi)容在文件中的字節(jié)偏移	和-o 一起使用，只打印偏移
-s, --case-sensitive	大小寫敏感	會(huì)覆蓋-i(ignore case), -S(smart case)
--color <WHEN>	什么時(shí)候使用顏色，默認(rèn) auto	如果--vimgre 被使用，那么默認(rèn)值是 never
	可選項(xiàng)有： never, auto, always, ansi
--colors <COLOR_SPEC>...	設(shè)定輸出顏色：	color: red, blue, green, cyan
	`{type}:{attribute}:{value}`	magenta, yellow, white, black
	`{type}`: path, line, column, match	style: nobold, bold, nointense
	`{attribute}`: fg, bg, style	intense, nounderline, underline
	`{value}`: a color or a text style	Example:
	`{type}:none`會(huì)清空{(diào)type}的顏色設(shè)定	rg --colors 'match:fg:magenta' --colors 'line:bg:yellow' foo
		擴(kuò)展顏色集可以被`{value}`使用，如果終端支持 ANSI color
		描述方法是'x'(256-color) 或 'x,x,x'(24-bit true color)
		x 是 0-255 之間的數(shù)值,默認(rèn)是 10 進(jìn)制， 0x 前綴是 16 進(jìn)制
		比如: rg --colors 'match:bg:0,128,255'
		或者等價(jià)的：rg --colors 'match:bg:0x0,0x80,0xFF'
		在使用 extended color code 時(shí) intense 和 nointense 是無(wú)效的
--column	第一次匹配所在列數(shù)(從 1 開(kāi)始)	能夠被--no-column 取消掉
-C, --context <NUM>	顯示匹配內(nèi)容的前面和后面的<NUM>行	它會(huì)覆蓋-B 和-A 選項(xiàng)
--context-separator <SEPARATOR>	在輸出中用來(lái)分隔非連續(xù)的行	x7F 或t 可被使用，默認(rèn)是--
-c, --count	只顯示匹配的行數(shù)	如果只有一個(gè)文件給 ripgrep，那只打印匹配行數(shù)
		可以用--with-filename 來(lái)強(qiáng)制打印文件名
		它會(huì)覆蓋--count-matches 選項(xiàng)
--count-matches	只顯示匹配的次數(shù)	可以用--with-file 來(lái)強(qiáng)制在只有一個(gè)文件時(shí)也輸出文件名
--debug	顯示調(diào)試信息
--dfa-size-limit <NUM+SUFFIX?>	regex DFA 的上限，默認(rèn) 10M
-E, --encoding <ENCODING>	描述文本編碼, 默認(rèn)是 auto	https://encoding.spec.whatwg.org/#concept-encoding-get
-f, --file <PATTERNFILE>...	從文件中讀入 pattern, 一行一 pattern	可以被多次使用或和-e 一起組合使用，所以有組合會(huì)被匹配
--files	打印所有將被搜索的文件	以`rg <options> --files [PATH...]`方式使用，不能加 pattern
-l, --files-with-matches	只打印有匹配的文件名	覆蓋--files-without-match
--files-without-match	只打印無(wú)匹配的文件名	覆蓋 --file-with-matches
-F, --fixed-strings	把 pattern 當(dāng)成常規(guī)文字而非 regex	可以用--no-fixed-strings 來(lái)禁止這個(gè)選項(xiàng)
-L, --follow	會(huì)遞歸搜索鏈接，默認(rèn)關(guān)閉	可以用--no-follow 來(lái)關(guān)閉
-g, --glob <GLOB>...	通配符文件或文件夾，可以用!來(lái)取反	可以多次使用，會(huì)匹配.gitignore 的通配符規(guī)則
-h, --help	打印幫助信息
--heading	打印文件名到匹配內(nèi)容的上方而不是同一行	這是默認(rèn)行為，可以用--no-heading 來(lái)關(guān)閉
--hidden	搜索隱藏文件和文件夾	默認(rèn)忽略, 可用--no-hidden 關(guān)閉
--iglob <GLOB>...	同--glob, 但這個(gè)大小寫不敏感
-i, --ignore-case	pattern 大小寫不敏感	可通過(guò)-s/--case-sensitive 或-S/--smart-case 覆蓋這個(gè)選項(xiàng)
--ignore-file <PATH>...	忽略路徑，格式同.gitignore, 可多個(gè)	多個(gè)--ignore-file 標(biāo)記時(shí)，后面優(yōu)先級(jí)高
	在命令上時(shí)，使用-g 來(lái)達(dá)到同樣效果
-v, --invert-match	反向匹配
-n, --line-number	顯示文件行數(shù)，默認(rèn)打開(kāi)
-x, --line-regexp	只顯示整行都匹配 pattern 的行	會(huì)覆蓋--word-regexp
-M, --max-columns <NUM>	不打印長(zhǎng)于<NUM>列的匹配行
-m, --max-count <NUM>	限制一個(gè)文件最多<NUM>行匹配
--max-depth <NUM>	限制文件夾遞歸搜索深度	`rg --max-depth 0 dir/`不執(zhí)行任何搜索
--max-filesize <NUM+SUFFIX?>	忽略大于<NUM>byte 的文件	suffix 可以是 K, M，G, 默認(rèn)是 byte
--mmap	盡量使用 memory maps, 默認(rèn)行為	目前它不支持所有選項(xiàng), 用--no-mmap 來(lái)關(guān)閉
--no-config	不讀取 conf 文件, 忽略 RIPGREP_CONFIG_PATH
--no-filename	不要打印匹配文件名
--no-heading	在每個(gè)匹配行前都打印文件名
--no-ignore	取消 ignore 文件，如.gitignore, .ignore	可以用--ignore 關(guān)閉
--no-ignore-global	取消對(duì)全局的 ignore 文件讀取	如$HOME/.config/git/ignore
--no-ignore-messages	取消解析.ignroe, .gitignore 文件相關(guān)錯(cuò)誤	可通過(guò)--ignore-messages 關(guān)閉
--no-ignore-parent	不讀取父文件夾里的.gitignore, .ignore 文件	可通過(guò) --ignore-parent 關(guān)閉
--no-ignore-vcs	只全能.ignore 文件	可通過(guò)--ignore-vcs 關(guān)閉
-N, --no-line-number	不打印匹配行數(shù)
--no-messages	不打印打開(kāi)和讀取文件相關(guān)錯(cuò)誤
-0, --null	在打印的文件路徑后加一個(gè) NUL 字符	對(duì)于 xargs 非常有用
-o, --only-matching	只打印匹配的內(nèi)容，而不是整行
--passthru	打印匹配和不匹配的行
--path-separator <SEPARATOR>	路徑分隔符，在 linux 上默認(rèn)是/
--pre <COMMAND>	用<COMMAND>處理文件，并將結(jié)果給 rg	可能有巨大的性能懲罰
	例如
	case "$1" in
	*.pdf)
	exec pdftotext "$1" -
	;;
	*)
	case $(file "$1") in
	_Zstandard_)
	exec pzstd -cdq
	;;
	*)
	exec cat
	;;
	esac
	;;
	esac
-p, --pretty	`--color always --heading --line-number`
-q, --quiet	不打印到 stdout, 如果匹配發(fā)現(xiàn)，停止 rg	當(dāng) rg 用于 exit 代碼時(shí)非常有用
--regex-size-limit <NUM+SUFFIX?>	編譯 regex 的上限
-e, --regexp <PATTERN>...	使用正則來(lái)匹配	可多次使用這個(gè)選項(xiàng)，打印匹配任何 pattern 的行
		可以用于搜索-開(kāi)頭的 pattern，如`rg -e -foo`
-r, --replace <REPLACEMENT_TEXT>	用相應(yīng)文件代替匹配內(nèi)容打印出來(lái)	組序號(hào)($5)可以被使用
-z, --search-zip	在 gz,bz2,xz,lzma,lz4 文件類型中搜索	可通過(guò)--no-search-zip 關(guān)閉
-S, --smart-case	如果全小寫，則大小寫不敏感，否則敏感	可通過(guò)-s/--case-sensitive 和-i/--ignore-case 關(guān)閉
--sort-files	根據(jù)文件路徑排序輸出結(jié)果	會(huì)關(guān)閉并行搜索線程
--stats	打印出統(tǒng)計(jì)結(jié)果
-a, --text	搜索二進(jìn)制文件	可通過(guò)--no-text 關(guān)閉
-j, --threads <NUM>	大約使用的線程數(shù)
-t, --type <TYPE>...	只搜索某種文件類型	可通過(guò)--type-lsit 來(lái)列出支持的文件類型
--type-add <TYPE_SPEC>...	添加文件類型	如`rg --type-add 'foo:*.foo' -tfoo PATTERN`
	也可以用來(lái)創(chuàng)建某種包含多種文件類型的規(guī)則	--type-add 'src:include:cpp,py,md'
--type-clear <TYPE>...	清除默認(rèn)的文件類型
--type-list	列出所有內(nèi)置文件類型
-T, --type-not <TYPE>...	不要搜索某種文件類型
-u, --unrestricted	-u 搜索.gitignore 里的文件, -uu 搜索隱藏文件	-uuu 搜索二進(jìn)制文件
-V, --version	打印版本信息
--vimgrep	每一次匹配打印一行	一行有多次匹配會(huì)打印多行
-H, --with-filename	打印匹配的文件路徑，默認(rèn)	可通過(guò)--no-filename 關(guān)閉
-w, --word-regexp	把 pattern 作為單獨(dú)單詞匹配，與< >等價(jià)

實(shí)例展示

實(shí)例一

$ rg 'name' ./

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例二

搜索name為獨(dú)立單詞的內(nèi)容(-w), 相當(dāng)于<pattern>

$ rg -w 'name' ./

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例三

只打印包含匹配內(nèi)容的文件名(-l)

$ rg -w 'name' ./ -l
src/cpp/epoll_server.cpp
src/cpp/uart_xtor.cpp

實(shí)例四

只搜索cpp文件(-t), 可以用-T來(lái)不搜索某種類型文件

$ rg -w 'name' ./ -tcpp

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例五

正則搜索(-e)

$ rg -e "sa.*port" ./ -tcpp

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例六

顯示匹配內(nèi)容及上下各兩行(-C), -A/-B類似

$ rg -e "sa.*port" ./ -tcpp -C2

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例七

顯示不含"debug"的行(-v)

$ rg -v "debug" -tcpp ./

實(shí)例八

只顯示匹配部分(-o)

$ rg -e "if.*debug" ./ -tcpp -o

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例九

忽略大小寫(-i)

$ rg -ie "if.*debug" ./ -tcpp -o

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例十

把pattern當(dāng)成常量字符(-F), 像.(){}*+不需要escape，如果要搜索的字符是以-開(kāi)頭，那么要用--來(lái)作為分隔符，或者用rg -e "-foo"

rg -F "i++)" ./ -tcpp

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例十一

打印所有將被搜索的文件 --files

rg --files

linux最快的文本搜索神器ripgrep(grep的最好代替者)

實(shí)例十二

輸出內(nèi)置識(shí)別文件類型

$ rg --type-list
agda: *.agda, *.lagda
aidl: *.aidl
amake: *.bp, *.mk
asciidoc: *.adoc, *.asc, *.asciidoc
asm: *.S, *.asm, *.s
ats: *.ats, *.dats, *.hats, *.sats
avro: *.avdl, *.avpr, *.avsc
awk: *.awk
bazel: *.bzl, BUILD, WORKSPACE
bitbake: *.bb, *.bbappend, *.bbclass, *.conf, *.inc
bzip2: *.bz2
c: *.H, *.c, *.cats, *.h
cabal: *.cabal
cbor: *.cbor
ceylon: *.ceylon
clojure: *.clj, *.cljc, *.cljs, *.cljx
cmake: *.cmake, CMakeLists.txt
coffeescript: *.coffee
config: *.cfg, *.conf, *.config, *.ini
cpp: *.C, *.H, *.cc, *.cpp, *.cxx, *.h, *.hh, *.hpp, *.hxx, *.inl
creole: *.creole
crystal: *.cr, Projectfile
cs: *.cs
csharp: *.cs
cshtml: *.cshtml
css: *.css, *.scss
csv: *.csv
cython: *.pyx
d: *.d
dart: *.dart
dhall: *.dhall
docker: *Dockerfile*
elisp: *.el
elixir: *.eex, *.ex, *.exs
elm: *.elm
erlang: *.erl, *.hrl
fidl: *.fidl
fish: *.fish
fortran: *.F, *.F77, *.F90, *.F95, *.f, *.f77, *.f90, *.f95, *.pfo
fsharp: *.fs, *.fsi, *.fsx
gn: *.gn, *.gni
go: *.go
groovy: *.gradle, *.groovy
gzip: *.gz
h: *.h, *.hpp
haskell: *.c2hs, *.cpphs, *.hs, *.hsc, *.lhs
hbs: *.hbs
hs: *.hs, *.lhs
html: *.ejs, *.htm, *.html
idris: *.idr, *.lidr
java: *.java, *.jsp
jinja: *.j2, *.jinja, *.jinja2
jl: *.jl
js: *.js, *.jsx, *.vue
json: *.json, composer.lock
jsonl: *.jsonl
julia: *.jl
jupyter: *.ipynb, *.jpynb
kotlin: *.kt, *.kts
less: *.less
license: *[.-]LICEN[CS]E*, AGPL-*[0-9]*, APACHE-*[0-9]*, BSD-*[0-9]*, CC-BY-*, COPYING, COPYING[.-]*, COPYRIGHT, COPYRIGHT[.-]*, EULA, EULA[.-]*, GFDL-*[0-9]*, GNU-*[0-9]*, GPL-*[0-9]*, LGPL-*[0-9]*, LICEN[CS]E, LICEN[CS]E[.-]*, MIT-*[0-9]*, MPL-*[0-9]*, NOTICE, NOTICE[.-]*, OFL-*[0-9]*, PATENTS, PATENTS[.-]*, UNLICEN[CS]E, UNLICEN[CS]E[.-]*, agpl[.-]*, gpl[.-]*, lgpl[.-]*, licen[cs]e, licen[cs]e.*
lisp: *.el, *.jl, *.lisp, *.lsp, *.sc, *.scm
log: *.log
lua: *.lua
lz4: *.lz4
lzma: *.lzma
m4: *.ac, *.m4
make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile
man: *.[0-9][cEFMmpSx], *.[0-9lnpx]
markdown: *.markdown, *.md, *.mdown, *.mkdn
matlab: *.m
md: *.markdown, *.md, *.mdown, *.mkdn
mk: mkfile
ml: *.ml
msbuild: *.csproj, *.fsproj, *.proj, *.props, *.targets, *.vcxproj
nim: *.nim
nix: *.nix
objc: *.h, *.m
objcpp: *.h, *.mm
ocaml: *.ml, *.mli, *.mll, *.mly
org: *.org
pdf: *.pdf
perl: *.PL, *.perl, *.pl, *.plh, *.plx, *.pm, *.t
php: *.php, *.php3, *.php4, *.php5, *.phtml
pod: *.pod
protobuf: *.proto
ps: *.cdxml, *.ps1, *.ps1xml, *.psd1, *.psm1
puppet: *.erb, *.pp, *.rb
purs: *.purs
py: *.py
qmake: *.prf, *.pri, *.pro
r: *.R, *.Rmd, *.Rnw, *.r
rdoc: *.rdoc
readme: *README, README*
rst: *.rst
ruby: *.gemspec, *.rb, .irbrc, Gemfile, Rakefile
rust: *.rs
sass: *.sass, *.scss
scala: *.sbt, *.scala
sh: *.bash, *.bashrc, *.csh, *.cshrc, *.ksh, *.kshrc, *.sh, *.tcsh, *.zsh, .bash_login, .bash_logout, .bash_profile, .bashrc, .cshrc, .kshrc, .login, .logout, .profile, .tcshrc, .zlogin, .zlogout, .zprofile, .zshenv, .zshrc, bash_login, bash_logout, bash_profile, bashrc, profile, zlogin, zlogout, zprofile, zshenv, zshrc
smarty: *.tpl
sml: *.sig, *.sml
soy: *.soy
spark: *.spark
sql: *.psql, *.sql
stylus: *.styl
sv: *.h, *.sv, *.svh, *.v, *.vg
svg: *.svg
swift: *.swift
swig: *.def, *.i
systemd: *.automount, *.conf, *.device, *.link, *.mount, *.path, *.scope, *.service, *.slice, *.socket, *.swap, *.target, *.timer
taskpaper: *.taskpaper
tcl: *.tcl
tex: *.bib, *.cls, *.ltx, *.sty, *.tex
textile: *.textile
tf: *.tf
toml: *.toml, Cargo.lock
ts: *.ts, *.tsx
twig: *.twig
txt: *.txt
vala: *.vala
vb: *.vb
verilog: *.sv, *.svh, *.v, *.vh
vhdl: *.vhd, *.vhdl
vim: *.vim
vimscript: *.vim
webidl: *.idl, *.webidl, *.widl
wiki: *.mediawiki, *.wiki
xml: *.xml, *.xml.dist
xz: *.xz
yacc: *.y
yaml: *.yaml, *.yml
zsh: *.zsh, .zlogin, .zlogout, .zprofile, .zshenv, .zshrc, zlogin, zlogout, zprofile, zshenv, zshrc

總結(jié)

ripgrep的搜索速度真是快的飛起來(lái)，在瀏覽代碼時(shí)對(duì)我的幫助非常大。我相信它對(duì)于每一個(gè)碼農(nóng)的價(jià)值都是無(wú)限大的，特別是結(jié)合FZF之后。
唯一的弱點(diǎn)是對(duì)正則的支持，但這是一個(gè)取舍，如果采用如PCRE那樣的庫(kù)的話，一定會(huì)極大影響速度。

以上就是本文的全部?jī)?nèi)容，希望對(duì)大家的學(xué)習(xí)有所幫助，也希望大家多多支持億速云。

向AI問(wèn)一下細(xì)節(jié)

linux最快的文本搜索神器ripgrep(grep的最好代替者)

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽