溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗(yàn)證碼

其他方式登錄

點(diǎn)擊登錄注冊即表示同意《億速云用戶服務(wù)條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時(shí)！

請點(diǎn)擊重新獲取二維碼

kafka深度研究之路(4)-kafka和zk 配置文件詳細(xì)說明（來龍去脈）之kafka配置

發(fā)布時(shí)間：2020-07-30 13:54:46 來源：網(wǎng)絡(luò) 閱讀：1502 作者：馬吉輝欄目：大數(shù)據(jù)

2/kafka配置文件參數(shù)詳解  默認(rèn)必須配置的參數(shù) 
默認(rèn) kafka server.properties  配置如下：
############################# Server Basics #############################          # 服務(wù)器基礎(chǔ)知識

# The id of the broker. This must be set to a unique integer for each broker.      # 必須為每個(gè)代理設(shè)置一個(gè)唯一的整數(shù)
broker.id=0

############################# Socket Server Settings #############################   # 套接字服務(wù)器設(shè)置

# The address the socket server listens on. It will get the value returned from 
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092

# Hostname and port the broker will advertise to producers and consumers. If not set, 
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3                                       # 服務(wù)器用于從網(wǎng)絡(luò)接收請求并向網(wǎng)絡(luò)發(fā)送響應(yīng)的線程數(shù)  默認(rèn)是3 

# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8                                               # 服務(wù)器用于處理請求的線程數(shù)，可能包括磁盤I / O.  默認(rèn)是 8

# The send buffer (SO_SNDBUF) used by the socket server         ＃套接字服務(wù)器使用的發(fā)送緩沖區(qū)（SO_SNDBUF）
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server      ＃套接字服務(wù)器使用的接收緩沖區(qū)（SO_RCVBUF）
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)         # 套接字服務(wù)器將接受的請求的最大大?。ǚ乐筄OM）
socket.request.max.bytes=104857600

############################# Log Basics #############################   日志基礎(chǔ)

# A comma separated list of directories under which to store log files          ＃逗號分隔的目錄列表，用于存儲(chǔ)日志文件
log.dirs=/tmp/kafka-logs

# The default number of log partitions per topic. More partitions allow greater    ＃每個(gè)主題的默認(rèn)日志分區(qū)數(shù)。更多分區(qū)允許更大
# parallelism for consumption, but this will also result in more files across      #dileism for consumption，但這也會(huì)導(dǎo)致更多的文件
# the brokers.
num.partitions=1                                                                   # 建議broker少的話，默認(rèn)就幾個(gè)broker 就設(shè)置成幾個(gè)分區(qū)

＃在啟動(dòng)時(shí)用于日志恢復(fù)和在關(guān)閉時(shí)刷新的每個(gè)數(shù)據(jù)目錄的線程數(shù)。
＃對于數(shù)據(jù)目錄位于RAID陣列中的安裝，建議增加此值。
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Internal Topic Settings  #############################   內(nèi)部主題設(shè)置
＃組元數(shù)據(jù)內(nèi)部主題“__consumer_offsets”和“__transaction_state”的復(fù)制因子
＃對于除開發(fā)測試之外的任何其他內(nèi)容，建議使用大于1的值以確?？捎眯?，例如3。
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
//關(guān)于這3個(gè)參數(shù)，可以在修改kafka程序中指定的 __consumer_offsets 的副本數(shù) 
然后@上海-馬吉輝 說只要num.partitions=3，__consumer_offsets副本數(shù)就是3，我測試不是 還是1
所以還是以offsets.topic.replication.factor參數(shù)控制為準(zhǔn)
如果不是第一次啟動(dòng)kafka  那幾個(gè)配置只有在初次啟動(dòng)生效的。 apache kafka 下載下來應(yīng)該都默認(rèn)是 1 吧，2.* 也是 1 啊。
可以這樣修改
先停止kafka集群，刪除每個(gè)broker  data目錄下所有__consumer_offsets_*
然后刪除zookeeper下rmr /kafkatest/brokers/topics/__consumer_offsets    然后重啟kafka
消費(fèi)一下，這個(gè)__consumer_offsets就會(huì)創(chuàng)建了
注意：是在第一次消費(fèi)時(shí)，才創(chuàng)建這個(gè)topic的，不是broker集群啟動(dòng)就創(chuàng)建，還有那個(gè)__trancation_state  topic也是第一次使用事務(wù)的時(shí)候才會(huì)創(chuàng)建

小結(jié)：在生產(chǎn)上，沒人去刪zk里的內(nèi)容，危險(xiǎn)系數(shù)大，還是推薦動(dòng)態(tài)擴(kuò)副本，只要把json寫對就好

############################# Log Flush Policy #############################    日志刷新政策
＃消息立即寫入文件系統(tǒng)，但默認(rèn)情況下我們只有fsync（）才能同步
＃懶惰的操作系統(tǒng)緩存。以下配置控制將數(shù)據(jù)刷新到磁盤。
＃這里有一些重要的權(quán)衡：
＃1。持久性：如果您不使用復(fù)制，則可能會(huì)丟失未刷新的數(shù)據(jù)。
＃2。延遲：當(dāng)刷新確實(shí)發(fā)生時(shí)，非常大的刷新間隔可能會(huì)導(dǎo)致延遲峰值，因?yàn)闀?huì)有大量數(shù)據(jù)需要刷新。
＃3。吞吐量：沖洗通常是最昂貴的操作，并且小的沖洗間隔可能導(dǎo)致過多的搜索。
＃以下設(shè)置允許配置刷新策略以在一段時(shí)間后刷新數(shù)據(jù)或
＃每N條消息（或兩者）。這可以在全局范圍內(nèi)完成，并在每個(gè)主題的基礎(chǔ)上進(jìn)行覆蓋。

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk   ＃強(qiáng)制刷新數(shù)據(jù)到磁盤之前要接受的消息數(shù)
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush  ＃強(qiáng)制刷新之前消息可以在日志中停留的最長時(shí)間
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################   日志保留政策
＃以下配置控制日志段的處理。政策可以
＃設(shè)置為在一段時(shí)間后或在累積給定大小后刪除段。
＃只要滿足這些條件* *，就會(huì)刪除一個(gè)段。刪除總是發(fā)生
＃從日志的末尾開始。

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion due to age      ＃由于年齡原因有資格刪除的日志文件的最小年齡
log.retention.hours=168

＃日志的基于大小的保留策略。除非剩下，否則將從日志中刪除段
＃segments落在log.retention.bytes之下。功能獨(dú)立于log.retention.hours。
＃log.retention.bytes = 1073741824
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824

＃日志段文件的最大大小。達(dá)到此大小時(shí)，將創(chuàng)建新的日志段。
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

＃檢查日志段以查看是否可以刪除日志段的時(shí)間間隔
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181                      # zookeeper集群的地址，可以是多個(gè)，多個(gè)之間用逗號分割 hostname1:port1,hostname2:port2,hostname3:port3

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000                # ZooKeeper的連接超時(shí)時(shí)間

############################# Group Coordinator Settings #############################   組協(xié)調(diào)員設(shè)置
＃以下配置指定GroupCoordinator將延遲初始消費(fèi)者重新平衡的時(shí)間（以毫秒為單位）。
＃當(dāng)新成員加入組時(shí)，重新平衡將進(jìn)一步延遲group.initial.rebalance.delay.ms的值，最多為max.poll.interval.ms。
＃默認(rèn)值為3秒。
＃我們將此覆蓋為0，因?yàn)樗鼮殚_發(fā)和測試提供了更好的開箱即用體驗(yàn)。
＃但是，在生產(chǎn)環(huán)境中，默認(rèn)值3秒更合適，因?yàn)檫@有助于避免在應(yīng)用程序啟動(dòng)期間不必要且可能很昂貴的重新平衡。
group.initial.rebalance.delay.ms = 0

# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0

kafka的擴(kuò)展參數(shù) 抓重點(diǎn)說明
background.threads =4                           # 一些后臺(tái)任務(wù)處理的線程數(shù)，例如過期消息文件的刪除等，一般情況下不需要去做修改
queued.max.requests =500                          # 等待IO線程處理的請求隊(duì)列最大數(shù)，若是等待IO的請求超過這個(gè)數(shù)值，那么會(huì)停止接受外部消息，應(yīng)該是一種自我保護(hù)機(jī)制。
controller.socket.timeout.ms =30000             # partition leader與replicas之間通訊時(shí),socket的超時(shí)時(shí)間 
controller.message.queue.size=10                # partition leader與replicas數(shù)據(jù)同步時(shí),消息的隊(duì)列尺寸
replica.lag.time.max.ms =10000                  # replicas響應(yīng)partition leader的最長等待時(shí)間，若是超過這個(gè)時(shí)間，就將replicas列入ISR(in-sync replicas)，并認(rèn)為它是死的，不會(huì)再加入管理中 
replica.lag.max.messages =4000                  # 如果follower落后與leader太多,將會(huì)認(rèn)為此follower[或者說partition?relicas]已經(jīng)失效
                                                ##通常,在follower與leader通訊時(shí),因?yàn)榫W(wǎng)絡(luò)延遲或者鏈接斷開,總會(huì)導(dǎo)致replicas中消息同步滯后
                                                ##如果消息之后太多,leader將認(rèn)為此follower網(wǎng)絡(luò)延遲較大或者消息吞吐能力有限,將會(huì)把此replicas遷移
                                                ##到其他follower中.
                                                ##在broker數(shù)量較少,或者網(wǎng)絡(luò)不足的環(huán)境中,建議提高此值.
                                                // Leader會(huì)跟蹤與其保持同步的Replica列表，該列表稱為ISR（即in-sync Replica）。如果一個(gè)Follower宕機(jī)，或者落后太多，Leader將把它從ISR中移除。這里所描述的“落后太多”指Follower復(fù)制的消息落后于Leader后的條數(shù)超過預(yù)定值（該值可在$KAFKA_HOME/config/server.properties中通過replica.lag.max.messages配置，其默認(rèn)值是4000）或者Follower超過一定時(shí)間（該值可在$KAFKA_HOME/config/server.properties中通過replica.lag.time.max.ms來配置，其默認(rèn)值是10000）未向Leader發(fā)送fetch請求。
replica.socket.timeout.ms=30*1000               # follower與leader之間的socket超時(shí)時(shí)間 
replica.socket.receive.buffer.bytes=64*1024      # leader復(fù)制時(shí)候的socket緩存大小  建議  1048576 B = 1M 
replica.fetch.max.bytes =1024*1024             # replicas每次獲取數(shù)據(jù)的最大大小 
replica.fetch.wait.max.ms =500                  # replicas同leader之間通信的最大等待時(shí)間，失敗了會(huì)重試
replica.fetch.min.bytes =1                      # fetch的最小數(shù)據(jù)尺寸,如果leader中尚未同步的數(shù)據(jù)不足此值,將會(huì)阻塞,直到滿足條件
num.replica.fetchers=1                          # leader進(jìn)行復(fù)制的線程數(shù)，增大這個(gè)數(shù)值會(huì)增加follower的IO
replica.high.watermark.checkpoint.interval.ms =5000   # 每個(gè)replica檢查是否將最高水位進(jìn)行固化的頻率
leader.imbalance.per.broker.percentage =10     # leader的不平衡比例，若是超過這個(gè)數(shù)值，會(huì)對分區(qū)進(jìn)行重新的平衡
leader.imbalance.check.interval.seconds =300   # 檢查leader是否不平衡的時(shí)間間隔
zookeeper.connect = localhost:2181             # zookeeper集群的地址，可以是多個(gè)，多個(gè)之間用逗號分割 hostname1:port1,hostname2:port2,hostname3:port3
zookeeper.session.timeout.ms=6000              # ZooKeeper的最大超時(shí)時(shí)間，就是心跳的間隔，若是沒有反映，那么認(rèn)為已經(jīng)死了，不易過大
zookeeper.connection.timeout.ms =6000          # ZooKeeper的連接超時(shí)時(shí)間
zookeeper.sync.time.ms =2000                   # ZooKeeper集群中l(wèi)eader和follower之間的同步時(shí)間

###############################################
grep '^[a-Z]' server.properties 
broker.id=1                             # //當(dāng)前機(jī)器在集群中的唯一標(biāo)識，和zookeeper的myid性質(zhì)一樣
host.name=10.9.39.110                   # 這個(gè)參數(shù)默認(rèn)是關(guān)閉的，在0.8.1有個(gè)bug，DNS解析問題，失敗率的問題。 盡量寫ip
num.network.threads=8                   # 這個(gè)是borker進(jìn)行網(wǎng)絡(luò)處理的線程數(shù) 一般num.network.threads主要處理網(wǎng)絡(luò)io，讀寫緩沖區(qū)數(shù)據(jù)，基本沒有io等待，配置線程數(shù)量為cpu核數(shù)加1
num.io.threads=16                       # num.io.threads主要進(jìn)行磁盤io操作，高峰期可能有些io等待，因此配置需要大些。配置線程數(shù)量為cpu核數(shù)2倍，最大不超過3倍  
socket.send.buffer.bytes=102400         # 發(fā)送緩沖區(qū)buffer大小，數(shù)據(jù)不是一下子就發(fā)送的，先回存儲(chǔ)到緩沖區(qū)了到達(dá)一定的大小后在發(fā)送，能提高性能   100kb （發(fā)送緩沖區(qū)）推薦1M
socket.receive.buffer.bytes=102400      # kafka接收緩沖區(qū)大小，當(dāng)數(shù)據(jù)到達(dá)一定大小后在序列化到磁盤   100kb   （接收緩沖區(qū)）  推薦1M
socket.request.max.bytes=104857600      # 這個(gè)參數(shù)是向kafka請求消息或者向kafka發(fā)送消息的請請求的最大數(shù)，這個(gè)值不能超過java的堆棧大小 104857600B =  100M （防止oom）
log.dirs=/data/kafka/kafka-logs         # 消息存放的目錄，這個(gè)目錄可以配置為“，”逗號分割的表達(dá)式，上面的num.io.threads要大于這個(gè)目錄的個(gè)數(shù)，
                                        //這個(gè)目錄如果配置多個(gè)目錄，新創(chuàng)建的topic他把消息持久化的地方是，當(dāng)前以逗號分割的目錄中，那個(gè)分區(qū)數(shù)最少就放那一個(gè)
num.partitions=1                        # 默認(rèn)的分區(qū)數(shù)，一個(gè)topic默認(rèn)1個(gè)分區(qū)數(shù)  我建議根據(jù)brocker數(shù)設(shè)置 broker有3個(gè) 就設(shè)置成默認(rèn)分區(qū)為3
num.recovery.threads.per.data.dir=1     # 每個(gè)數(shù)據(jù)目錄用來日志恢復(fù)的線程數(shù)目 對于數(shù)據(jù)目錄位于RAID陣列中的安裝，建議增加此值。 一般保持默認(rèn)
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1 
//以上3個(gè)推薦＃組元數(shù)據(jù)內(nèi)部主題“__consumer_offsets”和“__transaction_state”的復(fù)制因子對于除開發(fā)測試之外的任何其他內(nèi)容，建議使用大于1的值以確保可用性，例如3。
#log.flush.interval.messages=10000       # 強(qiáng)制刷新數(shù)據(jù)到磁盤之前要接受的消息數(shù) 
#log.flush.interval.ms=1000              ＃ 強(qiáng)制刷新之前消息可以在日志中停留的最長時(shí)間 
log.retention.hours=24                  # 默認(rèn)消息的最大持久化時(shí)間，168小時(shí)，7天 
＃log.retention.bytes = 1073741824      ＃ 日志的基于大小的保留策略。除非剩下，否則將從日志中刪除段  segments落在log.retention.bytes之下。功能獨(dú)立于log.retention.hours。
log.segment.bytes=1073741824            # 這個(gè)參數(shù)是：因?yàn)閗afka的消息是以追加的形式落地到文件，當(dāng)超過這個(gè)值的時(shí)候，kafka會(huì)新起一個(gè)文件
log.retention.check.interval.ms=300000  # 每隔300000毫秒去檢查上面配置的log失效時(shí)間 
zookeeper.connect=10.9.39.110:2181,10.9.139.65:2181,10.9.35.206:2181,10.9.88.40:2181,10.9.74.126:2181/kafkagroup  # 設(shè)置zookeeper的連接端口
zookeeper.connection.timeout.ms=60000   # 設(shè)置zookeeper的連接超時(shí)時(shí)間 
group.initial.rebalance.delay.ms=3      # 以下配置指定GroupCoordinator將延遲初始消費(fèi)者重新平衡的時(shí)間（以毫秒為單位）。 官方推薦成 3 

########################################

向AI問一下細(xì)節(jié)

推薦閱讀：

免責(zé)聲明：本站發(fā)布的內(nèi)容（圖片、視頻和文字）以原創(chuàng)、轉(zhuǎn)載和分享為主，文章觀點(diǎn)不代表本網(wǎng)站立場，如果涉及侵權(quán)請聯(lián)系站長郵箱：is@yisu.com進(jìn)行舉報(bào)，并提供相關(guān)證據(jù)，一經(jīng)查實(shí)，將立刻刪除涉嫌侵權(quán)內(nèi)容。

上一篇新聞：
LC接口光模塊大盤點(diǎn)
下一篇新聞：
python matplotlib模塊繪制基本圖形的方法

猜你喜歡

AI
助
手

產(chǎn)品服務(wù)

地區(qū)劃分

專題活動(dòng)

幫助支持

關(guān)于我們

售后咨詢

7*24小時(shí)在線電話：400-100-2938

7*24小時(shí)在線 QQ：800811969

關(guān)注億速云

億速云公眾號

手機(jī)網(wǎng)站二維碼