您好,登錄后才能下訂單哦!
魯春利的工作筆記,誰說程序員不能有文藝范?
Flume從指定目錄讀取數(shù)據(jù),通過memory作為channel,然后講數(shù)據(jù)寫入到hdfs。
Spooling Directory Source(http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source)
Memory Channel(http://flume.apache.org/FlumeUserGuide.html#memory-channel)
HDFS Sink(http://flume.apache.org/FlumeUserGuide.html#hdfs-sink)
Flume配置文件
# vim agent-hdfs.conf # write data to hdfs agent.sources = sd-source agent.channels = mem-channel agent.sinks = hdfs-sink # define source agent.sources.sd-source.type = spooldir agent.sources.sd-source.spoolDir = /opt/flumeSpool agent.sources.sd-source.fileHeader = true # define channel agent.channels.mem-channel.type = memory # define sink agent.sinks.hdfs-sink.type = hdfs agent.sinks.hdfs-sink.hdfs.path = hdfs://nnode:8020/flume/webdata # assemble agent.sources.sd-source.channels = mem-channel agent.sinks.hdfs-sink.channel = mem-channel
說明:/opt/flumeSpool目錄需要提前創(chuàng)建,否則flume檢測不到該目錄,會有錯(cuò)誤提示。
啟動Agent
[hadoop@nnode flume1.6.0]$ bin/flume-ng agent --conf conf --name agent --conf-file conf/agent-hdfs.conf -Dflume.root.logger=INFO,console
拷貝數(shù)據(jù)到/opt/flumeSpool目錄下
cp /usr/local/hadoop2.6.0/logs/* /opt/flumeSpool
Flume檢測到該目錄下數(shù)據(jù)變化,并會自動寫入到HDFS
查看HDFS上flume目錄
[hadoop@nnode flume1.6.0]$ hdfs dfs -ls -R /flume/ drwxr-xr-x - hadoop hadoop 0 2015-11-21 16:55 /flume/webdata -rw-r--r-- 2 hadoop hadoop 2568 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836223 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836224 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836225 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836226 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836227 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836228 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836229 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836230 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836231 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836232 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836233 -rw-r--r-- 2 hadoop hadoop 2163 2015-11-21 16:50 /flume/webdata/FlumeData.1448095836234
查看文件
說明:
通過Flume往hdfs寫入數(shù)據(jù)時(shí),默認(rèn)格式(hdfs.fileType)為SequenceFile,無法直接查看;若希望保存為文本格式,則可以指定hdfs.fileType為DataStream。
查看flumeSpool目錄
[root@nnode flumeSpool]# ll total 3028 -rw-r--r-- 1 root root 227893 Nov 21 16:50 hadoop-hadoop-journalnode-nnode.log.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-journalnode-nnode.out.1.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-journalnode-nnode.out.2.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-journalnode-nnode.out.COMPLETED -rw-r--r-- 1 root root 1993109 Nov 21 16:50 hadoop-hadoop-namenode-nnode.log.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-namenode-nnode.out.1.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-namenode-nnode.out.2.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-namenode-nnode.out.COMPLETED -rw-r--r-- 1 root root 169932 Nov 21 16:50 hadoop-hadoop-zkfc-nnode.log.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-zkfc-nnode.out.1.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-zkfc-nnode.out.2.COMPLETED -rw-r--r-- 1 root root 718 Nov 21 16:50 hadoop-hadoop-zkfc-nnode.out.COMPLETED
說明:Flume處理萬文件后默認(rèn)是不刪除的,但是會標(biāo)記該文件已經(jīng)被flume處理過了,如果處理后無需對文件保留可以通過Source指定刪除策略:
deletePolicy never When to delete completed files: never or immediate
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。