大數(shù)據(jù)常見錯(cuò)誤

發(fā)布時(shí)間：2020-07-11 09:34:01 來源：網(wǎng)絡(luò) 閱讀：1574 作者：51zhangyanfeng 欄目：大數(shù)據(jù)

1、用./bin/spark-shell啟動(dòng)spark時(shí)遇到異常：java.net.BindException: Can't assign requested address: Service 'sparkDriver' failed after 16 retries!

解決方法：add export SPARK_LOCAL_IP="127.0.0.1" to spark-env.sh

2、java Kafka producer error:ERROR kafka.utils.Utils$ - fetching topic metadata for topics [Set(words_topic)] from broker [ArrayBuffer(id:0,host: xxxxxx,port:9092)] failed

解決方法：Set 'advertised.host.name' on server.properties of Kafka broker to server's realIP(same to producer's 'metadata.broker.list' property)

3、java.net.NoRouteToHostException: No route to host

解決方法：zookeeper的IP要配對(duì)

4、Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) java.net.UnknownHostException: linux-pic4.site:

解決方法：add your hostname to /etc/hosts: 127.0.0.1 localhost linux-pic4.site

5、org.apache.spark.SparkException: A master URL must be set in your configuration

解決方法：SparkConf sparkConf = new SparkConf().setAppName("JavaDirectKafkaWordCount").setMaster("local");

6、Failed to locate the winutils binary in the hadoop binary path

解決方法:先安裝好hadoop

7、啟動(dòng)spark時(shí)： Failed to get database default, returning NoSuchObjectException

解決方法：1)Copy winutils.exe from here(https://github.com/steveloughran/winutils/tree/master/hadoop-2.6.0/bin) to some folder say, C:\Hadoop\bin. Set HADOOP_HOME to C:\Hadoop.2）Open admin command prompt. Run C:\Hadoop\bin\winutils.exe chmod 777 /tmp/hive

8、org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true.

解決方法：Use this constructor JavaStreamingContext(sparkContext: JavaSparkContext, batchDuration: Duration) 替代 new JavaStreamingContext(sparkConf, Durations.seconds(5));

9、Reconnect due to socket error: java.nio.channels.ClosedChannelException

解決方法：kafka服務(wù)器broker ip寫對(duì)

10、java.lang.IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute

解決方法：tranformation最后一步產(chǎn)生的那個(gè)RDD必須有相應(yīng)Action操作，例如massages.print()等

11、經(jīng)驗(yàn)：spark中數(shù)據(jù)寫入ElasticSearch的操作必須在action中以RDD為單位執(zhí)行

12、 Problem binding to [0.0.0.0:50010] java.net.BindException: Address already in use;

解決方法：master和slave配置成同一個(gè)IP導(dǎo)致的，要配成不同IP

13、CALL TO LOCALHOST/127.0.0.1:9000

解決方法：host配置正確，/etc/sysconfig/network /etc/hosts /etc/sysconfig/network-scripts/ifcfg-eth0

13、打開namenode:50070頁面，Datanode Infomation只顯示一個(gè)節(jié)點(diǎn)

解決方法：SSH配置錯(cuò)誤導(dǎo)致，主機(jī)名一定要嚴(yán)格匹配，重新配置ssh免密碼登錄

14、經(jīng)驗(yàn)：搭建集群時(shí)要首先配置好主機(jī)名，并重啟機(jī)器讓配置的主機(jī)名生效

15、INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.NoRouteToHostException: No route to host

解決方法：如果主從節(jié)點(diǎn)能相互ping通，那就關(guān)掉防火墻 service iptables stop

16、經(jīng)驗(yàn)：不要隨意格式化HDFS，這會(huì)帶來數(shù)據(jù)版本不一致等諸多問題，格式化前要清空數(shù)據(jù)文件夾

17、namenode1: ssh: connect to host namenode1 port 22: Connection refused

解決方法：sshd被關(guān)閉或沒安裝導(dǎo)致，which sshd檢查是否安裝，若已經(jīng)安裝，則sshd restart，并ssh 本機(jī)hostname，檢查是否連接成功

18、Log aggregation has not completed or is not enabled.

解決方法：在yarn-site.xml中增加相應(yīng)配置，以支持日志聚合

19、failed to launch org.apache.spark.deploy.history.History Server full log in

解決方法：正確配置spark-defaults.xml,spark-en.sh中SPARK_HISTORY_OPTS屬性

20、Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

解決方法：yarn-lient模式出現(xiàn)的異常，暫時(shí)無解

21、hadoop的文件不能下載以及YARN中Tracking UI不能訪問歷史日志

解決方法：windows系統(tǒng)不能解析域名所致，把hosts文件hostname復(fù)制到windows的hosts中

22、經(jīng)驗(yàn)：HDFS文件路徑寫法為：hdfs://master:9000/文件路徑，這里的master是namenode的hostname,9000是hdfs端口號(hào)。

23、Yarn JobHistory Error: Failed redirect for container

解決方法：將 http://:19888/jobhistory/logs 配置到y(tǒng)arn-site.xml中，重啟yarn和JobHistoryServer

24、通過hadoop UI訪問hdfs文件夾時(shí)，出現(xiàn)提示 Permission denied: user=dr.who

解決方法：namonode節(jié)點(diǎn)終端執(zhí)行：hdfs dfs -chmod -R 755 /

25、經(jīng)驗(yàn)：Spark的Driver只有在Action時(shí)才會(huì)收到結(jié)果

26、經(jīng)驗(yàn)：Spark需要全局聚合變量時(shí)應(yīng)當(dāng)使用累加器（Accumulator）

27、經(jīng)驗(yàn)：Kafka以topic與consumer group劃分關(guān)系，一個(gè)topic的消息會(huì)被訂閱它的消費(fèi)者組全部消費(fèi)，如果希望某個(gè)consumer使用topic的全部消息，可將該組只設(shè)一個(gè)消費(fèi)者，每個(gè)組的消費(fèi)者數(shù)目不能大于topic的partition總數(shù)，否則多出的consumer將無消可費(fèi)

28、java.lang.NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;

解決方法：統(tǒng)一ES版本，盡量避免直接在spark中創(chuàng)建ES client

29、eturned Bad Request(400) - failed to parse;Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes; Bailing out..

解決方法：寫入ES的數(shù)據(jù)格式糾正

30、java.util.concurrent.TimeoutException: Cannot receive any reply in 120 seconds

解決方法：確保所有節(jié)點(diǎn)之間能夠免密碼登錄

31、集群模式下，spark無法向elasticsearch寫入數(shù)據(jù)

解決方法：采用這種寫入方式（帶上es配置的Map參數(shù)）results.foreachRDD(javaRDD -> {JavaEsSpark.saveToEs(javaRDD, esSchema, cfg);return null;});

32、經(jīng)驗(yàn)：所有自定義類要實(shí)現(xiàn)serializable接口，否則在集群中無法生效

33、經(jīng)驗(yàn)：resources資源文件讀取要在Spark Driver端進(jìn)行，以局部變量方式傳給閉包函數(shù)

34、通過nio讀取資源文件時(shí)，java.nio.file.FileSystemNotFoundException at com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:171)

解決方法：打成jar包后URI發(fā)生變化所致，形如jar:file:/C:/path/to/my/project.jar!/my-folder，要采用以下解析方式，

final Map env = new HashMap<>();

final String[] array = uri.toString().split("!");

final FileSystem fs = FileSystems.newFileSystem(URI.create(array[0]), env);

final Path path = fs.getPath(array[1]);

35、經(jīng)驗(yàn)：DStream流轉(zhuǎn)化只產(chǎn)生臨時(shí)流對(duì)象，如果要繼續(xù)使用，需要一個(gè)引用指向該臨時(shí)流對(duì)象

36、經(jīng)驗(yàn)：提交到y(tǒng)arn cluster的作業(yè)不能直接print到控制臺(tái)，要用log4j輸出到日志文件中

37、java.io.NotSerializableException: org.apache.log4j.Logger

解決方法：序列化類中不能包含不可序列化對(duì)象，you have to prevent logger instance from default serializabtion process, either make it transient or static. Making it static final is preferred option due to many reason because if you make it transient than after deserialization logger instance will be null and any logger.debug() call will result in NullPointerException in Java because neither constructor not instance initializer block is called during deserialization. By making it static and final you ensure that its thread-safe and all instance of Customer class can share same logger instance, By the way this error is also one of the reason Why Logger should be declared static and final in Java program.

38、log4j:WARN Unsupported encoding

解決方法：1.把UTF改成小寫utf-8 2.設(shè)置編碼那行有空格

39、MapperParsingException[Malformed content, must start with an object

解決方法：采用接口JavaEsSpark.saveJsonToEs，因?yàn)閟aveToEs只能處理對(duì)象不能處理字符串

40、 ERROR ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application

解決方法：資源不能分配過大,或者沒有把.setMaster("local[*]")去掉

41、WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)

解決方法：配置文件broker編號(hào)要寫對(duì)，命令中的IP寫真實(shí)IP

42、 User class threw exception: org.apache.spark.SparkException: org.apache.spark.SparkException: Couldn't find leaders for Set([mywaf,7], [mywaf,1])

解決方法：正確配置kafka，并重新創(chuàng)建topic

43、在ES界面發(fā)現(xiàn)有節(jié)點(diǎn)shard分片不顯示

解決方法：該節(jié)點(diǎn)磁盤容量不足，清理磁盤增加容量

44、The method updateStateByKey(Function2,Optional,Optional>, int) in the type JavaPairDStream is not applicable for the arguments (Function2,Optional,Optional>, int)

解決方法：Spark use com.google.common.base.Optional not jdk default package java.util.Optional

45、NativeCrc32.nativeComputeChunkedSumsByteArray

解決方法：配置eclipse的hadoop-home，bin和system32文件夾中加入64位的2.6版本的hadoop.dll

46、經(jīng)驗(yàn)：Spark Streaming包含三種計(jì)算模式：nonstate 、stateful 、window

47、Yarn的RM單點(diǎn)故障

解決方法：通過三節(jié)點(diǎn)zookeeper集群和yarn-site.xml配置文件完成Yarn HA

48、經(jīng)驗(yàn)：kafka可通過配置文件使用自帶的zookeeper集群

49、經(jīng)驗(yàn)：Spark一切操作歸根結(jié)底是對(duì)RDD的操作

50、如何保證kafka消息隊(duì)列的強(qiáng)有序

解決方法：把需要強(qiáng)有序的topic只設(shè)置一個(gè)partition

51、linux批量多機(jī)互信

解決方法：pub秘鑰配成一個(gè)

52、org.apache.spark.SparkException: Failed to get broadcast_790_piece0 of broadcast_790

解決方法：去除spark-defaults.conf中spark.cleaner.ttl配置

53、Yarn HA環(huán)境下，通過web訪問history日志被跳轉(zhuǎn)到8088而無法顯示

解決方法：恢復(fù)Yarn Http默認(rèn)端口8088

54、but got no response. Marking as slave lost

解決方法：使用yarn client提交作業(yè)遇到這種情況，暫時(shí)無解

55、Using config: /work/poa/zookeeper-3.4.6/bin/../conf/zoo.cfg Error contacting service. It is probably not running.

解決方法：配置文件不正確，例如hostname不匹配等

56、經(jīng)驗(yàn)：部署Spark任務(wù)，不用拷貝整個(gè)架包，只需拷貝被修改的文件，然后在目標(biāo)服務(wù)器上編譯打包。

57、Spark setAppName doesn't appear in Hadoop running applications UI

解決方法：set it in the command line for spark-submit "--name BetterName"

58、如何監(jiān)控Sprak Streaming作業(yè)是否掛掉

解決方法：通過監(jiān)控Driver端口或者根據(jù)yarn指令寫Linux定時(shí)腳本監(jiān)控

59、kafka內(nèi)外網(wǎng)問題

解決方法：kafka機(jī)器雙網(wǎng)卡，配置文件server.properties中advertised.host.name不要寫IP，用域名形式，外網(wǎng)的生產(chǎn)者和內(nèi)網(wǎng)的消費(fèi)者各自解析成自己所需的IP。

60、經(jīng)驗(yàn)：kafka的log.dirs不要設(shè)置成/tmp下的目錄，貌似tmp目錄有文件數(shù)和磁盤容量限制

61、kafka搬機(jī)器后，在新的集群，topic被自動(dòng)創(chuàng)建，且只有一臺(tái)broker負(fù)載

解決方法：server.properties中加上delete.topic.enable=true和auto.create.topics.enable=false，刪除舊的topic，重新創(chuàng)建topic，重啟kafka

62、安裝sbt，運(yùn)行sbt命令卡在Getting org.scala-sbt sbt 0.13.6 ...

解決方法：sbt takes some time to download its jars when it is run first time，不要退出，直至sbt處理完

63、經(jīng)驗(yàn)：ES的分片類似kafka的partition

64、kafka出現(xiàn)OOM異常

解決方法：進(jìn)入kafka broker啟動(dòng)腳本中，在export KAFKA_HEAP_OPTS="-Xmx24G -Xms1G"調(diào)大JVM堆內(nèi)存參數(shù)

65、linux服務(wù)器磁盤爆滿，檢查超過指定大小的文件

解決方法：find / -type f -size +10G

66、spark-direct kafka streaming限速

解決方法：spark.streaming.kafka.maxRatePerPartition，配置每秒每個(gè)kafka分區(qū)讀取速率

67、org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable error returned Not Found(404) - [EngineClosedException CurrentState[CLOSED]

解決方法：在kopf插件中對(duì)該索引先close再open即可。造成原因可能是Index創(chuàng)建時(shí)有shard壞掉。

68、Job aborted due to stage failure: Task not serializable:

解決方法：Serializable the class;Declare the instance only within the lambda function passed in map;Make the NotSerializable object as a static and create it once per machine;Call rdd.forEachPartition and create the NotSerializable object in there

69、Pipeline write will fail on this Pipeline because it contains a stage which does not implement Writable

解決方法：this cannot be done as of Spark 1.6,需升級(jí)spark版本

70、IDEA從git導(dǎo)入scala項(xiàng)目，通篇提示變量never used

解決方法：將src文件夾mark directory as sources root

71、Run configuration in IntelliJ result in "Cannot start compilation: the output path is not specified for module "xxx". Specify the output path in Configure Project.

解決方法：In the default intellij options, "Make" was checked as "Before Launch". Unchecking it fixed the issue.

72、UDFRegistration$$anonfun$register$26$$anonfun$apply$2 cannot be cast to scala.Function1

解決方法：聚合函數(shù)不能用UDF，而應(yīng)該定義UDAF

73、SPARK SQL replacement for mysql GROUP_CONCAT aggregate function

解決方法：自定義UDAF

74、在intellij idea的maven項(xiàng)目中，無法New scala文件

解決方法：pom.xml加入scala-tools插件相關(guān)配置，下載并更新

75、Error:scala: Error: org.jetbrains.jps.incremental.scala.remote.ServerException

解決方法：修改pom.xml配置文件，把scala換到最新版本

76、HADOOP 磁盤滿的各節(jié)點(diǎn)平衡

解決方法：運(yùn)行指令hdfs balancer -Threshold 3 或者運(yùn)行 start-balancer.sh 腳本格式：$Hadoop_home/bin/start-balancer.sh -threshold，參數(shù)3是比例參數(shù)，表示3%，也就是平各個(gè)DataNode直接磁盤使用率偏差在3%以內(nèi)

77、經(jīng)驗(yàn)：sparkSQL UDAF中update函數(shù)的第二個(gè)參數(shù) input: Row 對(duì)應(yīng)的并非DataFrame的行，而是被inputSchema投影了的行

78、Error: No TypeTag available for String sqlContext.udf.register()

解決方法：scala版本不一致，統(tǒng)一所有scala版本

79、How to add a constant column in a Spark DataFrame?

解決方法：The second argument for DataFrame.withColumn should be a Column so you have to use a literal: df.withColumn('new_column', lit(10))

80、Error:scalac:Error:object VolatileDoubleRef does not have a member create

解決方法：scala版本不一致，統(tǒng)一開發(fā)環(huán)境和系統(tǒng)的scala版本

81、java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet

解決方法：統(tǒng)一scala和spark的scala版本

82、maven項(xiàng)目打包去除不要的依賴，防止目標(biāo)jar容量過大

解決方法：在中加入provided標(biāo)明該依賴不放進(jìn)目標(biāo)jar,并用maven shaded方式打包

83、maven打包scala和java的混合項(xiàng)目

解決方法：使用指令 mvn clean scala:compile compile package

84、sparkSQL的udf無法注冊(cè)UDAF聚合函數(shù)

解決方法：把UDAF自定義類的object關(guān)鍵字改成class聲明

85、經(jīng)驗(yàn)：運(yùn)行時(shí)刪除hadoop數(shù)據(jù)目錄會(huì)導(dǎo)致依賴HDFS的JOB失效

86、[IllegalArgumentException[Document contains at least one immense term in field=XXX

解決方法：在ES中創(chuàng)建索引時(shí)對(duì)長(zhǎng)文本字段要分詞

87、maven shade打包資源文件沒有打進(jìn)去

解決方法：把resources文件夾放到src/main/下面，與scala或java文件夾并排

88、經(jīng)驗(yàn)：spark Graph根據(jù)邊集合構(gòu)建圖，頂點(diǎn)集合只是指定圖中哪些頂點(diǎn)有效

89、ES寫query用到正則匹配時(shí)，Determinizing automaton would result in more than 10000 states.

解決方法：正則表達(dá)式的字符串太長(zhǎng)，復(fù)雜度過高，正則匹配要精練，不要枚舉式匹配

90、java.lang.StackOverflowError at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53)

解決方法：sql語句的where條件過長(zhǎng)，字符串棧溢出

91、org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0

解決方法：加大executor內(nèi)存，減少executor個(gè)數(shù)，加大executor并發(fā)度

92、ExecutorLostFailure (executor 3 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 61.0 GB of 61 GB physical memory used

解決方法：移除RDD緩存操作，增加該JOB的spark.storage.memoryFraction系數(shù)值，增加該job的spark.yarn.executor.memoryOverhead值

93、EsRejectedExecutionException[rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction

解決方法:減少spark并發(fā)數(shù)，降低對(duì)ES的并發(fā)讀取

94、經(jīng)驗(yàn)：?jiǎn)蝹€(gè)spark任務(wù)的excutor核數(shù)不宜設(shè)置過高，否則會(huì)導(dǎo)致其他JOB延遲

95、經(jīng)驗(yàn)：數(shù)據(jù)傾斜只發(fā)生在shuffle過程，可能觸發(fā)shuffle操作的算子有：distinct groupByKey reduceByKey aggregateByKey join cogroup repartition等

96、如何定位spark的數(shù)據(jù)傾斜

解決方法：在Spark Web UI看一下當(dāng)前stage各個(gè)task分配的數(shù)據(jù)量以及執(zhí)行時(shí)間，根據(jù)stage劃分原理定位代碼中shuffle類算子

97、如何解決spark數(shù)據(jù)傾斜

解決方法：1）過濾少數(shù)導(dǎo)致傾斜的key（僅限于拋棄的Key對(duì)作業(yè)影響很?。?，2）提高shuffle操作并行度（提升效果有限），3）兩階段聚合（局部聚合+全局聚合），先對(duì)相同的key加前綴變成多個(gè)key，局部shuffle后再去掉前綴，再次進(jìn)行全局shuffle（僅適用于聚合類的shuffle操作，效果明顯，對(duì)于join類的shuffle操作無效），4）將reduce join轉(zhuǎn)為map join，將小表進(jìn)行廣播，對(duì)大表map操作，遍歷小表數(shù)據(jù)（僅適用于大小表或RDD情況），5）使用隨機(jī)前綴和擴(kuò)容RDD進(jìn)行join，對(duì)其中一個(gè)RDD每條數(shù)據(jù)打上n以內(nèi)的隨機(jī)前綴，用flatMap算子對(duì)另一個(gè)RDD進(jìn)行n倍擴(kuò)容并擴(kuò)容后的每條數(shù)據(jù)依次打上0~n的前綴，最后將兩個(gè)改造key后的RDD進(jìn)行join（能大幅緩解join類型數(shù)據(jù)傾斜，需要消耗巨額內(nèi)存）

98、經(jīng)驗(yàn)：shuffle write就是在一個(gè)stage結(jié)束計(jì)算之后，為了下一個(gè)stage可以執(zhí)行shuffle類的算子，而將每個(gè)task處理的數(shù)據(jù)按key進(jìn)行分類，將相同key都寫入同一個(gè)磁盤文件中，而每一個(gè)磁盤文件都只屬于下游stage的一個(gè)task，在將數(shù)據(jù)寫入磁盤之前，會(huì)先將數(shù)據(jù)寫入內(nèi)存緩存中，下一個(gè)stage的task有多少個(gè)，當(dāng)前stage的每個(gè)task就要?jiǎng)?chuàng)建多少份磁盤文件。

99、java.util.regex.PatternSyntaxException: Dangling meta character '?' near index 0

解決方法：元字符記得轉(zhuǎn)義

100、spark彈性資源分配

解決方法：配置spark shuffle service,打開spark.dynamicAllocation.enabled

101、經(jīng)驗(yàn)：kafka的comsumer groupID對(duì)于spark direct streaming無效

102、啟動(dòng)hadoop yarn,發(fā)現(xiàn)只啟動(dòng)了ResourceManager，沒有啟動(dòng)NodeManager

解決方法：yarn-site.xml配置有問題，檢查并規(guī)范各項(xiàng)配置

103、如何查看hadoop系統(tǒng)日志

解決方法：Hadoop 2.x中YARN系統(tǒng)的服務(wù)日志包括ResourceManager日志和各個(gè)NodeManager日志，它們的日志位置如下：ResourceManager日志存放位置是Hadoop安裝目錄下的logs目錄下的yarn-*-resourcemanager-*.log，NodeManager日志存放位置是各個(gè)NodeManager節(jié)點(diǎn)上hadoop安裝目錄下的logs目錄下的yarn-*-nodemanager-*.log

104、經(jīng)驗(yàn)：小于128M的小文件都會(huì)占據(jù)一個(gè)128M的BLOCK，合并或者刪除小文件節(jié)省磁盤空間

105、how to remove Non DFS Used

解決方法：1）清除hadoop數(shù)據(jù)目錄中用戶緩存文件：cd /data/hadoop/storage/tmp/nm-local-dir/usercache;du -h;rm -rf `find -type f -size +10M`; 2）清理Linux文件系統(tǒng)中的垃圾數(shù)據(jù)

106、經(jīng)驗(yàn)：Non DFS Used指的是非HDFS的所有文件

107、linux profile配置文件隔離

解決方法：cd /etc/profile.d;在這里新建相應(yīng)配置腳本

108、The reference to entity "autoReconnect" must end with the ';' delimiter

解決方法：把&替換成&

109、Service hiveserver not found

解決方法：Try to run bin/hive --service hiveserver2 instead of hive --service hiveserver for this version of apache hive

110、Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

解決方法：不要預(yù)編譯的spark，重新編譯spark，并保證與hive pom中的版本一致

111、java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS at org.apache.hive.spark.client.rpc.RpcConfiguration.(RpcConfiguration.java:45)

解決方法：hive spark版本要匹配，同時(shí)必須是沒有-phive參數(shù)編譯的spark

112、javax.jdo.JDOFatalInternalException: Error creating transactional connection factory

解決方法：把mysql connector加入hive的lib中

113、org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

解決方法：原因有多種，去hive.log查看日志進(jìn)一步定位問題

114、Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream

解決方法：編譯spark用了hadoop-provided參數(shù)，導(dǎo)致缺少hadoop相關(guān)包

115、linux 輸入錯(cuò)誤命令按刪除鍵顯示^H

解決方法：執(zhí)行指令 stty erase ^H

116、經(jīng)驗(yàn)：通過hive源文件pom.xml查看適配的spark版本，只要打版本保持一致就行，例如spark1.6.0和1.6.2都能匹配

117、經(jīng)驗(yàn)：打開Hive命令行客戶端，觀察輸出日志是否有打印“SLF4J: Found binding in [jar:file:/work/poa/hive-2.1.0-bin/lib/spark-assembly-1.6.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]”來判斷hive有沒有綁定spark

118、啟動(dòng)yarn，發(fā)現(xiàn)只啟動(dòng)了部分Nodemanager

解決方法：未啟動(dòng)的節(jié)點(diǎn)缺少yarn相關(guān)包，要保持所有節(jié)點(diǎn)jar包一致

119、Error: Could not find or load main class org.apache.hive.beeline.BeeLine

解決方法：重新編譯Hive，并帶上參數(shù)-Phive-thriftserver

120、經(jīng)驗(yàn)：編譯spark，hive on spark就不要加-Phive參數(shù)，若需sparkSQL支持hive語法則要加-Phive參數(shù)

121、User class threw exception: org.apache.spark.sql.AnalysisException: path hdfs://XXXXXX already exists.;

解決方法：df.write.format("parquet").mode("append").save("path.parquet")

122、check the manual that corresponds to your MySQL server version for the right syntax to use near 'OPTION SQL_SELECT_LIMIT=DEFAULT' at line 1

解決方法：用新版mysql-connector

123、org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate

解決方法：vim core-site.xml,hadoop.proxyuser.root.hosts,value = *,hadoop.proxyuser.root.groups,value = *，restart yarn

124、java.lang.NoSuchMethodError: org.apache.parquet.schema.Types$MessageTypeBuilder.addFields([Lorg/apache/parquet/schema/Type;)Lorg/apache/parquet/schema/Types$BaseGroupBuilder;

解決方法：版本沖突所致，統(tǒng)一hive和spark中parquet組件版本

125、經(jīng)驗(yàn)：可以通過hive-site.xml修改spark.executor.instances、spark.executor.cores、spark.executor.memory等配置來優(yōu)化hive on spark執(zhí)行性能，不過最好配成動(dòng)態(tài)資源分配。

126、WARN SparkContext: Dynamic Allocation and num executors both set, thus dynamic allocation disabled.

解決方法：如果要使用動(dòng)態(tài)資源分配，就不要設(shè)置執(zhí)行器個(gè)數(shù)

127、Invalid configuration property node.environment: is malformed (for class io.airlift.node.NodeConfig.environment)

解決方法：the node.environment property (in the node.properties file) is set but fails to match the following regular expression: [a-z0-9][_a-z0-9]*. 重新規(guī)范命名

128、com.facebook.presto.server.PrestoServerNo factory for connector hive-XXXXXX

解決方法：在hive.properties中 connector.name寫錯(cuò)了，應(yīng)該為指定的版本，以便于presto使用對(duì)應(yīng)的適配器，修改為：connector.name=hive-hadoop2

129、org.apache.spark.SparkException: Task failed while writing rows Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: null

解決方法：ES負(fù)載過高，修復(fù)ES

130、經(jīng)驗(yàn)：如果maven下載很慢，很可能是被天朝的GFW墻了，可以在maven安裝目錄的setting.conf配置文件mirrors標(biāo)簽下加入國(guó)內(nèi)鏡像抵制**黨的網(wǎng)絡(luò)封鎖，例如：

nexus-aliyun

Nexus aliyun

http://maven.aliyun.com/nexus/content/groups/public

131、RROR ApplicationMaster: Uncaught exception: java.lang.SecurityException: Invalid signature file digest for Manifest main attributes

解決方法：pom.xml文件中標(biāo)簽下加入

META-INF/*.SF

META-INF/*.DSA

META-INF/*.RSA

132、scala.MatchError: Buffer(10.113.80.29, None) (of class scala.collection.convert.Wrappers$JListWrapper)

解決方法：清除ES中跟scala數(shù)據(jù)類型不兼容的臟數(shù)據(jù)

133、HDFS誤刪文件如何恢復(fù)解決方法：core-site文件中加入

fs.trash.interval

2880

HDFS垃圾箱設(shè)置，可以恢復(fù)誤刪除，配置的值為分鐘數(shù)，0為禁用

恢復(fù)文件執(zhí)行 hdfs dfs -mv /user/root/.Trash/Current/誤刪文件 /原路徑

134、改了linux定時(shí)腳本里邊部分任務(wù)順序，導(dǎo)致有些任務(wù)未執(zhí)行，而有些重復(fù)執(zhí)行

解決方法：Linux腳本修改后實(shí)時(shí)生效，務(wù)必在腳本全部執(zhí)行完再修改，以免產(chǎn)生副作用

135、經(jīng)驗(yàn)：spark兩個(gè)分區(qū)方法coalesce和repartition，前者窄依賴，分區(qū)后數(shù)據(jù)不均勻，后者寬依賴，引發(fā)shuffle操作，分區(qū)后數(shù)據(jù)均勻

136、org.apache.spark.SparkException: Task failed while writing rows scala.MatchError: Buffer(10.113.80.29, None) (of class scala.collection.convert.Wrappers$JListWrapper)

解決方法：ES數(shù)據(jù)在sparksql類型轉(zhuǎn)化時(shí)不兼容，可通過EsSpark.esJsonRDD以字符串形式取ES數(shù)據(jù)，再把rdd轉(zhuǎn)換成dataframe

137、Container exited with a non-zero exit code 143 Killed by external signal

解決方法：分配的資源不夠，加大內(nèi)存或者調(diào)整代碼，盡量避免類似JsonObject這樣的大對(duì)象過度消耗內(nèi)存，或者Include below properties in yarn-site.xml and restart VM,

yarn.nodemanager.vmem-check-enabled

false

Whether virtual memory limits will be enforced for containers

yarn.nodemanager.vmem-pmem-ratio

Ratio between virtual memory to physical memory when setting memory limits for containers

138、對(duì)已有jar手動(dòng)生成maven依賴

解決方法：mvn install:install-file -Dfile=spark-assembly-1.6.2-hadoop2.6.0.jar -DgroupId=org.apache.repack -DartifactId=spark-assembly-1.6.2-hadoop2.6.0 -Dversion=2.6 -Dpackaging=jar

139、FAILED: SemanticException [Error 10006]: Line 1:122 Partition not found ''2016-08-01''

解決方法：hive版本太新，hive自身bug，把hive版本從2.1.0降到1.2.1

140、ParseException line 1:17 mismatched input 'hdfs' expecting StringLiteral near 'inpath' in load statement

解決方法：去掉以hdfs開頭的IP端口號(hào)前綴，直接寫HDFS中的絕對(duì)路徑，并用單引號(hào)括起來

141、[ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected解決方案：export HADOOP_USER_CLASSPATH_FIRST=true

142、crontab中啟動(dòng)的shell腳本不能正常運(yùn)行，但是使用手動(dòng)執(zhí)行沒有問題

解決方法：在腳本第一行寫上source /etc/profile,因?yàn)閏ront進(jìn)程不會(huì)自動(dòng)加載用戶目錄下的.profile文件

143、SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted

解決方法：集群資源不夠，確保真實(shí)剩余內(nèi)存大于spark job申請(qǐng)的內(nèi)存

144、PrestoException: ROW comparison not supported for fields with null elements

解決方法：把 !=null 換成 is not null

145、啟動(dòng)presto服務(wù)器，部分節(jié)點(diǎn)啟動(dòng)不成功

解決方法：JVM所分配的內(nèi)存，必須小于真實(shí)剩余內(nèi)存

146、經(jīng)驗(yàn)：presto進(jìn)程一旦啟動(dòng)，JVM server會(huì)一直占用內(nèi)存

147、Error injecting constructor, java.lang.IllegalArgumentException: query.max-memory-per-node set to 20GB, but only 10213706957B of useable heap available

解決方法：Presto will claim 0.40 * max heap size for the system pool, so your query.max-memory-per-node must not exceed this. You can increase the heap or decrease query.max-memory-per-node.

148、failed: Encountered too many errors talking to a worker node. The node may have crashed or be under too much load. failed java.util.concurrent.CancellationException: Task was cancelled

解決方法：such exceptions caused by timeout limits，延長(zhǎng)等待時(shí)間，在work節(jié)點(diǎn)config配置中set exchange.http-client.request-timeout=50s

149、大數(shù)據(jù)ETL可視化有哪些主流方案

解決方法：可以考慮的技術(shù)棧有ELK(elasticsearch+logstash+kibana)或者HPA(hive+presto+airpal)

150、經(jīng)驗(yàn)：presto集群沒必要采用on yarn模式，因?yàn)閔adoop依賴HDFS，如果部分機(jī)器磁盤很小，HADOOP會(huì)很尷尬，而presto是純內(nèi)存計(jì)算，不依賴磁盤，獨(dú)立安裝可以跨越多個(gè)集群，可以說有內(nèi)存的地方就可以有presto

向AI問一下細(xì)節(jié)

大數(shù)據(jù)常見錯(cuò)誤

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽