溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

WIn7下用Idea遠(yuǎn)程操作Spark

發(fā)布時(shí)間:2020-06-28 03:23:43 來(lái)源:網(wǎng)絡(luò) 閱讀:2163 作者:moviebat 欄目:大數(shù)據(jù)

WIn7下用Idea遠(yuǎn)程操作Spark


main.scala
org.apache.spark.SparkContext._
org.apache.spark.{SparkConfSparkContext}

SogouResult {
  (args:Array[]){
    (args.length==) {
      System..println()
      System.()
    }

    conf=SparkConf().setAppName().setMaster()
    sc=SparkContext(conf)

    rdd1=sc.textFile(args()).map(_.split()).filter(_.length==)
    rdd2=rdd1.map(x=>(x())).reduceByKey(_+_).map(x=>(x._2x._1)).sortByKey().map(x=>(x._2x._1))
    rdd2.saveAsTextFile(args())
    sc.stop


  }

}
fs.defaultFShdfs://192.168.0.3:9000
4.0.0HdfsTestHdfsTest1.0-SNAPSHOT12         13             org.apache.hadoop14             hadoop-common15             2.6.416         17         18             org.apache.hadoop19             hadoop-mapreduce-client-jobclient20             2.6.421         22         23             commons-cli24             commons-cli25             1.226         27     28
    29     30         ${project.artifactId}31

運(yùn)行參數(shù)如下:

hdfs://192.168.0.3:9000/input/SogouQ1 hdfs://192.168.0.3:9000/output/sogou1


出錯(cuò):

"C:\Program Files\Java\jdk1.7.0_79\bin\java" -Didea.launcher.port=7535 -Didea.launcher.bin.path=D:\Java\IntelliJ\bin -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.7.0_79\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jce.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jfxrt.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\resources.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\rt.jar;D:\scalasrc\HdfsTest\target\classes;D:\scalasrc\lib\datanucleus-core-3.2.10.jar;D:\scalasrc\lib\datanucleus-rdbms-3.2.9.jar;D:\scalasrc\lib\spark-1.5.0-yarn-shuffle.jar;D:\scalasrc\lib\datanucleus-api-jdo-3.2.6.jar;D:\scalasrc\lib\spark-assembly-1.5.0-hadoop2.6.0.jar;D:\scalasrc\lib\spark-examples-1.5.0-hadoop2.6.0.jar;D:\Java\scala210\lib\scala-actors-migration.jar;D:\Java\scala210\lib\scala-actors.jar;D:\Java\scala210\lib\scala-library.jar;D:\Java\scala210\lib\scala-reflect.jar;D:\Java\scala210\lib\scala-swing.jar;D:\Java\IntelliJ\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain main.scala.SogouResult hdfs://192.168.0.3:9000/input/SogouQ1 hdfs://192.168.0.3:9000/output/sogou1

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/D:/scalasrc/lib/spark-assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/D:/scalasrc/lib/spark-examples-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/09/16 12:00:43 INFO SparkContext: Running Spark version 1.5.0

16/09/16 12:00:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

16/09/16 12:00:44 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)

at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)

at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)

at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)

at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104)

at org.apache.hadoop.security.Groups.<init>(Groups.java:86)

at org.apache.hadoop.security.Groups.<init>(Groups.java:66)

at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280)

at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271)

at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248)

at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763)

at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748)

at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621)

at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2084)

at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2084)

at scala.Option.getOrElse(Option.scala:120)

at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2084)

at org.apache.spark.SparkContext.<init>(SparkContext.scala:310)

at main.scala.SogouResult$.main(SogouResult.scala:16)

at main.scala.SogouResult.main(SogouResult.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

16/09/16 12:00:44 INFO SecurityManager: Changing view acls to: danger

16/09/16 12:00:44 INFO SecurityManager: Changing modify acls to: danger

16/09/16 12:00:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(danger); users with modify permissions: Set(danger)

16/09/16 12:00:45 INFO Slf4jLogger: Slf4jLogger started

16/09/16 12:00:45 INFO Remoting: Starting remoting

16/09/16 12:00:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.0.2:55944]

16/09/16 12:00:45 INFO Utils: Successfully started service 'sparkDriver' on port 55944.

16/09/16 12:00:45 INFO SparkEnv: Registering MapOutputTracker

16/09/16 12:00:45 INFO SparkEnv: Registering BlockManagerMaster

16/09/16 12:00:45 INFO DiskBlockManager: Created local directory at C:\Users\danger\AppData\Local\Temp\blockmgr-281e23a9-a059-4670-a1b0-0511e63c55a3

16/09/16 12:00:45 INFO MemoryStore: MemoryStore started with capacity 481.1 MB

16/09/16 12:00:45 INFO HttpFileServer: HTTP File server directory is C:\Users\danger\AppData\Local\Temp\spark-84f74e01-9ea2-437c-b532-a5cfec898bc8\httpd-876c9027-ebb3-44c6-8256-bd4a555eaeaf

16/09/16 12:00:45 INFO HttpServer: Starting HTTP Server

16/09/16 12:00:46 INFO Utils: Successfully started service 'HTTP file server' on port 55945.

16/09/16 12:00:46 INFO SparkEnv: Registering OutputCommitCoordinator

16/09/16 12:00:46 INFO Utils: Successfully started service 'SparkUI' on port 4040.

16/09/16 12:00:46 INFO SparkUI: Started SparkUI at http://192.168.0.2:4040

16/09/16 12:00:46 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.

16/09/16 12:00:46 INFO Executor: Starting executor ID driver on host localhost

16/09/16 12:00:46 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55964.

16/09/16 12:00:46 INFO NettyBlockTransferService: Server created on 55964

16/09/16 12:00:46 INFO BlockManagerMaster: Trying to register BlockManager

16/09/16 12:00:46 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55964 with 481.1 MB RAM, BlockManagerId(driver, localhost, 55964)

16/09/16 12:00:46 INFO BlockManagerMaster: Registered BlockManager

16/09/16 12:00:47 INFO MemoryStore: ensureFreeSpace(157320) called with curMem=0, maxMem=504511856

16/09/16 12:00:47 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 153.6 KB, free 481.0 MB)

16/09/16 12:00:47 INFO MemoryStore: ensureFreeSpace(14301) called with curMem=157320, maxMem=504511856

16/09/16 12:00:47 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.0 KB, free 481.0 MB)

16/09/16 12:00:47 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55964 (size: 14.0 KB, free: 481.1 MB)

16/09/16 12:00:47 INFO SparkContext: Created broadcast 0 from textFile at SogouResult.scala:18

16/09/16 12:00:48 WARN : Your hostname, danger-PC resolves to a loopback/non-reachable address: fe80:0:0:0:0:5efe:ac1b:2301%24, but we couldn't find any external IP address!

Exception in thread "main" java.net.ConnectException: Call From danger-PC/192.168.0.2 to 192.168.0.3:9000 failed on connection exception: java.net.ConnectException: Connection refused: no further information; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:526)

at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)

at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)

at org.apache.hadoop.ipc.Client.call(Client.java:1472)

at org.apache.hadoop.ipc.Client.call(Client.java:1399)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)

at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at com.sun.proxy.$Proxy20.getFileInfo(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1988)

at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)

at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)

at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)

at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)

at org.apache.hadoop.fs.Globber.glob(Globber.java:252)

at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1644)

at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:257)

at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)

at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)

at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

at scala.Option.getOrElse(Option.scala:120)

at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

at scala.Option.getOrElse(Option.scala:120)

at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

at scala.Option.getOrElse(Option.scala:120)

at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

at scala.Option.getOrElse(Option.scala:120)

at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)

at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)

at scala.Option.getOrElse(Option.scala:120)

at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)

at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)

at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:290)

at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:290)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)

at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:289)

at main.scala.SogouResult$.main(SogouResult.scala:19)

at main.scala.SogouResult.main(SogouResult.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

Caused by: java.net.ConnectException: Connection refused: no further information

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)

at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)

at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)

at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)

at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)

at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)

at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)

at org.apache.hadoop.ipc.Client.call(Client.java:1438)

... 61 more

16/09/16 12:00:50 INFO SparkContext: Invoking stop() from shutdown hook

16/09/16 12:00:50 INFO SparkUI: Stopped Spark web UI at http://192.168.0.2:4040

16/09/16 12:00:50 INFO DAGScheduler: Stopping DAGScheduler

16/09/16 12:00:51 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

16/09/16 12:00:51 INFO MemoryStore: MemoryStore cleared

16/09/16 12:00:51 INFO BlockManager: BlockManager stopped

16/09/16 12:00:51 INFO BlockManagerMaster: BlockManagerMaster stopped

16/09/16 12:00:51 INFO SparkContext: Successfully stopped SparkContext

16/09/16 12:00:51 INFO ShutdownHookManager: Shutdown hook called

16/09/16 12:00:51 INFO ShutdownHookManager: Deleting directory C:\Users\danger\AppData\Local\Temp\spark-84f74e01-9ea2-437c-b532-a5cfec898bc8


Process finished with exit code 1



半天沒搞定啊感覺應(yīng)該是沒有winutils.exe

果斷從網(wǎng)上找個(gè),放在hadoop/bin下,


執(zhí)行,沒有了,但依然無(wú)法連接


telnet 192.168.0.3 9000,顯示無(wú)法連接


應(yīng)該問題就在這


在hadoop的配置文件中,配置的是主機(jī)名


將所有的主機(jī)名改為IP地址

"C:\Program Files\Java\jdk1.7.0_79\bin\java" -Didea.launcher.port=7536 -Didea.launcher.bin.path=D:\Java\IntelliJ\bin -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.7.0_79\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jce.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jfxrt.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\resources.jar;C:\Program Files\Java\jdk1.7.0_79\jre\lib\rt.jar;D:\scalasrc\HdfsTest\target\classes;D:\scalasrc\lib\datanucleus-core-3.2.10.jar;D:\scalasrc\lib\datanucleus-rdbms-3.2.9.jar;D:\scalasrc\lib\spark-1.5.0-yarn-shuffle.jar;D:\scalasrc\lib\datanucleus-api-jdo-3.2.6.jar;D:\scalasrc\lib\spark-assembly-1.5.0-hadoop2.6.0.jar;D:\scalasrc\lib\spark-examples-1.5.0-hadoop2.6.0.jar;D:\Java\scala210\lib\scala-actors-migration.jar;D:\Java\scala210\lib\scala-actors.jar;D:\Java\scala210\lib\scala-library.jar;D:\Java\scala210\lib\scala-reflect.jar;D:\Java\scala210\lib\scala-swing.jar;D:\Java\IntelliJ\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain main.scala.SogouResult hdfs://192.168.0.3:9000/input/SogouQ1 hdfs://192.168.0.3:9000/output/sogou1

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/D:/scalasrc/lib/spark-assembly-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/D:/scalasrc/lib/spark-examples-1.5.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/09/16 14:04:45 INFO SparkContext: Running Spark version 1.5.0

16/09/16 14:04:46 INFO SecurityManager: Changing view acls to: danger

16/09/16 14:04:46 INFO SecurityManager: Changing modify acls to: danger

16/09/16 14:04:46 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(danger); users with modify permissions: Set(danger)

16/09/16 14:04:47 INFO Slf4jLogger: Slf4jLogger started

16/09/16 14:04:47 INFO Remoting: Starting remoting

16/09/16 14:04:47 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.0.2:51172]

16/09/16 14:04:47 INFO Utils: Successfully started service 'sparkDriver' on port 51172.

16/09/16 14:04:47 INFO SparkEnv: Registering MapOutputTracker

16/09/16 14:04:47 INFO SparkEnv: Registering BlockManagerMaster

16/09/16 14:04:47 INFO DiskBlockManager: Created local directory at C:\Users\danger\AppData\Local\Temp\blockmgr-087e9166-2258-4f45-b449-d184c92702a3

16/09/16 14:04:47 INFO MemoryStore: MemoryStore started with capacity 481.1 MB

16/09/16 14:04:47 INFO HttpFileServer: HTTP File server directory is C:\Users\danger\AppData\Local\Temp\spark-0d6662f5-0bfa-4e6f-a256-c97bc6ce5f47\httpd-a2355600-9a68-417d-bd52-2ccdcac7bb13

16/09/16 14:04:47 INFO HttpServer: Starting HTTP Server

16/09/16 14:04:48 INFO Utils: Successfully started service 'HTTP file server' on port 51173.

16/09/16 14:04:48 INFO SparkEnv: Registering OutputCommitCoordinator

16/09/16 14:04:48 INFO Utils: Successfully started service 'SparkUI' on port 4040.

16/09/16 14:04:48 INFO SparkUI: Started SparkUI at http://192.168.0.2:4040

16/09/16 14:04:48 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.

16/09/16 14:04:48 INFO Executor: Starting executor ID driver on host localhost

16/09/16 14:04:48 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51192.

16/09/16 14:04:48 INFO NettyBlockTransferService: Server created on 51192

16/09/16 14:04:48 INFO BlockManagerMaster: Trying to register BlockManager

16/09/16 14:04:48 INFO BlockManagerMasterEndpoint: Registering block manager localhost:51192 with 481.1 MB RAM, BlockManagerId(driver, localhost, 51192)

16/09/16 14:04:48 INFO BlockManagerMaster: Registered BlockManager

16/09/16 14:04:49 INFO MemoryStore: ensureFreeSpace(157320) called with curMem=0, maxMem=504511856

16/09/16 14:04:49 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 153.6 KB, free 481.0 MB)

16/09/16 14:04:49 INFO MemoryStore: ensureFreeSpace(14301) called with curMem=157320, maxMem=504511856

16/09/16 14:04:49 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 14.0 KB, free 481.0 MB)

16/09/16 14:04:49 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:51192 (size: 14.0 KB, free: 481.1 MB)

16/09/16 14:04:49 INFO SparkContext: Created broadcast 0 from textFile at SogouResult.scala:18

16/09/16 14:04:50 WARN : Your hostname, danger-PC resolves to a loopback/non-reachable address: fe80:0:0:0:0:5efe:ac1b:2301%24, but we couldn't find any external IP address!

16/09/16 14:04:52 INFO FileInputFormat: Total input paths to process : 1

16/09/16 14:04:52 INFO SparkContext: Starting job: sortByKey at SogouResult.scala:19

16/09/16 14:04:52 INFO DAGScheduler: Registering RDD 4 (map at SogouResult.scala:19)

16/09/16 14:04:52 INFO DAGScheduler: Got job 0 (sortByKey at SogouResult.scala:19) with 2 output partitions

16/09/16 14:04:52 INFO DAGScheduler: Final stage: ResultStage 1(sortByKey at SogouResult.scala:19)

16/09/16 14:04:52 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)

16/09/16 14:04:52 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)

16/09/16 14:04:52 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[4] at map at SogouResult.scala:19), which has no missing parents

16/09/16 14:04:52 INFO MemoryStore: ensureFreeSpace(4208) called with curMem=171621, maxMem=504511856

16/09/16 14:04:52 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.1 KB, free 481.0 MB)

16/09/16 14:04:52 INFO MemoryStore: ensureFreeSpace(2347) called with curMem=175829, maxMem=504511856

16/09/16 14:04:52 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.3 KB, free 481.0 MB)

16/09/16 14:04:52 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:51192 (size: 2.3 KB, free: 481.1 MB)

16/09/16 14:04:52 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:861

16/09/16 14:04:52 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[4] at map at SogouResult.scala:19)

16/09/16 14:04:52 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks

16/09/16 14:04:52 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 2135 bytes)

16/09/16 14:04:53 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)

16/09/16 14:04:53 INFO HadoopRDD: Input split: hdfs://192.168.0.3:9000/input/SogouQ1:0+134217728

16/09/16 14:04:53 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id

16/09/16 14:04:53 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

16/09/16 14:04:53 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap

16/09/16 14:04:53 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition

16/09/16 14:04:53 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id

16/09/16 14:04:54 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)

java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:206)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:244)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)

at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)

at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

16/09/16 14:04:54 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]

java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:206)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:244)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)

at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)

at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

16/09/16 14:04:54 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 2135 bytes)

16/09/16 14:04:54 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)

16/09/16 14:04:54 INFO SparkContext: Invoking stop() from shutdown hook

16/09/16 14:04:54 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:206)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:244)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)

at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)

at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


16/09/16 14:04:54 INFO HadoopRDD: Input split: hdfs://192.168.0.3:9000/input/SogouQ1:134217728+17788332

16/09/16 14:04:54 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job

16/09/16 14:04:54 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)

java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:134)

at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)

at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:239)

at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)

at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

16/09/16 14:04:54 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-1,5,main]

java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:134)

at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)

at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:239)

at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)

at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

16/09/16 14:04:54 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on executor localhost: java.lang.UnsatisfiedLinkError (org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V) [duplicate 1]

16/09/16 14:04:54 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 

16/09/16 14:04:54 INFO TaskSchedulerImpl: Cancelling stage 0

16/09/16 14:04:54 INFO SparkUI: Stopped Spark web UI at http://192.168.0.2:4040

16/09/16 14:04:54 INFO DAGScheduler: ShuffleMapStage 0 (map at SogouResult.scala:19) failed in 1.350 s

16/09/16 14:04:54 INFO DAGScheduler: Job 0 failed: sortByKey at SogouResult.scala:19, took 1.693803 s

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:206)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:244)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)

at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)

at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


Driver stacktrace:

at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1280)

at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1268)

at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1267)

at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1267)

at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)

at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)

at scala.Option.foreach(Option.scala:236)

at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)

at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1493)

at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1455)

at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1444)

at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:1813)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:1826)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:1839)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:1910)

at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)

at org.apache.spark.rdd.RDD.collect(RDD.scala:904)

at org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:264)

at org.apache.spark.RangePartitioner.<init>(Partitioner.scala:126)

at org.apache.spark.rdd.OrderedRDDFunctions$$anonfun$sortByKey$1.apply(OrderedRDDFunctions.scala:62)

at org.apache.spark.rdd.OrderedRDDFunctions$$anonfun$sortByKey$1.apply(OrderedRDDFunctions.scala:61)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)

at org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:61)

at main.scala.SogouResult$.main(SogouResult.scala:19)

at main.scala.SogouResult.main(SogouResult.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V

at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)

at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)

at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)

at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:216)

at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:146)

at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:693)

at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:749)

at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:806)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)

at java.io.DataInputStream.read(DataInputStream.java:100)

at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)

at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)

at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)

at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:206)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:244)

at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)

at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)

at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:203)

at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

16/09/16 14:04:54 INFO DAGScheduler: Stopping DAGScheduler

16/09/16 14:04:54 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

16/09/16 14:04:54 INFO MemoryStore: MemoryStore cleared

16/09/16 14:04:54 INFO BlockManager: BlockManager stopped

16/09/16 14:04:54 INFO BlockManagerMaster: BlockManagerMaster stopped

16/09/16 14:04:54 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!

16/09/16 14:04:54 INFO SparkContext: Successfully stopped SparkContext

16/09/16 14:04:54 INFO ShutdownHookManager: Shutdown hook called

16/09/16 14:04:54 INFO ShutdownHookManager: Deleting directory C:\Users\danger\AppData\Local\Temp\spark-0d6662f5-0bfa-4e6f-a256-c97bc6ce5f47

16/09/16 14:04:54 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.

16/09/16 14:04:54 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.


Process finished with exit code 50



呵呵,顯示有進(jìn)步

16/09/16 14:04:54 INFO HadoopRDD: Input split: hdfs://192.168.0.3:9000/input/SogouQ1:134217728+17788332

16/09/16 14:04:54 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job


繼續(xù)跟錯(cuò)誤如下:

Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V


百度這個(gè)問題:

http://blog.csdn.net/glad_xiao/article/details/48825391


說是官方的文檔官方文檔HADOOP-11064

WIn7下用Idea遠(yuǎn)程操作Spark

看描述可以清楚這是Spark版本與Hadoop版本不適配導(dǎo)致的錯(cuò)誤,遇到這種錯(cuò)誤的一般是從Spark官網(wǎng)下載預(yù)編譯好的二進(jìn)制bin文件。

因此解決辦法有兩種: 
1. 重新下載并配置Spark預(yù)編譯好的對(duì)應(yīng)的Hadoop版本 
2. 從官網(wǎng)上下載Spark源碼按照預(yù)裝好的Hadoop版本進(jìn)行編譯(畢竟Spark的配置比Hadoop輕松不少)

又看到一篇文章:

http://www.cnblogs.com/marost/p/4372778.html

里面又提到,hadoop2.6.4之后的跟之前的不兼容,果斷在CSDN中下載

http://download.csdn.net/detail/ylhlly/9485201

windows64位平臺(tái)的hadoop2.6插件包(hadoop.dll,winutils.exe)

替換,執(zhí)行

出錯(cuò):


Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=danger, access=WRITE, inode="/output":dyq:supergroup:drwxr-xr-x

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6545)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6527)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6479)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4290)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4260)


繼續(xù)百度,問題


dyq@ubuntu:/opt/hadoop-2.6.4$ hadoop fs -chmod 777 /input

dyq@ubuntu:/opt/hadoop-2.6.4$ hadoop fs -chmod 777 /output


不行,繼續(xù)


http://www.cnblogs.com/fang-s/p/3777784.html


這個(gè)問題出現(xiàn)在本地使用eclipse向hdfs中寫入文件時(shí)出現(xiàn)的權(quán)限問題

解決:在hdfs-site.xml加入如下代碼

WIn7下用Idea遠(yuǎn)程操作Spark

<property>
  <name>dfs.permissions</name>
  <value>false</value>
  <description>
    If "true", enable permission checking in HDFS.
    If "false", permission checking is turned off,
    but all other behavior is unchanged.
    Switching from one parameter value to the other does not change the mode,
    owner or group of files or directories.  </description></property>

WIn7下用Idea遠(yuǎn)程操作Spark


修改完后,重啟,

終于看到了

Process finished with exit code 0

http://192.168.0.3:50070/explorer.html#/output

查看Hadoop的文件系統(tǒng),也順利出線了sogou2


哈哈,花了一天時(shí)間搞定


看來(lái)大數(shù)據(jù)的成本很高啊

向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI