溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶(hù)服務(wù)條款》

如何用命令行的方式運(yùn)行Spark平臺(tái)的wordcount項(xiàng)目

發(fā)布時(shí)間:2021-12-17 09:52:52 來(lái)源:億速云 閱讀:181 作者:柒染 欄目:大數(shù)據(jù)

如何用命令行的方式運(yùn)行Spark平臺(tái)的wordcount項(xiàng)目,針對(duì)這個(gè)問(wèn)題,這篇文章詳細(xì)介紹了相對(duì)應(yīng)的分析和解答,希望可以幫助更多想解決這個(gè)問(wèn)題的小伙伴找到更簡(jiǎn)單易行的方法。

Created by Wang, Jerry, last modified on Sep 22, 2015

單機(jī)模式運(yùn)行,即local模式

local模式運(yùn)行非常簡(jiǎn)單,只要運(yùn)行以下命令即可,假設(shè)當(dāng)前目錄是$SPARK_HOME
MASTER=local bin/spark-shell
“MASTER=local"就是表明當(dāng)前運(yùn)行在單機(jī)模式
scala> val textFile = sc.textFile(“README.md”)
val textFile = sc.textFile(“jerry.test”)
15/08/08 19:14:32 INFO MemoryStore: ensureFreeSpace(182712) called with curMem=664070, maxMem=278302556
15/08/08 19:14:32 INFO MemoryStore: Block broadcast_7 stored as values in memory (estimated size 178.4 KB, free 264.6 MB)
15/08/08 19:14:32 INFO MemoryStore: ensureFreeSpace(17237) called with curMem=846782, maxMem=278302556
15/08/08 19:14:32 INFO MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 16.8 KB, free 264.6 MB)
15/08/08 19:14:32 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory on localhost:37219 (size: 16.8 KB, free: 265.3 MB)
15/08/08 19:14:32 INFO SparkContext: Created broadcast 7 from textFile at :21
textFile: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[12] at textFile at :21
then: textFile.filter(.contains(“Spark”)).count
or textFile.flatMap(
.split(” ")).map((_, 1))
15/08/08 19:16:27 INFO FileInputFormat: Total input paths to process : 1
15/08/08 19:16:27 INFO SparkContext: Starting job: count at :24
15/08/08 19:16:27 INFO DAGScheduler: Got job 0 (count at :24) with 1 output partitions (allowLocal=false)
15/08/08 19:16:27 INFO DAGScheduler: Final stage: ResultStage 0(count at :24)
15/08/08 19:16:27 INFO DAGScheduler: Parents of final stage: List()
15/08/08 19:16:27 INFO DAGScheduler: Missing parents: List()
15/08/08 19:16:27 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[2] at filter at :24), which has no missing parents
15/08/08 19:16:27 INFO MemoryStore: ensureFreeSpace(3184) called with curMem=156473, maxMem=278302556
15/08/08 19:16:27 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.1 KB, free 265.3 MB)
15/08/08 19:16:27 INFO MemoryStore: ensureFreeSpace(1855) called with curMem=159657, maxMem=278302556
15/08/08 19:16:27 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1855.0 B, free 265.3 MB)
15/08/08 19:16:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:42648 (size: 1855.0 B, free: 265.4 MB)
15/08/08 19:16:27 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:874
15/08/08 19:16:27 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[2] at filter at :24)
15/08/08 19:16:27 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
15/08/08 19:16:27 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1415 bytes)
15/08/08 19:16:27 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/08/08 19:16:27 INFO HadoopRDD: Input split: file:/root/devExpert/spark-1.4.1/README.md:0+3624
15/08/08 19:16:27 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
15/08/08 19:16:27 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
15/08/08 19:16:27 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
15/08/08 19:16:27 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
15/08/08 19:16:27 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/08/08 19:16:27 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1830 bytes result sent to driver
15/08/08 19:16:27 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 80 ms on localhost (1/1)
15/08/08 19:16:27 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/08/08 19:16:27 INFO DAGScheduler: ResultStage 0 (count at :24) finished in 0.093 s
15/08/08 19:16:27 INFO DAGScheduler: Job 0 finished: count at :24, took 0.176689 s
res0: Long = 19

如何用命令行的方式運(yùn)行Spark平臺(tái)的wordcount項(xiàng)目

關(guān)于如何用命令行的方式運(yùn)行Spark平臺(tái)的wordcount項(xiàng)目問(wèn)題的解答就分享到這里了,希望以上內(nèi)容可以對(duì)大家有一定的幫助,如果你還有很多疑惑沒(méi)有解開(kāi),可以關(guān)注億速云行業(yè)資訊頻道了解更多相關(guān)知識(shí)。

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI