溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的

發(fā)布時間:2021-12-17 13:58:48 來源:億速云 閱讀:170 作者:柒染 欄目:大數(shù)據(jù)

這篇文章給大家介紹IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的,內(nèi)容非常詳細,感興趣的小伙伴們可以參考借鑒,希望對大家能有所幫助。

Based on:

Mac os

Spark 2.4.3

(Spark running on  a standalone mode  reference blog :http://blog.itpub.net/69908925/viewspace-2644303/  )

scala 2.12.8

IDEA 2019

1  IDEA-File-Project Structure-Libarary-Scala SDK

IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的

select  version  2.11.12 

IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的

這處選擇的版本需要跟spark scala運行版本一致,默認的是本機裝的Scala版本2.12.8,spark上運行會報主類錯誤

2 新建project ,pom.xml添加依賴

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.ny.service</groupId>
    <artifactId>scala517</artifactId>
    <version>1.0</version>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
    <dependencies>
        <!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
        <!-- 以下dependency都要修改成自己的scala,spark,hadoop版本-->
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>2.11.12</version>
        </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.4.3</version>
    </dependency>
    </dependencies>
    <build>
        <!--程序主目錄,按照自己的路徑修改,如果有測試文件還要加一個testDirectory-->
        <sourceDirectory>src/main/scala</sourceDirectory>
        <plugins>
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <version>2.15.2</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.4.3</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <!--<transformers>-->
                            <!--<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">-->
                            <!--<mainClass></mainClass>-->
                            <!--</transformer>-->
                            <!--</transformers>-->
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                            <useUniqueVersions>false</useUniqueVersions>
                            <classpathPrefix>lib/</classpathPrefix>
                            <!--修改為自己的包名.類名,右鍵類->copy reference-->
                            <mainClass>com.ny.service.WordCount</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

scala library  選擇spark中的Scala版本 2.11.12 也是目前支持的最近版本

org.apache.spark  也選擇2.11   

否則會出現(xiàn)主類錯誤:

19/05/16 10:52:03 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:60010 (size: 22.9 KB, free: 366.3 MB)

19/05/16 10:52:03 INFO SparkContext: Created broadcast 0 from textFile at WordCount.scala:18

Exception in thread "main" java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/runtime/java8/JFunction2$mcIII$sp

at com.nyc.WordCount$.main(WordCount.scala:24)

at com.nyc.WordCount.main(WordCount.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

如何查看spark 中Scala版本號

進入路徑:

/usr/local/opt/spark-2.4.3/jars

IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的

3 word count測試腳本

package com.ny.service
import org.apache.spark.{SparkConf, SparkContext}
object WordCount{
  def main(args: Array[String]): Unit = {
    // 1 創(chuàng)建配置信息
    val conf = new SparkConf().setAppName("wc")
    // 2 創(chuàng)建spark context sc
     val  sc = new SparkContext(conf)
    // 3 處理邏輯
    //讀取文件
    val lines = sc.textFile(args(0))
    //壓平
    val words = lines.flatMap(_.split(" "))
    //map
    val k2v = words.map((_,1))
    val results = k2v.reduceByKey(_+_)
    //保存數(shù)據(jù)
    results.saveAsTextFile(args(1))
    // 4 關(guān)閉連接
    sc.stop()
  }
}

4 打包

IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的    IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的

復制到spark家目錄下,因為standalone模式所以沒有啟動Hadoop集群

nancylulululu:spark-2.4.3 nancy$ mv /Users/nancy/IdeaProjects/scala517/target/original-scala517-1.0.jar wc.jar 

5 spark submit 執(zhí)行

bin/spark-submit \
--class com.ny.service.WordCount \
--master spark://localhost:7077 \
./wc.jar \
file:///usr/local/opt/spark-2.4.3/test/1test \
file:///usr/local/opt/spark-2.4.3/test/out

如果是Hadoop file改為hdfs文件系統(tǒng)路徑 

查看執(zhí)行結(jié)果文件:

nancylulululu:out nancy$ ls
_SUCCESSpart-00000part-00001
nancylulululu:out nancy$ cat part-00000
(scala,2)
(hive,1)
(mysql,1)
(hello,5)
(java,2)

關(guān)于IDEA WordCount jar包上傳spark是怎么調(diào)試及排錯的就分享到這里了,希望以上內(nèi)容可以對大家有一定的幫助,可以學到更多知識。如果覺得文章不錯,可以把它分享出去讓更多的人看到。

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI