溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

如何分析spark lac 停留最長(zhǎng)的兩個(gè)地方

發(fā)布時(shí)間:2021-12-17 13:47:05 來(lái)源:億速云 閱讀:115 作者:柒染 欄目:大數(shù)據(jù)

這篇文章將為大家詳細(xì)講解有關(guān)如何分析spark lac 停留最長(zhǎng)的兩個(gè)地方,文章內(nèi)容質(zhì)量較高,因此小編分享給大家做個(gè)參考,希望大家閱讀完這篇文章后對(duì)相關(guān)知識(shí)有一定的了解。

package hgs.spark.othertest
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object FindTheTop2 {
  def main(args: Array[String]): Unit = {
    
    val conf = new  SparkConf().setAppName("FindTheTop2").setMaster("local[3]")
    val sc = new SparkContext(conf)
    val rdd1 = sc.textFile("D:\\bs_log")  
    //rdd_phone_lac_time:(18688888888 16030401EAFB68F1E3CDF819735E1C66,-20160327082400,1), (18611132889 16030401EAFB68F1E3CDF819735E1C66,-20160327082500,1)
    //先映射為上一行為例的map(K,V)
    val rdd_phone_lac_time = rdd1.map(x=>{
      val list = x.split(",")
      if(Integer.parseInt(list(3))==1)
      (list(0)+" "+list(2),-list(1).toLong)
      else{
       (list(0)+" "+list(2),list(1).toLong)
         }
     }
    )
    //根據(jù)rdd_phone_lac_time 的key進(jìn)行reduce,將所有key相同的數(shù)據(jù)相加
    val rdd_reduce_phone_lackey = rdd_phone_lac_time.reduceByKey((x,y)=>x+y)
    //(18688888888,CompactBuffer((18688888888 16030401EAFB68F1E3CDF819735E1C66,87600), (18688888888 9F36407EAD0629FC166F14DDE7970F68,51200), (18688888888 CC0710CC94ECC657A8561DE549D940E0,1300)))
    //取top2,mapValues對(duì)values操作,返回的是map(K,V),K是原始的K,V是操作后得到的V
    val rdd_reduce_phone_lackey_groupyed = rdd_reduce_phone_lackey.groupBy(x=>x._1.split(" ")(0))
    val rdd_top2 = rdd_reduce_phone_lackey_groupyed.mapValues(x=>{
      x.toList.sortBy(_._2).reverse.take(2)    
    })
    //(16030401EAFB68F1E3CDF819735E1C66,(18688888888,16030401EAFB68F1E3CDF819735E1C66,87600))
    //下面需要與另一個(gè)map根據(jù)特定的字段例如16030401EAFB68F1E3CDF819735E1C66進(jìn)行join,所以需要將‘18688888888 16030401EAFB68F1E3CDF819735E1C66’拆開(kāi),將第二個(gè)作為K,返回新的map
    val rdd_result = rdd_top2.flatMap(x=>{
         x._2.map(y=>{
           val li = y._1.split(" ")
           (li(1),(li(0),li(1),y._2))
         })
         
       })
       //該文件中即是需要與上面的結(jié)果進(jìn)行join
     val lati_longti = sc.textFile("D:\\lac_info", 1)
     //(9F36407EAD0629FC166F14DDE7970F68,(116.304864,40.050645))
     //映射成如上一行的map
     val rdd_coordinate = lati_longti.map(f=>{
       val li = f.split(",")
       (li(0),(li(0),li(1),li(2)))
     })
     //進(jìn)行join
     //rdd_coordinate 與rdd_result的結(jié)構(gòu)類型已改是一樣的,即K,V的類型對(duì)應(yīng),否則無(wú)法join
     val join_resultWithcoordinate = rdd_coordinate.join(rdd_result)
    // rdd_coordinate.to
     //println(rdd_result.collect().length)
     //保存文件
    join_resultWithcoordinate.saveAsTextFile("d:\\dest")
    sc.stop()
    
  }
}

樣例數(shù)據(jù)

D:\\bs_log
18688888888,20160327082400,16030401EAFB68F1E3CDF819735E1C66,1
18611132889,20160327082500,16030401EAFB68F1E3CDF819735E1C66,1
18688888888,20160327170000,16030401EAFB68F1E3CDF819735E1C66,0
18611132889,20160327075000,9F36407EAD0629FC166F14DDE7970F68,1
18688888888,20160327075100,9F36407EAD0629FC166F14DDE7970F68,1
18611132889,20160327081000,9F36407EAD0629FC166F14DDE7970F68,0
18688888888,20160327081300,9F36407EAD0629FC166F14DDE7970F68,0
18688888888,20160327175000,9F36407EAD0629FC166F14DDE7970F68,1
18611132889,20160327182000,9F36407EAD0629FC166F14DDE7970F68,1
18688888888,20160327220000,9F36407EAD0629FC166F14DDE7970F68,0
18611132889,20160327230000,9F36407EAD0629FC166F14DDE7970F68,0
18611132889,20160327180000,16030401EAFB68F1E3CDF819735E1C66,0
18611132889,20160327081100,CC0710CC94ECC657A8561DE549D940E0,1
18688888888,20160327081200,CC0710CC94ECC657A8561DE549D940E0,1
18688888888,20160327081900,CC0710CC94ECC657A8561DE549D940E0,0
18611132889,20160327082000,CC0710CC94ECC657A8561DE549D940E0,0
18688888888,20160327171000,CC0710CC94ECC657A8561DE549D940E0,1
18688888888,20160327171600,CC0710CC94ECC657A8561DE549D940E0,0
18611132889,20160327180500,CC0710CC94ECC657A8561DE549D940E0,1
18611132889,20160327181500,CC0710CC94ECC657A8561DE549D940E0,0
D:\\lac_info  
9F36407EAD0629FC166F14DDE7970F68,116.304864,40.050645,6
CC0710CC94ECC657A8561DE549D940E0,116.303955,40.041935,6
16030401EAFB68F1E3CDF819735E1C66,116.296302,40.032296,6
數(shù)據(jù)結(jié)果:
(16030401EAFB68F1E3CDF819735E1C66,((16030401EAFB68F1E3CDF819735E1C66,116.296302,40.032296),(18688888888,16030401EAFB68F1E3CDF819735E1C66,87600)))
(16030401EAFB68F1E3CDF819735E1C66,((16030401EAFB68F1E3CDF819735E1C66,116.296302,40.032296),(18611132889,16030401EAFB68F1E3CDF819735E1C66,97500)))
(9F36407EAD0629FC166F14DDE7970F68,((9F36407EAD0629FC166F14DDE7970F68,116.304864,40.050645),(18688888888,9F36407EAD0629FC166F14DDE7970F68,51200)))
(9F36407EAD0629FC166F14DDE7970F68,((9F36407EAD0629FC166F14DDE7970F68,116.304864,40.050645),(18611132889,9F36407EAD0629FC166F14DDE7970F68,54000)))

關(guān)于如何分析spark lac 停留最長(zhǎng)的兩個(gè)地方就分享到這里了,希望以上內(nèi)容可以對(duì)大家有一定的幫助,可以學(xué)到更多知識(shí)。如果覺(jué)得文章不錯(cuò),可以把它分享出去讓更多的人看到。

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI