您好,登錄后才能下訂單哦!
背景:
為了滿足各個(gè)平臺(tái)間數(shù)據(jù)的傳輸,以及能確保歷史性和實(shí)時(shí)性。先選用kafka作為不同平臺(tái)數(shù)據(jù)傳輸?shù)闹修D(zhuǎn)站,來(lái)滿足我們對(duì)跨平臺(tái)數(shù)據(jù)發(fā)送與接收的需要。
kafka簡(jiǎn)介:
Kafka is a distributed,partitioned,replicated commit logservice。它提供了類似于JMS的特性,但是在設(shè)計(jì)實(shí)現(xiàn)上完全不同,此外它并不是JMS規(guī)范的實(shí)現(xiàn)。kafka對(duì)消息保存時(shí)根據(jù)Topic進(jìn)行歸類,發(fā)送消息者成為Producer,消息接受者成為Consumer,此外kafka集群有多個(gè)kafka實(shí)例組成,每個(gè)實(shí)例(server)成為broker。無(wú)論是kafka集群,還是producer和consumer都依賴于zookeeper來(lái)保證系統(tǒng)可用性集群保存一些meta信息。
總之:kafka做為中轉(zhuǎn)站有以下功能:
1.生產(chǎn)者(產(chǎn)生數(shù)據(jù)或者說(shuō)是從外部接收數(shù)據(jù))
2.消費(fèi)著(將接收到的數(shù)據(jù)轉(zhuǎn)花為自己所需用的格式)
環(huán)境:
1.python3.5.x
2.kafka1.4.3
3.pandas
準(zhǔn)備開(kāi)始:
1.kafka的安裝
pip install kafka-python
pip install pandas
4.kafka數(shù)據(jù)的傳輸
直接擼代碼:
# -*- coding: utf-8 -*- ''' @author: 真夢(mèng)行路 @file: kafka.py @time: 2018/9/3 10:20 ''' import sys import json import pandas as pd import os from kafka import KafkaProducer from kafka import KafkaConsumer from kafka.errors import KafkaError KAFAKA_HOST = "xxx.xxx.x.xxx" #服務(wù)器端口地址 KAFAKA_PORT = 9092 #端口號(hào) KAFAKA_TOPIC = "topic0" #topic data=pd.read_csv(os.getcwd()+'\\data\\1.csv') key_value=data.to_json() class Kafka_producer(): ''' 生產(chǎn)模塊:根據(jù)不同的key,區(qū)分消息 ''' def __init__(self, kafkahost, kafkaport, kafkatopic, key): self.kafkaHost = kafkahost self.kafkaPort = kafkaport self.kafkatopic = kafkatopic self.key = key self.producer = KafkaProducer(bootstrap_servers='{kafka_host}:{kafka_port}'.format( kafka_host=self.kafkaHost, kafka_port=self.kafkaPort) ) def sendjsondata(self, params): try: parmas_message = params #注意dumps producer = self.producer producer.send(self.kafkatopic, key=self.key, value=parmas_message.encode('utf-8')) producer.flush() except KafkaError as e: print(e) class Kafka_consumer(): def __init__(self, kafkahost, kafkaport, kafkatopic, groupid,key): self.kafkaHost = kafkahost self.kafkaPort = kafkaport self.kafkatopic = kafkatopic self.groupid = groupid self.key = key self.consumer = KafkaConsumer(self.kafkatopic, group_id=self.groupid, bootstrap_servers='{kafka_host}:{kafka_port}'.format( kafka_host=self.kafkaHost, kafka_port=self.kafkaPort) ) def consume_data(self): try: for message in self.consumer: yield message except KeyboardInterrupt as e: print(e) def sortedDictValues(adict): items = adict.items() items=sorted(items,reverse=False) return [value for key, value in items] def main(xtype, group, key): ''' 測(cè)試consumer和producer ''' if xtype == "p": # 生產(chǎn)模塊 producer = Kafka_producer(KAFAKA_HOST, KAFAKA_PORT, KAFAKA_TOPIC, key) print("===========> producer:", producer) params =key_value producer.sendjsondata(params) if xtype == 'c': # 消費(fèi)模塊 consumer = Kafka_consumer(KAFAKA_HOST, KAFAKA_PORT, KAFAKA_TOPIC, group,key) print("===========> consumer:", consumer) message = consumer.consume_data() for msg in message: msg=msg.value.decode('utf-8') python_data=json.loads(msg) ##這是一個(gè)字典 key_list=list(python_data) test_data=pd.DataFrame() for index in key_list: print(index) if index=='Month': a1=python_data[index] data1 = sortedDictValues(a1) test_data[index]=data1 else: a2 = python_data[index] data2 = sortedDictValues(a2) test_data[index] = data2 print(test_data) # print('value---------------->', python_data) # print('msg---------------->', msg) # print('key---------------->', msg.kry) # print('offset---------------->', msg.offset) if __name__ == '__main__': main(xtype='p',group='py_test',key=None) main(xtype='c',group='py_test',key=None)
向AI問(wèn)一下細(xì)節(jié)
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。