溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

Nginx + Docker 手動集群方式運行 EMQ

發(fā)布時間:2020-08-08 11:40:27 來源:ITPUB博客 閱讀:198 作者:emqx 欄目:互聯(lián)網(wǎng)科技

EMQ X 在支持客戶的過程中,了解到有客戶使用 Nginx 做負載均衡,Docker 容器手動加入集群的方式運行 EMQ 集群,現(xiàn)將主要過程記錄下來。

業(yè)務需求

  • 使用 Nginx 作為反向代理
  • Nginx 需要提前分配好代理 server 的地址
  • 使用 Docker 容器運行 EMQ
  • EMQ 自動重啟
  • EMQ 重啟后自動集群

配置

Nginx 配置

$ cat /etc/nginx/tcpstream.conf## tcp LB  and SSL passthrough for backend ##stream {
    upstream mqtt_broker {
        server 127.0.0.1:21871; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21872; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21873; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21874; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21875; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21881; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21891; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21882; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21892; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21883; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21893; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21884; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21894; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21885; #max_fails=5 fail_timeout=30s;
        server 127.0.0.1:21895; #max_fails=5 fail_timeout=30s;
    }
log_format basic '$proxy_protocol_addr - $remote_addr [$time_local] '
                 '$protocol $status $bytes_sent $bytes_received '
                 '$session_time "$upstream_addr" '
                 '"$upstream_bytes_sent" "$upstream_bytes_received" "$upstream_connect_time"';
    access_log /var/log/nginx/access.log basic;
    error_log  /var/log/nginx/error.log;
    server {
        listen 8884 ssl; # proxy_protocol;
        proxy_next_upstream on;
        #proxy_bind $remote_addr transparent;
        proxy_ssl off;
        proxy_pass mqtt_broker;
        proxy_protocol on;
        #ssl_on;
        # adding some extra proxy settings
        proxy_timeout 350s;
        #proxy_buffer_size 128k;
        #ssl_certificate /etc/nginx/certs/solace.pem;
        #ssl_certificate_key /etc/nginx/certs/solace.pem;
        ssl_certificate /etc/nginx/certs/cert.pem;
        ssl_certificate_key /etc/nginx/certs/key.pem;
        #ssl_verify_client off;
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        ssl_ciphers HIGH:!aNULL:!MD5;
    }
}

Docker 配置

客戶自行編譯的 Docker image,并非使用 EMQ 提供的官方鏡像。

Dockerfile 目錄如下:

$ ll /opt/Docker/總用量 28
-rw-r--r--  1 alexeyp emq      620 10月 22 17:26 Dockerfile
lrwxrwxrwx  1 alexeyp emq       13 10月 24 13:59 emqttd -> emqttd.2.3.11
drwxr-xr-x 10 alexeyp emq      110 10月 24 14:27 emqttd.2.3.11
-rwxr-xr-x  1 alexeyp emq     3463 10月 26 05:03 StartEmqInstance.sh
-rwxr-xr-x  1 alexeyp alexeyp  270 10月 25 10:46 status.sh

Dockerfile:

$ cat DockerfileFROM centos:latest
RUN yum -y update
EXPOSE 60000-65000
WORKDIR /opt/emqttd
ADD ./emqttd /opt/emqttd
ADD ./vsparc.rpm /tmp/vsparc.rpm
ADD ./StartEmqInstance.sh /opt/emqttd/StartEmqInstance.sh
RUN yum install -y epel-release
RUN yum install -y which less sed net-tools telnet gtest /tmp/vsparc.rpm
ENV TZ Australia/Melbourne
CMD bash /opt/emqttd/StartEmqInstance.sh && bash

可以看到 Docker 容器啟動后會執(zhí)行一個 StartEmqInstance.sh 的腳本,查看該腳本:

$ cat StartEmqInstance.sh#!/bin/bashDIR=$(dirname $0)
HOSTNAME=$(hostname -s)
function adjust_instance()
{
    local INST=$1
    local INST_ROOT=$2
    cat $INST_ROOT/etc/emq.conf | \
       sed -re "s/^node\.name\s*=.*$/node.name = emq$INST@127.0.0.1/" | \
       #sed -re "s/^cluster\.name\s*=.*$/cluster.name = $HOSTNAME/" | \
       sed -re "s/^listener\.tcp\.external\s*=.*$/listener.tcp.external = 0.0.0.0:6188$INST/" | \
       sed -re "s/^listener\.tcp\.external1\s*=.*$/listener.tcp.external1 = 0.0.0.0:6189$INST/" | \
       sed -re "s/^listener\.tcp\.external2\s*=.*$/listener.tcp.external2 = 0.0.0.0:6187$INST/" | \
       sed -re "s/^listener\.tcp\.internal\s*=.*$/listener.tcp.internal = 127.0.0.1:6298$INST/" | \
       sed -re "s/^listener\.ssl\.external\s*=.*$/listener.ssl.external = 6288$INST/" | \
       sed -re "s/^listener\.ws\.external\s*=.*$/listener.ws.external = 6208$INST/" | \
       sed -re "s/^listener\.wss\.external\s*=.*$/listener.ws.external = 6308$INST/" | \
       sed -re "s/^listener\.api\.mgmt\s*=.*$/listener.api.mgmt = 6408$INST/" | \
       sed -re "s/^(##\s)?listener\.tcp\.external\.proxy_protocol\s=.*$/listener.tcp.external.proxy_protocol = on/" | \
       sed -re "s/^(##\s)?listener\.tcp\.external1\.proxy_protocol\s=.*$/listener.tcp.external1.proxy_protocol = on/" | \
       sed -re "s/^(##\s)?listener\.tcp\.external2\.proxy_protocol\s=.*$/listener.tcp.external2.proxy_protocol = on/" | \
       sed -re "s/^(##\s)?listener\.tcp\.external\.proxy_protocol_timeout\s=.*$/listener.tcp.external.proxy_protocol_timeout = 30s/" | \
       sed -re "s/^(##\s)?listener\.tcp\.external1\.proxy_protocol_timeout\s=.*$/listener.tcp.external1.proxy_protocol_timeout = 30s/" | \
       sed -re "s/^(##\s)?listener\.tcp\.external2\.proxy_protocol_timeout\s=.*$/listener.tcp.external2.proxy_protocol_timeout = 30s/" | \
       sed -re "s/^(##\s)?node.dist_listen_min\s*=.*$/node.dist_listen_min = 6000$INST/" | \
       sed -re "s/^(##\s)?node.dist_listen_max\s*=.*$/node.dist_listen_max = 6000$INST/" | \
       cat - > $INST_ROOT/etc/emq.conf.new
    mv $INST_ROOT/etc/emq.conf.new $INST_ROOT/etc/emq.conf
}
function cluster_instance()
{
    local INST=$1
    for DEST in 1 2 3 4 5; do
        if [ $DEST == $INST ]; then
            continue;
        fi
        DEST_NODE="emq$DEST@127.0.0.1"
        RESULT=$(/opt/emqttd/bin/emqttd_ctl cluster join $DEST_NODE 2>&1)
        echo "$RESULT"
        echo "$RESULT" | grep -E 'successfully|already' > /dev/null
        RC=$?
        [ $RC == 0 ] && break
    done
}
cd "$DIR"
if [ "$EMQ_INSTANCE_NUMBER" == "" ]; then
    echo "Environment variable EMQ_INSTANCE_NUMBER(1..10) is not set."
    echo "eMQ instance name is not configured."
    exit 1
else
    adjust_instance $EMQ_INSTANCE_NUMBER $DIR
fi
function run_application()
{
    local CMD="$1"
    local RC=1
    while [ $RC != 0 ]; do
        $CMD
        RC=$?
        echo "### Exited: $CMD"
        echo "### rc = $RC"
        #[ $RC != 0 ] && sleep 3
        RC=1
    done
    echo "### Done: $CMD"
}
function start_node()
{
    bin/emqttd start
    STARTED=0
    while [ $STARTED == 0 ]; do
        sleep 1
        /opt/emqttd/bin/emqttd_ctl status | grep "is running"
        [ $? == 0 ] && break
    done
    cluster_instance $EMQ_INSTANCE_NUMBER > /tmp/cluster_instance.log
}
start_node
sleep 5
run_application "/usr/local/bin/emqtt-stats-collector" &#waitIDLE_TIME=0
while [[ $IDLE_TIME -lt 5 ]]
do
    IDLE_TIME=$((IDLE_TIME+1))
    if [[ ! -z "$( /opt/emqttd/bin/emqttd_ctl status|grep 'is running'|awk '{print $1}')" ]]; then
        IDLE_TIME=0
    else
        echo "['$(date -u +"%Y-%m-%dT%H:%M:%SZ")']:emqttd not running, waiting for recovery in $((60-IDLE_TIME*5)) seconds"
    fi
    sleep 5
done
echo "['$(date -u +"%Y-%m-%dT%H:%M:%SZ")']:emqttd exit abnormally"
exit 1

腳本內(nèi)容稍多而且有些復雜,需要結合 start.sh 腳本和 etc/emq.conf一起看

$ cat start.sh#!/bin/bashfor INST in 1 2 3 4 5
do
    docker ps | grep -E "\sinstance_$INST$"
    if [ $? != 0 ]; then
        #docker run -itd ---ulimit nofile=1048576 -restart=always -v /opt/Docker/emqtt/emq$INST/data/mnesia:/opt/emqttd/data/mnesia  -e EMQ_INSTANCE_NUMBER=$INST --name=instance_$INST --network host emq:test &
        docker run -itd --ulimit nofile=1048576 -e EMQ_INSTANCE_NUMBER=$INST --name=instance_$INST --network host emq:latest &
    fi
done
wait

EMQ 配置

etc/emq.conf`的全文就不貼出來了,主要是增加了兩個 tcp 監(jiān)聽端口,并且關閉了`listener.tcp.external.tune_buffer
$ cat etc/emq.conf......
##--------------------------------------------------------------------
listener.tcp.external = 0.0.0.0:21881
listener.tcp.external.acceptors = 16
listener.tcp.external.max_clients = 512000
listener.tcp.external.access.1 = allow all
listener.tcp.external.proxy_protocol = on
listener.tcp.external.proxy_protocol_timeout = 30s
listener.tcp.external.backlog = 1024
listener.tcp.external.send_timeout = 15s
listener.tcp.external.send_timeout_close = on
## listener.tcp.external.tune_buffer = on
listener.tcp.external.nodelay = true
listener.tcp.external.reuseaddr = true
##--------------------------------------------------------------------
listener.tcp.external1 = 0.0.0.0:21891
listener.tcp.external1.acceptors = 16
listener.tcp.external1.max_clients = 512000
listener.tcp.external1.access.1 = allow all
listener.tcp.external1.proxy_protocol = on
listener.tcp.external1.proxy_protocol_timeout = 30s
listener.tcp.external1.backlog = 1024
listener.tcp.external1.send_timeout = 15s
listener.tcp.external1.send_timeout_close = on
## listener.tcp.external1.tune_buffer = on
listener.tcp.external1.nodelay = true
listener.tcp.external1.reuseaddr = true
##--------------------------------------------------------------------
listener.tcp.external2 = 0.0.0.0:21871
listener.tcp.external2.acceptors = 16
listener.tcp.external2.max_clients = 512000
listener.tcp.external2.access.1 = allow all
listener.tcp.external2.proxy_protocol = on
listener.tcp.external2.proxy_protocol_timeout = 30s
listener.tcp.external2.backlog = 1024
listener.tcp.external2.send_timeout = 15s
listener.tcp.external2.send_timeout_close = on
## listener.tcp.external2.tune_buffer = on
listener.tcp.external2.nodelay = true
listener.tcp.external2.reuseaddr = true
......

業(yè)務分析

Docker 容器初始化

Docker 容器創(chuàng)建之后, StartEmqInstance.sh執(zhí)行 adjust_instance()etc/emq.conf中監(jiān)聽的端口修改為Nginx 的代理 server

 sed -re "s/^node\.name\s*=.*$/node.name = emq$INST@127.0.0.1/" | \
 sed -re "s/^listener\.tcp\.external\s*=.*$/listener.tcp.external = 0.0.0.0:6188$INST/" 
 sed -re "s/^listener\.tcp\.external1\s*=.*$/listener.tcp.external1 = 0.0.0.0:6189$INST/" 
 sed -re "s/^listener\.tcp\.external2\s*=.*$/listener.tcp.external2 = 0.0.0.0:6187$INST/" 
 sed -re "s/^listener\.tcp\.internal\s*=.*$/listener.tcp.internal = 127.0.0.1:6298$INST/"

并通過 join 命令來實現(xiàn)集群功能

function cluster_instance()
{
    local INST=$1
    for DEST in 1 2 3 4 5; do
        if [ $DEST == $INST ]; then
            continue;
        fi
        DEST_NODE="emq$DEST@127.0.0.1"
        RESULT=$(/opt/emqttd/bin/emqttd_ctl cluster join $DEST_NODE 2>&1)
        echo "$RESULT"
        echo "$RESULT" | grep -E 'successfully|already' > /dev/null
        RC=$?
        [ $RC == 0 ] && break
    done
}

循環(huán)檢查 EMQ 的狀態(tài),當 EMQ 停止了之后退出容器

IDLE_TIME=0
while [[ $IDLE_TIME -lt 5 ]]
do
    IDLE_TIME=$((IDLE_TIME+1))
    if [[ ! -z "$( /opt/emqttd/bin/emqttd_ctl status|grep 'is running'|awk '{print $1}')" ]]; then
        IDLE_TIME=0
    else
        echo "['$(date -u +"%Y-%m-%dT%H:%M:%SZ")']:emqttd not running, waiting for recovery in $((60-IDLE_TIME*5)) seconds"
    fi
    sleep 5
done
echo "['$(date -u +"%Y-%m-%dT%H:%M:%SZ")']:emqttd exit abnormally"
exit 1

訪問

客戶端通過 SSL 方式連接 地址,Nginx 將連接以 TCP 方式負載到 EMQ 節(jié)點。

PS:關于 Nginx 如何反向代理 tcp 和 ssl 的設置,可以參考 EMQ X 消息服務器 Nginx 反向代理

自動重啟和自動集群

容器啟動后通過 StartEmqInstance.sh腳本查詢 EMQ 的狀態(tài),當 EMQ 停止時退出容器,配合 --restart=always來達到重啟容器的目的。

EMQ 將集群信息儲存在 data/mnesia中,將容器的中的目錄映射到宿主機,當容器重啟之后會讀取宿主機映射的相關目錄,實現(xiàn)重啟后自動集群。

存在問題

  • Docker 的 host 網(wǎng)絡模式使用宿主機的網(wǎng)絡,當宿主機有其他業(yè)務在執(zhí)行的時候,容易出現(xiàn)端口沖突

解決方案

  • 修改 /proc/sys/net/ipv4/ip_local_port_range指定系統(tǒng)分配的端口為 1024 60000,然后將 EMQ 的業(yè)務端口分配為 60000 之后的端口

實踐案例

建議使用 kubernetes 來編排 docker 容器:

  • EMQ 可以通過 kube-apiserver來實現(xiàn)自動集群的功能。
  • 該客戶目前只是在單機部署docker集群,使用 kubernetes 可以輕易實現(xiàn)多個節(jié)點之間部署集群。
  • kubernetes 的 deployment可以監(jiān)控 emqx pod的狀態(tài),實現(xiàn)自動重啟、彈性擴容等功能。
  • 每個 emqx pod都有獨立的虛擬 IP,不會出現(xiàn)端口沖突的問題。
  • kubernetes 的 Service可以實現(xiàn)固定 IP 和負載均衡的需求,在 Service 創(chuàng)建的請求中,可以通過設置 spec.clusterIP 字段來指定自己的集群 IP 地址,將 Nginx 的代理 server 設置成 clusterIP即可, Service可自行實現(xiàn)負載均衡。
向AI問一下細節(jié)

免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權內(nèi)容。

AI