您好,登錄后才能下訂單哦!
本篇內(nèi)容介紹了“PostgreSQL中citus節(jié)點(diǎn)間的網(wǎng)絡(luò)需求有哪些”的有關(guān)知識(shí),在實(shí)際案例的操作過(guò)程中,不少人都會(huì)遇到這樣的困境,接下來(lái)就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!
citus 節(jié)點(diǎn)間的網(wǎng)絡(luò)需求:
1、cn節(jié)點(diǎn)訪問(wèn)所有worker節(jié)點(diǎn)。oltp業(yè)務(wù)的訪問(wèn)較頻繁。
2、重分布數(shù)據(jù)時(shí),worker節(jié)點(diǎn)間相互訪問(wèn)。訪問(wèn)頻度不大,OLAP業(yè)務(wù)常見(jiàn),一旦有可能數(shù)據(jù)交換吞吐較大。
citus的cn節(jié)點(diǎn)連worker節(jié)點(diǎn)為有兩種模式,
一種為事務(wù)級(jí)保持連接模式(每條SQL發(fā)起時(shí)建立連接,SQL結(jié)束后釋放連接(除非在事務(wù)中,否則SQL結(jié)束立即釋放連接)。),
另一種為會(huì)話級(jí)保持連接模式(會(huì)話發(fā)起時(shí)建立連接,會(huì)話結(jié)束后釋放連接。)。
1、跑OLAP類的SQL時(shí),使用的是第一種即時(shí)連接模式(OLAP場(chǎng)景并發(fā)不高,建立連接帶來(lái)的額外開銷不明顯)
可以在worker節(jié)點(diǎn)打開參數(shù)進(jìn)行跟蹤
postgres=# show log_connections ; log_connections ----------------- on (1 row) postgres=# show log_disconnections ; log_disconnections -------------------- on (1 row)
例子,
以下兩條SQL均為即時(shí)短連接模式(Custom Scan (Citus Task-Tracker) Custom Scan (Citus Real-Time)
)。
postgres=# set citus.task_executor_type =task; ERROR: invalid value for parameter "citus.task_executor_type": "task" HINT: Available values: real-time, task-tracker. postgres=# set citus.task_executor_type ='task-tracker'; SET postgres=# explain select count(*) from pgbench_accounts ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Task-Tracker) (cost=0.00..0.00 rows=0 width=0) Task Count: 128 Tasks Shown: One of 128 -> Task Node: host=172.24.211.224 port=1921 dbname=postgres -> Aggregate (cost=231.85..231.86 rows=1 width=8) -> Seq Scan on pgbench_accounts_106812 pgbench_accounts (cost=0.00..212.48 rows=7748 width=0) (8 rows) postgres=# set citus.task_executor_type ='real-time'; postgres=# explain select count(*) from pgbench_accounts ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Real-Time) (cost=0.00..0.00 rows=0 width=0) Task Count: 128 Tasks Shown: One of 128 -> Task Node: host=172.24.211.224 port=1921 dbname=postgres -> Aggregate (cost=231.85..231.86 rows=1 width=8) -> Seq Scan on pgbench_accounts_106812 pgbench_accounts (cost=0.00..212.48 rows=7748 width=0) (8 rows)
2、跑OLTP查詢時(shí)(通常并發(fā)很高,前端有連接池(保持會(huì)話)),為會(huì)話級(jí)保持連接模式(Custom Scan (Citus Router)
)。
以下SQL為長(zhǎng)連接模式(不會(huì)立即釋放,而是等會(huì)再釋放,以降低高并發(fā)時(shí)連接帶來(lái)的開銷)
postgres=# explain select * from pgbench_accounts where aid=5; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Custom Scan (Citus Router) (cost=0.00..0.00 rows=0 width=0) Task Count: 1 Tasks Shown: All -> Task Node: host=172.24.211.224 port=1921 dbname=postgres -> Index Scan using pgbench_accounts_pkey_106836 on pgbench_accounts_106836 pgbench_accounts (cost=0.28..2.50 rows=1 width=97) Index Cond: (aid = 5) (7 rows)
看以上兩種場(chǎng)景,CITUS應(yīng)該說(shuō)設(shè)計(jì)得已經(jīng)很不錯(cuò)了。既能滿足TP也能滿足AP。
但是前面也說(shuō)了,連接保持是在事務(wù)或會(huì)話層面的,如果查詢量大,或者用戶使用了短連接,建立連接的開銷就會(huì)很凸顯。
newdb=> \c postgres postgres You are now connected to database "postgres" as user "postgres". postgres=# select * from pgbench_accounts where aid=1; aid | bid | abalance | filler -----+-----+----------+-------------------------------------------------------------------------------------- 1 | 1 | 7214 | (1 row) Time: 11.264 ms -- 包括新建連接的開銷 postgres=# select * from pgbench_accounts where aid=1; aid | bid | abalance | filler -----+-----+----------+-------------------------------------------------------------------------------------- 1 | 1 | 7214 | (1 row) Time: 0.905 ms -- 已建立連接
在worker節(jié)點(diǎn)上,部署pgbouncer,所有與worker節(jié)點(diǎn)建立的連接都通過(guò)pgbouncer連接池,以此來(lái)保持住連接,降低worker節(jié)點(diǎn)頻繁新建連接的開銷。
部署方法
pgbouncer
yum install -y pgbouncer
配置
vi /etc/pgbouncer/pgb.ini [databases] newdb = host=/tmp dbname=newdb port=1921 user=digoal pool_size=128 reserve_pool=10 [pgbouncer] logfile = /var/log/pgbouncer/pgbouncer.log pidfile = /var/run/pgbouncer/pgbouncer.pid listen_addr = 0.0.0.0 listen_port = 8001 auth_type = any auth_file = /etc/pgbouncer/userlist.txt pool_mode = session server_reset_query = DISCARD ALL max_client_conn = 5000 default_pool_size = 128 ; 最大不要超過(guò)4倍CPU數(shù)
啟動(dòng)
pgbouncer -d -u pgbouncer /etc/pgbouncer/pgb.ini
在一個(gè)citus集群中,可以同時(shí)存在直連worker或者通過(guò)pgbouncer連接worker。
不同的database可以有不同的配置。
以下例子,新建一個(gè)database,使用pgbouncer連接worker.
新建數(shù)據(jù)庫(kù),插件
su - postgres psql -c "create role digoal login;" psql -c "create database newdb;" psql -c "grant all on database newdb to digoal;" psql -U postgres newdb -c "create extension citus;"
將worker添加到集群配置,使用pgbouncer的連接端口
su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.224', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.230', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.231', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.225', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.227', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.232', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.226', 8001);\"" su - postgres -c "psql -U postgres newdb -c \"SELECT * from master_add_node('xxx.xxx.xxx.229', 8001);\""
MX配置,同樣,將worker節(jié)點(diǎn)添加到元數(shù)據(jù)同步中。
psql newdb postgres
select * from master_add_node('xxx.xxx.xxx.224',8001); select * from master_add_node('xxx.xxx.xxx.230',8001);
開啟同步到元數(shù)據(jù)。
select start_metadata_sync_to_node('xxx.xxx.xxx.224',8001); select start_metadata_sync_to_node('xxx.xxx.xxx.230',8001);
1、tpc-b 長(zhǎng)連接測(cè)試
pgbench -i -s -U digoal newdb
psql -U digoal newdb select create_distributed_table('pgbench_accounts','aid'); select create_distributed_table('pgbench_branches','bid'); select create_distributed_table('pgbench_tellers','tid'); select create_distributed_table('pgbench_history','aid');
pgbench -M prepared -v -r -P 1 -c 64 -j 64 -T 120 -U digoal newdb -S
性能與不使用pgbouncer差不多,因?yàn)槭褂昧碎L(zhǎng)連接測(cè)試簡(jiǎn)單SQL(本身citus就使用了會(huì)話級(jí)連接保持,沒(méi)有短連接問(wèn)題)。
在一個(gè)新建的會(huì)話中,第一次查詢總是需要需要耗費(fèi)更多的時(shí)間(如果沒(méi)有建立連接,那么包括建立連接的時(shí)間。即使已建立連接,也需要耗費(fèi)額外的一些時(shí)間)(具體原因可以跟蹤分析一下code)。
以下使用的是pgbouncer連接worker,因此第一次QUERY不包含建立連接的時(shí)間。
newdb=> \q postgres@digoal-citus-gpdb-test001-> psql newdb digoal psql (10.5) Type "help" for help. \timing newdb=> select * from pgbench_accounts where aid=5; aid | bid | abalance | filler -----+-----+----------+-------------------------------------------------------------------------------------- 5 | 1 | 0 | (1 row) Time: 6.016 ms -- 不包括新建連接(已使用pgbouncer建立),但是也多了幾毫秒 -- 但是相比未使用pgbouncer,已經(jīng)降低了5毫秒左右的延遲。 newdb=> select * from pgbench_accounts where aid=5; aid | bid | abalance | filler -----+-----+----------+-------------------------------------------------------------------------------------- 5 | 1 | 0 | (1 row) Time: 0.989 ms
多出的幾毫秒,我們社區(qū)的小伙伴鄧彪、王健給出了原因如下,很多地方需要檢查安裝的citus版本與citus.control控制文件的版本是否兼容(比如加載分布式TABLE的RELCACHE時(shí),第一次訪問(wèn)就是這個(gè)問(wèn)題),不兼容報(bào)錯(cuò):
詳見(jiàn)代碼
https://github.com/citusdata/citus/blob/3fa04d8f2c8f27b1377fe4c1468ee47358117e3c/src/backend/distributed/utils/metadata_cache.c
/* * CheckAvailableVersion compares CITUS_EXTENSIONVERSION and the currently * available version from the citus.control file. If they are not compatible, * this function logs an error with the specified elevel and returns false, * otherwise it returns true. */ bool CheckAvailableVersion(int elevel) { char *availableVersion = NULL; if (!EnableVersionChecks) { return true; } availableVersion = AvailableExtensionVersion(); if (!MajorVersionsCompatible(availableVersion, CITUS_EXTENSIONVERSION)) { ereport(elevel, (errmsg("loaded Citus library version differs from latest " "available extension version"), errdetail("Loaded library requires %s, but the latest control " "file specifies %s.", CITUS_MAJORVERSION, availableVersion), errhint("Restart the database to load the latest Citus " "library."))); return false; } return true; } /* * AvailableExtensionVersion returns the Citus version from citus.control file. It also * saves the result, thus consecutive calls to CitusExtensionAvailableVersion will * not read the citus.control file again. */ static char * AvailableExtensionVersion(void) { ReturnSetInfo *extensionsResultSet = NULL; TupleTableSlot *tupleTableSlot = NULL; FunctionCallInfoData *fcinfo = NULL; FmgrInfo *flinfo = NULL; int argumentCount = 0; EState *estate = NULL; bool hasTuple = false; bool goForward = true; bool doCopy = false; char *availableExtensionVersion; InitializeCaches(); estate = CreateExecutorState(); extensionsResultSet = makeNode(ReturnSetInfo); extensionsResultSet->econtext = GetPerTupleExprContext(estate); extensionsResultSet->allowedModes = SFRM_Materialize; fcinfo = palloc0(sizeof(FunctionCallInfoData)); flinfo = palloc0(sizeof(FmgrInfo)); fmgr_info(F_PG_AVAILABLE_EXTENSIONS, flinfo); InitFunctionCallInfoData(*fcinfo, flinfo, argumentCount, InvalidOid, NULL, (Node *) extensionsResultSet); /* pg_available_extensions returns result set containing all available extensions */ (*pg_available_extensions)(fcinfo); tupleTableSlot = MakeSingleTupleTableSlot(extensionsResultSet->setDesc); hasTuple = tuplestore_gettupleslot(extensionsResultSet->setResult, goForward, doCopy, tupleTableSlot); while (hasTuple) { Datum extensionNameDatum = 0; char *extensionName = NULL; bool isNull = false; extensionNameDatum = slot_getattr(tupleTableSlot, 1, &isNull); extensionName = NameStr(*DatumGetName(extensionNameDatum)); if (strcmp(extensionName, "citus") == 0) { MemoryContext oldMemoryContext = NULL; Datum availableVersion = slot_getattr(tupleTableSlot, 2, &isNull); /* we will cache the result of citus version to prevent catalog access */ oldMemoryContext = MemoryContextSwitchTo(CacheMemoryContext); availableExtensionVersion = text_to_cstring(DatumGetTextPP(availableVersion)); MemoryContextSwitchTo(oldMemoryContext); ExecClearTuple(tupleTableSlot); ExecDropSingleTupleTableSlot(tupleTableSlot); return availableExtensionVersion; } ExecClearTuple(tupleTableSlot); hasTuple = tuplestore_gettupleslot(extensionsResultSet->setResult, goForward, doCopy, tupleTableSlot); } ExecDropSingleTupleTableSlot(tupleTableSlot); ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("citus extension is not found"))); return NULL; }
1、對(duì)于業(yè)務(wù)層短連接會(huì)有比較好的效果??梢越档椭辽?毫秒左右的延遲。
2、對(duì)于大量復(fù)雜查詢(需要motion的查詢),可以減少節(jié)點(diǎn)間的連接數(shù)。
“PostgreSQL中citus節(jié)點(diǎn)間的網(wǎng)絡(luò)需求有哪些”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注億速云網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實(shí)用文章!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。