您好,登錄后才能下訂單哦!
環(huán)境配置,例如OS kernel 參數(shù);
創(chuàng)建gp管理用戶(hù);
ssh key的交換(使用gpssh-exkeys -e exist_hosts -x new_hosts);
greenplum bin軟件的拷貝;
規(guī)劃segment 數(shù)據(jù)目錄;
使用gpcheck檢查 (gpcheck -f new_hosts );
使用gpcheckperf檢查性能 (gpcheckperf -f new_hosts_file -d /data1 -d /data2 -v)
這一步主要做的是
產(chǎn)生配置文件(gpexpand -f new_hosts_file),也可以自己寫(xiě)配置文件;
在指定目錄初始化segment數(shù)據(jù)庫(kù)(gpexpand -i cnf -D dbname );
將新增的segment信息添加到master元表;
擴(kuò)展失敗了怎么處理?
規(guī)劃表的重分布優(yōu)先級(jí)順序;
將表數(shù)據(jù)根據(jù)新的 segments 重新分布。
分析表;
4臺(tái)虛擬機(jī),每臺(tái)虛擬機(jī)16G內(nèi)存,跑8個(gè)segment。
查看配置
test=# select * from gp_segment_configuration ;
因?yàn)闆](méi)有新增主機(jī),所以直接進(jìn)入第二步.
創(chuàng)建需要擴(kuò)展segment的主機(jī)文件
cat > seg_nodes << EOF
gpsegment62
gpsegment63
gpsegment64
EOF
$gpexpand -f ./seg_nodes
Would you like to initiate a new System Expansion Yy|Nn (default=N):
> y
How many new primary segments per host do you want to add? (default=0): 每個(gè)主機(jī)加幾個(gè)segment
> 8
Enter new primary data directory 1: segment data目錄
> /greenplum/data/gpdatap9
此處中間省略。。。
Enter new primary data directory 8: segment data目錄
> /greenplum/data/gpdatap16
Enter new mirror data directory 1: segment data目錄
> /greenplum/data/gpdatam9
此處中間省略。。。
Enter new mirror data directory 8: segment data目錄
> /greenplum/data/gpdatam16
Input configuration files were written to 'gpexpand_inputfile_20180814_140954' and 'None'.
Please review the file and make sure that it is correct then re-run
with: gpexpand -i gpexpand_inputfile_20180814_140954 -D digoal
$cat gpexpand_inputfile_20180814_140954
gpsegment62:gpsegment62:40008:/greenplum/data/gpdatap9/gpseg24:50:24:p:41008
gpsegment63:gpsegment63:50008:/greenplum/data/gpdatam9/gpseg24:82:24:m:51008
gpsegment62:gpsegment62:40009:/greenplum/data/gpdatap10/gpseg25:51:25:p:41009
gpsegment63:gpsegment63:50009:/greenplum/data/gpdatam10/gpseg25:83:25:m:51009
gpsegment62:gpsegment62:40010:/greenplum/data/gpdatap11/gpseg26:52:26:p:41010
gpsegment63:gpsegment63:50010:/greenplum/data/gpdatam11/gpseg26:84:26:m:51010
gpsegment62:gpsegment62:40011:/greenplum/data/gpdatap12/gpseg27:53:27:p:41011
gpsegment63:gpsegment63:50011:/greenplum/data/gpdatam12/gpseg27:85:27:m:51011
gpsegment62:gpsegment62:40012:/greenplum/data/gpdatap13/gpseg28:54:28:p:41012
...
gpsegment62:gpsegment62:50012:/greenplum/data/gpdatam13/gpseg44:78:44:m:51012
gpsegment64:gpsegment64:40013:/greenplum/data/gpdatap14/gpseg45:71:45:p:41013
gpsegment62:gpsegment62:50013:/greenplum/data/gpdatam14/gpseg45:79:45:m:51013
gpsegment64:gpsegment64:40014:/greenplum/data/gpdatap15/gpseg46:72:46:p:41014
gpsegment62:gpsegment62:50014:/greenplum/data/gpdatam15/gpseg46:80:46:m:51014
gpsegment64:gpsegment64:40015:/greenplum/data/gpdatap16/gpseg47:73:47:p:41015
gpsegment62:gpsegment62:50015:/greenplum/data/gpdatam16/gpseg47:81:47:m:51015
內(nèi)容包括幾個(gè)字段
hostname 主機(jī)名
address 類(lèi)似主機(jī)名
port segment監(jiān)聽(tīng)端口
fselocation segment data目錄,注意是全路徑
dbid gp集群的唯一ID,可以到gp_segment_configuration中獲得,必須順序累加
content 可以到gp_segment_configuration中獲得,必須順序累加
prefered_role 角色(p或m)(primary , mirror)
replication_port 如果沒(méi)有mirror則不需要(用于replication的端口)。
如果你覺(jué)得以上內(nèi)容有問(wèn)題,可以手工修改。
gpssh -f seg_nodes 'mkdir /greenplum/data/gpdatap{9..16} /greenplum/data/gpdatam{9..16}'
修改配置
gpconfig -c max_connections -v 1000 -m 500
gpconfig -c shared_buffers -v 64m -m 64m
gpstop -afr
gpexpand -i gpexpand_inputfile_20180814_140954 -D test -V -v -n 8 -B 1 -t /home/gpadmin/gpAdminLogs
解釋一下命令
-B <batch_size>
Batch size of remote commands to send to a given host before
making a one-second pause. Default is 16. Valid values are 1-128.
The gpexpand utility issues a number of setup commands that may exceed
the host's maximum threshold for authenticated connections as defined
by MaxStartups in the SSH daemon configuration. The one-second pause
allows authentications to be completed before gpexpand issues any
more commands. The default value does not normally need to be changed.
However, it may be necessary to reduce the maximum number of commands
if gpexpand fails with connection errors such as
'ssh_exchange_identification: Connection closed by remote host.'
-D <database_name>
Specifies the database in which to create the expansion schema
and tables. If this option is not given, the setting for the
environment variable PGDATABASE is used. The database templates
template1 and template0 cannot be used.
-i | --input <input_file>
Specifies the name of the expansion configuration file, which contains
one line for each segment to be added in the format of:
<hostname>:<address>:<port>:<fselocation>:<dbid>:<content>:<preferred_role>:<replication_port>
-n <parallel_processes>
The number of tables to redistribute simultaneously. Valid values
are 1 - 16. Each table redistribution process requires two database
connections: one to alter the table, and another to update the table's
status in the expansion schema. Before increasing -n, check the current
value of the server configuration parameter max_connections and make
sure the maximum connection limit is not exceeded.
-S | --simple_progress
Show simple progress view.
-t | --tardir <directory>
Specify the temporary directory on segment hosts to put tar file.
-v | --verbose
Verbose debugging output. With this option, the utility will output
all DDL and DML used to expand the database.
-V | --novacuum
Do not vacuum catalog tables before creating schema copy.
20180814:14:15:20:002788 gpexpand:gpmaster61:gpadmin-[DEBUG]:-Transitioning from PREPARE_EXPANSION_SCHEMA_STARTED to PREPARE_EXPANSION_SCHEMA_DONE
20180814:14:15:20:002788 gpexpand:gpmaster61:gpadmin-[DEBUG]:-Transitioning from PREPARE_EXPANSION_SCHEMA_DONE to EXPANSION_PREPARE_DONE
20180814:14:15:20:002788 gpexpand:gpmaster61:gpadmin-[DEBUG]:-Removing segment configuration backup file
20180814:14:15:20:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-Stopping Greenplum Database
20180814:14:16:18:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-Starting Greenplum Database
20180814:14:16:36:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-Starting new mirror segment synchronization
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-************************************************
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-Initialization of the system expansion complete.
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-To begin table expansion onto the new segments
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-rerun gpexpand
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-************************************************
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[INFO]:-Exiting...
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[DEBUG]:-WorkerPool haltWork()
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[DEBUG]:-[worker0] haltWork
20180814:14:17:48:002788 gpexpand:gpmaster61:gpadmin-[DEBUG]:-[worker0] got a halt cmd
啟動(dòng)限制模式,回滾。
gpstart -R
gpexpand --rollback -D test
gpstart -a
然后找問(wèn)題繼續(xù)上一步,直到成功。
test=# select * from gp_segment_configuration ;
接下來(lái)可以計(jì)劃重分布任務(wù)中,表的調(diào)度順序了
digoal=# select * from gpexpand.
gpexpand.status gpexpand.status_detail gpexpand.expansion_progress
digoal=# select * from gpexpand.status;
status | updated
------------+----------------------------
SETUP | 2015-12-17 16:50:07.15973
SETUP DONE | 2015-12-17 16:50:16.427367
(2 rows)
查看接下來(lái)的任務(wù),如果要調(diào)整任務(wù)的先后順序,改rank即可。
digoal=# select * from gpexpand.status_detail ;
dbname | fq_name | schema_oid | table_oid | distribution_policy | distribution_policy_names | distribution_policy_coloids | storage_options | rank | status | expansion_started | expansion_finished | source_bytes
--------+--------------+------------+-----------+---------------------+---------------------------+-----------------------------+-----------------+------+-------------+-------------------+--------------------+--------------
digoal | public.test | 2200 | 17156 | {1} | id | 17156 | | 2 | NOT STARTED | | | 0
digoal | public.test1 | 2200 | 17182 | {1} | id | 17182 | | 2 | NOT STARTED | | | 0
(2 rows)
例如:
=> UPDATE gpexpand.status_detail SET rank=10;
=> UPDATE gpexpand.status_detail SET rank=1 WHERE fq_name = 'public.lineitem';
=> UPDATE gpexpand.status_detail SET rank=2 WHERE fq_name = 'public.orders';
These commands lower the priority of all tables to 10 and then assign a rank of 1 to lineitem and a rank of 2 to orders.
When table redistribution begins, lineitem is redistributed first, followed by orders and all other tables in gpexpand.status_detail.
To exclude a table from redistribution, remove the table from gpexpand.status_detail.
還有多少個(gè)表未完成重分布
digoal=# select * from gpexpand.expansion_progress ;
name | value
-------------+-------
Tables Left | 2
(1 row)
需要指定計(jì)劃在多久內(nèi)完成,或者計(jì)劃在哪天完成重分布,腳本會(huì)自動(dòng)調(diào)度重分布。
gpexpand -a -d 1:00:00 -D test -S -t /tmp -v -n 1
命令解釋
To begin the redistribution phase, you must run gpexpand with either
the -d (duration) or -e (end time) options. Until the specified end
time or duration is reached, the utility will redistribute tables in
the expansion schema. Each table is reorganized using ALTER TABLE
commands to rebalance the tables across new segments, and to set
tables to their original distribution policy. If gpexpand completes
the reorganization of all tables before the specified duration,
it displays a success message and ends.
NOTE: Data redistribution should be performed during low-use hours.
Redistribution can divided into batches over an extended period.
-a | --analyze
Run ANALYZE to update the table statistics after expansion.
The default is to not run ANALYZE.
-d | --duration <hh:mm:ss>
Duration of the expansion session from beginning to end.
-D <database_name>
Specifies the database in which to create the expansion schema
and tables. If this option is not given, the setting for the
environment variable PGDATABASE is used. The database templates
template1 and template0 cannot be used.
-e | --end '<YYYY-MM-DD hh:mm:ss>'
Ending date and time for the expansion session.
-S | --simple_progress
Show simple progress view.
-t | --tardir <directory>
Specify the temporary directory on segment hosts to put tar file.
-v | --verbose
Verbose debugging output. With this option, the utility will output
all DDL and DML used to expand the database.
-n <parallel_processes>
The number of tables to redistribute simultaneously. Valid values
are 1 - 16. Each table redistribution process requires two database
connections: one to alter the table, and another to update the table's
status in the expansion schema. Before increasing -n, check the current
value of the server configuration parameter max_connections and make
sure the maximum connection limit is not exceeded.
重分布過(guò)程中,可以看到進(jìn)度。
digoal=# select * from gpexpand.expansion_progress ;
name | value
-----------------+-------
Tables Expanded | 1
Tables Left | 1
(2 rows)
test=# select * from gpexpand.status_detail ;
dbname | fq_name | schema_oid | table_oid | distribution_policy | distribution_policy_names | distribution_policy_coloids | storage_options | rank | status | expansion_started | expansion_finished | source_byte
s
--------+--------------+------------+-----------+---------------------+---------------------------+-------
--
test | public.test | 2200 | 17156 | {1} | id | 17156 | | 2 | NOT STARTED | | |0
test | public.test1 | 2200 | 17182 | {1} | id | 17182 | | 2 | COMPLETED | 2015-12-17 17:12:12.43088 | 2015-12-17 17:13:27.335207 | 0
(2 rows)
# 或者在命令行看進(jìn)度
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-['digoal', 'public.test1', 2200L, 17182L, '{1}', 'id', '17182', None, 2, 'NOT STARTED', None, None, Decimal('0')]
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-Adding cmd to work_queue: None
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-['digoal', 'public.test', 2200L, 17156L, '{1}', 'id', '17156', None, 2, 'NOT STARTED', None, None, Decimal('0')]
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-Adding cmd to work_queue: None
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-woke up. queue: 2 finished 0
。。。。
20151217:17:14:36:020043 gpexpand:digoal193096:digoal-[DEBUG]:-woke up. queue: 2 finished 1
20151217:17:14:40:020043 gpexpand:digoal193096:digoal-[INFO]:-Analyzing public.test
20151217:17:14:41:020043 gpexpand:digoal193096:digoal-[DEBUG]:-woke up. queue: 2 finished 1
20151217:17:14:43:020043 gpexpand:digoal193096:digoal-[INFO]:-Finished expanding digoal.public.test
20151217:17:14:43:020043 gpexpand:digoal193096:digoal-[DEBUG]:-UPDATE gpexpand.status_detail
SET status = 'COMPLETED', expansion_started='2015-12-17 17:13:29.258085', expansion_finished='2015-12-17 17:14:43.552232'
WHERE dbname = 'digoal' AND schema_oid = 2200
AND table_oid = 17156
20151217:17:14:44:020043 gpexpand:digoal193096:digoal-[DEBUG]:-[worker0] finished cmd: name cmdStr='None'
20151217:17:14:46:020043 gpexpand:digoal193096:digoal-[DEBUG]:-WorkerPool haltWork()
20151217:17:14:46:020043 gpexpand:digoal193096:digoal-[DEBUG]:-[worker0] haltWork
。。。
20151217:17:14:54:020043 gpexpand:digoal193096:digoal-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20151217:17:14:54:020043 gpexpand:digoal193096:digoal-[INFO]:-Exiting...
gpexpand -c -D test
問(wèn)你是否需要在清除gpexpand schema前將狀態(tài)信息導(dǎo)出。
Do you want to dump the gpexpand.status_detail table to file? Yy|Nn (default=Y):
> y
例子2,再擴(kuò)展,6個(gè)segment,并且新加一臺(tái)主機(jī)。達(dá)到每個(gè)主機(jī)分布4個(gè)SEGMENT的目的。
和例子1的差別就在于新加了主機(jī),所以需要額外的過(guò)程。
過(guò)程概要
環(huán)境配置,例如OS kernel 參數(shù);
創(chuàng)建gp管理用戶(hù);
ssh key的交換(使用gpssh-exkeys -e exist_hosts -x new_hosts);
greenplum bin軟件的拷貝;
規(guī)劃segment 數(shù)據(jù)目錄;
使用gpcheck檢查 (gpcheck -f new_hosts );
使用gpcheckperf檢查性能 (gpcheckperf -f new_hosts_file -d /data1 -d /data2 -v)
yum -y install rsync coreutils glib2 lrzsz sysstat e4fsprogs xfsprogs ntp readline-devel zlib zlib-devel openssl openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl perl-devel perl-ExtUtils* OpenIPMI-tools openldap openldap-devel logrotate
cat > /etc/sysctl.conf << EOF
kernel.shmmax = 68719476736
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 50100 64128000 50100 1280
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
fs.file-max = 7672460
net.ipv4.netfilter.ip_conntrack_max = 655360
fs.aio-max-nr = 1048576
net.ipv4.tcp_keepalive_time = 72
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 7
EOF
sysctl -p
cat >> /etc/security/limits.conf EOF
* soft nofile 131072
* hard nofile 131072
* soft nproc 131072
* hard nproc 131072
* soft memlock unlimited
* hard memlock unlimited
EOF
rm -f /etc/security/limits.d/90-nproc.conf
cat > /etc/hosts << EOF
127.0.0.1 localhost
192.168.61.61 node61
192.168.61.62 node62
192.168.61.63 node63
192.168.61.64 node64
192.168.61.65 node65
192.168.61.66 node66
192.168.61.67 node67
EOF
noatime,nodiratime,nobarrier,discard,nodelalloc,data=writeback
/sbin/blockdev --setra 16384 /dev/xvda1
創(chuàng)建一個(gè)管理greenplum 的用戶(hù),這里使用digoal
創(chuàng)建一個(gè)目錄,放gp軟件, 給greenplum管理用戶(hù)寫(xiě)權(quán)限,也可以直接使用用戶(hù)的HOME目錄,例如/home/digoal/greenplum-db-4.3.6.1
創(chuàng)建一個(gè)目錄,放數(shù)據(jù)庫(kù), 給greenplum管理用戶(hù)寫(xiě)權(quán)限
# mkdir -p /data01/gpdata
# chown -R digoal /data01/gpdata
# chmod -R 700 /data01/gpdata
創(chuàng)建主機(jī)文件,包括所有節(jié)點(diǎn)以及主節(jié)點(diǎn)本身
$ vi host_exist
digoal193096.zmf
digoal199092.zmf
digoal200164.zmf
digoal204016.zmf
digoal204063.zmf
$ vi host_new
digoal209198.zmf
交換KEY,master使用gp管理用戶(hù)(digoal)訪問(wèn)所有的segment不需要輸入密碼,master pub拷貝到所有的segment authorized_keys
$ gpssh-exkeys -e host_exist -x host_new
安裝軟件到segment hosts
$gpseginstall -f ./host_new -u digoal
使用gpcheck檢查
$ gpcheck -f host_new
使用gpcheckperf檢查性能
$ gpcheckperf -f host_new -d /data01/gpdata -v
接下來(lái)的操作和前面就差不多了,如下:
$vi host
digoal204016.zmf
digoal204063.zmf
digoal209198.zmf
產(chǎn)生配置文件
$gpexpand -f ./host -c
產(chǎn)生的配置文件內(nèi)容如下
$cat gpexpand_inputfile_20151217_173855
digoal209198.zmf:digoal209198.zmf:40000:/data01/gpdata/gpseg20:22:20:p
digoal209198.zmf:digoal209198.zmf:40001:/data01/gpdata/gpseg21:23:21:p
digoal209198.zmf:digoal209198.zmf:40002:/data01/gpdata/gpseg22:24:22:p
digoal209198.zmf:digoal209198.zmf:40003:/data01/gpdata/gpseg23:25:23:p
digoal193096.zmf:digoal193096.zmf:40004:/data01/gpdata/gpseg24:26:24:p
digoal199092.zmf:digoal199092.zmf:40004:/data01/gpdata/gpseg25:27:25:p
digoal200164.zmf:digoal200164.zmf:40004:/data01/gpdata/gpseg26:28:26:p
digoal204016.zmf:digoal204016.zmf:40004:/data01/gpdata/gpseg27:29:27:p
digoal204063.zmf:digoal204063.zmf:40004:/data01/gpdata/gpseg28:30:28:p
digoal209198.zmf:digoal209198.zmf:40004:/data01/gpdata/gpseg29:31:29:p
需要人為調(diào)整一下: (dbid, contendid都務(wù)必連續(xù), 通過(guò)查看gp_segment_configuration) (同一主機(jī),端口不能沖突)
digoal204016.zmf:digoal204016.zmf:40004:/data01/gpdata/gpseg20:22:20:p
digoal204063.zmf:digoal204063.zmf:40004:/data01/gpdata/gpseg21:23:21:p
digoal209198.zmf:digoal209198.zmf:40000:/data01/gpdata/gpseg22:24:22:p
digoal209198.zmf:digoal209198.zmf:40001:/data01/gpdata/gpseg23:25:23:p
digoal209198.zmf:digoal209198.zmf:40002:/data01/gpdata/gpseg24:26:24:p
digoal209198.zmf:digoal209198.zmf:40003:/data01/gpdata/gpseg25:27:25:p
接下來(lái)需要修改greenplum bin目錄權(quán)限,gpexpand需要在這個(gè)目錄寫(xiě)一些東西。
chmod -R 700 /opt/gpdb
$ gpexpand -i ./gpexpand_inputfile_20151217_173855 -D digoal -S -V -v -n 1 -B 1 -t /tmp
執(zhí)行重分布命令。需要指定計(jì)劃在多久內(nèi)完成,或者計(jì)劃在哪天完成重分布,腳本會(huì)自動(dòng)調(diào)度重分布。
$ gpexpand -a -d 1:00:00 -D digoal -S -t /tmp -v -n 1
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。