共计 12851 个字符,预计需要花费 33 分钟才能阅读完成。
说明:由于配置节点比较多,有些操作使用简单的 ansible 部署,同时为了能清淅明白操作步骤没有写成 playbook;
拓扑图:
1、确保所有节点时间同步;
[root@ansible ~]# ansible group1 -a ‘date’
server13.neo.com | success | rc=0 >>
Sat Jul 26 10:31:49 CST 2014
server11.neo.com | success | rc=0 >>
Sat Jul 26 10:31:50 CST 2014
server12.neo.com | success | rc=0 >>
Sat Jul 26 10:31:50 CST 2014
2、确保 NetworkManager 己关闭;
[root@ansible ~]# ansible group1 -a ‘chkconfig –list NetworkManager’
server11.neo.com | success | rc=0 >>
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
server13.neo.com | success | rc=0 >>
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
server12.neo.com | success | rc=0 >>
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
一、RHCS 高可用部署
1、RHCS 管理节点安装 luci 组件
[root@ansible ~]# yum install -y luci –disablerepo=epel
[root@ansible ~]# service luci start
[root@ansible ~]# netstat -tunlp |grep 8084
2、RHCS 集群节点安装 ricci 组件
[root@ansible ~]# ansible group1 -m yum -a ‘name=ricci state=present disablerepo=epel’
[root@ansible ~]# ansible group1 -m service -a ‘name=ricci state=started enabled=yes’
[root@ansible ~]# ansible group1 -m shell -a ‘netstat -tunlp |grep ricci’
3、为每个 RHCS 集群节点添加 web 服务且提供主页文件
[root@ansible ~]# ansible group1 -m yum -a ‘name=httpd state=present’
[root@ansible ~]# ansible group1 -m shell -a ‘echo “<h1>`uname -n`</h1>” >/var/www/html/index.html’
[root@ansible ~]# ansible group1 -m service -a ‘name=httpd state=started enabled=yes’
[root@ansible ~]# ansible group1 -m shell -a ‘echo asdasd |passwd –stdin ricci’
4、使用 web GUI 界面配置高可用服务
1)在浏览器中访问 https://192.168.100.200:8084;
2)使用 root 帐号密码登陆;
3)创建集群服务
注意 Node Name 中的 Password 是我们为 ricci 添加的密码;
4)定义故障域 ”FailOver Domains”, 且定义“Priority”值;
5)添加资源“Resources”,这里 lsb 风格的服务需指定为 Scripts;添加 VIP 与 Httpd 两个资源;
6)定义组“Group”,把 VIP 与 httpd 添加进组中;
7)启动组;
8)模拟故障,在 web GUI 界面切换组资源在节点上的运行;
我们也可以在 shell 命令行下查看节点状态:
[root@server11 ~]# clustat
Cluster Status for My_Cluster @ Sat Jul 26 11:28:36 2014
Member Status: Quorate
Member Name ID Status
—— —- —- ——
server11.neo.com 1 Online, Local, rgmanager
server12.neo.com 2 Online, rgmanager
server13.neo.com 3 Online, rgmanager
Service Name Owner (Last) State
——- —- —– —— —–
service:webserver server13.neo.com started
至此一个简单的 web 高可用集群服务部署完成。
相关的命令行管理工具:
clusvcadm
cman_tool
ccs_tool
cman 的配置文件在 /etc/cluster/cluster.conf
更多详情见请继续阅读下一页的精彩内容 :http://www.linuxidc.com/Linux/2014-07/104786p2.htm
二、部署 GFS2/CLVM 集群文件系统
由于上面部署 GFS2 使用的是 ricci 和 luci 套件,而且使用了 CONGA 图型化配置,下面我们部置集群文件系统不需要用到图型化配置,所以还是手动安装就行;
1、在集群节点上安装 cman,corosync,rgmanager
# ansible group1 -m yum -a ‘name=cman state=present’
# ansible group1 -m yum -a ‘name=corosync state=present’
# ansible group1 -m yum -a ‘name=rgmanager state=present’
2、在任一集群节点上创建集群,且添加集如节点信息;
下面这里必须手动创建,默认安装完后是没有配置文件,CMAN 也无法启动;
[root@server11 ~]# ccs_tool create My_Cluster
[root@server11 ~]# ccs_tool addnode server11.neo.com -n 1 -v 1
[root@server11 ~]# ccs_tool addnode server12.neo.com -n 2 -v 1
[root@server11 ~]# ccs_tool addnode server13.neo.com -n 3 -v 1
[root@server11 ~]# ccs_tool lsnode // 查看集群节点;
Cluster name: My_Cluster, config_version: 4
Nodename Votes Nodeid Fencetype
server11.neo.com 1 1
server12.neo.com 1 2
server13.neo.com 1 3
[root@server11 ~]# cat /etc/cluster/cluster.conf // 查看 CMAN 配置文件;
<?xml version=”1.0″?>
<cluster name=”My_Cluster” config_version=”4″>
<clusternodes>
<clusternode name=”server11.neo.com” votes=”1″ nodeid=”1″/>
<clusternode name=”server12.neo.com” votes=”1″ nodeid=”2″/>
<clusternode name=”server13.neo.com” votes=”1″ nodeid=”3″/>
</clusternodes>
<fencedevices>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
将 server11 上的 cluster.conf 复制到其它集群节点:// 以下两步,我记得以前在 RHEL5 上是可以自动传送给其它节点的,但不知为何 RHEL6 无法自动传送,所以我们还是手动传送给其它集群节点;
[root@server11 ~]# scp -p /etc/cluster/cluster.conf server12:/etc/cluster/
[root@server11 ~]# scp -p /etc/cluster/cluster.conf server13:/etc/cluster/
3、启动服务
[root@ansible ~]# ansible group1 -m service -a ‘name=cman state=started enabled=yes’ // 设置 CMAN 服务启动且开机自动启动;
4、安装配置 ISCSI 服务端与客户端
[root@ansible ~]# ansible group1 -m yum -a ‘name=iscsi-initiator-utils state=present’ // 在各集群节点上安装 iscsi 客户端软件;
[root@server14 ~]# yum install scsi-target-utils -y // 在 iscsi 服务器上安装配置 ISCSI 服务端
[root@server14 ~]# cat /etc/tgt/targets.conf // 配置文件;
default-driver iscsi
<target iqn.2014-07.com.neo:server14.target2>
backing-store /dev/sdb1
backing-store /dev/sdb2
initiator-address 192.168.100.0/24
</target>
[root@server14 ~]# service tgtd start // 启动 iscsi 服务端;
Starting SCSI target daemon: [OK]
[root@server14 ~]# chkconfig tgtd on // 设置开机自动启动;
[root@server14 ~]# tgtadm -L iscsi -m target -o show // 查看状态信息;
Target 1: iqn.2014-07.com.neo:server14.target2
System information:
Driver: iscsi
State: ready
LUN information:
LUN: 0
Type: controller
SCSI ID: IET 00010000
SCSI SN: beaf10
Size: 0 MB, Block size: 1
Online: Yes
Removable media: No
Prevent removal: No
Readonly: No
Backing store type: null
Backing store path: None
Backing store flags:
LUN: 1
Type: disk
SCSI ID: IET 00010001
SCSI SN: beaf11
Size: 10742 MB, Block size: 512
Online: Yes
Removable media: No
Prevent removal: No
Readonly: No
Backing store type: rdwr
Backing store path: /dev/sdb1
Backing store flags:
LUN: 2
Type: disk
SCSI ID: IET 00010002
SCSI SN: beaf12
Size: 10742 MB, Block size: 512
Online: Yes
Removable media: No
Prevent removal: No
Readonly: No
Backing store type: rdwr
Backing store path: /dev/sdb2
Backing store flags:
Account information:
ACL information:
192.168.100.0/24
为每个集群节点配置独有的 iscsi 客户端名称;
[root@ansible ~]# ansible group1 -m shell -a ‘echo “InitiatorName=`iscsi-iname -p iqn.2014-07.com.neo`” >/etc/iscsi/initiatorname.iscsi’
检查是否正确;
[root@ansible ~]# ansible group1 -m shell -a ‘cat /etc/iscsi/initiatorname.iscsi’
server11.neo.com | success | rc=0 >>
InitiatorName=iqn.2014-07.com.neo:bb39eb84c2
server12.neo.com | success | rc=0 >>
InitiatorName=iqn.2014-07.com.neo:3032a51c13da
server13.neo.com | success | rc=0 >>
InitiatorName=iqn.2014-07.com.neo:c82593bb8d
[root@ansible ~]# ansible group1 -m shell -a ‘iscsiadm -m discovery -p 192.168.100.14 -t sendtargets’ // 集群节点都去 discovery 一次目标主机;
[root@ansible ~]# ansible group1 -m shell -a ‘iscsiadm -m node -T iqn.2014-07.com.neo:server14.target2 -p 192.168.100.14 -l’ // 集群节点登陆,连接服务端;
[root@ansible ~]# ansible group1 -m yum -a ‘name=gfs2-utils state=present’ // 安装 GFS2 组件;
以下为选择集群中任一节点操作:
[root@server11 ~]# fdisk /dev/sdb
[root@server11 ~]# partx -a /dev/sdb
[root@server11 ~]# mkfs.gfs2 -h
Usage:
mkfs.gfs2 [options] <device> [block-count]
Options:
-b <bytes> Filesystem block size
-c <MB> Size of quota change file
-D Enable debugging code
-h Print this help, then exit
-J <MB> Size of journals
-j <num> Number of journals // 设定日志文件系统个数,有多少个集群节点设置多少个;
-K Don’t try to discard unused blocks
-O Don’t ask for confirmation
-p <name> Name of the locking protocol // 指定锁类型,创建集群文件系统我们使用 lock_dlm;
-q Don’t print anything
-r <MB> Resource Group Size
-t <name> Name of the lock table // 指定锁表名称:ClusterName:string
-u <MB> Size of unlinked file
-V Print program version information, then exit
[root@server11 ~]# mkfs.gfs2 -j 3 -t My_Cluster:myweb -p lock_dlm /dev/sdb1 // 创建集群文件系统;
This will destroy any data on /dev/sdb1.
It appears to contain: data
Are you sure you want to proceed? [y/n] y
Device: /dev/sdb1
Blocksize: 4096
Device Size 10.00 GB (2622460 blocks)
Filesystem Size: 10.00 GB (2622458 blocks)
Journals: 3
Resource Groups: 41
Locking Protocol: “lock_dlm”
Lock Table: “My_Cluster:myweb”
UUID: c6d9c780-9a94-e245-0471-a16c55e6b1c1
[root@server11 ~]# ansible group1 -m shell -a ‘mount /dev/sdb1 /root/gfs2’ // 所有节点都挂载此集群文件系统,且都可以同时读取文件;
CLVM 集群逻辑卷文件系统
这里我们继续上面 GFS2 文件系统环境进行;
每个集群节点上安装 lvm2-utils
123
[root@ansible ~]# ansible group1 -m yum -a ‘name=lvm2-cluster state=present disablerepo=epel’
[root@ansible ~]# ansible group1 -m shell -a ‘lvmconf –enable-cluster’ // 启用 lvm 集群功能,默认是不启用的;这里也可以直接去修改 /etc/lvm/lvm.conf 中“locking-type=3”;
[root@ansible ~]# ansible group1 -m service -a ‘name=clvmd state=started enabled=yes’ // 设置开机自动启动;
同样的,在任一集群节点上创建一个新的分区,然后创建 PV,VG,LV 等;
[root@server11 ~]# fdisk /dev/sdc
Command (m for help): n
p
Partition number (1-4): 1
First cylinder (1-1020, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-1020, default 1020): +1G
Command (m for help): n
p
Partition number (1-4): 2
Last cylinder, +cylinders or +size{K,M,G} (206-1020, default 1020): +1G
Command (m for help): w
在每个节点上刷新内核重识分区 (由于 ISCSI 磁盘是先连接登陆,所以集群中某一节点对其分区,另外节点内核是无法识别的);
[root@ansible ~]# ansible group1 -m shell -a ‘partx -a /dev/sdc’
[root@server11 ~]# pvcreate /dev/sdc1 // 创建 PV;
Physical volume “/dev/sdc1” successfully created
[root@server11 ~]# vgcreate vg_clvm /dev/sdc1 // 创建 VG;
Clustered volume group “vg_clvm” successfully created
[root@server11 ~]# lvcreate -L 1G -n lv_clvm vg_clvm // 创建 LV;
Logical volume “lv_clvm” created
下面可以查看其它集群节点都可以看到 server11 上的操作结果;
[root@ansible ~]# ansible group1 -m shell -a ‘vgs’
server11.neo.com | success | rc=0 >>
VG #PV #LV #SN Attr VSize VFree
vg_clvm 1 1 0 wz–nc 1.00g 4.00m
vg_server 1 2 0 wz–n- 19.51g 0
server12.neo.com | success | rc=0 >>
VG #PV #LV #SN Attr VSize VFree
vg_clvm 1 1 0 wz–nc 1.00g 4.00m
vg_server 1 2 0 wz–n- 19.51g 0
server13.neo.com | success | rc=0 >>
VG #PV #LV #SN Attr VSize VFree
vg_clvm 1 1 0 wz–nc 1.00g 4.00m
vg_server 1 2 0 wz–n- 19.51g 0
[root@ansible ~]# ansible group1 -m shell -a ‘lvs’
server12.neo.com | success | rc=0 >>
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
lv_clvm vg_clvm -wi-a—– 1.00g
lv_root vg_server -wi-ao—- 17.57g
lv_swap vg_server -wi-ao—- 1.94g
server13.neo.com | success | rc=0 >>
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
lv_clvm vg_clvm -wi-a—– 1.00g
lv_root vg_server -wi-ao—- 17.57g
lv_swap vg_server -wi-ao—- 1.94g
server11.neo.com | success | rc=0 >>
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
lv_clvm vg_clvm -wi-a—– 1.00g
lv_root vg_server -wi-ao—- 17.57g
lv_swap vg_server -wi-ao—- 1.94g
创建 GFS2 文件系统;
[root@server11 ~]# mkfs.gfs2 -j 3 -t My_Cluster:clvm -p lock_dlm /dev/mapper/vg_clvm-lv_clvm
This will destroy any data on /dev/mapper/vg_clvm-lv_clvm.
It appears to contain: symbolic link to `../dm-2′
Are you sure you want to proceed? [y/n] y
Device: /dev/mapper/vg_clvm-lv_clvm
Blocksize: 4096
Device Size 1.00 GB (262144 blocks)
Filesystem Size: 1.00 GB (262142 blocks)
Journals: 3
Resource Groups: 4
Locking Protocol: “lock_dlm”
Lock Table: “My_Cluster:clvm”
UUID: dfcd6df8-c309-9fba-c19f-3f8d174096d9
说明:由于配置节点比较多,有些操作使用简单的 ansible 部署,同时为了能清淅明白操作步骤没有写成 playbook;
拓扑图:
1、确保所有节点时间同步;
[root@ansible ~]# ansible group1 -a ‘date’
server13.neo.com | success | rc=0 >>
Sat Jul 26 10:31:49 CST 2014
server11.neo.com | success | rc=0 >>
Sat Jul 26 10:31:50 CST 2014
server12.neo.com | success | rc=0 >>
Sat Jul 26 10:31:50 CST 2014
2、确保 NetworkManager 己关闭;
[root@ansible ~]# ansible group1 -a ‘chkconfig –list NetworkManager’
server11.neo.com | success | rc=0 >>
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
server13.neo.com | success | rc=0 >>
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
server12.neo.com | success | rc=0 >>
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
一、RHCS 高可用部署
1、RHCS 管理节点安装 luci 组件
[root@ansible ~]# yum install -y luci –disablerepo=epel
[root@ansible ~]# service luci start
[root@ansible ~]# netstat -tunlp |grep 8084
2、RHCS 集群节点安装 ricci 组件
[root@ansible ~]# ansible group1 -m yum -a ‘name=ricci state=present disablerepo=epel’
[root@ansible ~]# ansible group1 -m service -a ‘name=ricci state=started enabled=yes’
[root@ansible ~]# ansible group1 -m shell -a ‘netstat -tunlp |grep ricci’
3、为每个 RHCS 集群节点添加 web 服务且提供主页文件
[root@ansible ~]# ansible group1 -m yum -a ‘name=httpd state=present’
[root@ansible ~]# ansible group1 -m shell -a ‘echo “<h1>`uname -n`</h1>” >/var/www/html/index.html’
[root@ansible ~]# ansible group1 -m service -a ‘name=httpd state=started enabled=yes’
[root@ansible ~]# ansible group1 -m shell -a ‘echo asdasd |passwd –stdin ricci’
4、使用 web GUI 界面配置高可用服务
1)在浏览器中访问 https://192.168.100.200:8084;
2)使用 root 帐号密码登陆;
3)创建集群服务
注意 Node Name 中的 Password 是我们为 ricci 添加的密码;
4)定义故障域 ”FailOver Domains”, 且定义“Priority”值;
5)添加资源“Resources”,这里 lsb 风格的服务需指定为 Scripts;添加 VIP 与 Httpd 两个资源;
6)定义组“Group”,把 VIP 与 httpd 添加进组中;
7)启动组;
8)模拟故障,在 web GUI 界面切换组资源在节点上的运行;
我们也可以在 shell 命令行下查看节点状态:
[root@server11 ~]# clustat
Cluster Status for My_Cluster @ Sat Jul 26 11:28:36 2014
Member Status: Quorate
Member Name ID Status
—— —- —- ——
server11.neo.com 1 Online, Local, rgmanager
server12.neo.com 2 Online, rgmanager
server13.neo.com 3 Online, rgmanager
Service Name Owner (Last) State
——- —- —– —— —–
service:webserver server13.neo.com started
至此一个简单的 web 高可用集群服务部署完成。
相关的命令行管理工具:
clusvcadm
cman_tool
ccs_tool
cman 的配置文件在 /etc/cluster/cluster.conf
更多详情见请继续阅读下一页的精彩内容 :http://www.linuxidc.com/Linux/2014-07/104786p2.htm