drbd+keepalivednfs高可用方案实践

drbd+keepalived nfs高可用方案实践

  • https://docs.linbit.com/ drbd官网
  • 环境 centos7
  • 主:192.168.212.10 (chy) (nfs+keepalived+drdb都在一台机器上)
  • 备:192.168.212.11 (chy01)
  • 客户端:192.168.212.12 (chy02)
  • 架构图如下:
    drbd+keepalived nfs高可用方案实践
  • 基础准备工作

    让客户满意是我们工作的目标,不断超越客户的期望值来自于我们对这个行业的热爱。我们立志把好的技术通过有效、简单的方式提供给客户,将通过不懈努力成为客户在信息化领域值得信任、有价值的长期合作伙伴,公司提供的服务项目有:主机域名网站空间、营销软件、网站建设、从江网站维护、网站推广。

    1. 确保所有服务器时间同步。
      yum -y install ntp 
      ntpdate -u time.nist.gov (时钟同步命令)

      2.确保所有服务器的防火墙,Selinux关闭了。

    2. 且主机名配置到位,能够根据主机名知道服务器角色。
    3. 确保所有服务器hosts里面能够解析任意一台服务器的hostname。
      说明:如上简单的命令这就不做介绍了
  • 搭建DRDB
    1. drbd的原理不赘述了,这里我们采用C协议(backup端网络接收到后写入磁盘再返回OK状态给Master)。
    2. 开始安装DRDB (如下的操作主备都需要操作,我在这用的是yum安装方式没有用源码编译)
      [root@chy ~]# yum -y update kernel kernel-devel
      [root@chy ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
      [root@chy ~]# yum -y install drbd84-utils kmod-drbd84

      创建磁盘分区(主备同时操作)主备都安装drbd以后,我们就开始格式化磁盘。这里我把/dev/sdb直接分成主分区,大小为20G,在这基础之上,做了LVM卷,划分大小为10G。

      [root@chy ~]# lsblk //列出所有块设备
      NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      sda           8:0    0   20G  0 disk 
      ├─sda1        8:1    0  200M  0 part /boot
      └─sda2        8:2    0 18.9G  0 part 
      ├─cl-root 253:0    0    9G  0 lvm  /
      ├─cl-swap 253:1    0  800M  0 lvm  [SWAP]
      ├─cl-home 253:2    0  500M  0 lvm  /home
      └─cl-var  253:3    0  8.6G  0 lvm  /var
      sdb           8:16   0   20G  0 disk 
      sr0          11:0    1  4.1G  0 rom  
      [root@chy ~]# fdisk /dev/sdb
      [root@chy ~]# pvcreate /dev/sdb1
      Physical volume "/dev/sdb" successfully created.
      [root@chy ~]# vgcreate nfsdisk /dev/sdb1
      [root@chy ~]# lvcreate -L 10G -n nfsvolume nfsdisk
      Logical volume "nfsvolume" created.
      [root@chy01 ~]# lsblk
      NAME                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      sda                     8:0    0   20G  0 disk 
      ├─sda1                  8:1    0  200M  0 part /boot
      └─sda2                  8:2    0 18.9G  0 part 
      ├─cl-root           253:0    0    9G  0 lvm  /
      ├─cl-swap           253:1    0  800M  0 lvm  [SWAP]
      ├─cl-home           253:2    0  500M  0 lvm  /home
      └─cl-var            253:3    0  8.6G  0 lvm  /var
      sdb                     8:16   0   20G  0 disk 
      └─sdb1                  8:17   0   20G  0 part 
      └─nfsdisk-nfsvolume 253:4    0  100M  0 lvm  
      sr0                    11:0    1  4.1G  0 rom  
      [root@chy drbd.d]# ls -l /dev/nfsdisk/nfsvolume //查看是否创建成功后续要用
      lrwxrwxrwx 1 root root 7 12月 19 00:29 /dev/nfsdisk/nfsvolume -> ../dm-4
      [root@chy ~]# lvdisplay //查看信息

      在主上做如下的配置(drbd)

      
      [root@chy etc]# cat drbd.conf 
      # You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

#include "drbd.d/global_common.conf";#注释掉这行,避免和我们自己写的配置产生冲突。
include "drbd.d/.res";
include "drbd.d/
.cfg"; 增加一行cfg
[root@chy drbd.d]# cat drbd_basic.cfg ##主要配置文件 (在/etc/drbd.d 下创建drbd_basic.cfg)
global {
usage-count yes; # 是否参与DRBD使用者统计,默认为yes,yes or no都无所谓
}

common {
syncer { rate 30M; } # 设置主备节点同步的网络速率最大值,默认单位是字节,我们可以设定为兆
}
resource r0 { # r0为资源名,我们在初始化磁盘的时候就可以使用资源名来初始化。
protocol C; #使用 C 协议。
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f ";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f ";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
fence-peer "/usr/lib4/heartbeat/drbd-peer-outdater -t 5";
pri-lost "echo pri-lst. Have a look at the log file. | mail -s 'Drbd Alert' root";
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
}
net {
cram-hmac-alg "sha1";
shared-secret "NFS-HA";
# drbd同步时使用的验证方式和密码信息
}
disk {
on-io-error detach;
fencing resource-only;

使用DOPD(drbd outdate-peer deamon)功能保证数据不同步的时候不进行切换。

}
startup {
    wfc-timeout 120;
    degr-wfc-timeout 120;
}
device /dev/drbd0;   # 这里/dev/drbd0是用户挂载时的设备名字,由DRBD进程创建

on chy {    #每个主机名的说明以on开头,后面是hostname(必须在/etc/hosts可解析)
    disk /dev/nfsdisk/nfsvolume;   # 使用这个磁盘作为/dev/nfsdisk/nfsvolume的磁盘/dev/drbd0。(这个就是我们之前创建的一定要保证存在要不后续会出错)
    address 192.168.212.10:7788;     #设置DRBD的监听端口,用于与另一台主机通信
    meta-disk internal;     # drbd的元数据存放方式
}

on chy01 {
    disk /dev/nfsdisk/nfsvolume;
    address 192.168.212.11:7788;
    meta-disk internal;
}

}
[root@chy drbd.d]# drbdadm create-md r0 //创建元数据库
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

initializing activity log
initializing bitmap (320 KB) to all zero
Writing meta data...
New drbd meta data block successfully created. ##创建成功
success

#上面提示的命令如下,不必要执行:
useradd haclient
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84
chgrp haclient /sbin/drbdmeta
chmod o-x /sbin/drbdmeta
chmod u+s /sbin/drbdmeta
[root@chy drbd.d]# service drbd start
[root@chy drbd.d]# ps aux |grep drbd
root 11624 0.0 0.0 0 0 ? S< 02:33 0:00 [drbd-reissue]
root 60363 0.0 0.0 0 0 ? S< 05:36 0:00 [drbd0_submit]
root 64193 0.0 0.0 112660 972 pts/0 R+ 05:51 0:00 grep --color=auto drbd

[root@chy drbd.d]# drbdadm primary all
#把当前服务器设置为primary状态(主节点),如果这一步执行不成功,那么执行这个命令“drbdadm -- --overwrite-data-of-peer primary all”
[root@chy drbd.d]# drbdadm primary all
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

0: State change failed: (-2) Need access to UpToDate data
Command 'drbdsetup-84 primary 0' terminated with exit code 17
[root@chy drbd.d]# drbdadm up r0 //
[root@chy drbd.d]# drbdadm primary r0 --force //强制设置成主节点

如果可以正常启动,那么就把/etc/drbd.d/drbd_basic.cfg和/etc/drbd.conf复制到备的机器上我用的是scp

[root@chy01 drbd.d]# scp 192.168.212.10:/etc/drbd.conf /etc/drbd.conf
drbd.conf 100% 158 0.2KB/s 00:00
[root@chy01 drbd.d]# scp 192.168.212.10:/etc/drbd.d/drbd_basic.cfg /etc/drbd.d/drbd_basic.cfg
drbd_basic.cfg

使用drbdadm create-md r0 创建元数据库在/dev/mapper/nfsdisk-nfsvolume (在备的机器上操作)

[root@chy01 drbd.d]# drbdadm create-md r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

initializing activity log
initializing bitmap (320 KB) to all zero
Writing meta data...
New drbd meta data block successfully created.
success

1. 这里我就简要说说11(备)做的操作了,其实两者做的操作也一样。
格式化磁盘,使用配置文件指定的/dev/sdb1,两者的容量要大小相同。
2. 使用drbdadm create-md r0 创建元数据库在/dev/sdb1。
3. 启动服务。service drbd start。
4. 查看状态。service drbd status.
在backup服务器上做完上述说的操作后,我们在bakcup服务器查看drbd的状态:

[root@chy01 drbd.d]# service drbd status
Redirecting to /bin/systemctl status drbd.service
● drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager.
Loaded: loaded (/usr/lib/systemd/system/drbd.service; disabled; vendor preset: disabled)
Active: active (exited) since Fri 2017-12-15 06:12:37 CST; 3min 46s ago
Process: 3995 ExecStart=/lib/drbd/drbd start (code=exited, status=0/SUCCESS)
Main PID: 3995 (code=exited, status=0/SUCCESS)

Dec 15 06:12:37 chy01 drbd[3995]: to be able to call drbdsetup and drbdmeta with root privileges.
Dec 15 06:12:37 chy01 drbd[3995]: You need to fix this with these commands:
Dec 15 06:12:37 chy01 drbd[3995]: chgrp haclient /lib/drbd/drbdsetup-84
Dec 15 06:12:37 chy01 drbd[3995]: chmod o-x /lib/drbd/drbdsetup-84
Dec 15 06:12:37 chy01 drbd[3995]: chmod u+s /lib/drbd/drbdsetup-84
Dec 15 06:12:37 chy01 drbd[3995]: chgrp haclient /usr/sbin/drbdmeta
Dec 15 06:12:37 chy01 drbd[3995]: chmod o-x /usr/sbin/drbdmeta
Dec 15 06:12:37 chy01 drbd[3995]: chmod u+s /usr/sbin/drbdmeta
Dec 15 06:12:37 chy01 drbd[3995]: .
Dec 15 06:12:37 chy01 systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager..
[root@chy01 drbd.d]# drbdadm dstate r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

Inconsistent/UpToDate
[root@chy01 drbd.d]# drbdadm role r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

Secondary/Primary
[root@chy drbd.d]# service drbd status # drbd master上查看
Redirecting to /bin/systemctl status drbd.service
● drbd.service - DRBD -- please disable. Unless you are NOT using a cluster manager.
Loaded: loaded (/usr/lib/systemd/system/drbd.service; disabled; vendor preset: disabled)
Active: active (exited) since 五 2017-12-15 02:33:20 CST; 3h 40min ago
Main PID: 11620 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/drbd.service

12月 15 02:33:20 chy systemd[1]: Starting DRBD -- please disable. Unless you are NOT using a cluster manager....
12月 15 02:33:20 chy drbd[11620]: Starting DRBD resources: no resources defined!
12月 15 02:33:20 chy drbd[11620]: no resources defined!
12月 15 02:33:20 chy drbd[11620]: WARN: stdin/stdout is not a TTY; using /dev/consoleWARN: stdin/stdout is...ined!
12月 15 02:33:20 chy drbd[11620]: .
12月 15 02:33:20 chy systemd[1]: Started DRBD -- please disable. Unless you are NOT using a cluster manager..
Hint: Some lines were ellipsized, use -l to show in full.
[root@chy drbd.d]# drbdadm dstate r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

UpToDate/Inconsistent
[root@chy drbd.d]# drbdadm role r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

Primary/Secondary

挂载DRBD的磁盘(drbd master)上操作:

[root@chy drbd.d]# mkfs.ext4 /dev/drbd0 //需要格式化
mke2fs 1.42.9 (28-Dec-2013)
文件系统标签=
OS type: Linux
块大小=4096 (log=2)
分块大小=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
655360 inodes, 2621351 blocks
131067 blocks (5.00%) reserved for the super user
第一个数据块=0
Maximum filesystem blocks=2151677952
80 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: 完成
正在写入inode表: 完成
Creating journal (32768 blocks): 完成
Writing superblocks and filesystem accounting information: 完成
[root@chy drbd.d]# mkdir /database
[root@chy drbd.d]# mount /dev/drbd0 /database/
[root@chy drbd.d]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 8.8G 7.6G 748M 92% /
devtmpfs 737M 0 737M 0% /dev
tmpfs 748M 4.0K 748M 1% /dev/shm
tmpfs 748M 8.6M 739M 2% /run
tmpfs 748M 0 748M 0% /sys/fs/cgroup
/dev/sda1 190M 136M 41M 77% /boot
/dev/mapper/cl-var 8.4G 3.5G 4.5G 44% /var
/dev/mapper/cl-home 497M 66M 431M 14% /home
tmpfs 150M 0 150M 0% /run/user/0
/dev/drbd0 9.8G 37M 9.2G 1% /database
[root@chy ~]# cat /proc/drbd //centos7用这个查看drbd当前主的状态
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
0: cs:Connected ro:Primary/Secondary ds:Diskless/UpToDate C r-----
ns:135468 nr:1440 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@chy01 ~]# cat /proc/drbd //centos7用这个查看drbd当前主的状态
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
0: cs:Connected ro:Secondary/Primary ds:UpToDate/Diskless C r-----
ns:1440 nr:135468 dw:10620872 dr:1440 al:125 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:10485404

主端挂载完后去格式化备端和挂载,主要是检测备端是否能够正常挂载和使用,格式化之前需要主备状态切换,切换的方法请看下面的DRBD设备角色切换的内容。下面就在备端格式化和挂载磁盘。(备端也可以这样挂载磁盘,但是挂载的前提是备端切换成master端。只有master端可以挂载磁盘。)
DRBD设备角色切换

DRBD设备在角色切换之前,需要在主节点执行umount命令卸载磁盘先,然后再把一台主机上的DRBD角色修改为Primary,最后把当前节点的磁盘挂载

第一种方法:
在192.168.212.10上操作(当前是primary)。

[root@chy ~]# umount /dev/drbd0
[root@chy ~]# drbdadm secondary r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta
[root@chy ~]# cat /proc/drbd
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
0: cs:Connected ro:Secondary/Secondary ds:Diskless/UpToDate C r-----
ns:135552 nr:1917 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
发现两台都是备机
在192.168.212.11上操作。
[root@chy01 ~]# drbdadm primary r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta
[root@chy01 ~]# cat /proc/drbd
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
0: cs:Connected ro:Primary/Secondary ds:UpToDate/Diskless C r-----
ns:1917 nr:135552 dw:10784752 dr:2829 al:148 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:10485404
已经切换成功
[root@dbbackup136 ~]# mkfs.ext4 /dev/drbd0 # 在没有切换成primary状态的时候,是没法格式化磁盘的。
第二种方法:
在192.168.212.11上操作(当前是primary)。
[root@dbbackup136 ~]# service drbd stop # 先停止服务
Stopping all DRBD resources:
[root@chy01 ~]# service drbd start //之后在启动

在192.168.212.10上操作

[root@chy ~]# drbdadm primary r0
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

[root@chy ~]# cat /proc/drbd
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
0: cs:Connected ro:Primary/Secondary ds:Diskless/UpToDate C r-----
ns:135552 nr:2829 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
一般我用的是第一种方法

说明:如上只是一个来回切换的小测试,接下来开始做keepalived  
keepalived主上的操作(具体的配置介绍我这就不说了,https://blog.51cto.com/chy940405/2052014 这里是我更新的文章有需要的可以看看)

[root@chy keepalived]# cat keepalived.conf

global_defs{
notification_email {chy@chy.com
br/>chy@chy.com
notification_email_from root@chy.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_scriptchk_nfs {
script "/etc/keepalived/check_nfs.sh"
interval 5
}
vrrp_instance VI_1 {
state MASTER
interface br0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass chylinux>com
}
virtual_ipaddress {
192.168.212.100 //vip的地址
}
track_script {
chk_nfs
}
notify_master /etc/keepalived/notify_master.sh //master的脚本
notify_stop /etc/keepalived/notify_stop.sh
}
[root@chy keepalived]# cat check_nfs.sh
#!/bin/sh

###检查nfs可用性:进程和是否能够挂载
/sbin/service nfs status &>/dev/null
if [ $? -ne 0 ];then
###如果服务状态不正常,先尝试重启服务
/sbin/service nfs status &>/dev/null
if [ $? -ne 0 ];then
###若重启nfs服务后,仍不正常
###卸载drbd0设备
umount /dev/drbd0
###将drbd主降级为备
drbdadm secondary r0
#关闭keepalived
/sbin/service keepalived stop
fi
fi
[root@chy keepalived]# cat notify_master.sh
#!/bin/bash

time=date "+%F %H:%M:%S"
echo -e "$time ------notify_master------\n" >> /etc/keepalived/logs/notify_master.log
/sbin/drbdadm primary r0 &>> /etc/keepalived/logs/notify_master.log
/bin/mount /dev/drbd0 /database &>> /etc/keepalived/logs/notify_master.log
/sbin/service nfs restart &>> /var/log/master.log
echo -e "\n" >> /etc/keepalived/logs/notify_master.log

备的操作

[root@chy01 keepalived]# cat keepalived.conf

global_defs{
notification_email {chy@chy.com
br/>chy@chy.com
notification_email_from root@chy.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_scriptchk_nfs {
script "/etc/keepalived/check_nfs.sh"
interval 5
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass chylinux
}
virtual_ipaddress {
192.168.212.100
}
track_script {
chk_nfs
}

notify_master /etc/keepalived/notify_master.sh

notify_backup /etc/keepalived/notify_backup.sh
}
[root@chy01 keepalived]# cat check_nfs.sh
#!/bin/sh

###检查nfs可用性:进程和是否能够挂载
/sbin/service nfs status &>/dev/null
if [ $? -ne 0 ];then
###如果服务状态不正常,先尝试重启服务
/sbin/service nfs status &>/dev/null
if [ $? -ne 0 ];then
###若重启nfs服务后,仍不正常
###卸载drbd0设备
umount /dev/drbd0
###将drbd主降级为备
drbdadm secondary r0
#关闭keepalived
/sbin/service keepalived stop
fi
fi
[root@chy01 keepalived]# cat notify_backup.sh
#!/bin/bash

time=date "+%F %H:%M:%S"
echo -e "$time ------notify_backup------\n" >> /etc/keepalived/logs/notify_backup.log
/sbin/service nfs stop &>> /etc/keepalived/logs/notify_backup.log
/bin/umount /database &>> /etc/keepalived/logs/notify_backup.log
/sbin/drbdadm secondary all &>> /etc/keepalived/logs/notify_backup.log
echo -e "\n" >> /etc/keepalived/logs/notify_backup.log
[root@chy01 keepalived]# cat notify_master.sh
#!/bin/bash
time=date "+%F %H:%M:%S"
echo -e "$time ------notify_master------\n" >> /etc/keepalived/logs/notify_master.log
/sbin/drbdadm primary r0 &>> /etc/keepalived/logs/notify_master.log
/bin/mount /dev/drbd0 /database &>> /etc/keepalived/logs/notify_master.log
/sbin/service nfs restart &>> /var/log/master.log
echo -e "\n" >> /etc/keepalived/logs/notify_master.log
关闭主上keepalived,会按照预期流程走。关闭主上nfs—-卸载资源设备—-主drbd降级—-备drdb升级—-备挂载资源设备—-备启动nfs服务,脚本的大概含义

nfs主的操作

[root@chy ~]# cat /etc/exports
/database 192.168.212.12/24(rw,sync,no_root_squash)
我这里共享的目录是database目录,为什么要共享这个呢,是因为我之前将drbd挂载这个目录了
[root@chy ~]# exportfs -avr
exporting 192.168.212.12/24:/database

nfs备的操作就直接将主的scp即可

在nfs的server端的操作

[root@chy02 mnt]# showmount -e 192.168.212.100
Export list for 192.168.212.100:
/database 192.168.212.12/24
[root@chy02 mnt]# mount -t nfs 192.168.212.100:/database/ /mnt
[root@chy02 mnt]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 8.8G 5.7G 2.6G 69% /
devtmpfs 737M 0 737M 0% /dev
tmpfs 748M 0 748M 0% /dev/shm
tmpfs 748M 8.6M 739M 2% /run
tmpfs 748M 0 748M 0% /sys/fs/cgroup
/dev/sda1 190M 107M 70M 61% /boot
/dev/mapper/cl-var 8.4G 276M 7.7G 4% /var
/dev/mapper/cl-home 497M 26M 472M 6% /home
tmpfs 150M 0 150M 0% /run/user/0
192.168.212.100:/database 93M 1.5M 85M 2% /mnt
[root@chy02 mnt]# touch 1.19

在主备上开始进行测试(如下的介绍都有测试有关)

[root@chy keepalived]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 8.8G 7.6G 744M 92% /
devtmpfs 737M 0 737M 0% /dev
tmpfs 748M 4.0K 748M 1% /dev/shm
tmpfs 748M 26M 722M 4% /run
tmpfs 748M 0 748M 0% /sys/fs/cgroup
/dev/sda1 190M 136M 41M 77% /boot
/dev/mapper/cl-var 8.4G 3.1G 4.9G 39% /var
/dev/mapper/cl-home 497M 66M 431M 14% /home
tmpfs 150M 0 150M 0% /run/user/0
/dev/drbd0 93M 1.6M 85M 2% /database
[root@chy keepalived]# cd /database/
[root@chy database]# ls
~ 111 112 1.19 12.19 222 333 444 bbb

现在将主的keepalivedd停掉,在server端看是否还能操作

[root@chy database]# systemctl stop keepalived
[root@chy database]# ps aux |grep keepalived
root 83980 0.0 0.0 112660 980 pts/1 R+ 06:53 0:00 grep --color=auto keepalived
[root@chy database]# cat /proc/drbd
version: 8.4.10-1 (api:1/proto:86-101)
GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:29 nr:56 dw:85 dr:3674 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
好奇怪怎么这样了?不是应该切换过去?不用担心我们可以看看日志

[root@chy database]# cat /etc/keepalived/logs/notify_stop.log
这是之前在脚本里面定义的日志,在日志里面查看到了如下的报错信息
Redirecting to /bin/systemctl stop nfs.service
umount: /database:目标忙。
(有些情况下通过 lsof(8) 或 fuser(1) 可以
找到有关使用该设备的进程的有用信息)
WARN:
You are using the 'drbd-peer-outdater' as fence-peer program.
If you use that mechanism the dopd heartbeat plugin program needs
to be able to call drbdsetup and drbdmeta with root privileges.

You need to fix this with these commands:
chgrp haclient /lib/drbd/drbdsetup-84
chmod o-x /lib/drbd/drbdsetup-84
chmod u+s /lib/drbd/drbdsetup-84

chgrp haclient /usr/sbin/drbdmeta
chmod o-x /usr/sbin/drbdmeta
chmod u+s /usr/sbin/drbdmeta

0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup-84 secondary 0' terminated with exit code 11
[root@chy database]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 8.8G 7.6G 744M 92% /
devtmpfs 737M 0 737M 0% /dev
tmpfs 748M 4.0K 748M 1% /dev/shm
tmpfs 748M 26M 722M 4% /run
tmpfs 748M 0 748M 0% /sys/fs/cgroup
/dev/sda1 190M 136M 41M 77% /boot
/dev/mapper/cl-var 8.4G 3.1G 4.9G 39% /var
/dev/mapper/cl-home 497M 66M 431M 14% /home
tmpfs 150M 0 150M 0% /run/user/0
/dev/drbd0 93M 1.6M 85M 2% /database
[root@chy database]# umount /database/ 手动卸载也是不可以的
umount: /database:目标忙。
(有些情况下通过 lsof(8) 或 fuser(1) 可以
找到有关使用该设备的进程的有用信息)
解决方法如下:
[root@chy database]# fuser -m /dev/drbd0/
/dev/drbd0: 595c
[root@chy database]# ps aux |grep 595
root 595 0.0 0.1 115720 2340 pts/1 Ss 02:31 0:00 -bash
root 90942 0.0 0.0 112660 976 pts/1 R+ 07:02 0:00 grep --color=auto 595
[root@chy database]# kill 595
[root@chy keepalived]# sh -x notify_stop.sh
++ date '+%F %H:%M:%S'

  • time='2017-12-19 07:04:19'
  • echo -e '2017-12-19 07:04:19 ------notify_stop------\n'
  • /sbin/service nfs stop
  • /bin/umount /database
  • /sbin/drbdadm secondary all
  • echo -e '\n'
    [root@chy keepalived]# df -h 查看已经不挂载了
    文件系统 容量 已用 可用 已用% 挂载点
    /dev/mapper/cl-root 8.8G 7.6G 744M 92% /
    devtmpfs 737M 0 737M 0% /dev
    tmpfs 748M 4.0K 748M 1% /dev/shm
    tmpfs 748M 26M 722M 4% /run
    tmpfs 748M 0 748M 0% /sys/fs/cgroup
    /dev/sda1 190M 136M 41M 77% /boot
    /dev/mapper/cl-var 8.4G 3.1G 4.9G 39% /var
    /dev/mapper/cl-home 497M 66M 431M 14% /home
    tmpfs 150M 0 150M 0% /run/user/0
    [root@chy01 ~]# cat /proc/drbd //在备上查看已经正常切换了
    version: 8.4.10-1 (api:1/proto:86-101)
    GIT-hash: a4d5de01fffd7e4cde48a080e2c686f9e8cebf4c build by mockbuild@, 2017-09-15 14:23:22
    0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:57 nr:113 dw:170 dr:3665 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    [root@chy01 ~]# df -h
    Filesystem Size Used Avail Use% Mounted on
    /dev/mapper/cl-root 8.8G 5.8G 2.6G 69% /
    devtmpfs 737M 0 737M 0% /dev
    tmpfs 748M 0 748M 0% /dev/shm
    tmpfs 748M 17M 731M 3% /run
    tmpfs 748M 0 748M 0% /sys/fs/cgroup
    /dev/sda1 190M 135M 41M 77% /boot
    /dev/mapper/cl-var 8.4G 710M 7.3G 9% /var
    /dev/mapper/cl-home 497M 66M 432M 14% /home
    tmpfs 150M 0 150M 0% /run/user/0
    /dev/drbd0 93M 1.6M 85M 2% /database
    [root@chy02 ~]# cd /mnt/
    [root@chy02 mnt]# ls
    ~ 111 112 1.19 12.19 222 333 444 bbb
    数据都在,在此高可用已经完成
    
    如上已经基本完成,**但是有个问题还没还没有解决就是如果主节点断电或者直接关机,则会导致主备切换异常,不知道其它伙伴是否有好的办法可以找我一起探讨,期待有好的方法。**

标题名称:drbd+keepalivednfs高可用方案实践
本文来源:http://pcwzsj.com/article/ishcic.html