LVM on top of DRBD “Failure: (104) Can not open backing device”.

Using Proxmox VE, in 2 members cluster configuration with LVM on top of DRBD, makes possible to have FileSystem Data Redundancy and support OnLine VMs Migration between Host Controllers.

DRBD refers to block devices designed as a building block to form high availability (HA) clusters.
This is done by mirroring a whole block device via an assigned network. DRBD can be understood as “Network-Based RAID1”.
For detailed information please visit Linbit.

After rebooting a controller it is possible to see that DRBD is not properly working and the rebooted node is indicating: “Failure: (104) Can not open backing device.” and DRBD’s DS is “Diskless”.

From pve02 (Rebooted Proxmox VE Node):

root@pve02:~# cat /proc/drbd
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35
 0: cs:Connected ro:Primary/Primary ds:Diskless/UpToDate C r-----
    ns:0 nr:80 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

While trying to restart DRBD:

root@pve02:~# /etc/init.d/drbd restart
Restarting all DRBD resources:DRBD module version: 8.3.10
0: Failure: (104) Can not open backing device.
Command '/sbin/drbdsetup 0 disk /dev/sdb1 /dev/sdb1 internal
         --set-defaults --create-device' terminated with exit code 10.

Operations from dmesg:

[...]
block drbd0: Starting worker thread (from cqueue [1561])
block drbd0: open("/dev/sdb1") failed with -16
block drbd0: drbd_bm_resize called with capacity == 0
block drbd0: worker terminated
block drbd0: Terminating worker thread
[...]

By analyzing the system, check if the Logical Volume on top of DRBD is marked as ‘ACTIVE’ (using lvscan / pvscan):

[...]
 ACTIVE '/dev/drbdvg/vm-111-disk-1' [50.00 GiB] inherit
[...]
 PV /dev/drbd0 VG drbdvg lvm2 [465.74 GiB / 415.74 GiB free]
[...]

To apply data interactively, after deactivating the Volume Group, resync DRBD:

root@pve02:~# vgchange -a n drbdvg
 0 logical volume(s) in volume group "drbdvg" now active
root@pve02:~# /etc/init.d/drbd start
Starting DRBD resources:DRBD module version: 8.3.10
[ d(r0) s(r0) n(r0) ]DRBD module version: 8.3.10
root@pve02:~# cat /proc/drbd
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35
 0: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:164104 dw:155908 dr:144 al:0 bm:33 lo:66 pe:376 ua:65 ap:0 ep:1 wo:b oos:3743948
	[>....................] sync'ed:  4.1% (3743948/3899800)K
	finish: 0:03:11 speed: 19,480 (19,480) want: 30,720 K/sec

Let’s finish the work by reactivating LVM on top of DRBD:

root@pve02:~# vgchange -a y drbdvg
 1 logical volume(s) in volume group "drbdvg" now active
root@pve02:~# lvscan 
 ACTIVE '/dev/drbdvg/vm-111-disk-1' [50.00 GiB] inherit

After a few….

root@pve02:~# cat /proc/drbd
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:0 nr:3923588 dw:3923588 dr:23332 al:0 bm:292 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

This problem is commonly caused by a misconfiguration of LVM filters, usually located in /etc/lvm.conf.
By default, LVM filters are configured as follows:

# By default we accept every block device:
filter = [ "a/.*/" ]

In DRBD Cluster with LVM on top, LVM filters in /etc/lvm.conf need to be configured as follow:

# ProxMox DRBD Cluster
filter = [ "r|/dev/sdb1|", "r|/dev/disk/|", "r|/dev/block/|", "a/.*/" ]

Comments are closed.