Monthly Archives: August 2020

How to fix your zfs pool after /dev/sdx name have changed

if you have created your zfs pool earlier using /dev/sdx name instead of /dev/disk/by-id/xxx then you might face an issue if the dev name changed and you try to replace the drive with itself using the new OS generated name under /dev/sdx, similar to below error:

alsolh@server:~/diskmanagement$ sudo zpool replace -f tank 53495565608654595 /dev/sdc1
invalid vdev specification
the following errors must be manually repaired:
/dev/sdc1 is part of active pool ‘tank’

alsolh@server:/dev/disk/by-id$ sudo zpool replace -f tank 53495565608654595 /dev/disk/by-id/ata-WDC_WD100EMAZ-00WXXXX_1EGGXXXX-part1
invalid vdev specification
the following errors must be manually repaired:
/dev/disk/by-id/ata-WDC_WD100EMAZ-00WXXXX_1EGGXXXX-part1 is part of active pool ‘tank’

to solve this issue, referring to both articles below:

https://unix.stackexchange.com/questions/346713/zfs-ubuntu-16-04-replace-drive-with-itself
https://plantroon.com/changing-disk-identifiers-in-zpool/

first to avoid having this issue in other drives run below command

zpool export tank
zpool import -d /dev/disk/by-id/ tank

if you have proxmox, you need to comment out these lines before running zpool export

alsolh@server:/dev/disk/by-id$ sudo nano /etc/pve/storage.cfg

dir: local

path /var/lib/vz
content backup,iso,vztmpl

lvmthin: local-lvm
thinpool data
vgname pve
content rootdir,images

#zfspool: tank
# pool tank
# content rootdir,images
# mountpoint /tank
# nodes server

then run below commands under interactive root and replace sdx1 with your disk new generated name that needs to be fixed, this will clear the disk that have changed its name from its assignment to the zpool

alsolh@server:/dev/disk/by-id$ sudo -i
root@server:~# dd bs=512 if=/dev/zero of=/dev/sdx1 count=2048 seek=$(($(blockdev –getsz /dev/sdx1) – 2048))
root@server:~# dd bs=512 if=/dev/zero of=/dev/sdc1 count=2048

now find out what is the device new name with below command

alsolh@server:/dev/disk/by-id$ cd /dev/disk/by-id
alsolh@server:/dev/disk/by-id$ ls

sample correct name is “ata-WDC_WD100EMAZ-00WXXXX_1EGGXXXX-part1” or the world wide name “wwn-0x5000cca27ecxxxxx”

now replace your drive that had the issue by itself with below command:

alsolh@server:/dev/disk/by-id$ sudo zpool replace -f tank 53495565608654595 wwn-0x5000cca27ec6xxxx

or

alsolh@server:/dev/disk/by-id$ sudo zpool replace -f tank 53495565608654595 /dev/disk/by-id/ata-WDC_WD100EMAZ-00WXXXX_1EGGXXXX-part1

now you will see your pool is re-silvering which indicates that you have done the replacement successfully

alsolh@server:~$ sudo zpool status -v
pool: tank
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Fri Aug 7 18:37:09 2020
6.09T scanned at 1.76G/s, 1.84T issued at 547M/s, 14.6T total
356G resilvered, 12.60% done, 0 days 06:48:55 to go
config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
wwn-0x5000cca267dbxxxx ONLINE 0 0 0
wwn-0x5000cca267f2xxxx ONLINE 0 0 0
replacing-2 DEGRADED 0 0 0
53495565608654595 UNAVAIL 0 0 0 was /dev/sdd1
wwn-0x5000cca27ec6xxxx ONLINE 0 0 0 (resilvering)

errors: No known data errors