Hi,
I'm using hd-idle to spin down 8 SAS disks. After some debugging, I have set up each disk to have this settings:
/dev/sdi: SEAGATE ST2000NX0463 NT32
Power condition mode page:
PM_BG 0 [cha: n, def: 0, sav: 0]
STANDBY_Y 0 [cha: n, def: 0, sav: 0]
IDLE_C 0 [cha: y, def: 0, sav: 0]
IDLE_B 0 [cha: y, def: 1, sav: 0]
IDLE_A 0 [cha: y, def: 1, sav: 0]
STANDBY_Z 1 [cha: y, def: 0, sav: 1]
IACT 10 [cha: n, def: 10, sav: 10]
SZCT 36000 [cha: y, def:36000, sav:36000]
IBCT 6000 [cha: n, def:6000, sav:6000]
ICCT 9000 [cha: n, def:9000, sav:9000]
SYCT 18000 [cha: n, def:18000, sav:18000]
CCF_IDLE 0 [cha: y, def: 1, sav: 0]
CCF_STAND 0 [cha: y, def: 2, sav: 0]
CCF_STOPP 2 [cha: y, def: 2, sav: 2]
Here's my hd-idle config (I'm using a debian package):
HD_IDLE_OPTS="-i 1800 -l /var/log/hd-idle.log \
-d -p 3 -s 1 \
-a /dev/disk/by-id/wwn-0x5000c500ab \
-a /dev/disk/by-id/wwn-0x5000c5006b \
-a /dev/disk/by-id/wwn-0x5000c5000f \
-a /dev/disk/by-id/wwn-0x5000c50007 \
-a /dev/disk/by-id/wwn-0x5000c50008 \
-a /dev/disk/by-id/wwn-0x5000c5003b \
-a /dev/disk/by-id/wwn-0x5000c500db \
-a /dev/disk/by-id/wwn-0x5000c500bf"
Spindown works reliably. Disks are LUKS-encrypted and are organized in the ZFS RAIDZ2 pool. I have also adjusted ZFS timeouts to be 30s, and Linux drive timeouts to be 120s, to be tolerant of the long spin-up times (although it usually happens in 10 secs maximum).
Nevertheless, when I try to access some data from a spinned down pool, hd-idle fails to wake up all drives:
[560621.157626] sd 7:0:6:0: [sdh] tag#3960 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157626] sd 7:0:4:0: [sdf] tag#3964 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157631] sd 7:0:6:0: [sdh] tag#3960 Sense Key : Not Ready [current]
[560621.157635] sd 7:0:4:0: [sdf] tag#3964 Sense Key : Not Ready [current]
[560621.157639] sd 7:0:6:0: [sdh] tag#3960 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157641] sd 7:0:4:0: [sdf] tag#3964 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157645] sd 7:0:6:0: [sdh] tag#3960 CDB: Write(10) 2a 00 4c 80 a4 40 00 00 10 00
[560621.157647] sd 7:0:4:0: [sdf] tag#3964 CDB: Write(10) 2a 00 4c 80 a4 48 00 00 10 00
[560621.157651] I/O error, dev sdh, sector 1283499072 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157653] I/O error, dev sdf, sector 1283499080 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157658] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=657134747648 size=8192 flags=3145856
[560621.157664] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=657134751744 size=8192 flags=3145856
[560621.157670] sd 7:0:6:0: [sdh] tag#3962 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157675] sd 7:0:6:0: [sdh] tag#3962 Sense Key : Not Ready [current]
[560621.157675] sd 7:0:5:0: [sdg] tag#3959 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157678] sd 7:0:6:0: [sdh] tag#3962 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157681] sd 7:0:5:0: [sdg] tag#3959 Sense Key : Not Ready [current]
[560621.157684] sd 7:0:6:0: [sdh] tag#3962 CDB: Write(10) 2a 00 4c c0 a4 38 00 00 10 00
[560621.157686] sd 7:0:5:0: [sdg] tag#3959 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157689] I/O error, dev sdh, sector 1287693368 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157692] sd 7:0:5:0: [sdg] tag#3959 CDB: Write(10) 2a 00 4c 80 a4 40 00 00 18 00
[560621.157719] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=659282227200 size=8192 flags=3145856
[560621.157721] I/O error, dev sdg, sector 1283499072 op 0x1:(WRITE) flags 0x0 phys_seg 2 prio class 2
[560621.157730] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=657134747648 size=12288 flags=3145856
[560621.157740] sd 7:0:4:0: [sdf] tag#3965 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157744] sd 7:0:4:0: [sdf] tag#3965 Sense Key : Not Ready [current]
[560621.157747] sd 7:0:4:0: [sdf] tag#3965 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157752] sd 7:0:4:0: [sdf] tag#3965 CDB: Write(10) 2a 00 4c c0 a4 40 00 00 10 00
[560621.157755] I/O error, dev sdf, sector 1287693376 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157760] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=659282231296 size=8192 flags=3145856
[560621.157769] sd 7:0:5:0: [sdg] tag#3966 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157793] sd 7:0:5:0: [sdg] tag#3966 Sense Key : Not Ready [current]
[560621.157797] sd 7:0:5:0: [sdg] tag#3966 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157818] sd 7:0:5:0: [sdg] tag#3966 CDB: Write(10) 2a 00 4c c0 a4 40 00 00 10 00
[560621.157822] I/O error, dev sdg, sector 1287693376 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157828] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=659282231296 size=8192 flags=3145856
[560621.157864] sd 7:0:4:0: [sdf] tag#3904 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157872] sd 7:0:4:0: [sdf] tag#3904 Sense Key : Not Ready [current]
[560621.157872] sd 7:0:6:0: [sdh] tag#3967 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157877] sd 7:0:4:0: [sdf] tag#3904 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157883] sd 7:0:6:0: [sdh] tag#3967 Sense Key : Not Ready [current]
[560621.157888] sd 7:0:4:0: [sdf] tag#3904 CDB: Read(10) 28 00 00 00 82 10 00 00 10 00
[560621.157890] I/O error, dev sdf, sector 33296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157894] sd 7:0:6:0: [sdh] tag#3967 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157899] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=1 offset=270336 size=8192 flags=1245377
[560621.157903] sd 7:0:6:0: [sdh] tag#3967 CDB: Read(10) 28 00 00 00 82 10 00 00 10 00
[560621.157909] I/O error, dev sdh, sector 33296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157912] sd 7:0:4:0: [sdf] tag#3906 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157915] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=1 offset=270336 size=8192 flags=1245377
[560621.157921] sd 7:0:4:0: [sdf] tag#3906 Sense Key : Not Ready [current]
[560621.157922] sd 7:0:6:0: [sdh] tag#3905 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157925] sd 7:0:4:0: [sdf] tag#3906 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157928] sd 7:0:6:0: [sdh] tag#3905 Sense Key : Not Ready [current]
[560621.157931] sd 7:0:4:0: [sdf] tag#3906 CDB: Read(10) 28 00 e8 e0 84 10 00 00 10 00
[560621.157934] sd 7:0:6:0: [sdh] tag#3905 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157934] I/O error, dev sdg, sector 33296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157936] I/O error, dev sdf, sector 3907027984 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157939] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=1 offset=2000381550592 size=8192 flags=1245377
[560621.157940] sd 7:0:6:0: [sdh] tag#3905 CDB: Read(10) 28 00 e8 e0 84 10 00 00 10 00
[560621.157943] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=1 offset=2000381812736 size=8192 flags=1245377
[560621.157948] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=1 offset=270336 size=8192 flags=1245377
[560621.157952] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=1 offset=2000381550592 size=8192 flags=1245377
[560621.157962] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=1 offset=2000381550592 size=8192 flags=1245377
[560621.157968] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=1 offset=2000381812736 size=8192 flags=1245377
[560621.157970] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=659288616960 size=4096 flags=3145856
[560621.157985] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=659288621056 size=4096 flags=3145856
[560621.157998] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=1 offset=2000381812736 size=8192 flags=1245377
[560621.158008] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=670026555392 size=4096 flags=3145856
[560621.158022] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=652850610176 size=4096 flags=3145856
[560621.158084] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=652850606080 size=4096 flags=3145856
[560621.158084] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=670026559488 size=4096 flags=3145856
[560621.158115] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=654998093824 size=4096 flags=3145856
[560621.158116] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=652850610176 size=4096 flags=3145856
[560621.158130] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=654998089728 size=4096 flags=3145856
[560621.158217] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=672174039040 size=4096 flags=3145856
[560621.158227] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=670026559488 size=4096 flags=3145856
[560621.158244] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=654998093824 size=4096 flags=3145856
[560621.158255] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=672174043136 size=4096 flags=3145856
[560621.158264] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=672174043136 size=4096 flags=3145856
[560631.212264] WARNING: Pool 'rust' has encountered an uncorrectable I/O failure and has been suspended.
If I then clear the pool errors and import it back, everything is fine; there's no data corruption, it's just disks fail to spin up in time and ZFS thinks they're dead. What I think is happening is either START STOP CMD hd-idle uses to spin drives up is wrong, or it tries to spin them up at the same time, and backplane is limiting the current (otherwise it would be a massive inrush current spike).
I am also able to spin up drives manually with sg_start /dev/sdX. There's already an issue and PR ready for having custom spin-down hook (#133), maybe this can be extended to add spin-up hook and additionally disable default commands? In this case hd-idle would be still monitoring drive accesses and timers, but sleep/wake up will be outsourced to a different scripts (which can spin up/down drives manually).
Hi,
I'm using hd-idle to spin down 8 SAS disks. After some debugging, I have set up each disk to have this settings:
Here's my hd-idle config (I'm using a debian package):
Spindown works reliably. Disks are LUKS-encrypted and are organized in the ZFS RAIDZ2 pool. I have also adjusted ZFS timeouts to be 30s, and Linux drive timeouts to be 120s, to be tolerant of the long spin-up times (although it usually happens in 10 secs maximum).
Nevertheless, when I try to access some data from a spinned down pool, hd-idle fails to wake up all drives:
If I then clear the pool errors and import it back, everything is fine; there's no data corruption, it's just disks fail to spin up in time and ZFS thinks they're dead. What I think is happening is either START STOP CMD hd-idle uses to spin drives up is wrong, or it tries to spin them up at the same time, and backplane is limiting the current (otherwise it would be a massive inrush current spike).
I am also able to spin up drives manually with
sg_start /dev/sdX. There's already an issue and PR ready for having custom spin-down hook (#133), maybe this can be extended to add spin-up hook and additionally disable default commands? In this case hd-idle would be still monitoring drive accesses and timers, but sleep/wake up will be outsourced to a different scripts (which can spin up/down drives manually).