Skip to content

Clear spin-up impossible #138

@xorrvin

Description

@xorrvin

Hi,

I'm using hd-idle to spin down 8 SAS disks. After some debugging, I have set up each disk to have this settings:

/dev/sdi: SEAGATE   ST2000NX0463      NT32
Power condition mode page:
PM_BG         0  [cha: n, def:  0, sav:  0]
STANDBY_Y     0  [cha: n, def:  0, sav:  0]
IDLE_C        0  [cha: y, def:  0, sav:  0]
IDLE_B        0  [cha: y, def:  1, sav:  0]
IDLE_A        0  [cha: y, def:  1, sav:  0]
STANDBY_Z     1  [cha: y, def:  0, sav:  1]
IACT          10  [cha: n, def: 10, sav: 10]
SZCT          36000  [cha: y, def:36000, sav:36000]
IBCT          6000  [cha: n, def:6000, sav:6000]
ICCT          9000  [cha: n, def:9000, sav:9000]
SYCT          18000  [cha: n, def:18000, sav:18000]
CCF_IDLE      0  [cha: y, def:  1, sav:  0]
CCF_STAND     0  [cha: y, def:  2, sav:  0]
CCF_STOPP     2  [cha: y, def:  2, sav:  2]

Here's my hd-idle config (I'm using a debian package):

HD_IDLE_OPTS="-i 1800 -l /var/log/hd-idle.log \
-d -p 3 -s 1 \
-a /dev/disk/by-id/wwn-0x5000c500ab \
-a /dev/disk/by-id/wwn-0x5000c5006b \
-a /dev/disk/by-id/wwn-0x5000c5000f \
-a /dev/disk/by-id/wwn-0x5000c50007 \
-a /dev/disk/by-id/wwn-0x5000c50008 \
-a /dev/disk/by-id/wwn-0x5000c5003b \
-a /dev/disk/by-id/wwn-0x5000c500db \
-a /dev/disk/by-id/wwn-0x5000c500bf"

Spindown works reliably. Disks are LUKS-encrypted and are organized in the ZFS RAIDZ2 pool. I have also adjusted ZFS timeouts to be 30s, and Linux drive timeouts to be 120s, to be tolerant of the long spin-up times (although it usually happens in 10 secs maximum).

Nevertheless, when I try to access some data from a spinned down pool, hd-idle fails to wake up all drives:

[560621.157626] sd 7:0:6:0: [sdh] tag#3960 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157626] sd 7:0:4:0: [sdf] tag#3964 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157631] sd 7:0:6:0: [sdh] tag#3960 Sense Key : Not Ready [current] 
[560621.157635] sd 7:0:4:0: [sdf] tag#3964 Sense Key : Not Ready [current] 
[560621.157639] sd 7:0:6:0: [sdh] tag#3960 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157641] sd 7:0:4:0: [sdf] tag#3964 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157645] sd 7:0:6:0: [sdh] tag#3960 CDB: Write(10) 2a 00 4c 80 a4 40 00 00 10 00
[560621.157647] sd 7:0:4:0: [sdf] tag#3964 CDB: Write(10) 2a 00 4c 80 a4 48 00 00 10 00
[560621.157651] I/O error, dev sdh, sector 1283499072 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157653] I/O error, dev sdf, sector 1283499080 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157658] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=657134747648 size=8192 flags=3145856
[560621.157664] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=657134751744 size=8192 flags=3145856
[560621.157670] sd 7:0:6:0: [sdh] tag#3962 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157675] sd 7:0:6:0: [sdh] tag#3962 Sense Key : Not Ready [current] 
[560621.157675] sd 7:0:5:0: [sdg] tag#3959 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157678] sd 7:0:6:0: [sdh] tag#3962 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157681] sd 7:0:5:0: [sdg] tag#3959 Sense Key : Not Ready [current] 
[560621.157684] sd 7:0:6:0: [sdh] tag#3962 CDB: Write(10) 2a 00 4c c0 a4 38 00 00 10 00
[560621.157686] sd 7:0:5:0: [sdg] tag#3959 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157689] I/O error, dev sdh, sector 1287693368 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157692] sd 7:0:5:0: [sdg] tag#3959 CDB: Write(10) 2a 00 4c 80 a4 40 00 00 18 00
[560621.157719] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=659282227200 size=8192 flags=3145856
[560621.157721] I/O error, dev sdg, sector 1283499072 op 0x1:(WRITE) flags 0x0 phys_seg 2 prio class 2
[560621.157730] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=657134747648 size=12288 flags=3145856
[560621.157740] sd 7:0:4:0: [sdf] tag#3965 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157744] sd 7:0:4:0: [sdf] tag#3965 Sense Key : Not Ready [current] 
[560621.157747] sd 7:0:4:0: [sdf] tag#3965 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157752] sd 7:0:4:0: [sdf] tag#3965 CDB: Write(10) 2a 00 4c c0 a4 40 00 00 10 00
[560621.157755] I/O error, dev sdf, sector 1287693376 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157760] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=659282231296 size=8192 flags=3145856
[560621.157769] sd 7:0:5:0: [sdg] tag#3966 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157793] sd 7:0:5:0: [sdg] tag#3966 Sense Key : Not Ready [current] 
[560621.157797] sd 7:0:5:0: [sdg] tag#3966 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157818] sd 7:0:5:0: [sdg] tag#3966 CDB: Write(10) 2a 00 4c c0 a4 40 00 00 10 00
[560621.157822] I/O error, dev sdg, sector 1287693376 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[560621.157828] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=659282231296 size=8192 flags=3145856
[560621.157864] sd 7:0:4:0: [sdf] tag#3904 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157872] sd 7:0:4:0: [sdf] tag#3904 Sense Key : Not Ready [current] 
[560621.157872] sd 7:0:6:0: [sdh] tag#3967 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157877] sd 7:0:4:0: [sdf] tag#3904 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157883] sd 7:0:6:0: [sdh] tag#3967 Sense Key : Not Ready [current] 
[560621.157888] sd 7:0:4:0: [sdf] tag#3904 CDB: Read(10) 28 00 00 00 82 10 00 00 10 00
[560621.157890] I/O error, dev sdf, sector 33296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157894] sd 7:0:6:0: [sdh] tag#3967 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157899] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=1 offset=270336 size=8192 flags=1245377
[560621.157903] sd 7:0:6:0: [sdh] tag#3967 CDB: Read(10) 28 00 00 00 82 10 00 00 10 00
[560621.157909] I/O error, dev sdh, sector 33296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157912] sd 7:0:4:0: [sdf] tag#3906 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157915] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=1 offset=270336 size=8192 flags=1245377
[560621.157921] sd 7:0:4:0: [sdf] tag#3906 Sense Key : Not Ready [current] 
[560621.157922] sd 7:0:6:0: [sdh] tag#3905 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[560621.157925] sd 7:0:4:0: [sdf] tag#3906 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157928] sd 7:0:6:0: [sdh] tag#3905 Sense Key : Not Ready [current] 
[560621.157931] sd 7:0:4:0: [sdf] tag#3906 CDB: Read(10) 28 00 e8 e0 84 10 00 00 10 00
[560621.157934] sd 7:0:6:0: [sdh] tag#3905 Add. Sense: Logical unit not ready, additional power use not yet granted
[560621.157934] I/O error, dev sdg, sector 33296 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157936] I/O error, dev sdf, sector 3907027984 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[560621.157939] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=1 offset=2000381550592 size=8192 flags=1245377
[560621.157940] sd 7:0:6:0: [sdh] tag#3905 CDB: Read(10) 28 00 e8 e0 84 10 00 00 10 00
[560621.157943] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=1 offset=2000381812736 size=8192 flags=1245377
[560621.157948] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=1 offset=270336 size=8192 flags=1245377
[560621.157952] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=1 offset=2000381550592 size=8192 flags=1245377
[560621.157962] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=1 offset=2000381550592 size=8192 flags=1245377
[560621.157968] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=1 offset=2000381812736 size=8192 flags=1245377
[560621.157970] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=659288616960 size=4096 flags=3145856
[560621.157985] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=659288621056 size=4096 flags=3145856
[560621.157998] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=1 offset=2000381812736 size=8192 flags=1245377
[560621.158008] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=670026555392 size=4096 flags=3145856
[560621.158022] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=652850610176 size=4096 flags=3145856
[560621.158084] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=652850606080 size=4096 flags=3145856
[560621.158084] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=670026559488 size=4096 flags=3145856
[560621.158115] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=654998093824 size=4096 flags=3145856
[560621.158116] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=652850610176 size=4096 flags=3145856
[560621.158130] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=654998089728 size=4096 flags=3145856
[560621.158217] zio pool=rust vdev=/dev/mapper/sas-0x5000c50007 error=5 type=2 offset=672174039040 size=4096 flags=3145856
[560621.158227] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=670026559488 size=4096 flags=3145856
[560621.158244] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=654998093824 size=4096 flags=3145856
[560621.158255] zio pool=rust vdev=/dev/mapper/sas-0x5000c500ab error=5 type=2 offset=672174043136 size=4096 flags=3145856
[560621.158264] zio pool=rust vdev=/dev/mapper/sas-0x5000c5000f error=5 type=2 offset=672174043136 size=4096 flags=3145856
[560631.212264] WARNING: Pool 'rust' has encountered an uncorrectable I/O failure and has been suspended.

If I then clear the pool errors and import it back, everything is fine; there's no data corruption, it's just disks fail to spin up in time and ZFS thinks they're dead. What I think is happening is either START STOP CMD hd-idle uses to spin drives up is wrong, or it tries to spin them up at the same time, and backplane is limiting the current (otherwise it would be a massive inrush current spike).

I am also able to spin up drives manually with sg_start /dev/sdX. There's already an issue and PR ready for having custom spin-down hook (#133), maybe this can be extended to add spin-up hook and additionally disable default commands? In this case hd-idle would be still monitoring drive accesses and timers, but sleep/wake up will be outsourced to a different scripts (which can spin up/down drives manually).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions