SUMMARY: vxvm Raid-5 volume in detached/replay state

From: Fernando Nantes de Souza (fnantes@yahoo.com)
Date: Mon Jul 17 2000 - 11:27:50 CDT


Original Posting:

--- Fernando Nantes de Souza <fnantes@yahoo.com> wrote:
> Date: Sun, 16 Jul 2000 13:29:21 -0700 (PDT)
> From: Fernando Nantes de Souza <fnantes@yahoo.com>
> Subject: vxvm Raid-5 volume in detached/replay state
> To: Sun Managers <sun-managers@sunmanagers.ececs.uc.edu>
>
> Sun E3000 + A5000 + A100
> Solaris 2.5.1
> Sun StorEdge Volume Manager 2.6 (UFS no vxfs)
>
> We had some trouble with our electrical system after a thunderstorm
> on
> Friday night. The machine didn't reboot but it was probably a power
> spike that cause the disks to fail with many messages like this in
> /var/adm/messages:
>
> Jul 15 04:05:43 descartes unix: WARNING: vxvm:vxio: Raid-5 vol01
> block
> 32960: Uncorrectable write error
>
> After a couple of reboots/fsck cycles, we got to a situation were 3
> raid-5 volumes are in a "startable degraded" state according to
> vxinfo:
>
> vol13 fsgen Started
> vol09 fsgen Started
> vol14 fsgen Started
> vol04 fsgen Started
> vol12 fsgen Started
> vol06 fsgen Started
> vol11 fsgen Started
> vol07 fsgen Started
> vol03 fsgen Started
> vol05 fsgen Started
> vol01 raid5 Startable Degraded
> vol02 raid5 Started
> vol08 raid5 Startable Degraded
> vol10 raid5 Startable Degraded
>
> Sun Support looked at the output of the commands "vxprint -ht" and
> "vxdisk list" (see below) and said that the raidsets are being
> automatically recovered and it should take a couple of hours to come
> back to normal. vxprint shows a DETACHED/REPLAY state for the volumes
> (see vol01, vol08, vol10).
>
> I can format->analyze/read all the disks, probe-scsi-all looks good.
> But "vxdisk list" shows many disks with "online failing" status.
>
> The problem:
>
> - We've waited for more than 20 hours and the status didn't change.
> - "ps -ef | grep vxrecover" doesn't show anything. I'd expect to see
> a
> vxrecover process running.
> - I've tried to reach a human being for hours at Sun and all I get is
> music (good music though)
>
> Any ideas? Thanks in advance.
>
> --
> Fernando De Souza
> SysAdmin/DBA
> PricewaterhouseCoopers
> Fairfax, VA

The Solution:

This procedure only works if the disks are ok. They can be tested using
format->analyze.

VM set the disks statuses to "failing" when it can't access them. It
can be caused by a permanent hardware failure or a temporary condition
(power surge) but we need to clean this status to be able to restart
the volumes:

$ vxedit -g <disk group> set failing=off

Do it for each disk that shows up as "failing" in "vxdisk list".
Remember that you must make sure the disk is not really bad
(format/analyze)

Now, force the affected volumes to start/recover: (This step takes a
while, 15 min/GB according to SUN Support. Write a script and do it in
background)

$ vxvol -g <disk group> -f start <volume name>

Durind the recover "vxprint -ht <volume name>" will show a status like
"ENABLED SYNC". When it's done, if everything worked as expected, the
status will change to "ENABLED ACTIVE", and you should be able to
fsck/mount the file systems.

Thanks to those that replied, it's pretty impressive to have 3 replies
on a Sunday night. The more complete answer came from Sun Support, not
without waiting for hours (literally) on hold.

* Don Cary Grant: Suggested that I subscribed to Veritas list, which I
did with one good reply (this morning though)

* Ryan Matteson

* Tony Griffiths: Replied thru the Veritas-vx list.

--Fernando

"vxprint -ht" and "vxdisk list" for broken system:
>
>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> # vxprint -ht
>
> Disk group: rootdg
>
> DG NAME NCONFIG NLOG MINORS GROUP-ID
> DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
> V NAME USETYPE KSTATE STATE LENGTH READPOL
> PREFPLEX
> PL NAME VOLUME KSTATE STATE LENGTH LAYOUT
> NCOL/WID MODE
> SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF
> DEVICE MODE
>
> dg rootdg default default 0
> 863812056.1025.descartes
>
> dm disk01 c0t0d0s2 sliced 2159 8378640 -
> dm disk02 c0t0d1s2 sliced 2159 8378640 -
> dm disk03 c0t0d2s2 sliced 2159 8378640 -
> dm disk04 c0t0d3s2 sliced 2159 8378640 -
> dm disk05 c0t0d4s2 sliced 2159 8378640 -
> dm disk06 c0t1d0s2 sliced 2159 8378640 -
> dm disk07 c0t1d1s2 sliced 2159 8378640 -
> dm disk08 c0t1d2s2 sliced 2159 8378640 -
> dm disk09 c0t1d3s2 sliced 2159 8378640 -
> dm disk10 c0t1d4s2 sliced 2159 8378640 -
> dm disk11 c0t2d0s2 sliced 2159 8378640 -
> dm disk12 c0t2d1s2 sliced 2159 8378640 -
> dm disk13 c0t2d2s2 sliced 2159 8378640 -
> dm disk14 c0t2d3s2 sliced 2159 8378640 -
> dm disk15 c0t2d4s2 sliced 2159 8378640 -
> dm disk16 c0t3d0s2 sliced 2159 8378640 -
> dm disk17 c0t3d1s2 sliced 2159 8378640 -
> dm disk18 c0t3d2s2 sliced 2159 8378640 -
> dm disk19 c0t3d3s2 sliced 2159 8378640 -
> dm disk20 c0t3d4s2 sliced 2159 8378640 -
> dm disk21 c0t4d0s2 sliced 2159 8378640 -
> dm disk22 c0t4d1s2 sliced 2159 8378640 -
> dm disk23 c0t4d2s2 sliced 2159 8378640 -
> dm disk24 c0t4d3s2 sliced 2159 8378640 -
> dm disk25 c0t4d4s2 sliced 2159 8378640 -
> dm disk26 c0t5d0s2 sliced 2159 8378640 -
> dm disk27 c0t5d1s2 sliced 2159 8378640 -
> dm disk28 c0t5d2s2 sliced 2159 8378640 SPARE
> dm disk29 c0t5d3s2 sliced 2159 8378640 SPARE
> dm disk30 c0t5d4s2 sliced 2159 8378640 SPARE
> dm disk31 c1t1d0s2 sliced 3590 17678493 -
> dm disk32 c1t2d0s2 sliced 3590 17678493 -
> dm disk33 c1t3d0s2 sliced 3590 17678493 -
> dm disk34 c1t10d0s2 sliced 4711 35363560 -
> dm disk35 c1t11d0s2 sliced 4711 35363560 -
> dm disk36 c1t12d0s2 sliced 4711 35363560 -
> dm disk37 c1t13d0s2 sliced 4711 35363560 -
> dm disk38 c1t14d0s2 sliced 4711 35363560 -
> dm disk39 c1t15d0s2 sliced 4711 35363560 SPARE
> dm disk40 c2t64d0s2 sliced 3590 17678493 -
> dm disk41 c2t65d0s2 sliced 3590 17678493 -
> dm disk42 c2t66d0s2 sliced 3590 17678493 -
> dm disk43 c2t67d0s2 sliced 3590 17678493 -
> dm disk44 c2t68d0s2 sliced 3590 17678493 -
> dm disk45 c2t69d0s2 sliced 3590 17678493 -
> dm disk46 c2t70d0s2 sliced 3590 17678493 -
> dm disk47 c2t80d0s2 sliced 3590 17678493 -
> dm disk48 c2t81d0s2 sliced 3590 17678493 -
> dm disk49 c2t82d0s2 sliced 3590 17678493 -
> dm disk50 c2t83d0s2 sliced 3590 17678493 -
> dm disk51 c2t84d0s2 sliced 3590 17678493 -
> dm disk52 c2t85d0s2 sliced 3590 17678493 -
> dm disk53 c2t86d0s2 sliced 3590 17678493 -
>
> v vol01 raid5 DETACHED REPLAY 92164160 RAID -
> pl vol01-03 vol01 ENABLED LOG 4712 CONCAT -
>
> RW
> sd disk39-01 vol01-03 disk39 0 4712 0
> c1t15d0 ENA
> pl vol01-01 vol01 ENABLED ACTIVE 92164864 RAID
> 12/32
> RW
> sd disk01-01 vol01-01 disk01 0 8378640 0/0
> c0t0d0 RCOV
> sd disk02-01 vol01-01 disk02 0 8378640 1/0
> c0t0d1 RCOV
> sd disk03-01 vol01-01 disk03 0 8378640 2/0
> c0t0d2 ENA
> sd disk04-01 vol01-01 disk04 0 8378640 3/0
> c0t0d3 ENA
> sd disk05-01 vol01-01 disk05 0 8378640 4/0
> c0t0d4 RCOV
> sd disk29-01 vol01-01 disk29 0 8378640 5/0
> c0t5d3 ENA
> sd disk07-01 vol01-01 disk07 0 8378640 6/0
> c0t1d1 RCOV
> sd disk08-01 vol01-01 disk08 0 8378640 7/0
> c0t1d2 ENA
> sd disk09-01 vol01-01 disk09 0 8378640 8/0
> c0t1d3 RCOV
> sd disk10-01 vol01-01 disk10 0 8378640 9/0
> c0t1d4 ENA
> sd disk11-01 vol01-01 disk11 0 8378640 10/0
> c0t2d0 RCOV
> sd disk12-01 vol01-01 disk12 0 8378640 11/0
> c0t2d1 RCOV
>
> v vol02 raid5 ENABLED ACTIVE 92164160 RAID -
> pl vol02-03 vol02 ENABLED LOG 4712 CONCAT -
>
> RW
> sd disk39-02 vol02-03 disk39 4712 4712 0
> c1t15d0 ENA
> pl vol02-01 vol02 ENABLED ACTIVE 92164864 RAID
> 12/32
> RW
> sd disk30-01 vol02-01 disk30 0 8378640 0/0
> c0t5d4 ENA
> sd disk14-01 vol02-01 disk14 0 8378640 1/0
> c0t2d3 ENA
> sd disk15-01 vol02-01 disk15 0 8378640 2/0
> c0t2d4 ENA
> sd disk16-01 vol02-01 disk16 0 8378640 3/0
> c0t3d0 ENA
> sd disk17-01 vol02-01 disk17 0 8378640 4/0
> c0t3d1 ENA
> sd disk18-01 vol02-01 disk18 0 8378640 5/0
> c0t3d2 ENA
> sd disk19-01 vol02-01 disk19 0 8378640 6/0
> c0t3d3 ENA
> sd disk20-01 vol02-01 disk20 0 8378640 7/0
> c0t3d4 ENA
> sd disk21-01 vol02-01 disk21 0 8378640 8/0
> c0t4d0 ENA
> sd disk22-01 vol02-01 disk22 0 8378640 9/0
> c0t4d1 ENA
> sd disk23-01 vol02-01 disk23 0 8378640 10/0
> c0t4d2 ENA
> sd disk24-01 vol02-01 disk24 0 8378640 11/0
> c0t4d3 ENA
>
> v vol03 fsgen ENABLED ACTIVE 69812224 SELECT -
> pl vol03-01 vol03 ENABLED ACTIVE 69812695 CONCAT -
>
> RW
> sd disk31-01 vol03-01 disk31 0 17678493 0
> c1t1d0 ENA
> sd disk32-01 vol03-01 disk32 0 17678493 17678493
> c1t2d0 ENA
> sd disk33-01 vol03-01 disk33 0 17678493 35356986
> c1t3d0 ENA
> sd disk34-04 vol03-01 disk34 28292344 6291456 53035479
> c1t10d0 ENA
> sd disk36-02 vol03-01 disk36 10488912 10485760 59326935
> c1t12d0 ENA
>
> v vol04 fsgen ENABLED ACTIVE 9402640 SELECT -
> pl vol04-01 vol04 ENABLED ACTIVE 9402640 CONCAT -
>
> RW
> sd disk25-01 vol04-01 disk25 0 8378640 0
> c0t4d4 ENA
> sd disk34-03 vol04-01 disk34 27268344 1024000 8378640
> c1t10d0 ENA
>
> v vol05 fsgen ENABLED ACTIVE 2097152 SELECT -
> pl vol05-01 vol05 ENABLED ACTIVE 2097360 CONCAT -
>
> RW
> sd disk27-01 vol05-01 disk27 0 2097360 0
> c0t5d1 ENA
>
> v vol06 fsgen ENABLED ACTIVE 18874368 SELECT -
> pl vol06-01 vol06 ENABLED ACTIVE 18876272 CONCAT -
>
> RW
> sd disk34-01 vol06-01 disk34 0 18876272 0
> c1t10d0 ENA
>
> v vol07 fsgen ENABLED ACTIVE 8388608 SELECT -
> pl vol07-01 vol07 ENABLED ACTIVE 8392072 CONCAT -
>
> RW
> sd disk34-02 vol07-01 disk34 18876272 8392072 0
> c1t10d0 ENA
>
> v vol08 raid5 DETACHED REPLAY 88391808 RAID -
> pl vol08-02 vol08 ENABLED LOG 4712 CONCAT -
>
> RW
> sd disk35-01 vol08-02 disk35 0 4712 0
> c1t11d0 ENA
> pl vol08-01 vol08 ENABLED ACTIVE 0 RAID
> 7/32
> RW
> sd disk40-01 vol08-01 disk40 0 14733873 0/0
> c2t64d0 RCOV
> sd disk42-01 vol08-01 disk42 0 14733873 1/0
> c2t66d0 RCOV
> sd disk44-01 vol08-01 disk44 0 14733873 2/0
> c2t68d0 RCOV
> sd disk46-01 vol08-01 disk46 0 14733873 3/0
> c2t70d0 RCOV
> sd disk48-01 vol08-01 disk48 0 14733873 4/0
> c2t81d0 RCOV
> sd disk50-01 vol08-01 disk50 0 14733873 5/0
> c2t83d0 RCOV
> sd disk52-01 vol08-01 disk52 0 14733873 6/0
> c2t85d0 RCOV
>
> v vol09 fsgen ENABLED ACTIVE 10485760 SELECT -
> pl vol09-01 vol09 ENABLED ACTIVE 10488912 CONCAT -
>
> RW
> sd disk36-01 vol09-01 disk36 0 10488912 0
> c1t12d0 ENA
>
> v vol10 raid5 DETACHED REPLAY 88391808 RAID -
> pl vol10-02 vol10 ENABLED LOG 4712 CONCAT -
>
> RW
> sd disk37-01 vol10-02 disk37 0 4712 0
> c1t13d0 ENA
> pl vol10-01 vol10 ENABLED ACTIVE 88403136 RAID
> 7/32
> RW
> sd disk41-01 vol10-01 disk41 0 14733873 0/0
> c2t65d0 RCOV
> sd disk43-01 vol10-01 disk43 0 14733873 1/0
> c2t67d0 ENA
> sd disk45-01 vol10-01 disk45 0 14733873 2/0
> c2t69d0 RCOV
> sd disk47-01 vol10-01 disk47 0 14733873 3/0
> c2t80d0 RCOV
> sd disk49-01 vol10-01 disk49 0 14733873 4/0
> c2t82d0 RCOV
> sd disk51-01 vol10-01 disk51 0 14733873 5/0
> c2t84d0 RCOV
> sd disk53-01 vol10-01 disk53 0 14733873 6/0
> c2t86d0 RCOV
>
> v vol11 fsgen ENABLED ACTIVE 12582912 SELECT -
> pl vol11-01 vol11 ENABLED ACTIVE 12585752 CONCAT -
>
> RW
> sd disk35-02 vol11-01 disk35 4712 12585752 0
> c1t11d0 ENA
>
> v vol12 fsgen ENABLED ACTIVE 10485760 SELECT -
> pl vol12-01 vol12 ENABLED ACTIVE 10488912 CONCAT -
>
> RW
> sd disk37-02 vol12-01 disk37 4712 10488912 0
> c1t13d0 ENA
>
> v vol13 fsgen ENABLED ACTIVE 14385152 SELECT -
> pl vol13-01 vol13 ENABLED ACTIVE 14385736 CONCAT -
>
> RW
> sd disk36-03 vol13-01 disk36 20977824 14385736 0
> c1t12d0 ENA
>
> v vol14 fsgen ENABLED ACTIVE 4194304 SELECT -
> pl vol14-01 vol14 ENABLED ACTIVE 4198392 CONCAT -
>
> RW
> sd disk38-01 vol14-01 disk38 0 4198392 0
> c1t14d0 ENA
>
>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> # vxdisk list
>
> DEVICE TYPE DISK GROUP STATUS
> c0t0d0s2 sliced disk01 rootdg online failing
> c0t0d1s2 sliced disk02 rootdg online failing
> c0t0d2s2 sliced disk03 rootdg online
> c0t0d3s2 sliced disk04 rootdg online
> c0t0d4s2 sliced disk05 rootdg online failing
> c0t1d0s2 sliced disk06 rootdg online
> c0t1d1s2 sliced disk07 rootdg online failing
> c0t1d2s2 sliced disk08 rootdg online
> c0t1d3s2 sliced disk09 rootdg online failing
> c0t1d4s2 sliced disk10 rootdg online
> c0t2d0s2 sliced disk11 rootdg online failing
> c0t2d1s2 sliced disk12 rootdg online failing
> c0t2d2s2 sliced disk13 rootdg online failing
> c0t2d3s2 sliced disk14 rootdg online
> c0t2d4s2 sliced disk15 rootdg online
> c0t3d0s2 sliced disk16 rootdg online
> c0t3d1s2 sliced disk17 rootdg online
> c0t3d2s2 sliced disk18 rootdg online
> c0t3d3s2 sliced disk19 rootdg online
> c0t3d4s2 sliced disk20 rootdg online
> c0t4d0s2 sliced disk21 rootdg online
> c0t4d1s2 sliced disk22 rootdg online
> c0t4d2s2 sliced disk23 rootdg online
> c0t4d3s2 sliced disk24 rootdg online
> c0t4d4s2 sliced disk25 rootdg online
> c0t5d0s2 sliced disk26 rootdg online
> c0t5d1s2 sliced disk27 rootdg online
> c0t5d2s2 sliced disk28 rootdg online spare failing
> c0t5d3s2 sliced disk29 rootdg online spare
> c0t5d4s2 sliced disk30 rootdg online spare
> c1t0d0s2 sliced - - error
> c1t1d0s2 sliced disk31 rootdg online
> c1t2d0s2 sliced disk32 rootdg online
> c1t3d0s2 sliced disk33 rootdg online
> c1t10d0s2 sliced disk34 rootdg online
> c1t11d0s2 sliced disk35 rootdg online
> c1t12d0s2 sliced disk36 rootdg online
> c1t13d0s2 sliced disk37 rootdg online
> c1t14d0s2 sliced disk38 rootdg online
> c1t15d0s2 sliced disk39 rootdg online spare
> c2t64d0s2 sliced disk40 rootdg online failing
> c2t65d0s2 sliced disk41 rootdg online failing
> c2t66d0s2 sliced disk42 rootdg online failing
> c2t67d0s2 sliced disk43 rootdg online
> c2t68d0s2 sliced disk44 rootdg online failing
> c2t69d0s2 sliced disk45 rootdg online failing
> c2t70d0s2 sliced disk46 rootdg online failing
> c2t80d0s2 sliced disk47 rootdg online failing
> c2t81d0s2 sliced disk48 rootdg online failing
> c2t82d0s2 sliced disk49 rootdg online failing
> c2t83d0s2 sliced disk50 rootdg online failing
> c2t84d0s2 sliced disk51 rootdg online failing
> c2t85d0s2 sliced disk52 rootdg online failing
> c2t86d0s2 sliced disk53 rootdg online failing
>
>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
> =====
> --
> Fernando Nantes de Souza
> fnantes@yahoo.com
>
> __________________________________________________
> Do You Yahoo!?
> Get Yahoo! Mail Free email you can access from anywhere!
> http://mail.yahoo.com/
>
> S
> U BEFORE POSTING please READ the FAQ located at
> N ftp://ftp.cs.toronto.edu/pub/jdd/sun-managers/faq
> .. and the list POLICY statement located at
> M ftp://ftp.cs.toronto.edu/pub/jdd/sun-managers/policy
> A To submit questions/summaries to this list send your email
> message to:
> N sun-managers@ececs.uc.edu
> A To unsubscribe from this list please send an email message to:
> G majordomo@sunmanagers.ececs.uc.edu
> E and in the BODY type:
> R unsubscribe sun-managers
> S Or
> .. unsubscribe sun-managers original@subscription.address
> L To view an archive of this list please visit:
> I http://www.latech.edu/sunman.html
> S
> T

=====

--
Fernando Nantes de Souza
fnantes@yahoo.com

__________________________________________________ Do You Yahoo!? Get Yahoo! Mail Free email you can access from anywhere! http://mail.yahoo.com/

S U BEFORE POSTING please READ the FAQ located at N ftp://ftp.cs.toronto.edu/pub/jdd/sun-managers/faq . and the list POLICY statement located at M ftp://ftp.cs.toronto.edu/pub/jdd/sun-managers/policy A To submit questions/summaries to this list send your email message to: N sun-managers@ececs.uc.edu A To unsubscribe from this list please send an email message to: G majordomo@sunmanagers.ececs.uc.edu E and in the BODY type: R unsubscribe sun-managers S Or . unsubscribe sun-managers original@subscription.address L To view an archive of this list please visit: I http://www.latech.edu/sunman.html S T



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:14:12 CDT